torch_geometric.datasets.BrcaTcga

class BrcaTcga(root: str, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, pre_filter: Optional[Callable] = None, force_reload: bool = False)[source]

Bases: InMemoryDataset

The breast cancer (BRCA TCGA Pan-Cancer Atlas) dataset consisting of patients with survival information and gene expression data from cBioPortal and a network of biological interactions between those nodes from Pathway Commons. The dataset contains the gene features of 1,082 patients, and the overall survival time (in months) of each patient as label.

Pre-processing and example model codes on how to use this dataset can be found here.

Parameters:
  • root (str) – Root directory where the dataset should be saved.

  • transform (callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access. (default: None)

  • pre_transform (callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk. (default: None)

  • pre_filter (callable, optional) – A function that takes in an torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default: None)

  • force_reload (bool, optional) – Whether to re-process the dataset. (default: False)

STATS:

#graphs

#nodes

#edges

#features

1,082

9,288

271,771

1,082