torch_geometric.datasets¶
Zachary’s karate club network from the “An Information Flow Model for Conflict and Fission in Small Groups” paper, containing 34 nodes, connected by 156 (undirected and unweighted) edges. |
|
A variety of graph kernel benchmark datasets, .e.g. “IMDB-BINARY”, “REDDIT-BINARY” or “PROTEINS”, collected from the TU Dortmund University. |
|
A variety of artificially and semi-artificially generated graph datasets from the “Benchmarking Graph Neural Networks” paper. |
|
The citation network datasets “Cora”, “CiteSeer” and “PubMed” from the “Revisiting Semi-Supervised Learning with Graph Embeddings” paper. |
|
The NELL dataset, a knowledge graph from the “Toward an Architecture for Never-Ending Language Learning” paper. |
|
The full citation network datasets from the “Deep Gaussian Embedding of Graphs: Unsupervised Inductive Learning via Ranking” paper. |
|
Alias for |
|
The Coauthor CS and Coauthor Physics networks from the “Pitfalls of Graph Neural Network Evaluation” paper. |
|
The Amazon Computers and Amazon Photo networks from the “Pitfalls of Graph Neural Network Evaluation” paper. |
|
The protein-protein interaction networks from the “Predicting Multicellular Function through Multi-layer Tissue Networks” paper, containing positional gene sets, motif gene sets and immunological signatures as features (50 in total) and gene ontology sets as labels (121 in total). |
|
The Reddit dataset from the “Inductive Representation Learning on Large Graphs” paper, containing Reddit posts belonging to different communities. |
|
The Reddit dataset from the “GraphSAINT: Graph Sampling Based Inductive Learning Method” paper, containing Reddit posts belonging to different communities. |
|
The Flickr dataset from the “GraphSAINT: Graph Sampling Based Inductive Learning Method” paper, containing descriptions and common properties of images. |
|
The Yelp dataset from the “GraphSAINT: Graph Sampling Based Inductive Learning Method” paper, containing customer reviewers and their friendship. |
|
The Amazon dataset from the “GraphSAINT: Graph Sampling Based Inductive Learning Method” paper, containing products and its categories. |
|
The QM7b dataset from the “MoleculeNet: A Benchmark for Molecular Machine Learning” paper, consisting of 7,211 molecules with 14 regression targets. |
|
The QM9 dataset from the “MoleculeNet: A Benchmark for Molecular Machine Learning” paper, consisting of about 130,000 molecules with 19 regression targets. |
|
The ZINC dataset from the ZINC database and the “Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules” paper, containing about 250,000 molecular graphs with up to 38 heavy atoms. |
|
The MoleculeNet benchmark collection from the “MoleculeNet: A Benchmark for Molecular Machine Learning” paper, containing datasets from physical chemistry, biophysics and physiology. |
|
The relational entities networks “AIFB”, “MUTAG”, “BGS” and “AM” from the “Modeling Relational Data with Graph Convolutional Networks” paper. |
|
The GED datasets from the “Graph Edit Distance Computation via Graph Neural Networks” paper. |
|
MNIST superpixels dataset from the “Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs” paper, containing 70,000 graphs with 75 nodes each. |
|
The FAUST humans dataset from the “FAUST: Dataset and Evaluation for 3D Mesh Registration” paper, containing 100 watertight meshes representing 10 different poses for 10 different subjects. |
|
The dynamic FAUST humans dataset from the “Dynamic FAUST: Registering Human Bodies in Motion” paper. |
|
The ShapeNet part level segmentation dataset from the “A Scalable Active Framework for Region Annotation in 3D Shape Collections” paper, containing about 17,000 3D shape point clouds from 16 shape categories. |
|
The ModelNet10/40 datasets from the “3D ShapeNets: A Deep Representation for Volumetric Shapes” paper, containing CAD models of 10 and 40 categories, respectively. |
|
The CoMA 3D faces dataset from the “Generating 3D faces using Convolutional Mesh Autoencoders” paper, containing 20,466 meshes of extreme expressions captured over 12 different subjects. |
|
The SHREC 2016 partial matching dataset from the “SHREC’16: Partial Matching of Deformable Shapes” paper. |
|
The TOSCA dataset from the “Numerical Geometry of Non-Ridig Shapes” book, containing 80 meshes. |
|
The PCPNet dataset from the “PCPNet: Learning Local Shape Properties from Raw Point Clouds” paper, consisting of 30 shapes, each given as a point cloud, densely sampled with 100k points. |
|
The (pre-processed) Stanford Large-Scale 3D Indoor Spaces dataset from the “3D Semantic Parsing of Large-Scale Indoor Spaces” paper, containing point clouds of six large-scale indoor parts in three buildings with 12 semantic elements (and one clutter class). |
|
Synthetic dataset of various geometric shapes like cubes, spheres or pyramids. |
|
The Bitcoin-OTC dataset from the “EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs” paper, consisting of 138 who-trusts-whom networks of sequential time steps. |
|
The Integrated Crisis Early Warning System (ICEWS) dataset used in the, e.g., “Recurrent Event Network for Reasoning over Temporal Knowledge Graphs” paper, consisting of events collected from 1/1/2018 to 10/31/2018 (24 hours time granularity). |
|
The Global Database of Events, Language, and Tone (GDELT) dataset used in the, e.g., “Recurrent Event Network for Reasoning over Temporal Knowledge Graphs” paper, consisting of events collected from 1/1/2018 to 1/31/2018 (15 minutes time granularity). |
|
The DBP15K dataset from the “Cross-lingual Entity Alignment via Joint Attribute-Preserving Embedding” paper, where Chinese, Japanese and French versions of DBpedia were linked to its English version. |
|
The WILLOW-ObjectClass dataset from the “Learning Graphs to Match” paper, containing 10 equal keypoints of at least 40 images in each category. |
|
The Pascal VOC 2011 dataset with Berkely annotations of keypoints from the “Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations” paper, containing 0 to 23 keypoints per example over 20 categories. |
|
The Pascal-PF dataset from the “Proposal Flow” paper, containing 4 to 16 keypoints per example over 20 categories. |
|
A variety of graph datasets collected from SNAP at Stanford University. |
|
A suite of sparse matrix benchmarks known as the Suite Sparse Matrix Collection collected from a wide range of applications. |
|
The TrackML Particle Tracking Challenge dataset to reconstruct particle tracks from 3D points left in the silicon detectors. |
|
The heterogeneous AMiner dataset from the “metapath2vec: Scalable Representation Learning for Heterogeneous Networks” paper, consisting of nodes from type |
|
The WordNet18 dataset from the “Translating Embeddings for Modeling Multi-Relational Data” paper, containing 40,943 entities, 18 relations and 151,442 fact triplets, e.g., furniture includes bed. |
|
The WordNet18RR dataset from the “Convolutional 2D Knowledge Graph Embeddings” paper, containing 40,943 entities, 11 relations and 93,003 fact triplets. |
|
The semi-supervised Wikipedia-based dataset from the “Wiki-CS: A Wikipedia-Based Benchmark for Graph Neural Networks” paper, containing 11,701 nodes, 216,123 edges, 10 classes and 20 different training splits. |
|
The WebKB datasets used in the “Geom-GCN: Geometric Graph Convolutional Networks” paper. |
|
The Wikipedia networks used in the “Geom-GCN: Geometric Graph Convolutional Networks” paper. |
|
The actor-only induced subgraph of the film-director-actor-writer network used in the “Geom-GCN: Geometric Graph Convolutional Networks” paper. |
|
The MixHop synthetic dataset from the “MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing” paper, containing 10 graphs, each with varying degree of homophily (ranging from 0.0 to 0.9). |
|
The tree-structured fake news propagation graph classification dataset from the “User Preference-aware Fake News Detection” paper. |
-
class
AMiner
(root: str, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None)[source]¶ The heterogeneous AMiner dataset from the “metapath2vec: Scalable Representation Learning for Heterogeneous Networks” paper, consisting of nodes from type
"paper"
,"author"
and"venue"
. Venue categories and author research interests are available as ground truth labels for a subset of nodes.- Parameters
root (string) – Root directory where the dataset should be saved.
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
Actor
(root: str, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None)[source]¶ The actor-only induced subgraph of the film-director-actor-writer network used in the “Geom-GCN: Geometric Graph Convolutional Networks” paper. Each node corresponds to an actor, and the edge between two nodes denotes co-occurrence on the same Wikipedia page. Node features correspond to some keywords in the Wikipedia pages. The task is to classify the nodes into five categories in term of words of actor’s Wikipedia.
- Parameters
root (string) – Root directory where the dataset should be saved.
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
Amazon
(root: str, name: str, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None)[source]¶ The Amazon Computers and Amazon Photo networks from the “Pitfalls of Graph Neural Network Evaluation” paper. Nodes represent goods and edges represent that two goods are frequently bought together. Given product reviews as bag-of-words node features, the task is to map goods to their respective product category.
- Parameters
root (string) – Root directory where the dataset should be saved.
name (string) – The name of the dataset (
"Computers"
,"Photo"
).transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
AmazonProducts
(root: str, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None)[source]¶ The Amazon dataset from the “GraphSAINT: Graph Sampling Based Inductive Learning Method” paper, containing products and its categories.
- Parameters
root (string) – Root directory where the dataset should be saved.
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
BitcoinOTC
(root: str, edge_window_size: int = 10, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None)[source]¶ The Bitcoin-OTC dataset from the “EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs” paper, consisting of 138 who-trusts-whom networks of sequential time steps.
- Parameters
root (string) – Root directory where the dataset should be saved.
edge_window_size (int, optional) – The window size for the existence of an edge in the graph sequence since its initial creation. (default:
10
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
CitationFull
(root: str, name: str, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None)[source]¶ The full citation network datasets from the “Deep Gaussian Embedding of Graphs: Unsupervised Inductive Learning via Ranking” paper. Nodes represent documents and edges represent citation links. Datasets include citeseer, cora, cora_ml, dblp, pubmed.
- Parameters
root (string) – Root directory where the dataset should be saved.
name (string) – The name of the dataset (
"Cora"
,"Cora_ML"
"CiteSeer"
,"DBLP"
,"PubMed"
).transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
CoMA
(root: str, train: bool = True, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, pre_filter: Optional[Callable] = None)[source]¶ The CoMA 3D faces dataset from the “Generating 3D faces using Convolutional Mesh Autoencoders” paper, containing 20,466 meshes of extreme expressions captured over 12 different subjects.
Note
Data objects hold mesh faces instead of edge indices. To convert the mesh to a graph, use the
torch_geometric.transforms.FaceToEdge
aspre_transform
. To convert the mesh to a point cloud, use thetorch_geometric.transforms.SamplePoints
astransform
to sample a fixed number of points on the mesh faces according to their face area.- Parameters
root (string) – Root directory where the dataset should be saved.
train (bool, optional) – If
True
, loads the training dataset, otherwise the test dataset. (default:True
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
The Coauthor CS and Coauthor Physics networks from the “Pitfalls of Graph Neural Network Evaluation” paper. Nodes represent authors that are connected by an edge if they co-authored a paper. Given paper keywords for each author’s papers, the task is to map authors to their respective field of study.
- Parameters
root (string) – Root directory where the dataset should be saved.
name (string) – The name of the dataset (
"CS"
,"Physics"
).transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
CoraFull
(root: str, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None)[source]¶ Alias for
torch_geometric.dataset.CitationFull
withname="cora"
.
-
class
DBP15K
(root: str, pair: str, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None)[source]¶ The DBP15K dataset from the “Cross-lingual Entity Alignment via Joint Attribute-Preserving Embedding” paper, where Chinese, Japanese and French versions of DBpedia were linked to its English version. Node features are given by pre-trained and aligned monolingual word embeddings from the “Cross-lingual Knowledge Graph Alignment via Graph Matching Neural Network” paper.
- Parameters
root (string) – Root directory where the dataset should be saved.
pair (string) – The pair of languages (
"en_zh"
,"en_fr"
,"en_ja"
,"zh_en"
,"fr_en"
,"ja_en"
).transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
DynamicFAUST
(root: str, subjects: Optional[List[str]] = None, categories: Optional[List[str]] = None, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, pre_filter: Optional[Callable] = None)[source]¶ The dynamic FAUST humans dataset from the “Dynamic FAUST: Registering Human Bodies in Motion” paper.
Note
Data objects hold mesh faces instead of edge indices. To convert the mesh to a graph, use the
torch_geometric.transforms.FaceToEdge
aspre_transform
. To convert the mesh to a point cloud, use thetorch_geometric.transforms.SamplePoints
astransform
to sample a fixed number of points on the mesh faces according to their face area.- Parameters
root (string) – Root directory where the dataset should be saved.
subjects (list, optional) – List of subjects to include in the dataset. Can include the subjects
"50002"
,"50004"
,"50007"
,"50009"
,"50020"
,"50021"
,"50022"
,"50025"
,"50026"
,"50027"
. If set toNone
, the dataset will contain all subjects. (default:None
)categories (list, optional) – List of categories to include in the dataset. Can include the categories
"chicken_wings"
,"hips"
,"jiggle_on_toes"
,"jumping_jacks"
,"knees"
,"light_hopping_loose"
,"light_hopping_stiff"
,"one_leg_jump"
,"one_leg_loose"
,"personal_move"
,"punching"
,"running_on_spot"
,"running_on_spot_bugfix"
,"shake_arms"
,"shake_hips"
,"shoulders"
. If set toNone
, the dataset will contain all categories. (default:None
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
Entities
(root: str, name: str, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None)[source]¶ The relational entities networks “AIFB”, “MUTAG”, “BGS” and “AM” from the “Modeling Relational Data with Graph Convolutional Networks” paper. Training and test splits are given by node indices.
- Parameters
root (string) – Root directory where the dataset should be saved.
name (string) – The name of the dataset (
"AIFB"
,"MUTAG"
,"BGS"
,"AM"
).transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
FAUST
(root: str, train: bool = True, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, pre_filter: Optional[Callable] = None)[source]¶ The FAUST humans dataset from the “FAUST: Dataset and Evaluation for 3D Mesh Registration” paper, containing 100 watertight meshes representing 10 different poses for 10 different subjects.
Note
Data objects hold mesh faces instead of edge indices. To convert the mesh to a graph, use the
torch_geometric.transforms.FaceToEdge
aspre_transform
. To convert the mesh to a point cloud, use thetorch_geometric.transforms.SamplePoints
astransform
to sample a fixed number of points on the mesh faces according to their face area.- Parameters
root (string) – Root directory where the dataset should be saved.
train (bool, optional) – If
True
, loads the training dataset, otherwise the test dataset. (default:True
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
Flickr
(root: str, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None)[source]¶ The Flickr dataset from the “GraphSAINT: Graph Sampling Based Inductive Learning Method” paper, containing descriptions and common properties of images.
- Parameters
root (string) – Root directory where the dataset should be saved.
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
GDELT
(root: str, split: str = 'train', transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, pre_filter: Optional[Callable] = None)[source]¶ The Global Database of Events, Language, and Tone (GDELT) dataset used in the, e.g., “Recurrent Event Network for Reasoning over Temporal Knowledge Graphs” paper, consisting of events collected from 1/1/2018 to 1/31/2018 (15 minutes time granularity).
- Parameters
root (string) – Root directory where the dataset should be saved.
split (string) – If
"train"
, loads the training dataset. If"val"
, loads the validation dataset. If"test"
, loads the test dataset. (default:"train"
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
GEDDataset
(root: str, name: str, train: bool = True, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, pre_filter: Optional[Callable] = None)[source]¶ The GED datasets from the “Graph Edit Distance Computation via Graph Neural Networks” paper. GEDs can be accessed via the global attributes
ged
andnorm_ged
for all train/train graph pairs and all train/test graph pairs:dataset = GEDDataset(root, name="LINUX") data1, data2 = dataset[0], dataset[1] ged = dataset.ged[data1.i, data2.i] # GED between `data1` and `data2`.
Note
ALKANE
is missing GEDs for train/test graph pairs since they are not provided in the official datasets.- Parameters
root (string) – Root directory where the dataset should be saved.
name (string) – The name of the dataset (one of
"AIDS700nef"
,"LINUX"
,"ALKANE"
,"IMDBMulti"
).train (bool, optional) – If
True
, loads the training dataset, otherwise the test dataset. (default:True
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
GNNBenchmarkDataset
(root: str, name: str, split: str = 'train', transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, pre_filter: Optional[Callable] = None)[source]¶ A variety of artificially and semi-artificially generated graph datasets from the “Benchmarking Graph Neural Networks” paper.
Note
The ZINC dataset is provided via
torch_geometric.datasets.ZINC
.- Parameters
root (string) – Root directory where the dataset should be saved.
name (string) – The name of the dataset (one of
"PATTERN"
,"CLUSTER"
,"MNIST"
,"CIFAR10"
,"TSP"
,"CSL"
)split (string, optional) – If
"train"
, loads the training dataset. If"val"
, loads the validation dataset. If"test"
, loads the test dataset. (default:"train"
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
GeometricShapes
(root: str, train: bool = True, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, pre_filter: Optional[Callable] = None)[source]¶ Synthetic dataset of various geometric shapes like cubes, spheres or pyramids.
Note
Data objects hold mesh faces instead of edge indices. To convert the mesh to a graph, use the
torch_geometric.transforms.FaceToEdge
aspre_transform
. To convert the mesh to a point cloud, use thetorch_geometric.transforms.SamplePoints
astransform
to sample a fixed number of points on the mesh faces according to their face area.- Parameters
root (string) – Root directory where the dataset should be saved.
train (bool, optional) – If
True
, loads the training dataset, otherwise the test dataset. (default:True
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
ICEWS18
(root, split='train', transform=None, pre_transform=None, pre_filter=None)[source]¶ The Integrated Crisis Early Warning System (ICEWS) dataset used in the, e.g., “Recurrent Event Network for Reasoning over Temporal Knowledge Graphs” paper, consisting of events collected from 1/1/2018 to 10/31/2018 (24 hours time granularity).
- Parameters
root (string) – Root directory where the dataset should be saved.
split (string) – If
"train"
, loads the training dataset. If"val"
, loads the validation dataset. If"test"
, loads the test dataset. (default:"train"
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
KarateClub
(transform=None)[source]¶ Zachary’s karate club network from the “An Information Flow Model for Conflict and Fission in Small Groups” paper, containing 34 nodes, connected by 156 (undirected and unweighted) edges. Every node is labeled by one of four classes obtained via modularity-based clustering, following the “Semi-supervised Classification with Graph Convolutional Networks” paper. Training is based on a single labeled example per class, i.e. a total number of 4 labeled nodes.
- Parameters
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)
-
class
MNISTSuperpixels
(root, train=True, transform=None, pre_transform=None, pre_filter=None)[source]¶ MNIST superpixels dataset from the “Geometric Deep Learning on Graphs and Manifolds Using Mixture Model CNNs” paper, containing 70,000 graphs with 75 nodes each. Every graph is labeled by one of 10 classes.
- Parameters
root (string) – Root directory where the dataset should be saved.
train (bool, optional) – If
True
, loads the training dataset, otherwise the test dataset. (default:True
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
MixHopSyntheticDataset
(root, homophily, transform=None, pre_transform=None)[source]¶ The MixHop synthetic dataset from the “MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing” paper, containing 10 graphs, each with varying degree of homophily (ranging from 0.0 to 0.9). All graphs have 5,000 nodes, where each node corresponds to 1 out of 10 classes. The feature values of the nodes are sampled from a 2D Gaussian distribution, which are distinct for each class.
- Parameters
root (string) – Root directory where the dataset should be saved.
homophily (float) – The degree of homophily (one of
0.0
,0.1
, …,0.9
).transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
ModelNet
(root, name='10', train=True, transform=None, pre_transform=None, pre_filter=None)[source]¶ The ModelNet10/40 datasets from the “3D ShapeNets: A Deep Representation for Volumetric Shapes” paper, containing CAD models of 10 and 40 categories, respectively.
Note
Data objects hold mesh faces instead of edge indices. To convert the mesh to a graph, use the
torch_geometric.transforms.FaceToEdge
aspre_transform
. To convert the mesh to a point cloud, use thetorch_geometric.transforms.SamplePoints
astransform
to sample a fixed number of points on the mesh faces according to their face area.- Parameters
root (string) – Root directory where the dataset should be saved.
name (string, optional) – The name of the dataset (
"10"
for ModelNet10,"40"
for ModelNet40). (default:"10"
)train (bool, optional) – If
True
, loads the training dataset, otherwise the test dataset. (default:True
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
MoleculeNet
(root, name, transform=None, pre_transform=None, pre_filter=None)[source]¶ The MoleculeNet benchmark collection from the “MoleculeNet: A Benchmark for Molecular Machine Learning” paper, containing datasets from physical chemistry, biophysics and physiology. All datasets come with the additional node and edge features introduced by the Open Graph Benchmark.
- Parameters
root (string) – Root directory where the dataset should be saved.
name (string) – The name of the dataset (
"ESOL"
,"FreeSolv"
,"Lipo"
,"PCBA"
,"MUV"
,"HIV"
,"BACE"
,"BBPB"
,"Tox21"
,"ToxCast"
,"SIDER"
,"ClinTox"
).transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
NELL
(root, transform=None, pre_transform=None)[source]¶ The NELL dataset, a knowledge graph from the “Toward an Architecture for Never-Ending Language Learning” paper. The dataset is processed as in the “Revisiting Semi-Supervised Learning with Graph Embeddings” paper.
Note
Entity nodes are described by sparse feature vectors of type
torch_sparse.SparseTensor
, which can be either used directly, or can be converted viadata.x.to_dense()
,data.x.to_scipy()
ordata.x.to_torch_sparse_coo_tensor()
.- Parameters
root (string) – Root directory where the dataset should be saved.
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
PCPNetDataset
(root, category, split='train', transform=None, pre_transform=None, pre_filter=None)[source]¶ The PCPNet dataset from the “PCPNet: Learning Local Shape Properties from Raw Point Clouds” paper, consisting of 30 shapes, each given as a point cloud, densely sampled with 100k points. For each shape, surface normals and local curvatures are given as node features.
- Parameters
root (string) – Root directory where the dataset should be saved.
category (string) – The training set category (one of
"NoNoise"
,"Noisy"
,"VarDensity"
,"NoisyAndVarDensity"
forsplit="train"
orsplit="val"
, or one of"All"
,"LowNoise"
,"MedNoise"
,"HighNoise", :obj:
”VarDensityStriped”,"VarDensityGradient"
forsplit="test"
).split (string) – If
"train"
, loads the training dataset. If"val"
, loads the validation dataset. If"test"
, loads the test dataset. (default:"train"
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
PPI
(root, split='train', transform=None, pre_transform=None, pre_filter=None)[source]¶ The protein-protein interaction networks from the “Predicting Multicellular Function through Multi-layer Tissue Networks” paper, containing positional gene sets, motif gene sets and immunological signatures as features (50 in total) and gene ontology sets as labels (121 in total).
- Parameters
root (string) – Root directory where the dataset should be saved.
split (string) – If
"train"
, loads the training dataset. If"val"
, loads the validation dataset. If"test"
, loads the test dataset. (default:"train"
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
PascalPF
(root, category, transform=None, pre_transform=None, pre_filter=None)[source]¶ The Pascal-PF dataset from the “Proposal Flow” paper, containing 4 to 16 keypoints per example over 20 categories.
- Parameters
root (string) – Root directory where the dataset should be saved.
category (string) – The category of the images (one of
"Aeroplane"
,"Bicycle"
,"Bird"
,"Boat"
,"Bottle"
,"Bus"
,"Car"
,"Cat"
,"Chair"
,"Diningtable"
,"Dog"
,"Horse"
,"Motorbike"
,"Person"
,"Pottedplant"
,"Sheep"
,"Sofa"
,"Train"
,"TVMonitor"
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
PascalVOCKeypoints
(root, category, train=True, transform=None, pre_transform=None, pre_filter=None)[source]¶ The Pascal VOC 2011 dataset with Berkely annotations of keypoints from the “Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations” paper, containing 0 to 23 keypoints per example over 20 categories. The dataset is pre-filtered to exclude difficult, occluded and truncated objects. The keypoints contain interpolated features from a pre-trained VGG16 model on ImageNet (
relu4_2
andrelu5_1
).- Parameters
root (string) – Root directory where the dataset should be saved.
category (string) – The category of the images (one of
"Aeroplane"
,"Bicycle"
,"Bird"
,"Boat"
,"Bottle"
,"Bus"
,"Car"
,"Cat"
,"Chair"
,"Diningtable"
,"Dog"
,"Horse"
,"Motorbike"
,"Person"
,"Pottedplant"
,"Sheep"
,"Sofa"
,"Train"
,"TVMonitor"
)train (bool, optional) – If
True
, loads the training dataset, otherwise the test dataset. (default:True
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
Planetoid
(root: str, name: str, split: str = 'public', num_train_per_class: int = 20, num_val: int = 500, num_test: int = 1000, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None)[source]¶ The citation network datasets “Cora”, “CiteSeer” and “PubMed” from the “Revisiting Semi-Supervised Learning with Graph Embeddings” paper. Nodes represent documents and edges represent citation links. Training, validation and test splits are given by binary masks.
- Parameters
root (string) – Root directory where the dataset should be saved.
name (string) – The name of the dataset (
"Cora"
,"CiteSeer"
,"PubMed"
).split (string) –
The type of dataset split (
"public"
,"full"
,"random"
). If set to"public"
, the split will be the public fixed split from the “Revisiting Semi-Supervised Learning with Graph Embeddings” paper. If set to"full"
, all nodes except those in the validation and test sets will be used for training (as in the “FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling” paper). If set to"random"
, train, validation, and test sets will be randomly generated, according tonum_train_per_class
,num_val
andnum_test
. (default:"public"
)num_train_per_class (int, optional) – The number of training samples per class in case of
"random"
split. (default:20
)num_val (int, optional) – The number of validation samples in case of
"random"
split. (default:500
)num_test (int, optional) – The number of test samples in case of
"random"
split. (default:1000
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
QM7b
(root, transform=None, pre_transform=None, pre_filter=None)[source]¶ The QM7b dataset from the “MoleculeNet: A Benchmark for Molecular Machine Learning” paper, consisting of 7,211 molecules with 14 regression targets.
- Parameters
root (string) – Root directory where the dataset should be saved.
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
QM9
(root: str, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, pre_filter: Optional[Callable] = None)[source]¶ The QM9 dataset from the “MoleculeNet: A Benchmark for Molecular Machine Learning” paper, consisting of about 130,000 molecules with 19 regression targets. Each molecule includes complete spatial information for the single low energy conformation of the atoms in the molecule. In addition, we provide the atom features from the “Neural Message Passing for Quantum Chemistry” paper.
Target
Property
Description
Unit
0
\(\mu\)
Dipole moment
\(\textrm{D}\)
1
\(\alpha\)
Isotropic polarizability
\({a_0}^3\)
2
\(\epsilon_{\textrm{HOMO}}\)
Highest occupied molecular orbital energy
\(\textrm{eV}\)
3
\(\epsilon_{\textrm{LUMO}}\)
Lowest unoccupied molecular orbital energy
\(\textrm{eV}\)
4
\(\Delta \epsilon\)
Gap between \(\epsilon_{\textrm{HOMO}}\) and \(\epsilon_{\textrm{LUMO}}\)
\(\textrm{eV}\)
5
\(\langle R^2 \rangle\)
Electronic spatial extent
\({a_0}^2\)
6
\(\textrm{ZPVE}\)
Zero point vibrational energy
\(\textrm{eV}\)
7
\(U_0\)
Internal energy at 0K
\(\textrm{eV}\)
8
\(U\)
Internal energy at 298.15K
\(\textrm{eV}\)
9
\(H\)
Enthalpy at 298.15K
\(\textrm{eV}\)
10
\(G\)
Free energy at 298.15K
\(\textrm{eV}\)
11
\(c_{\textrm{v}}\)
Heat capavity at 298.15K
\(\frac{\textrm{cal}}{\textrm{mol K}}\)
12
\(U_0^{\textrm{ATOM}}\)
Atomization energy at 0K
\(\textrm{eV}\)
13
\(U^{\textrm{ATOM}}\)
Atomization energy at 298.15K
\(\textrm{eV}\)
14
\(H^{\textrm{ATOM}}\)
Atomization enthalpy at 298.15K
\(\textrm{eV}\)
15
\(G^{\textrm{ATOM}}\)
Atomization free energy at 298.15K
\(\textrm{eV}\)
16
\(A\)
Rotational constant
\(\textrm{GHz}\)
17
\(B\)
Rotational constant
\(\textrm{GHz}\)
18
\(C\)
Rotational constant
\(\textrm{GHz}\)
- Parameters
root (string) – Root directory where the dataset should be saved.
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
Reddit
(root, transform=None, pre_transform=None)[source]¶ The Reddit dataset from the “Inductive Representation Learning on Large Graphs” paper, containing Reddit posts belonging to different communities.
- Parameters
root (string) – Root directory where the dataset should be saved.
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
Reddit2
(root, transform=None, pre_transform=None)[source]¶ The Reddit dataset from the “GraphSAINT: Graph Sampling Based Inductive Learning Method” paper, containing Reddit posts belonging to different communities.
Note
This is a sparser version of the original
Reddit
dataset (~23M edges instead of ~114M edges), and is used in papers such as SGC and GraphSAINT.- Parameters
root (string) – Root directory where the dataset should be saved.
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
S3DIS
(root, test_area=6, train=True, transform=None, pre_transform=None, pre_filter=None)[source]¶ The (pre-processed) Stanford Large-Scale 3D Indoor Spaces dataset from the “3D Semantic Parsing of Large-Scale Indoor Spaces” paper, containing point clouds of six large-scale indoor parts in three buildings with 12 semantic elements (and one clutter class).
- Parameters
root (string) – Root directory where the dataset should be saved.
test_area (int, optional) – Which area to use for testing (1-6). (default:
6
)train (bool, optional) – If
True
, loads the training dataset, otherwise the test dataset. (default:True
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
SHREC2016
(root, partiality, category, train=True, transform=None, pre_transform=None, pre_filter=None)[source]¶ The SHREC 2016 partial matching dataset from the “SHREC’16: Partial Matching of Deformable Shapes” paper. The reference shape can be referenced via
dataset.ref
.Note
Data objects hold mesh faces instead of edge indices. To convert the mesh to a graph, use the
torch_geometric.transforms.FaceToEdge
aspre_transform
. To convert the mesh to a point cloud, use thetorch_geometric.transforms.SamplePoints
astransform
to sample a fixed number of points on the mesh faces according to their face area.- Parameters
root (string) – Root directory where the dataset should be saved.
partiality (string) – The partiality of the dataset (one of
"Holes"
,"Cuts"
).category (string) – The category of the dataset (one of
"Cat"
,"Centaur"
,"David"
,"Dog"
,"Horse"
,"Michael"
,"Victoria"
,"Wolf"
).train (bool, optional) – If
True
, loads the training dataset, otherwise the test dataset. (default:True
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
SNAPDataset
(root, name, transform=None, pre_transform=None, pre_filter=None)[source]¶ A variety of graph datasets collected from SNAP at Stanford University.
- Parameters
root (string) – Root directory where the dataset should be saved.
name (string) – The name of the dataset.
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
ShapeNet
(root, categories=None, include_normals=True, split='trainval', transform=None, pre_transform=None, pre_filter=None)[source]¶ The ShapeNet part level segmentation dataset from the “A Scalable Active Framework for Region Annotation in 3D Shape Collections” paper, containing about 17,000 3D shape point clouds from 16 shape categories. Each category is annotated with 2 to 6 parts.
- Parameters
root (string) – Root directory where the dataset should be saved.
categories (string or [string], optional) – The category of the CAD models (one or a combination of
"Airplane"
,"Bag"
,"Cap"
,"Car"
,"Chair"
,"Earphone"
,"Guitar"
,"Knife"
,"Lamp"
,"Laptop"
,"Motorbike"
,"Mug"
,"Pistol"
,"Rocket"
,"Skateboard"
,"Table"
). Can be explicitly set toNone
to load all categories. (default:None
)include_normals (bool, optional) – If set to
False
, will not include normal vectors as input features. (default:True
)split (string, optional) – If
"train"
, loads the training dataset. If"val"
, loads the validation dataset. If"trainval"
, loads the training and validation dataset. If"test"
, loads the test dataset. (default:"trainval"
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
SuiteSparseMatrixCollection
(root, group, name, transform=None, pre_transform=None)[source]¶ A suite of sparse matrix benchmarks known as the Suite Sparse Matrix Collection collected from a wide range of applications.
- Parameters
root (string) – Root directory where the dataset should be saved.
group (string) – The group of the sparse matrix.
name (string) – The name of the sparse matrix.
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
TOSCA
(root, categories=None, transform=None, pre_transform=None, pre_filter=None)[source]¶ The TOSCA dataset from the “Numerical Geometry of Non-Ridig Shapes” book, containing 80 meshes. Meshes within the same category have the same triangulation and an equal number of vertices numbered in a compatible way.
Note
Data objects hold mesh faces instead of edge indices. To convert the mesh to a graph, use the
torch_geometric.transforms.FaceToEdge
aspre_transform
. To convert the mesh to a point cloud, use thetorch_geometric.transforms.SamplePoints
astransform
to sample a fixed number of points on the mesh faces according to their face area.- Parameters
root (string) – Root directory where the dataset should be saved.
categories (list, optional) – List of categories to include in the dataset. Can include the categories
"Cat"
,"Centaur"
,"David"
,"Dog"
,"Gorilla"
,"Horse"
,"Michael"
,"Victoria"
,"Wolf"
. If set toNone
, the dataset will contain all categories. (default:None
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
TUDataset
(root: str, name: str, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, pre_filter: Optional[Callable] = None, use_node_attr: bool = False, use_edge_attr: bool = False, cleaned: bool = False)[source]¶ A variety of graph kernel benchmark datasets, .e.g. “IMDB-BINARY”, “REDDIT-BINARY” or “PROTEINS”, collected from the TU Dortmund University. In addition, this dataset wrapper provides cleaned dataset versions as motivated by the “Understanding Isomorphism Bias in Graph Data Sets” paper, containing only non-isomorphic graphs.
Note
Some datasets may not come with any node labels. You can then either make use of the argument
use_node_attr
to load additional continuous node attributes (if present) or provide synthetic node features using transforms such as liketorch_geometric.transforms.Constant
ortorch_geometric.transforms.OneHotDegree
.- Parameters
root (string) – Root directory where the dataset should be saved.
name (string) – The name of the dataset.
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)use_node_attr (bool, optional) – If
True
, the dataset will contain additional continuous node attributes (if present). (default:False
)use_edge_attr (bool, optional) – If
True
, the dataset will contain additional continuous edge attributes (if present). (default:False
)cleaned – (bool, optional): If
True
, the dataset will contain only non-isomorphic graphs. (default:False
)
-
class
TrackMLParticleTrackingDataset
(root, transform=None)[source]¶ The TrackML Particle Tracking Challenge dataset to reconstruct particle tracks from 3D points left in the silicon detectors.
- Parameters
root (string) – Root directory where the dataset should be saved.
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)
-
class
UPFD
(root, name, feature, split='train', transform=None, pre_transform=None, pre_filter=None)[source]¶ The tree-structured fake news propagation graph classification dataset from the “User Preference-aware Fake News Detection” paper. It includes two sets of tree-structured fake & real news propagation graphs extracted from Twitter. For a single graph, the root node represents the source news, and leaf nodes represent Twitter users who retweeted the same root news. A user node has an edge to the news node if and only if the user retweeted the root news directly. Two user nodes have an edge if and only if one user retweeted the root news from the other user. Four different node features are encoded using different encoders. Please refer to GNN-FakeNews repo for more details.
Note
For an example of using UPFD, see examples/upfd.py.
- Parameters
root (string) – Root directory where the dataset should be saved.
name (string) – The name of the graph set (
"politifact"
,"gossipcop"
).feature (string) – The node feature type (
"profile"
,"spacy"
,"bert"
,"content"
). If set to"profile"
, the 10-dimensional node feature is composed of ten Twitter user profile attributes. If set to"spacy"
, the 300-dimensional node feature is composed of Twitter user historical tweets encoded by the spaCy word2vec encoder. If set to"bert"
, the 768-dimensional node feature is composed of Twitter user historical tweets encoded by the bert-as-service. If set to"content"
, the 310-dimensional node feature is composed of a 300-dimensional “spacy” vector plus a 10-dimensional “profile” vector.split (string, optional) – If
"train"
, loads the training dataset. If"val"
, loads the validation dataset. If"test"
, loads the test dataset. (default:"train"
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
WILLOWObjectClass
(root, category, transform=None, pre_transform=None, pre_filter=None)[source]¶ The WILLOW-ObjectClass dataset from the “Learning Graphs to Match” paper, containing 10 equal keypoints of at least 40 images in each category. The keypoints contain interpolated features from a pre-trained VGG16 model on ImageNet (
relu4_2
andrelu5_1
).- Parameters
root (string) – Root directory where the dataset should be saved.
category (string) – The category of the images (one of
"Car"
,"Duck"
,"Face"
,"Motorbike"
,"Winebottle"
).transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)
-
class
WebKB
(root, name, transform=None, pre_transform=None)[source]¶ The WebKB datasets used in the “Geom-GCN: Geometric Graph Convolutional Networks” paper. Nodes represent web pages and edges represent hyperlinks between them. Node features are the bag-of-words representation of web pages. The task is to classify the nodes into one of the five categories, student, project, course, staff, and faculty.
- Parameters
root (string) – Root directory where the dataset should be saved.
name (string) – The name of the dataset (
"Cornell"
,"Texas"
,"Wisconsin"
).transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
WikiCS
(root, transform=None, pre_transform=None)[source]¶ The semi-supervised Wikipedia-based dataset from the “Wiki-CS: A Wikipedia-Based Benchmark for Graph Neural Networks” paper, containing 11,701 nodes, 216,123 edges, 10 classes and 20 different training splits.
- Parameters
root (string) – Root directory where the dataset should be saved.
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
WikipediaNetwork
(root, name, transform=None, pre_transform=None)[source]¶ The Wikipedia networks used in the “Geom-GCN: Geometric Graph Convolutional Networks” paper. Nodes represent web pages and edges represent hyperlinks between them. Node features represent several informative nouns in the Wikipedia pages. The task is to classify the nodes into five categories in term of the number of average monthly traffic of the web page.
- Parameters
root (string) – Root directory where the dataset should be saved.
name (string) – The name of the dataset (
"Chameleon"
,"Squirrel"
).transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
WordNet18
(root, transform=None, pre_transform=None)[source]¶ The WordNet18 dataset from the “Translating Embeddings for Modeling Multi-Relational Data” paper, containing 40,943 entities, 18 relations and 151,442 fact triplets, e.g., furniture includes bed.
Note
The original
WordNet18
dataset suffers from test leakage, i.e. more than 80% of test triplets can be found in the training set with another relation type. Therefore, it should not be used for research evaluation anymore. We recommend to use its cleaned versionWordNet18RR
instead.- Parameters
root (string) – Root directory where the dataset should be saved.
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
WordNet18RR
(root, transform=None, pre_transform=None)[source]¶ The WordNet18RR dataset from the “Convolutional 2D Knowledge Graph Embeddings” paper, containing 40,943 entities, 11 relations and 93,003 fact triplets.
- Parameters
root (string) – Root directory where the dataset should be saved.
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
Yelp
(root, transform=None, pre_transform=None)[source]¶ The Yelp dataset from the “GraphSAINT: Graph Sampling Based Inductive Learning Method” paper, containing customer reviewers and their friendship.
- Parameters
root (string) – Root directory where the dataset should be saved.
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)
-
class
ZINC
(root, subset=False, split='train', transform=None, pre_transform=None, pre_filter=None)[source]¶ The ZINC dataset from the ZINC database and the “Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules” paper, containing about 250,000 molecular graphs with up to 38 heavy atoms. The task is to regress a synthetic computed property dubbed as the constrained solubility.
- Parameters
root (string) – Root directory where the dataset should be saved.
subset (boolean, optional) –
If set to
True
, will only load a subset of the dataset (12,000 molecular graphs), following the “Benchmarking Graph Neural Networks” paper. (default:False
)split (string, optional) – If
"train"
, loads the training dataset. If"val"
, loads the validation dataset. If"test"
, loads the test dataset. (default:"train"
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)