torch_geometric.datasets.TUDataset
- class TUDataset(root: str, name: str, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, pre_filter: Optional[Callable] = None, force_reload: bool = False, use_node_attr: bool = False, use_edge_attr: bool = False, cleaned: bool = False)[source]
Bases:
InMemoryDataset
A variety of graph kernel benchmark datasets, .e.g.,
"IMDB-BINARY"
,"REDDIT-BINARY"
or"PROTEINS"
, collected from the TU Dortmund University. In addition, this dataset wrapper provides cleaned dataset versions as motivated by the “Understanding Isomorphism Bias in Graph Data Sets” paper, containing only non-isomorphic graphs.Note
Some datasets may not come with any node labels. You can then either make use of the argument
use_node_attr
to load additional continuous node attributes (if present) or provide synthetic node features using transforms such astorch_geometric.transforms.Constant
ortorch_geometric.transforms.OneHotDegree
.- Parameters:
root (str) – Root directory where the dataset should be saved.
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)force_reload (bool, optional) – Whether to re-process the dataset. (default:
False
)use_node_attr (bool, optional) – If
True
, the dataset will contain additional continuous node attributes (if present). (default:False
)use_edge_attr (bool, optional) – If
True
, the dataset will contain additional continuous edge attributes (if present). (default:False
)cleaned (bool, optional) – If
True
, the dataset will contain only non-isomorphic graphs. (default:False
)
STATS:
Name
#graphs
#nodes
#edges
#features
#classes
MUTAG
188
~17.9
~39.6
7
2
ENZYMES
600
~32.6
~124.3
3
6
PROTEINS
1,113
~39.1
~145.6
3
2
COLLAB
5,000
~74.5
~4914.4
0
3
IMDB-BINARY
1,000
~19.8
~193.1
0
2
REDDIT-BINARY
2,000
~429.6
~995.5
0
2
…