torch_geometric.datasets.MoleculeNet

class MoleculeNet(root: str, name: str, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, pre_filter: Optional[Callable] = None, force_reload: bool = False)[source]

Bases: InMemoryDataset

The MoleculeNet benchmark collection from the “MoleculeNet: A Benchmark for Molecular Machine Learning” paper, containing datasets from physical chemistry, biophysics and physiology. All datasets come with the additional node and edge features introduced by the Open Graph Benchmark.

Parameters:
  • root (str) – Root directory where the dataset should be saved.

  • name (str) – The name of the dataset ("ESOL", "FreeSolv", "Lipo", "PCBA", "MUV", "HIV", "BACE", "BBBP", "Tox21", "ToxCast", "SIDER", "ClinTox").

  • transform (callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access. (default: None)

  • pre_transform (callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk. (default: None)

  • pre_filter (callable, optional) – A function that takes in an torch_geometric.data.Data object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default: None)

  • force_reload (bool, optional) – Whether to re-process the dataset. (default: False)

STATS:

Name

#graphs

#nodes

#edges

#features

#classes

ESOL

1,128

~13.3

~27.4

9

1

FreeSolv

642

~8.7

~16.8

9

1

Lipophilicity

4,200

~27.0

~59.0

9

1

PCBA

437,929

~26.0

~56.2

9

128

MUV

93,087

~24.2

~52.6

9

17

HIV

41,127

~25.5

~54.9

9

1

BACE

1513

~34.1

~73.7

9

1

BBBP

2,050

~23.9

~51.6

9

1

Tox21

7,831

~18.6

~38.6

9

12

ToxCast

8,597

~18.7

~38.4

9

617

SIDER

1,427

~33.6

~70.7

9

27

ClinTox

1,484

~26.1

~55.5

9

2