torch_geometric.datasets.HydroNet
- class HydroNet(root: str, name: Optional[str] = None, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, force_reload: bool = False, num_workers: int = 8, clusters: Optional[Union[int, List[int]]] = None, use_processed: bool = True)[source]
Bases:
InMemoryDataset
The HydroNet dataest from the “HydroNet: Benchmark Tasks for Preserving Intermolecular Interactions and Structural Motifs in Predictive and Generative Models for Molecular Data” paper, consisting of 5 million water clusters held together by hydrogen bonding networks. This dataset provides atomic coordinates and total energy in kcal/mol for the cluster.
- Parameters:
root (str) – Root directory where the dataset should be saved.
name (str, optional) – Name of the subset of the full dataset to use:
"small"
uses 500k graphs sampled from the"medium"
dataset,"medium"
uses 2.7m graphs with maximum size of 75 nodes. Mutually exclusive option with the clusters argument. (defaultNone
)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)force_reload (bool, optional) – Whether to re-process the dataset. (default:
False
)num_workers (int) – Number of multiprocessing workers to use for pre-processing the dataset. (default
8
)clusters (int or List[int], optional) – Select a subset of clusters from the full dataset. If set to
None
, will select all. (defaultNone
)use_processed (bool) – Option to use a pre-processed version of the original
xyz
dataset. (default:True
)