torch_geometric.datasets.MoleculeGPTDataset
- class MoleculeGPTDataset(root: str, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, pre_filter: Optional[Callable] = None, force_reload: bool = False, total_page_num: int = 10, total_block_num: int = 1)[source]
Bases:
InMemoryDataset
The dataset from the “MoleculeGPT: Instruction Following Large Language Models for Molecular Property Prediction” paper.
- Parameters:
root (str) – Root directory where the dataset should be saved.
transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before every access. (default:None
)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Data
object and returns a transformed version. The data object will be transformed before being saved to disk. (default:None
)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Data
object and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None
)force_reload (bool, optional) – Whether to re-process the dataset. (default:
False
)total_page_num (int, optional) – The number of pages from PubChem. (default:
10
)total_block_num (int, optional) – The blocks of SDF files from PubChem. (default:
1
)