torch_geometric.nn.models.MoleculeGPT
- class MoleculeGPT(llm: LLM, graph_encoder: Module, smiles_encoder: Module, mlp_out_channels: int = 32, max_tokens: Optional[int] = 20)[source]
Bases:
Module
The MoleculeGPT model from the “MoleculeGPT: Instruction Following Large Language Models for Molecular Property Prediction” paper.
- Parameters:
llm (LLM) – The LLM to use.
graph_encoder (torch.nn.Module) – Encode 2D molecule graph.
smiles_encoder (torch.nn.Module) – Encode 1D SMILES.
mlp_out_channels (int, optional) – The size of each embedding after qformer encoding. (default:
32
)max_tokens (int, optional) – Max output tokens of 1D/2D encoder. (default:
20
)
Warning
This module has been tested with the following HuggingFace models
llm_to_use="lmsys/vicuna-7b-v1.5"
and may not work with other models. See other models at HuggingFace Models and let us know if you encounter any issues.
Note
For an example of using
MoleculeGPT
, see examples/llm/molecule_gpt.py.- forward(x: Tensor, edge_index: Tensor, batch: Tensor, edge_attr: Optional[Tensor], smiles: List[str], instructions: List[str], label: List[str], additional_text_context: Optional[List[str]] = None)[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.