torch_geometric.graphgym¶
Workflow Modules¶
Load latest model checkpoint |
|
Save model checkpoint at given epoch |
|
Only keep the latest model checkpoint, remove all the older checkpoints |
|
Parses the command line arguments. |
|
|
CfgNode represents an internal node in the configuration tree. |
This function sets the default config value. |
|
Load configurations from file system and command line |
|
Dumps the config to the output directory specified in |
|
Create the directory for each random seed experiment run |
|
Create the directory for aggregated results over all the random seeds |
|
Performs weight initialization |
|
Create data loader object |
|
Set up printing options |
|
Create logger for the experiment |
|
Compute loss and prediction score |
|
Create model for graph machine learning |
|
Create optimizer for the model |
|
Create learning rate scheduler for the optimizer |
|
The core training pipeline |
- load_ckpt(model, optimizer=None, scheduler=None)[source]¶
Load latest model checkpoint
- Parameters
model (torch.nn.Module) – The model that will be loaded
optimizer (torch.optim, optional) – The optimizer that will be loaded
scheduler (torch.optim, optional) – The schduler that will be loaded
- Returns
Epoch count after loading the model
- save_ckpt(model, optimizer, scheduler, epoch)[source]¶
Save model checkpoint at given epoch
- Parameters
model (torch.nn.Module) – The model that will be saved
optimizer (torch.optim) – The optimizer that will be saved
scheduler (torch.optim) – The schduler that will be saved
epoch (int) – The epoch when the model is saved
- set_cfg(cfg)[source]¶
This function sets the default config value. 1) Note that for an experiment, only part of the arguments will be used The remaining unused arguments won’t affect anything. So feel free to register any argument in graphgym.contrib.config 2) We support at most two levels of configs, e.g., cfg.dataset.name
- Returns
configuration use by the experiment.
- load_cfg(cfg, args)[source]¶
Load configurations from file system and command line
- Parameters
cfg (CfgNode) – Configuration node
args (ArgumentParser) – Command argument parser
- dump_cfg(cfg)[source]¶
Dumps the config to the output directory specified in
cfg.out_dir
- Parameters
cfg (CfgNode) – Configuration node
- set_run_dir(out_dir, fname)[source]¶
Create the directory for each random seed experiment run
- Parameters
out_dir (string) – Directory for output, specified in
cfg.out_dir
fname (string) – Filename for the yaml format configuration file
- set_agg_dir(out_dir, fname)[source]¶
Create the directory for aggregated results over all the random seeds
- Parameters
out_dir (string) – Directory for output, specified in
cfg.out_dir
fname (string) – Filename for the yaml format configuration file
- compute_loss(pred, true)[source]¶
Compute loss and prediction score
- Parameters
pred (torch.tensor) – Unnormalized prediction
true (torch.tensor) – Grou
Returns: Loss, normalized prediction score
- create_model(to_device=True, dim_in=None, dim_out=None)[source]¶
Create model for graph machine learning
- create_optimizer(params, optimizer_config: torch_geometric.graphgym.optimizer.OptimizerConfig)[source]¶
Create optimizer for the model
- Parameters
params – PyTorch model parameters
Returns: PyTorch optimizer
Model Modules¶
Provides an encoder for integer node features. |
|
The atom Encoder used in OGB molecule dataset. |
|
The bond Encoder used in OGB molecule dataset. |
|
Wrapper for a GNN layer |
|
Wrapper for NN layer before GNN message passing |
|
Simple Stage that stack GNN layers |
|
Encoding node and edge features |
|
General GNN model: encoder + stage + head |
|
GNN prediction head for node prediction tasks. |
|
GNN prediction head for edge/link prediction tasks. |
|
GNN prediction head for graph prediction tasks. |
|
General wrapper for layers |
|
General wrapper for a stack of multiple layers |
|
Basic Linear layer. |
|
BatchNorm for node feature. |
|
BatchNorm for edge feature. |
|
Basic MLP model. |
|
Graph Convolutional Network (GCN) layer |
|
GraphSAGE Conv layer |
|
Graph Attention Network (GAT) layer |
|
Graph Isomorphism Network (GIN) layer |
|
SplineCNN layer |
|
A general GNN layer |
|
A general GNN layer that supports edge features as well |
|
A general GNN layer that supports edge features and edge sampling |
|
Globally pool node embeddings into graph embeddings, via elementwise sum. |
|
Globally pool node embeddings into graph embeddings, via elementwise mean. |
|
Globally pool node embeddings into graph embeddings, via elementwise max. |
- class IntegerFeatureEncoder(emb_dim, num_classes=None)[source]¶
Provides an encoder for integer node features.
- class AtomEncoder(emb_dim, num_classes=None)[source]¶
The atom Encoder used in OGB molecule dataset.
- Parameters
emb_dim (int) – Output embedding dimension
num_classes – None
- class BondEncoder(emb_dim)[source]¶
The bond Encoder used in OGB molecule dataset.
- Parameters
emb_dim (int) – Output edge embedding dimension
- class FeatureEncoder(dim_in)[source]¶
Encoding node and edge features
- Parameters
dim_in (int) – Input feature dimension
- class GNNGraphHead(dim_in, dim_out)[source]¶
GNN prediction head for graph prediction tasks. The optional post_mp layer (specified by cfg.gnn.post_mp) is used to transform the pooled embedding using an MLP.
- class GeneralLayer(name, layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶
General wrapper for layers
- Parameters
name (string) – Name of the layer in registered
layer_dict
dim_in (int) – Input dimension
dim_out (int) – Output dimension
has_act (bool) – Whether has activation after the layer
has_bn (bool) – Whether has BatchNorm in the layer
has_l2norm (bool) – Wheter has L2 normalization after the layer
**kwargs (optional) – Additional args
- class GeneralMultiLayer(name, layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶
General wrapper for a stack of multiple layers
- Parameters
name (string) – Name of the layer in registered
layer_dict
num_layers (int) – Number of layers in the stack
dim_in (int) – Input dimension
dim_out (int) – Output dimension
dim_inner (int) – The dimension for the inner layers
final_act (bool) – Whether has activation after the layer stack
**kwargs (optional) – Additional args
- class Linear(layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶
Basic Linear layer.
- class BatchNorm1dNode(layer_config: torch_geometric.graphgym.models.layer.LayerConfig)[source]¶
BatchNorm for node feature.
- Parameters
dim_in (int) – Input dimension
- class BatchNorm1dEdge(layer_config: torch_geometric.graphgym.models.layer.LayerConfig)[source]¶
BatchNorm for edge feature.
- Parameters
dim_in (int) – Input dimension
- class MLP(layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶
Basic MLP model. Here 1-layer MLP is equivalent to a Liner layer.
- class GCNConv(layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶
Graph Convolutional Network (GCN) layer
- class SAGEConv(layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶
GraphSAGE Conv layer
- class GATConv(layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶
Graph Attention Network (GAT) layer
- class GINConv(layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶
Graph Isomorphism Network (GIN) layer
- class SplineConv(layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶
SplineCNN layer
- class GeneralConv(layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶
A general GNN layer
- class GeneralEdgeConv(layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶
A general GNN layer that supports edge features as well
- class GeneralSampleEdgeConv(layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶
A general GNN layer that supports edge features and edge sampling
- global_add_pool(x, batch, size=None)[source]¶
Globally pool node embeddings into graph embeddings, via elementwise sum. Pooling function takes in node embedding [num_nodes x emb_dim] and batch (indices) and outputs graph embedding [num_graphs x emb_dim].
- Parameters
x (torch.tensor) – Input node embeddings
batch (torch.tensor) – Batch tensor that indicates which node
to which graph (belongs) –
size (optional) – Total number of graphs. Can be auto-inferred.
Returns: Pooled graph embeddings
- global_mean_pool(x, batch, size=None)[source]¶
Globally pool node embeddings into graph embeddings, via elementwise mean. Pooling function takes in node embedding [num_nodes x emb_dim] and batch (indices) and outputs graph embedding [num_graphs x emb_dim].
- Parameters
x (torch.tensor) – Input node embeddings
batch (torch.tensor) – Batch tensor that indicates which node
to which graph (belongs) –
size (optional) – Total number of graphs. Can be auto-inferred.
Returns: Pooled graph embeddings
- global_max_pool(x, batch, size=None)[source]¶
Globally pool node embeddings into graph embeddings, via elementwise max. Pooling function takes in node embedding [num_nodes x emb_dim] and batch (indices) and outputs graph embedding [num_graphs x emb_dim].
- Parameters
x (torch.tensor) – Input node embeddings
batch (torch.tensor) – Batch tensor that indicates which node
to which graph (belongs) –
size (optional) – Total number of graphs. Can be auto-inferred.
Returns: Pooled graph embeddings
Register Modules¶
Base function for registering a customized module to a module dictionary |
|
Register a customized activation function. |
|
Register a customized node feature encoder. |
|
Register a customized edge feature encoder. |
|
Register a customized GNN stage (consists of multiple layers). |
|
Register a customized GNN prediction head. |
|
Register a customized GNN layer. |
|
Register a customized GNN pooling layer (for graph classification). |
|
Register a customized GNN model. |
|
Register a customized configuration group. |
|
Register a customized PyG data loader. |
|
Register a customized optimizer. |
|
Register a customized learning rate scheduler. |
|
Register a customized loss function. |
|
Register a customized training function. |
- register_base(key, module, module_dict)[source]¶
Base function for registering a customized module to a module dictionary
- Parameters
key (string) – Name of the module
module – PyTorch module
module_dict (dict) – Python dictionary, hosting all the registered modules
- register_act(key, module)[source]¶
Register a customized activation function. After registeration, the module can be directly called by GraphGym.
- Parameters
key (string) – Name of the module
module – PyTorch module
- register_node_encoder(key, module)[source]¶
Register a customized node feature encoder. After registeration, the module can be directly called by GraphGym.
- Parameters
key (string) – Name of the module
module – PyTorch module
- register_edge_encoder(key, module)[source]¶
Register a customized edge feature encoder. After registeration, the module can be directly called by GraphGym.
- Parameters
key (string) – Name of the module
module – PyTorch module
- register_stage(key, module)[source]¶
Register a customized GNN stage (consists of multiple layers). After registeration, the module can be directly called by GraphGym.
- Parameters
key (string) – Name of the module
module – PyTorch module
- register_head(key, module)[source]¶
Register a customized GNN prediction head. After registeration, the module can be directly called by GraphGym.
- Parameters
key (string) – Name of the module
module – PyTorch module
- register_layer(key, module)[source]¶
Register a customized GNN layer. After registeration, the module can be directly called by GraphGym.
- Parameters
key (string) – Name of the module
module – PyTorch module
- register_pooling(key, module)[source]¶
Register a customized GNN pooling layer (for graph classification). After registeration, the module can be directly called by GraphGym.
- Parameters
key (string) – Name of the module
module – PyTorch module
- register_network(key, module)[source]¶
Register a customized GNN model. After registeration, the module can be directly called by GraphGym.
- Parameters
key (string) – Name of the module
module – PyTorch module
- register_config(key, module)[source]¶
Register a customized configuration group. After registeration, the module can be directly called by GraphGym.
- Parameters
key (string) – Name of the module
module – PyTorch module
- register_loader(key, module)[source]¶
Register a customized PyG data loader. After registeration, the module can be directly called by GraphGym.
- Parameters
key (string) – Name of the module
module – PyTorch module
- register_optimizer(key, module)[source]¶
Register a customized optimizer. After registeration, the module can be directly called by GraphGym.
- Parameters
key (string) – Name of the module
module – PyTorch module
- register_scheduler(key, module)[source]¶
Register a customized learning rate scheduler. After registeration, the module can be directly called by GraphGym.
- Parameters
key (string) – Name of the module
module – PyTorch module
Utility Modules¶
Aggregate over different random seeds of a single experiment |
|
Aggregate across results from multiple experiments via grid search |
|
Computes the number of parameters. |
|
Match the computational budget of a given baseline model. |
|
Get the current GPU memory usage. |
|
Auto select device for the experiment. |
|
Determines if the model should be evaluated at the current epoch. |
|
Determines if the model should be evaluated at the current epoch. |
|
Dump a Python dictionary to JSON file |
|
Dump a list of Python dictionaries to JSON file |
|
Add a dictionary of statistics to a Tensorboard writer |
|
Make a directory, remove any existing data. |
|
Default context manager that does nothing |
- agg_runs(dir, metric_best='auto')[source]¶
Aggregate over different random seeds of a single experiment
- agg_batch(dir, metric_best='auto')[source]¶
Aggregate across results from multiple experiments via grid search
- params_count(model)[source]¶
Computes the number of parameters.
- Parameters
model (nn.Module) – PyTorch model
- match_baseline_cfg(cfg_dict, cfg_dict_baseline, verbose=True)[source]¶
Match the computational budget of a given baseline model. THe current configuration dictionary will be modifed and returned.
- auto_select_device(memory_max=8000, memory_bias=200, strategy='random')[source]¶
Auto select device for the experiment. Useful when having multiple GPUs.
- Parameters
memory_max (int) – Threshold of existing GPU memory usage. GPUs with
usage beyond this threshold will be deprioritized. (memory) –
memory_bias (int) – A bias GPU memory usage added to all the GPUs.
dvided by zero error. (Avoild) –
strategy (str, optional) – ‘random’ (random select GPU) or ‘greedy’
select GPU) ((greedily) –
- dict_list_to_json(dict_list, fname)[source]¶
Dump a list of Python dictionaries to JSON file
- Parameters
dict_list (list of dict) – List of Python dictionaries
fname (str) – Output file name