torch_geometric.graphgym

Workflow and Register Modules

load_ckpt

Loads the model checkpoint at a given epoch.

save_ckpt

Saves the model checkpoint at a given epoch.

remove_ckpt

Removes the model checkpoint at a given epoch.

clean_ckpt

Removes all but the last model checkpoint.

parse_args

Parses the command line arguments.

cfg

set_cfg

This function sets the default config value.

load_cfg

Load configurations from file system and command line.

dump_cfg

Dumps the config to the output directory specified in cfg.out_dir.

set_run_dir

Create the directory for each random seed experiment run.

set_out_dir

Create the directory for full experiment run.

get_fname

Extract filename from file name path.

init_weights

Performs weight initialization.

create_loader

Create data loader object.

set_printing

Set up printing options.

create_logger

Create logger for the experiment.

compute_loss

Compute loss and prediction score.

create_model

Create model for graph machine learning.

create_optimizer

Creates a config-driven optimizer.

create_scheduler

Creates a config-driven learning rate scheduler.

train

Trains a GraphGym model using PyTorch Lightning.

register_base

Base function for registering a module in GraphGym.

register_act

Registers an activation function in GraphGym.

register_node_encoder

Registers a node feature encoder in GraphGym.

register_edge_encoder

Registers an edge feature encoder in GraphGym.

register_stage

Registers a customized GNN stage in GraphGym.

register_head

Registers a GNN prediction head in GraphGym.

register_layer

Registers a GNN layer in GraphGym.

register_pooling

Registers a GNN global pooling/readout layer in GraphGym.

register_network

Registers a GNN model in GraphGym.

register_config

Registers a configuration group in GraphGym.

register_dataset

Registers a dataset in GraphGym.

register_loader

Registers a data loader in GraphGym.

register_optimizer

Registers an optimizer in GraphGym.

register_scheduler

Registers a learning rate scheduler in GraphGym.

register_loss

Registers a loss function in GraphGym.

register_train

Registers a training function in GraphGym.

register_metric

Register a metric function in GraphGym.

load_ckpt(model: Module, optimizer: Optional[Optimizer] = None, scheduler: Optional[Any] = None, epoch: int = -1) int[source]

Loads the model checkpoint at a given epoch.

save_ckpt(model: Module, optimizer: Optional[Optimizer] = None, scheduler: Optional[Any] = None, epoch: int = 0)[source]

Saves the model checkpoint at a given epoch.

remove_ckpt(epoch: int = -1)[source]

Removes the model checkpoint at a given epoch.

clean_ckpt()[source]

Removes all but the last model checkpoint.

parse_args() Namespace[source]

Parses the command line arguments.

set_cfg(cfg)[source]

This function sets the default config value.

  1. Note that for an experiment, only part of the arguments will be used The remaining unused arguments won’t affect anything. So feel free to register any argument in graphgym.contrib.config

  2. We support at most two levels of configs, e.g., cfg.dataset.name.

Returns:

Configuration use by the experiment.

load_cfg(cfg, args)[source]

Load configurations from file system and command line.

Parameters:
  • cfg (CfgNode) – Configuration node

  • args (ArgumentParser) – Command argument parser

dump_cfg(cfg)[source]

Dumps the config to the output directory specified in cfg.out_dir.

Parameters:

cfg (CfgNode) – Configuration node

set_run_dir(out_dir)[source]

Create the directory for each random seed experiment run.

Parameters:

out_dir (str) – Directory for output, specified in cfg.out_dir

set_out_dir(out_dir, fname)[source]

Create the directory for full experiment run.

Parameters:
  • out_dir (str) – Directory for output, specified in cfg.out_dir

  • fname (str) – Filename for the yaml format configuration file

get_fname(fname)[source]

Extract filename from file name path.

Parameters:

fname (str) – Filename for the yaml format configuration file

init_weights(m)[source]

Performs weight initialization.

Parameters:

m (nn.Module) – PyTorch module

create_loader()[source]

Create data loader object.

Returns: List of PyTorch data loaders

set_printing()[source]

Set up printing options.

create_logger()[source]

Create logger for the experiment.

compute_loss(pred, true)[source]

Compute loss and prediction score.

Parameters:
  • pred (torch.tensor) – Unnormalized prediction

  • true (torch.tensor) – Grou

Returns: Loss, normalized prediction score

create_model(to_device=True, dim_in=None, dim_out=None) GraphGymModule[source]

Create model for graph machine learning.

Parameters:
  • to_device (bool, optional) – Whether to transfer the model to the specified device. (default: True)

  • dim_in (int, optional) – Input dimension to the model

  • dim_out (int, optional) – Output dimension to the model

create_optimizer(params: Iterator[Parameter], cfg: Any) Any[source]

Creates a config-driven optimizer.

create_scheduler(optimizer: Optimizer, cfg: Any) Any[source]

Creates a config-driven learning rate scheduler.

train(model: GraphGymModule, datamodule: GraphGymDataModule, logger: bool = True, trainer_config: Optional[Dict[str, Any]] = None)[source]

Trains a GraphGym model using PyTorch Lightning.

Parameters:
  • model (GraphGymModule) – The GraphGym model.

  • datamodule (GraphGymDataModule) – The GraphGym data module.

  • logger (bool, optional) – Whether to enable logging during training. (default: True)

  • trainer_config (dict, optional) – Additional trainer configuration.

register_base(mapping: Dict[str, Any], key: str, module: Optional[Any] = None) Union[None, Callable][source]

Base function for registering a module in GraphGym.

Parameters:
  • mapping (dict) – dictionary to register the module. hosting all the registered modules

  • key (str) – The name of the module.

  • module (any, optional) – The module. If set to None, will return a decorator to register a module.

register_act(key: str, module: Optional[Any] = None)[source]

Registers an activation function in GraphGym.

register_node_encoder(key: str, module: Optional[Any] = None)[source]

Registers a node feature encoder in GraphGym.

register_edge_encoder(key: str, module: Optional[Any] = None)[source]

Registers an edge feature encoder in GraphGym.

register_stage(key: str, module: Optional[Any] = None)[source]

Registers a customized GNN stage in GraphGym.

register_head(key: str, module: Optional[Any] = None)[source]

Registers a GNN prediction head in GraphGym.

register_layer(key: str, module: Optional[Any] = None)[source]

Registers a GNN layer in GraphGym.

register_pooling(key: str, module: Optional[Any] = None)[source]

Registers a GNN global pooling/readout layer in GraphGym.

register_network(key: str, module: Optional[Any] = None)[source]

Registers a GNN model in GraphGym.

register_config(key: str, module: Optional[Any] = None)[source]

Registers a configuration group in GraphGym.

register_dataset(key: str, module: Optional[Any] = None)[source]

Registers a dataset in GraphGym.

register_loader(key: str, module: Optional[Any] = None)[source]

Registers a data loader in GraphGym.

register_optimizer(key: str, module: Optional[Any] = None)[source]

Registers an optimizer in GraphGym.

register_scheduler(key: str, module: Optional[Any] = None)[source]

Registers a learning rate scheduler in GraphGym.

register_loss(key: str, module: Optional[Any] = None)[source]

Registers a loss function in GraphGym.

register_train(key: str, module: Optional[Any] = None)[source]

Registers a training function in GraphGym.

register_metric(key: str, module: Optional[Any] = None)[source]

Register a metric function in GraphGym.

Model Modules

IntegerFeatureEncoder

Provides an encoder for integer node features.

AtomEncoder

The atom encoder used in OGB molecule dataset.

BondEncoder

The bond encoder used in OGB molecule dataset.

GNNLayer

Creates a GNN layer, given the specified input and output dimensions and the underlying configuration in cfg.

GNNPreMP

Creates a NN layer used before message passing, given the specified input and output dimensions and the underlying configuration in cfg.

GNNStackStage

Stacks a number of GNN layers.

FeatureEncoder

Encodes node and edge features, given the specified input dimension and the underlying configuration in cfg.

GNN

A general Graph Neural Network (GNN) model.

GNNNodeHead

A GNN prediction head for node-level prediction tasks.

GNNEdgeHead

A GNN prediction head for edge-level/link-level prediction tasks.

GNNGraphHead

A GNN prediction head for graph-level prediction tasks.

GeneralLayer

A general wrapper for layers.

GeneralMultiLayer

A general wrapper class for a stacking multiple NN layers.

Linear

A basic Linear layer.

BatchNorm1dNode

A batch normalization layer for node-level features.

BatchNorm1dEdge

A batch normalization layer for edge-level features.

MLP

A basic MLP model.

GCNConv

A Graph Convolutional Network (GCN) layer.

SAGEConv

A GraphSAGE layer.

GATConv

A Graph Attention Network (GAT) layer.

GINConv

A Graph Isomorphism Network (GIN) layer.

SplineConv

A SplineCNN layer.

GeneralConv

A general GNN layer.

GeneralEdgeConv

A general GNN layer with edge feature support.

GeneralSampleEdgeConv

A general GNN layer that supports edge features and edge sampling.

global_add_pool

Returns batch-wise graph-level-outputs by adding node features across the node dimension.

global_mean_pool

Returns batch-wise graph-level-outputs by averaging node features across the node dimension.

global_max_pool

Returns batch-wise graph-level-outputs by taking the channel-wise maximum across the node dimension.

class IntegerFeatureEncoder(emb_dim: int, num_classes: int)[source]

Provides an encoder for integer node features.

Parameters:
  • emb_dim (int) – The output embedding dimension.

  • num_classes (int) – The number of classes/integers.

Example

>>> encoder = IntegerFeatureEncoder(emb_dim=16, num_classes=10)
>>> batch = torch.randint(0, 10, (10, 2))
>>> encoder(batch).size()
torch.Size([10, 16])
class AtomEncoder(emb_dim, *args, **kwargs)[source]

The atom encoder used in OGB molecule dataset.

Parameters:

emb_dim (int) – The output embedding dimension.

Example

>>> encoder = AtomEncoder(emb_dim=16)
>>> batch = torch.randint(0, 10, (10, 3))
>>> encoder(batch).size()
torch.Size([10, 16])
class BondEncoder(emb_dim: int)[source]

The bond encoder used in OGB molecule dataset.

Parameters:

emb_dim (int) – The output embedding dimension.

Example

>>> encoder = BondEncoder(emb_dim=16)
>>> batch = torch.randint(0, 10, (10, 3))
>>> encoder(batch).size()
torch.Size([10, 16])
GNNLayer(dim_in: int, dim_out: int, has_act: bool = True) GeneralLayer[source]

Creates a GNN layer, given the specified input and output dimensions and the underlying configuration in cfg.

Parameters:
  • dim_in (int) – The input dimension

  • dim_out (int) – The output dimension.

  • has_act (bool, optional) – Whether to apply an activation function after the layer. (default: True)

GNNPreMP(dim_in: int, dim_out: int, num_layers: int) GeneralMultiLayer[source]

Creates a NN layer used before message passing, given the specified input and output dimensions and the underlying configuration in cfg.

Parameters:
  • dim_in (int) – The input dimension

  • dim_out (int) – The output dimension.

  • num_layers (int) – The number of layers.

class GNNStackStage(dim_in, dim_out, num_layers)[source]

Stacks a number of GNN layers.

Parameters:
  • dim_in (int) – The input dimension

  • dim_out (int) – The output dimension.

  • num_layers (int) – The number of layers.

class FeatureEncoder(dim_in: int)[source]

Encodes node and edge features, given the specified input dimension and the underlying configuration in cfg.

Parameters:

dim_in (int) – The input feature dimension.

class GNN(dim_in: int, dim_out: int, **kwargs)[source]

A general Graph Neural Network (GNN) model.

The GNN model consists of three main components:

  1. An encoder to transform input features into a fixed-size embedding space.

  2. A processing or message passing stage for information exchange between nodes.

  3. A head to produce the final output features/predictions.

The configuration of each component is determined by the underlying configuration in cfg.

Parameters:
  • dim_in (int) – The input feature dimension.

  • dim_out (int) – The output feature dimension.

  • **kwargs (optional) – Additional keyword arguments.

class GNNNodeHead(dim_in: int, dim_out: int)[source]

A GNN prediction head for node-level prediction tasks.

Parameters:
  • dim_in (int) – The input feature dimension.

  • dim_out (int) – The output feature dimension.

class GNNEdgeHead(dim_in: int, dim_out: int)[source]

A GNN prediction head for edge-level/link-level prediction tasks.

Parameters:
  • dim_in (int) – The input feature dimension.

  • dim_out (int) – The output feature dimension.

class GNNGraphHead(dim_in: int, dim_out: int)[source]

A GNN prediction head for graph-level prediction tasks. A post message passing layer (as specified by cfg.gnn.post_mp) is used to transform the pooled graph-level embeddings using an MLP.

Parameters:
  • dim_in (int) – The input feature dimension.

  • dim_out (int) – The output feature dimension.

class GeneralLayer(name, layer_config: LayerConfig, **kwargs)[source]

A general wrapper for layers.

Parameters:
  • name (str) – The registered name of the layer.

  • layer_config (LayerConfig) – The configuration of the layer.

  • **kwargs (optional) – Additional keyword arguments.

class GeneralMultiLayer(name, layer_config: LayerConfig, **kwargs)[source]

A general wrapper class for a stacking multiple NN layers.

Parameters:
  • name (str) – The registered name of the layer.

  • layer_config (LayerConfig) – The configuration of the layer.

  • **kwargs (optional) – Additional keyword arguments.

class Linear(layer_config: LayerConfig, **kwargs)[source]

A basic Linear layer.

Parameters:
  • layer_config (LayerConfig) – The configuration of the layer.

  • **kwargs (optional) – Additional keyword arguments.

class BatchNorm1dNode(layer_config: LayerConfig)[source]

A batch normalization layer for node-level features.

Parameters:

layer_config (LayerConfig) – The configuration of the layer.

class BatchNorm1dEdge(layer_config: LayerConfig)[source]

A batch normalization layer for edge-level features.

Parameters:

layer_config (LayerConfig) – The configuration of the layer.

class MLP(layer_config: LayerConfig, **kwargs)[source]

A basic MLP model.

Parameters:
  • layer_config (LayerConfig) – The configuration of the layer.

  • **kwargs (optional) – Additional keyword arguments.

class GCNConv(layer_config: LayerConfig, **kwargs)[source]

A Graph Convolutional Network (GCN) layer.

class SAGEConv(layer_config: LayerConfig, **kwargs)[source]

A GraphSAGE layer.

class GATConv(layer_config: LayerConfig, **kwargs)[source]

A Graph Attention Network (GAT) layer.

class GINConv(layer_config: LayerConfig, **kwargs)[source]

A Graph Isomorphism Network (GIN) layer.

class SplineConv(layer_config: LayerConfig, **kwargs)[source]

A SplineCNN layer.

class GeneralConv(layer_config: LayerConfig, **kwargs)[source]

A general GNN layer.

class GeneralEdgeConv(layer_config: LayerConfig, **kwargs)[source]

A general GNN layer with edge feature support.

class GeneralSampleEdgeConv(layer_config: LayerConfig, **kwargs)[source]

A general GNN layer that supports edge features and edge sampling.

global_add_pool(x: Tensor, batch: Optional[Tensor], size: Optional[int] = None) Tensor[source]

Returns batch-wise graph-level-outputs by adding node features across the node dimension.

For a single graph \(\mathcal{G}_i\), its output is computed by

\[\mathbf{r}_i = \sum_{n=1}^{N_i} \mathbf{x}_n.\]

Functional method of the SumAggregation module.

Parameters:
  • x (torch.Tensor) – Node feature matrix \(\mathbf{X} \in \mathbb{R}^{(N_1 + \ldots + N_B) \times F}\).

  • batch (torch.Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which assigns each node to a specific example.

  • size (int, optional) – The number of examples \(B\). Automatically calculated if not given. (default: None)

global_mean_pool(x: Tensor, batch: Optional[Tensor], size: Optional[int] = None) Tensor[source]

Returns batch-wise graph-level-outputs by averaging node features across the node dimension.

For a single graph \(\mathcal{G}_i\), its output is computed by

\[\mathbf{r}_i = \frac{1}{N_i} \sum_{n=1}^{N_i} \mathbf{x}_n.\]

Functional method of the MeanAggregation module.

Parameters:
  • x (torch.Tensor) – Node feature matrix \(\mathbf{X} \in \mathbb{R}^{(N_1 + \ldots + N_B) \times F}\).

  • batch (torch.Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which assigns each node to a specific example.

  • size (int, optional) – The number of examples \(B\). Automatically calculated if not given. (default: None)

global_max_pool(x: Tensor, batch: Optional[Tensor], size: Optional[int] = None) Tensor[source]

Returns batch-wise graph-level-outputs by taking the channel-wise maximum across the node dimension.

For a single graph \(\mathcal{G}_i\), its output is computed by

\[\mathbf{r}_i = \mathrm{max}_{n=1}^{N_i} \, \mathbf{x}_n.\]

Functional method of the MaxAggregation module.

Parameters:
  • x (torch.Tensor) – Node feature matrix \(\mathbf{X} \in \mathbb{R}^{(N_1 + \ldots + N_B) \times F}\).

  • batch (torch.Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which assigns each element to a specific example.

  • size (int, optional) – The number of examples \(B\). Automatically calculated if not given. (default: None)

Utility Modules

agg_runs

Aggregate over different random seeds of a single experiment.

agg_batch

Aggregate across results from multiple experiments via grid search.

params_count

Computes the number of parameters.

match_baseline_cfg

Match the computational budget of a given baseline model.

get_current_gpu_usage

Get the current GPU memory usage.

auto_select_device

Auto select device for the current experiment.

is_eval_epoch

Determines if the model should be evaluated at the current epoch.

is_ckpt_epoch

Determines if the model should be evaluated at the current epoch.

dict_to_json

Dump a dictionary to a JSON file.

dict_list_to_json

Dump a list of dictionaries to a JSON file.

dict_to_tb

Add a dictionary of statistics to a Tensorboard writer.

makedirs_rm_exist

Make a directory, remove any existing data.

dummy_context

Default context manager that does nothing.

agg_runs(dir, metric_best='auto')[source]

Aggregate over different random seeds of a single experiment.

Parameters:
  • dir (str) – Directory of the results, containing 1 experiment

  • metric_best (str, optional) – The metric for selecting the best

  • Options (validation performance.) – auto, accuracy, auc.

agg_batch(dir, metric_best='auto')[source]

Aggregate across results from multiple experiments via grid search.

Parameters:
  • dir (str) – Directory of the results, containing multiple experiments

  • metric_best (str, optional) – The metric for selecting the best

  • Options (validation performance.) – auto, accuracy, auc.

params_count(model)[source]

Computes the number of parameters.

Parameters:

model (nn.Module) – PyTorch model

match_baseline_cfg(cfg_dict, cfg_dict_baseline, verbose=True)[source]

Match the computational budget of a given baseline model. The current configuration dictionary will be modifed and returned.

Parameters:
  • cfg_dict (dict) – Current experiment’s configuration

  • cfg_dict_baseline (dict) – Baseline configuration

  • verbose (str, optional) – If printing matched paramter conunts

get_current_gpu_usage()[source]

Get the current GPU memory usage.

auto_select_device()[source]

Auto select device for the current experiment.

is_eval_epoch(cur_epoch)[source]

Determines if the model should be evaluated at the current epoch.

is_ckpt_epoch(cur_epoch)[source]

Determines if the model should be evaluated at the current epoch.

dict_to_json(dict, fname)[source]

Dump a dictionary to a JSON file.

Parameters:
  • dict (dict) – The dictionary.

  • fname (str) – The output file name.

dict_list_to_json(dict_list, fname)[source]

Dump a list of dictionaries to a JSON file.

Parameters:
  • dict_list (list of dict) – List of dictionaries.

  • fname (str) – the output file name.

dict_to_tb(dict, writer, epoch)[source]

Add a dictionary of statistics to a Tensorboard writer.

Parameters:
  • dict (dict) – Statistics of experiments, the keys are attribute names,

  • values (the values are the attribute) –

  • writer – Tensorboard writer object

  • epoch (int) – The current epoch

makedirs_rm_exist(dir)[source]

Make a directory, remove any existing data.

Parameters:

dir (str) – The directory to be created.

class dummy_context[source]

Default context manager that does nothing.