torch_geometric.graphgym

Workflow and Register Modules

load_ckpt

Loads the model checkpoint at a given epoch.

save_ckpt

Saves the model checkpoint at a given epoch.

remove_ckpt

Removes the model checkpoint at a given epoch.

clean_ckpt

Removes all but the last model checkpoint.

parse_args

Parses the command line arguments.

cfg

set_cfg

This function sets the default config value.

load_cfg

Load configurations from file system and command line

dump_cfg

Dumps the config to the output directory specified in cfg.out_dir

set_run_dir

Create the directory for each random seed experiment run

set_out_dir

Create the directory for full experiment run

get_fname

Extract filename from file name path

init_weights

Performs weight initialization

create_loader

Create data loader object

set_printing

Set up printing options

create_logger

Create logger for the experiment.

compute_loss

Compute loss and prediction score

create_model

Create model for graph machine learning.

create_optimizer

Creates a config-driven optimizer.

create_scheduler

Creates a config-driven learning rate scheduler.

train

register_base

Base function for registering a module in GraphGym.

register_act

Registers an activation function in GraphGym.

register_node_encoder

Registers a node feature encoder in GraphGym.

register_edge_encoder

Registers an edge feature encoder in GraphGym.

register_stage

Registers a customized GNN stage in GraphGym.

register_head

Registers a GNN prediction head in GraphGym.

register_layer

Registers a GNN layer in GraphGym.

register_pooling

Registers a GNN global pooling/readout layer in GraphGym.

register_network

Registers a GNN model in GraphGym.

register_config

Registers a configuration group in GraphGym.

register_dataset

Registers a dataset in GraphGym.

register_loader

Registers a data loader in GraphGym.

register_optimizer

Registers an optimizer in GraphGym.

register_scheduler

Registers a learning rate scheduler in GraphGym.

register_loss

Registers a loss function in GraphGym.

register_train

Registers a training function in GraphGym.

register_metric

Register a metric function in GraphGym.

load_ckpt(model: Module, optimizer: Optional[Optimizer] = None, scheduler: Optional[Any] = None, epoch: int = - 1) int[source]

Loads the model checkpoint at a given epoch.

save_ckpt(model: Module, optimizer: Optional[Optimizer] = None, scheduler: Optional[Any] = None, epoch: int = 0)[source]

Saves the model checkpoint at a given epoch.

remove_ckpt(epoch: int = - 1)[source]

Removes the model checkpoint at a given epoch.

clean_ckpt()[source]

Removes all but the last model checkpoint.

parse_args() Namespace[source]

Parses the command line arguments.

set_cfg(cfg)[source]

This function sets the default config value. 1) Note that for an experiment, only part of the arguments will be used The remaining unused arguments won’t affect anything. So feel free to register any argument in graphgym.contrib.config 2) We support at most two levels of configs, e.g., cfg.dataset.name

Returns

configuration use by the experiment.

load_cfg(cfg, args)[source]

Load configurations from file system and command line

Parameters
  • cfg (CfgNode) – Configuration node

  • args (ArgumentParser) – Command argument parser

dump_cfg(cfg)[source]

Dumps the config to the output directory specified in cfg.out_dir

Parameters

cfg (CfgNode) – Configuration node

set_run_dir(out_dir)[source]

Create the directory for each random seed experiment run

Parameters
  • out_dir (string) – Directory for output, specified in cfg.out_dir

  • fname (string) – Filename for the yaml format configuration file

set_out_dir(out_dir, fname)[source]

Create the directory for full experiment run

Parameters
  • out_dir (string) – Directory for output, specified in cfg.out_dir

  • fname (string) – Filename for the yaml format configuration file

get_fname(fname)[source]

Extract filename from file name path

Parameters

fname (string) – Filename for the yaml format configuration file

init_weights(m)[source]

Performs weight initialization

Parameters

m (nn.Module) – PyTorch module

create_loader()[source]

Create data loader object

Returns: List of PyTorch data loaders

set_printing()[source]

Set up printing options

create_logger()[source]

Create logger for the experiment.

compute_loss(pred, true)[source]

Compute loss and prediction score

Parameters
  • pred (torch.tensor) – Unnormalized prediction

  • true (torch.tensor) – Grou

Returns: Loss, normalized prediction score

create_model(to_device=True, dim_in=None, dim_out=None) GraphGymModule[source]

Create model for graph machine learning.

Parameters
  • to_device (string) – The devide that the model will be transferred to

  • dim_in (int, optional) – Input dimension to the model

  • dim_out (int, optional) – Output dimension to the model

create_optimizer(params: Iterator[Parameter], cfg: Any) Any[source]

Creates a config-driven optimizer.

create_scheduler(optimizer: Optimizer, cfg: Any) Any[source]

Creates a config-driven learning rate scheduler.

register_base(mapping: Dict[str, Any], key: str, module: Optional[Any] = None) Union[None, Callable][source]

Base function for registering a module in GraphGym.

Parameters
  • mapping (dict) – Python dictionary to register the module. hosting all the registered modules

  • key (string) – The name of the module.

  • module (any, optional) – The module. If set to None, will return a decorator to register a module.

register_act(key: str, module: Optional[Any] = None)[source]

Registers an activation function in GraphGym.

register_node_encoder(key: str, module: Optional[Any] = None)[source]

Registers a node feature encoder in GraphGym.

register_edge_encoder(key: str, module: Optional[Any] = None)[source]

Registers an edge feature encoder in GraphGym.

register_stage(key: str, module: Optional[Any] = None)[source]

Registers a customized GNN stage in GraphGym.

register_head(key: str, module: Optional[Any] = None)[source]

Registers a GNN prediction head in GraphGym.

register_layer(key: str, module: Optional[Any] = None)[source]

Registers a GNN layer in GraphGym.

register_pooling(key: str, module: Optional[Any] = None)[source]

Registers a GNN global pooling/readout layer in GraphGym.

register_network(key: str, module: Optional[Any] = None)[source]

Registers a GNN model in GraphGym.

register_config(key: str, module: Optional[Any] = None)[source]

Registers a configuration group in GraphGym.

register_dataset(key: str, module: Optional[Any] = None)[source]

Registers a dataset in GraphGym.

register_loader(key: str, module: Optional[Any] = None)[source]

Registers a data loader in GraphGym.

register_optimizer(key: str, module: Optional[Any] = None)[source]

Registers an optimizer in GraphGym.

register_scheduler(key: str, module: Optional[Any] = None)[source]

Registers a learning rate scheduler in GraphGym.

register_loss(key: str, module: Optional[Any] = None)[source]

Registers a loss function in GraphGym.

register_train(key: str, module: Optional[Any] = None)[source]

Registers a training function in GraphGym.

register_metric(key: str, module: Optional[Any] = None)[source]

Register a metric function in GraphGym.

Model Modules

IntegerFeatureEncoder

Provides an encoder for integer node features.

AtomEncoder

The atom Encoder used in OGB molecule dataset.

BondEncoder

The bond Encoder used in OGB molecule dataset.

GNNLayer

Wrapper for a GNN layer

GNNPreMP

Wrapper for NN layer before GNN message passing

GNNStackStage

Simple Stage that stack GNN layers

FeatureEncoder

Encoding node and edge features

GNN

General GNN model: encoder + stage + head

GNNNodeHead

GNN prediction head for node prediction tasks.

GNNEdgeHead

GNN prediction head for edge/link prediction tasks.

GNNGraphHead

GNN prediction head for graph prediction tasks.

GeneralLayer

General wrapper for layers

GeneralMultiLayer

General wrapper for a stack of multiple layers

Linear

Basic Linear layer.

BatchNorm1dNode

BatchNorm for node feature.

BatchNorm1dEdge

BatchNorm for edge feature.

MLP

Basic MLP model.

GCNConv

Graph Convolutional Network (GCN) layer

SAGEConv

GraphSAGE Conv layer

GATConv

Graph Attention Network (GAT) layer

GINConv

Graph Isomorphism Network (GIN) layer

SplineConv

SplineCNN layer

GeneralConv

A general GNN layer

GeneralEdgeConv

A general GNN layer that supports edge features as well

GeneralSampleEdgeConv

A general GNN layer that supports edge features and edge sampling

global_add_pool

Returns batch-wise graph-level-outputs by adding node features across the node dimension, so that for a single graph \(\mathcal{G}_i\) its output is computed by

global_mean_pool

Returns batch-wise graph-level-outputs by averaging node features across the node dimension, so that for a single graph \(\mathcal{G}_i\) its output is computed by

global_max_pool

Returns batch-wise graph-level-outputs by taking the channel-wise maximum across the node dimension, so that for a single graph \(\mathcal{G}_i\) its output is computed by

class IntegerFeatureEncoder(emb_dim, num_classes=None)[source]

Provides an encoder for integer node features.

Parameters
  • emb_dim (int) – Output embedding dimension

  • num_classes (int) – the number of classes for the

  • from (embedding mapping to learn) –

class AtomEncoder(emb_dim, num_classes=None)[source]

The atom Encoder used in OGB molecule dataset.

Parameters
  • emb_dim (int) – Output embedding dimension

  • num_classes – None

class BondEncoder(emb_dim)[source]

The bond Encoder used in OGB molecule dataset.

Parameters

emb_dim (int) – Output edge embedding dimension

GNNLayer(dim_in, dim_out, has_act=True)[source]

Wrapper for a GNN layer

Parameters
  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension

  • has_act (bool) – Whether has activation function after the layer

GNNPreMP(dim_in, dim_out, num_layers)[source]

Wrapper for NN layer before GNN message passing

Parameters
  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension

  • num_layers (int) – Number of layers

class GNNStackStage(dim_in, dim_out, num_layers)[source]

Simple Stage that stack GNN layers

Parameters
  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension

  • num_layers (int) – Number of GNN layers

class FeatureEncoder(dim_in)[source]

Encoding node and edge features

Parameters

dim_in (int) – Input feature dimension

class GNN(dim_in, dim_out, **kwargs)[source]

General GNN model: encoder + stage + head

Parameters
  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension

  • **kwargs (optional) – Optional additional args

class GNNNodeHead(dim_in, dim_out)[source]

GNN prediction head for node prediction tasks.

Parameters
  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension. For binary prediction, dim_out=1.

class GNNEdgeHead(dim_in, dim_out)[source]

GNN prediction head for edge/link prediction tasks.

Parameters
  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension. For binary prediction, dim_out=1.

class GNNGraphHead(dim_in, dim_out)[source]

GNN prediction head for graph prediction tasks. The optional post_mp layer (specified by cfg.gnn.post_mp) is used to transform the pooled embedding using an MLP.

Parameters
  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension. For binary prediction, dim_out=1.

class GeneralLayer(name, layer_config: LayerConfig, **kwargs)[source]

General wrapper for layers

Parameters
  • name (string) – Name of the layer in registered layer_dict

  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension

  • has_act (bool) – Whether has activation after the layer

  • has_bn (bool) – Whether has BatchNorm in the layer

  • has_l2norm (bool) – Wheter has L2 normalization after the layer

  • **kwargs (optional) – Additional args

class GeneralMultiLayer(name, layer_config: LayerConfig, **kwargs)[source]

General wrapper for a stack of multiple layers

Parameters
  • name (string) – Name of the layer in registered layer_dict

  • num_layers (int) – Number of layers in the stack

  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension

  • dim_inner (int) – The dimension for the inner layers

  • final_act (bool) – Whether has activation after the layer stack

  • **kwargs (optional) – Additional args

class Linear(layer_config: LayerConfig, **kwargs)[source]

Basic Linear layer.

Parameters
  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension

  • bias (bool) – Whether has bias term

  • **kwargs (optional) – Additional args

class BatchNorm1dNode(layer_config: LayerConfig)[source]

BatchNorm for node feature.

Parameters

dim_in (int) – Input dimension

class BatchNorm1dEdge(layer_config: LayerConfig)[source]

BatchNorm for edge feature.

Parameters

dim_in (int) – Input dimension

class MLP(layer_config: LayerConfig, **kwargs)[source]

Basic MLP model. Here 1-layer MLP is equivalent to a Liner layer.

Parameters
  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension

  • bias (bool) – Whether has bias term

  • dim_inner (int) – The dimension for the inner layers

  • num_layers (int) – Number of layers in the stack

  • **kwargs (optional) – Additional args

class GCNConv(layer_config: LayerConfig, **kwargs)[source]

Graph Convolutional Network (GCN) layer

class SAGEConv(layer_config: LayerConfig, **kwargs)[source]

GraphSAGE Conv layer

class GATConv(layer_config: LayerConfig, **kwargs)[source]

Graph Attention Network (GAT) layer

class GINConv(layer_config: LayerConfig, **kwargs)[source]

Graph Isomorphism Network (GIN) layer

class SplineConv(layer_config: LayerConfig, **kwargs)[source]

SplineCNN layer

class GeneralConv(layer_config: LayerConfig, **kwargs)[source]

A general GNN layer

class GeneralEdgeConv(layer_config: LayerConfig, **kwargs)[source]

A general GNN layer that supports edge features as well

class GeneralSampleEdgeConv(layer_config: LayerConfig, **kwargs)[source]

A general GNN layer that supports edge features and edge sampling

global_add_pool(x: Tensor, batch: Optional[Tensor], size: Optional[int] = None) Tensor[source]

Returns batch-wise graph-level-outputs by adding node features across the node dimension, so that for a single graph \(\mathcal{G}_i\) its output is computed by

\[\mathbf{r}_i = \sum_{n=1}^{N_i} \mathbf{x}_n\]
Parameters
  • x (Tensor) – Node feature matrix \(\mathbf{X} \in \mathbb{R}^{(N_1 + \ldots + N_B) \times F}\).

  • batch (LongTensor, optional) – Batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which assigns each node to a specific example.

  • size (int, optional) – Batch-size \(B\). Automatically calculated if not given. (default: None)

global_mean_pool(x: Tensor, batch: Optional[Tensor], size: Optional[int] = None) Tensor[source]

Returns batch-wise graph-level-outputs by averaging node features across the node dimension, so that for a single graph \(\mathcal{G}_i\) its output is computed by

\[\mathbf{r}_i = \frac{1}{N_i} \sum_{n=1}^{N_i} \mathbf{x}_n\]
Parameters
  • x (Tensor) – Node feature matrix \(\mathbf{X} \in \mathbb{R}^{(N_1 + \ldots + N_B) \times F}\).

  • batch (LongTensor, optional) – Batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which assigns each node to a specific example.

  • size (int, optional) – Batch-size \(B\). Automatically calculated if not given. (default: None)

global_max_pool(x: Tensor, batch: Optional[Tensor], size: Optional[int] = None) Tensor[source]

Returns batch-wise graph-level-outputs by taking the channel-wise maximum across the node dimension, so that for a single graph \(\mathcal{G}_i\) its output is computed by

\[\mathbf{r}_i = \mathrm{max}_{n=1}^{N_i} \, \mathbf{x}_n\]
Parameters
  • x (Tensor) – Node feature matrix \(\mathbf{X} \in \mathbb{R}^{(N_1 + \ldots + N_B) \times F}\).

  • batch (LongTensor, optional) – Batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which assigns each node to a specific example.

  • size (int, optional) – Batch-size \(B\). Automatically calculated if not given. (default: None)

Utility Modules

agg_runs

Aggregate over different random seeds of a single experiment

agg_batch

Aggregate across results from multiple experiments via grid search

params_count

Computes the number of parameters.

match_baseline_cfg

Match the computational budget of a given baseline model.

get_current_gpu_usage

Get the current GPU memory usage.

auto_select_device

Auto select device for the experiment.

is_eval_epoch

Determines if the model should be evaluated at the current epoch.

is_ckpt_epoch

Determines if the model should be evaluated at the current epoch.

dict_to_json

Dump a Python dictionary to JSON file

dict_list_to_json

Dump a list of Python dictionaries to JSON file

dict_to_tb

Add a dictionary of statistics to a Tensorboard writer

makedirs_rm_exist

Make a directory, remove any existing data.

dummy_context

Default context manager that does nothing

agg_runs(dir, metric_best='auto')[source]

Aggregate over different random seeds of a single experiment

Parameters
  • dir (str) – Directory of the results, containing 1 experiment

  • metric_best (str, optional) – The metric for selecting the best

  • Options (validation performance.) – auto, accuracy, auc.

agg_batch(dir, metric_best='auto')[source]

Aggregate across results from multiple experiments via grid search

Parameters
  • dir (str) – Directory of the results, containing multiple experiments

  • metric_best (str, optional) – The metric for selecting the best

  • Options (validation performance.) – auto, accuracy, auc.

params_count(model)[source]

Computes the number of parameters.

Parameters

model (nn.Module) – PyTorch model

match_baseline_cfg(cfg_dict, cfg_dict_baseline, verbose=True)[source]

Match the computational budget of a given baseline model. THe current configuration dictionary will be modifed and returned.

Parameters
  • cfg_dict (dict) – Current experiment’s configuration

  • cfg_dict_baseline (dict) – Baseline configuration

  • verbose (str, optional) – If printing matched paramter conunts

get_current_gpu_usage()[source]

Get the current GPU memory usage.

auto_select_device(memory_max=8000, memory_bias=200, strategy='random')[source]

Auto select device for the experiment. Useful when having multiple GPUs.

Parameters
  • memory_max (int) – Threshold of existing GPU memory usage. GPUs with

  • deprioritized. (memory usage beyond this threshold will be) –

  • memory_bias (int) – A bias GPU memory usage added to all the GPUs.

  • error. (Avoild dvided by zero) –

  • strategy (str, optional) – ‘random’ (random select GPU) or ‘greedy’

  • GPU) ((greedily select) –

is_eval_epoch(cur_epoch)[source]

Determines if the model should be evaluated at the current epoch.

is_ckpt_epoch(cur_epoch)[source]

Determines if the model should be evaluated at the current epoch.

dict_to_json(dict, fname)[source]

Dump a Python dictionary to JSON file

Parameters
  • dict (dict) – Python dictionary

  • fname (str) – Output file name

dict_list_to_json(dict_list, fname)[source]

Dump a list of Python dictionaries to JSON file

Parameters
  • dict_list (list of dict) – List of Python dictionaries

  • fname (str) – Output file name

dict_to_tb(dict, writer, epoch)[source]

Add a dictionary of statistics to a Tensorboard writer

Parameters
  • dict (dict) – Statistics of experiments, the keys are attribute names,

  • values (the values are the attribute) –

  • writer – Tensorboard writer object

  • epoch (int) – The current epoch

makedirs_rm_exist(dir)[source]

Make a directory, remove any existing data.

Parameters

dir (str) – The directory to be created.

Returns:

class dummy_context[source]

Default context manager that does nothing