torch_geometric.graphgym

Workflow Modules

load_ckpt

Load latest model checkpoint

save_ckpt

Save model checkpoint at given epoch

clean_ckpt

Only keep the latest model checkpoint, remove all the older checkpoints

parse_args

Parses the command line arguments.

cfg

CfgNode represents an internal node in the configuration tree.

set_cfg

This function sets the default config value.

load_cfg

Load configurations from file system and command line

dump_cfg

Dumps the config to the output directory specified in cfg.out_dir

set_run_dir

Create the directory for each random seed experiment run

set_agg_dir

Create the directory for aggregated results over all the random seeds

init_weights

Performs weight initialization

create_loader

Create data loader object

set_printing

Set up printing options

create_logger

Create logger for the experiment

compute_loss

Compute loss and prediction score

create_model

Create model for graph machine learning

create_optimizer

Create optimizer for the model

create_scheduler

Create learning rate scheduler for the optimizer

train

The core training pipeline

load_ckpt(model, optimizer=None, scheduler=None)[source]

Load latest model checkpoint

Parameters
  • model (torch.nn.Module) – The model that will be loaded

  • optimizer (torch.optim, optional) – The optimizer that will be loaded

  • scheduler (torch.optim, optional) – The schduler that will be loaded

Returns

Epoch count after loading the model

save_ckpt(model, optimizer, scheduler, epoch)[source]

Save model checkpoint at given epoch

Parameters
  • model (torch.nn.Module) – The model that will be saved

  • optimizer (torch.optim) – The optimizer that will be saved

  • scheduler (torch.optim) – The schduler that will be saved

  • epoch (int) – The epoch when the model is saved

clean_ckpt()[source]

Only keep the latest model checkpoint, remove all the older checkpoints

parse_args()[source]

Parses the command line arguments.

set_cfg(cfg)[source]

This function sets the default config value. 1) Note that for an experiment, only part of the arguments will be used The remaining unused arguments won’t affect anything. So feel free to register any argument in graphgym.contrib.config 2) We support at most two levels of configs, e.g., cfg.dataset.name

Returns

configuration use by the experiment.

load_cfg(cfg, args)[source]

Load configurations from file system and command line

Parameters
  • cfg (CfgNode) – Configuration node

  • args (ArgumentParser) – Command argument parser

dump_cfg(cfg)[source]

Dumps the config to the output directory specified in cfg.out_dir

Parameters

cfg (CfgNode) – Configuration node

set_run_dir(out_dir, fname)[source]

Create the directory for each random seed experiment run

Parameters
  • out_dir (string) – Directory for output, specified in cfg.out_dir

  • fname (string) – Filename for the yaml format configuration file

set_agg_dir(out_dir, fname)[source]

Create the directory for aggregated results over all the random seeds

Parameters
  • out_dir (string) – Directory for output, specified in cfg.out_dir

  • fname (string) – Filename for the yaml format configuration file

init_weights(m)[source]

Performs weight initialization

Parameters

m (nn.Module) – PyTorch module

create_loader()[source]

Create data loader object

Returns: List of PyTorch data loaders

set_printing()[source]

Set up printing options

create_logger()[source]

Create logger for the experiment

Returns: List of logger objects

compute_loss(pred, true)[source]

Compute loss and prediction score

Parameters
  • pred (torch.tensor) – Unnormalized prediction

  • true (torch.tensor) – Grou

Returns: Loss, normalized prediction score

create_model(to_device=True, dim_in=None, dim_out=None)[source]

Create model for graph machine learning

Parameters
  • to_device (string) – The devide that the model will be transferred to

  • dim_in (int, optional) – Input dimension to the model

  • dim_out (int, optional) – Output dimension to the model

create_optimizer(params, optimizer_config: torch_geometric.graphgym.optimizer.OptimizerConfig)[source]

Create optimizer for the model

Parameters

params – PyTorch model parameters

Returns: PyTorch optimizer

create_scheduler(optimizer, scheduler_config: torch_geometric.graphgym.optimizer.SchedulerConfig)[source]

Create learning rate scheduler for the optimizer

Parameters

optimizer – PyTorch optimizer

Returns: PyTorch scheduler

train(loggers, loaders, model, optimizer, scheduler)[source]

The core training pipeline

Parameters
  • loggers – List of loggers

  • loaders – List of loaders

  • model – GNN model

  • optimizer – PyTorch optimizer

  • scheduler – PyTorch learning rate scheduler

Model Modules

IntegerFeatureEncoder

Provides an encoder for integer node features.

AtomEncoder

The atom Encoder used in OGB molecule dataset.

BondEncoder

The bond Encoder used in OGB molecule dataset.

GNNLayer

Wrapper for a GNN layer

GNNPreMP

Wrapper for NN layer before GNN message passing

GNNStackStage

Simple Stage that stack GNN layers

FeatureEncoder

Encoding node and edge features

GNN

General GNN model: encoder + stage + head

GNNNodeHead

GNN prediction head for node prediction tasks.

GNNEdgeHead

GNN prediction head for edge/link prediction tasks.

GNNGraphHead

GNN prediction head for graph prediction tasks.

GeneralLayer

General wrapper for layers

GeneralMultiLayer

General wrapper for a stack of multiple layers

Linear

Basic Linear layer.

BatchNorm1dNode

BatchNorm for node feature.

BatchNorm1dEdge

BatchNorm for edge feature.

MLP

Basic MLP model.

GCNConv

Graph Convolutional Network (GCN) layer

SAGEConv

GraphSAGE Conv layer

GATConv

Graph Attention Network (GAT) layer

GINConv

Graph Isomorphism Network (GIN) layer

SplineConv

SplineCNN layer

GeneralConv

A general GNN layer

GeneralEdgeConv

A general GNN layer that supports edge features as well

GeneralSampleEdgeConv

A general GNN layer that supports edge features and edge sampling

global_add_pool

Globally pool node embeddings into graph embeddings, via elementwise sum.

global_mean_pool

Globally pool node embeddings into graph embeddings, via elementwise mean.

global_max_pool

Globally pool node embeddings into graph embeddings, via elementwise max.

class IntegerFeatureEncoder(emb_dim, num_classes=None)[source]

Provides an encoder for integer node features.

Parameters
  • emb_dim (int) – Output embedding dimension

  • num_classes (int) – the number of classes for the

  • mapping to learn from (embedding) –

class AtomEncoder(emb_dim, num_classes=None)[source]

The atom Encoder used in OGB molecule dataset.

Parameters
  • emb_dim (int) – Output embedding dimension

  • num_classes – None

class BondEncoder(emb_dim)[source]

The bond Encoder used in OGB molecule dataset.

Parameters

emb_dim (int) – Output edge embedding dimension

GNNLayer(dim_in, dim_out, has_act=True)[source]

Wrapper for a GNN layer

Parameters
  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension

  • has_act (bool) – Whether has activation function after the layer

GNNPreMP(dim_in, dim_out)[source]

Wrapper for NN layer before GNN message passing

Parameters
  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension

class GNNStackStage(dim_in, dim_out, num_layers)[source]

Simple Stage that stack GNN layers

Parameters
  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension

  • num_layers (int) – Number of GNN layers

class FeatureEncoder(dim_in)[source]

Encoding node and edge features

Parameters

dim_in (int) – Input feature dimension

class GNN(dim_in, dim_out, **kwargs)[source]

General GNN model: encoder + stage + head

Parameters
  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension

  • **kwargs (optional) – Optional additional args

class GNNNodeHead(dim_in, dim_out)[source]

GNN prediction head for node prediction tasks.

Parameters
  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension. For binary prediction, dim_out=1.

class GNNEdgeHead(dim_in, dim_out)[source]

GNN prediction head for edge/link prediction tasks.

Parameters
  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension. For binary prediction, dim_out=1.

class GNNGraphHead(dim_in, dim_out)[source]

GNN prediction head for graph prediction tasks. The optional post_mp layer (specified by cfg.gnn.post_mp) is used to transform the pooled embedding using an MLP.

Parameters
  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension. For binary prediction, dim_out=1.

class GeneralLayer(name, dim_in, dim_out, has_act=True, has_bn=True, has_l2norm=False, **kwargs)[source]

General wrapper for layers

Parameters
  • name (string) – Name of the layer in registered layer_dict

  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension

  • has_act (bool) – Whether has activation after the layer

  • has_bn (bool) – Whether has BatchNorm in the layer

  • has_l2norm (bool) – Wheter has L2 normalization after the layer

  • **kwargs (optional) – Additional args

class GeneralMultiLayer(name, num_layers, dim_in, dim_out, dim_inner=None, final_act=True, **kwargs)[source]

General wrapper for a stack of multiple layers

Parameters
  • name (string) – Name of the layer in registered layer_dict

  • num_layers (int) – Number of layers in the stack

  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension

  • dim_inner (int) – The dimension for the inner layers

  • final_act (bool) – Whether has activation after the layer stack

  • **kwargs (optional) – Additional args

class Linear(dim_in, dim_out, bias=False, **kwargs)[source]

Basic Linear layer.

Parameters
  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension

  • bias (bool) – Whether has bias term

  • **kwargs (optional) – Additional args

class BatchNorm1dNode(dim_in)[source]

BatchNorm for node feature.

Parameters

dim_in (int) – Input dimension

class BatchNorm1dEdge(dim_in)[source]

BatchNorm for edge feature.

Parameters

dim_in (int) – Input dimension

class MLP(dim_in, dim_out, bias=True, dim_inner=None, num_layers=2, **kwargs)[source]

Basic MLP model. Here 1-layer MLP is equivalent to a Liner layer.

Parameters
  • dim_in (int) – Input dimension

  • dim_out (int) – Output dimension

  • bias (bool) – Whether has bias term

  • dim_inner (int) – The dimension for the inner layers

  • num_layers (int) – Number of layers in the stack

  • **kwargs (optional) – Additional args

class GCNConv(dim_in, dim_out, bias=False, **kwargs)[source]

Graph Convolutional Network (GCN) layer

class SAGEConv(dim_in, dim_out, bias=False, **kwargs)[source]

GraphSAGE Conv layer

class GATConv(dim_in, dim_out, bias=False, **kwargs)[source]

Graph Attention Network (GAT) layer

class GINConv(dim_in, dim_out, bias=False, **kwargs)[source]

Graph Isomorphism Network (GIN) layer

class SplineConv(dim_in, dim_out, bias=False, **kwargs)[source]

SplineCNN layer

class GeneralConv(dim_in, dim_out, bias=False, **kwargs)[source]

A general GNN layer

class GeneralEdgeConv(dim_in, dim_out, bias=False, **kwargs)[source]

A general GNN layer that supports edge features as well

class GeneralSampleEdgeConv(dim_in, dim_out, bias=False, **kwargs)[source]

A general GNN layer that supports edge features and edge sampling

global_add_pool(x, batch, size=None)[source]

Globally pool node embeddings into graph embeddings, via elementwise sum. Pooling function takes in node embedding [num_nodes x emb_dim] and batch (indices) and outputs graph embedding [num_graphs x emb_dim].

Parameters
  • x (torch.tensor) – Input node embeddings

  • batch (torch.tensor) – Batch tensor that indicates which node

  • to which graph (belongs) –

  • size (optional) – Total number of graphs. Can be auto-inferred.

Returns: Pooled graph embeddings

global_mean_pool(x, batch, size=None)[source]

Globally pool node embeddings into graph embeddings, via elementwise mean. Pooling function takes in node embedding [num_nodes x emb_dim] and batch (indices) and outputs graph embedding [num_graphs x emb_dim].

Parameters
  • x (torch.tensor) – Input node embeddings

  • batch (torch.tensor) – Batch tensor that indicates which node

  • to which graph (belongs) –

  • size (optional) – Total number of graphs. Can be auto-inferred.

Returns: Pooled graph embeddings

global_max_pool(x, batch, size=None)[source]

Globally pool node embeddings into graph embeddings, via elementwise max. Pooling function takes in node embedding [num_nodes x emb_dim] and batch (indices) and outputs graph embedding [num_graphs x emb_dim].

Parameters
  • x (torch.tensor) – Input node embeddings

  • batch (torch.tensor) – Batch tensor that indicates which node

  • to which graph (belongs) –

  • size (optional) – Total number of graphs. Can be auto-inferred.

Returns: Pooled graph embeddings

Register Modules

register

Base function for registering a customized module to a module dictionary

register_act

Register a customized activation function.

register_node_encoder

Register a customized node feature encoder.

register_edge_encoder

Register a customized edge feature encoder.

register_stage

Register a customized GNN stage (consists of multiple layers).

register_head

Register a customized GNN prediction head.

register_layer

Register a customized GNN layer.

register_pooling

Register a customized GNN pooling layer (for graph classification).

register_network

Register a customized GNN model.

register_config

Register a customized configuration group.

register_loader

Register a customized PyG data loader.

register_optimizer

Register a customized optimizer.

register_scheduler

Register a customized learning rate scheduler.

register_loss

Register a customized loss function.

register_train

Register a customized training function.

register(key, module, module_dict)[source]

Base function for registering a customized module to a module dictionary

Parameters
  • key (string) – Name of the module

  • module – PyTorch module

  • module_dict (dict) – Python dictionary, hosting all the registered modules

register_act(key, module)[source]

Register a customized activation function. After registeration, the module can be directly called by GraphGym.

Parameters
  • key (string) – Name of the module

  • module – PyTorch module

register_node_encoder(key, module)[source]

Register a customized node feature encoder. After registeration, the module can be directly called by GraphGym.

Parameters
  • key (string) – Name of the module

  • module – PyTorch module

register_edge_encoder(key, module)[source]

Register a customized edge feature encoder. After registeration, the module can be directly called by GraphGym.

Parameters
  • key (string) – Name of the module

  • module – PyTorch module

register_stage(key, module)[source]

Register a customized GNN stage (consists of multiple layers). After registeration, the module can be directly called by GraphGym.

Parameters
  • key (string) – Name of the module

  • module – PyTorch module

register_head(key, module)[source]

Register a customized GNN prediction head. After registeration, the module can be directly called by GraphGym.

Parameters
  • key (string) – Name of the module

  • module – PyTorch module

register_layer(key, module)[source]

Register a customized GNN layer. After registeration, the module can be directly called by GraphGym.

Parameters
  • key (string) – Name of the module

  • module – PyTorch module

register_pooling(key, module)[source]

Register a customized GNN pooling layer (for graph classification). After registeration, the module can be directly called by GraphGym.

Parameters
  • key (string) – Name of the module

  • module – PyTorch module

register_network(key, module)[source]

Register a customized GNN model. After registeration, the module can be directly called by GraphGym.

Parameters
  • key (string) – Name of the module

  • module – PyTorch module

register_config(key, module)[source]

Register a customized configuration group. After registeration, the module can be directly called by GraphGym.

Parameters
  • key (string) – Name of the module

  • module – PyTorch module

register_loader(key, module)[source]

Register a customized PyG data loader. After registeration, the module can be directly called by GraphGym.

Parameters
  • key (string) – Name of the module

  • module – PyTorch module

register_optimizer(key, module)[source]

Register a customized optimizer. After registeration, the module can be directly called by GraphGym.

Parameters
  • key (string) – Name of the module

  • module – PyTorch module

register_scheduler(key, module)[source]

Register a customized learning rate scheduler. After registeration, the module can be directly called by GraphGym.

Parameters
  • key (string) – Name of the module

  • module – PyTorch module

register_loss(key, module)[source]

Register a customized loss function. After registeration, the module can be directly called by GraphGym.

Parameters
  • key (string) – Name of the module

  • module – PyTorch module

register_train(key, module)[source]

Register a customized training function. After registeration, the module can be directly called by GraphGym.

Parameters
  • key (string) – Name of the module

  • module – PyTorch module

Utility Modules

agg_runs

Aggregate over different random seeds of a single experiment

agg_batch

Aggregate across results from multiple experiments via grid search

params_count

Computes the number of parameters.

match_baseline_cfg

Match the computational budget of a given baseline model.

get_current_gpu_usage

Get the current GPU memory usage.

auto_select_device

Auto select device for the experiment.

is_eval_epoch

Determines if the model should be evaluated at the current epoch.

is_ckpt_epoch

Determines if the model should be evaluated at the current epoch.

dict_to_json

Dump a Python dictionary to JSON file

dict_list_to_json

Dump a list of Python dictionaries to JSON file

dict_to_tb

Add a dictionary of statistics to a Tensorboard writer

makedirs_rm_exist

Make a directory, remove any existing data.

dummy_context

Default context manager that does nothing

agg_runs(dir, metric_best='auto')[source]

Aggregate over different random seeds of a single experiment

Parameters
  • dir (str) – Directory of the results, containing 1 experiment

  • metric_best (str, optional) – The metric for selecting the best

  • performance. Options (validation) – auto, accuracy, auc.

agg_batch(dir, metric_best='auto')[source]

Aggregate across results from multiple experiments via grid search

Parameters
  • dir (str) – Directory of the results, containing multiple experiments

  • metric_best (str, optional) – The metric for selecting the best

  • performance. Options (validation) – auto, accuracy, auc.

params_count(model)[source]

Computes the number of parameters.

Parameters

model (nn.Module) – PyTorch model

match_baseline_cfg(cfg_dict, cfg_dict_baseline, verbose=True)[source]

Match the computational budget of a given baseline model. THe current configuration dictionary will be modifed and returned.

Parameters
  • cfg_dict (dict) – Current experiment’s configuration

  • cfg_dict_baseline (dict) – Baseline configuration

  • verbose (str, optional) – If printing matched paramter conunts

get_current_gpu_usage()[source]

Get the current GPU memory usage.

auto_select_device(memory_max=8000, memory_bias=200, strategy='random')[source]

Auto select device for the experiment. Useful when having multiple GPUs.

Parameters
  • memory_max (int) – Threshold of existing GPU memory usage. GPUs with

  • usage beyond this threshold will be deprioritized. (memory) –

  • memory_bias (int) – A bias GPU memory usage added to all the GPUs.

  • dvided by zero error. (Avoild) –

  • strategy (str, optional) – ‘random’ (random select GPU) or ‘greedy’

  • select GPU) ((greedily) –

is_eval_epoch(cur_epoch)[source]

Determines if the model should be evaluated at the current epoch.

is_ckpt_epoch(cur_epoch)[source]

Determines if the model should be evaluated at the current epoch.

dict_to_json(dict, fname)[source]

Dump a Python dictionary to JSON file

Parameters
  • dict (dict) – Python dictionary

  • fname (str) – Output file name

dict_list_to_json(dict_list, fname)[source]

Dump a list of Python dictionaries to JSON file

Parameters
  • dict_list (list of dict) – List of Python dictionaries

  • fname (str) – Output file name

dict_to_tb(dict, writer, epoch)[source]

Add a dictionary of statistics to a Tensorboard writer

Parameters
  • dict (dict) – Statistics of experiments, the keys are attribute names,

  • values are the attribute values (the) –

  • writer – Tensorboard writer object

  • epoch (int) – The current epoch

makedirs_rm_exist(dir)[source]

Make a directory, remove any existing data.

Parameters

dir (str) – The directory to be created.

Returns:

class dummy_context[source]

Default context manager that does nothing