torch_geometric.graphgym¶

Contents

Workflow Modules
Model Modules
Register Modules
Utility Modules

Workflow Modules ¶

`load_ckpt`	Load latest model checkpoint
`save_ckpt`	Save model checkpoint at given epoch
`clean_ckpt`	Only keep the latest model checkpoint, remove all the older checkpoints
`parse_args`	Parses the command line arguments.
`cfg`	CfgNode represents an internal node in the configuration tree.
`set_cfg`	This function sets the default config value.
`load_cfg`	Load configurations from file system and command line
`dump_cfg`	Dumps the config to the output directory specified in `cfg.out_dir`
`set_run_dir`	Create the directory for each random seed experiment run
`set_agg_dir`	Create the directory for aggregated results over all the random seeds
`init_weights`	Performs weight initialization
`create_loader`	Create data loader object
`set_printing`	Set up printing options
`create_logger`	Create logger for the experiment
`compute_loss`	Compute loss and prediction score
`create_model`	Create model for graph machine learning
`create_optimizer`	Create optimizer for the model
`create_scheduler`	Create learning rate scheduler for the optimizer
`train`	The core training pipeline

load_ckpt(model, optimizer=None, scheduler=None)[source]¶

Load latest model checkpoint

Parameters

model (torch.nn.Module) – The model that will be loaded
optimizer (torch.optim, optional) – The optimizer that will be loaded
scheduler (torch.optim, optional) – The schduler that will be loaded

Returns

Epoch count after loading the model

save_ckpt(model, optimizer, scheduler, epoch)[source]¶

Save model checkpoint at given epoch

Parameters

model (torch.nn.Module) – The model that will be saved
optimizer (torch.optim) – The optimizer that will be saved
scheduler (torch.optim) – The schduler that will be saved
epoch (int) – The epoch when the model is saved

clean_ckpt()[source]¶: Only keep the latest model checkpoint, remove all the older checkpoints

parse_args()[source]¶: Parses the command line arguments.

set_cfg(cfg)[source]¶

This function sets the default config value. 1) Note that for an experiment, only part of the arguments will be used The remaining unused arguments won’t affect anything. So feel free to register any argument in graphgym.contrib.config 2) We support at most two levels of configs, e.g., cfg.dataset.name

Returns: configuration use by the experiment.

load_cfg(cfg, args)[source]¶

Load configurations from file system and command line

Parameters

cfg (CfgNode) – Configuration node
args (ArgumentParser) – Command argument parser

dump_cfg(cfg)[source]¶

Dumps the config to the output directory specified in cfg.out_dir

Parameters: cfg (CfgNode) – Configuration node

set_run_dir(out_dir, fname)[source]¶

Create the directory for each random seed experiment run

Parameters

out_dir (string) – Directory for output, specified in cfg.out_dir
fname (string) – Filename for the yaml format configuration file

set_agg_dir(out_dir, fname)[source]¶

Create the directory for aggregated results over all the random seeds

Parameters

out_dir (string) – Directory for output, specified in cfg.out_dir
fname (string) – Filename for the yaml format configuration file

init_weights(m)[source]¶

Performs weight initialization

Parameters: m (nn.Module) – PyTorch module

create_loader()[source]¶

Create data loader object

Returns: List of PyTorch data loaders

set_printing()[source]¶: Set up printing options

create_logger()[source]¶

Create logger for the experiment

Returns: List of logger objects

compute_loss(pred, true)[source]¶

Compute loss and prediction score

Parameters

pred (torch.tensor) – Unnormalized prediction
true (torch.tensor) – Grou

Returns: Loss, normalized prediction score

create_model(to_device=True, dim_in=None, dim_out=None)[source]¶

Create model for graph machine learning

Parameters

to_device (string) – The devide that the model will be transferred to
dim_in (int, optional) – Input dimension to the model
dim_out (int, optional) – Output dimension to the model

create_optimizer(params, optimizer_config: torch_geometric.graphgym.optimizer.OptimizerConfig)[source]¶

Create optimizer for the model

Parameters: params – PyTorch model parameters

Returns: PyTorch optimizer

create_scheduler(optimizer, scheduler_config: torch_geometric.graphgym.optimizer.SchedulerConfig)[source]¶

Create learning rate scheduler for the optimizer

Parameters: optimizer – PyTorch optimizer

Returns: PyTorch scheduler

train(loggers, loaders, model, optimizer, scheduler)[source]¶

The core training pipeline

Parameters

loggers – List of loggers
loaders – List of loaders
model – GNN model
optimizer – PyTorch optimizer
scheduler – PyTorch learning rate scheduler

Model Modules ¶

`IntegerFeatureEncoder`	Provides an encoder for integer node features.
`AtomEncoder`	The atom Encoder used in OGB molecule dataset.
`BondEncoder`	The bond Encoder used in OGB molecule dataset.
`GNNLayer`	Wrapper for a GNN layer
`GNNPreMP`	Wrapper for NN layer before GNN message passing
`GNNStackStage`	Simple Stage that stack GNN layers
`FeatureEncoder`	Encoding node and edge features
`GNN`	General GNN model: encoder + stage + head
`GNNNodeHead`	GNN prediction head for node prediction tasks.
`GNNEdgeHead`	GNN prediction head for edge/link prediction tasks.
`GNNGraphHead`	GNN prediction head for graph prediction tasks.
`GeneralLayer`	General wrapper for layers
`GeneralMultiLayer`	General wrapper for a stack of multiple layers
`Linear`	Basic Linear layer.
`BatchNorm1dNode`	BatchNorm for node feature.
`BatchNorm1dEdge`	BatchNorm for edge feature.
`MLP`	Basic MLP model.
`GCNConv`	Graph Convolutional Network (GCN) layer
`SAGEConv`	GraphSAGE Conv layer
`GATConv`	Graph Attention Network (GAT) layer
`GINConv`	Graph Isomorphism Network (GIN) layer
`SplineConv`	SplineCNN layer
`GeneralConv`	A general GNN layer
`GeneralEdgeConv`	A general GNN layer that supports edge features as well
`GeneralSampleEdgeConv`	A general GNN layer that supports edge features and edge sampling
`global_add_pool`	Globally pool node embeddings into graph embeddings, via elementwise sum.
`global_mean_pool`	Globally pool node embeddings into graph embeddings, via elementwise mean.
`global_max_pool`	Globally pool node embeddings into graph embeddings, via elementwise max.

class IntegerFeatureEncoder(emb_dim, num_classes=None)[source]¶

Provides an encoder for integer node features.

Parameters

emb_dim (int) – Output embedding dimension
num_classes (int) – the number of classes for the
mapping to learn from (embedding) –

class AtomEncoder(emb_dim, num_classes=None)[source]¶

The atom Encoder used in OGB molecule dataset.

Parameters

emb_dim (int) – Output embedding dimension
num_classes – None

class BondEncoder(emb_dim)[source]¶

The bond Encoder used in OGB molecule dataset.

Parameters: emb_dim (int) – Output edge embedding dimension

GNNLayer(dim_in, dim_out, has_act=True)[source]¶

Wrapper for a GNN layer

Parameters

dim_in (int) – Input dimension
dim_out (int) – Output dimension
has_act (bool) – Whether has activation function after the layer

GNNPreMP(dim_in, dim_out)[source]¶

Wrapper for NN layer before GNN message passing

Parameters

dim_in (int) – Input dimension
dim_out (int) – Output dimension

class GNNStackStage(dim_in, dim_out, num_layers)[source]¶

Simple Stage that stack GNN layers

Parameters

dim_in (int) – Input dimension
dim_out (int) – Output dimension
num_layers (int) – Number of GNN layers

class FeatureEncoder(dim_in)[source]¶

Encoding node and edge features

Parameters: dim_in (int) – Input feature dimension

class GNN(dim_in, dim_out, **kwargs)[source]¶

General GNN model: encoder + stage + head

Parameters

dim_in (int) – Input dimension
dim_out (int) – Output dimension
**kwargs (optional) – Optional additional args

class GNNNodeHead(dim_in, dim_out)[source]¶

GNN prediction head for node prediction tasks.

Parameters

dim_in (int) – Input dimension
dim_out (int) – Output dimension. For binary prediction, dim_out=1.

class GNNEdgeHead(dim_in, dim_out)[source]¶

GNN prediction head for edge/link prediction tasks.

Parameters

dim_in (int) – Input dimension
dim_out (int) – Output dimension. For binary prediction, dim_out=1.

class GNNGraphHead(dim_in, dim_out)[source]¶

GNN prediction head for graph prediction tasks. The optional post_mp layer (specified by cfg.gnn.post_mp) is used to transform the pooled embedding using an MLP.

Parameters

dim_in (int) – Input dimension
dim_out (int) – Output dimension. For binary prediction, dim_out=1.

class GeneralLayer(name, dim_in, dim_out, has_act=True, has_bn=True, has_l2norm=False, **kwargs)[source]¶

General wrapper for layers

Parameters

name (string) – Name of the layer in registered layer_dict
dim_in (int) – Input dimension
dim_out (int) – Output dimension
has_act (bool) – Whether has activation after the layer
has_bn (bool) – Whether has BatchNorm in the layer
has_l2norm (bool) – Wheter has L2 normalization after the layer
**kwargs (optional) – Additional args

class GeneralMultiLayer(name, num_layers, dim_in, dim_out, dim_inner=None, final_act=True, **kwargs)[source]¶

General wrapper for a stack of multiple layers

Parameters

name (string) – Name of the layer in registered layer_dict
num_layers (int) – Number of layers in the stack
dim_in (int) – Input dimension
dim_out (int) – Output dimension
dim_inner (int) – The dimension for the inner layers
final_act (bool) – Whether has activation after the layer stack
**kwargs (optional) – Additional args

class Linear(dim_in, dim_out, bias=False, **kwargs)[source]¶

Basic Linear layer.

Parameters

dim_in (int) – Input dimension
dim_out (int) – Output dimension
bias (bool) – Whether has bias term
**kwargs (optional) – Additional args

class BatchNorm1dNode(dim_in)[source]¶

BatchNorm for node feature.

Parameters: dim_in (int) – Input dimension

class BatchNorm1dEdge(dim_in)[source]¶

BatchNorm for edge feature.

Parameters: dim_in (int) – Input dimension

class MLP(dim_in, dim_out, bias=True, dim_inner=None, num_layers=2, **kwargs)[source]¶

Basic MLP model. Here 1-layer MLP is equivalent to a Liner layer.

Parameters

dim_in (int) – Input dimension
dim_out (int) – Output dimension
bias (bool) – Whether has bias term
dim_inner (int) – The dimension for the inner layers
num_layers (int) – Number of layers in the stack
**kwargs (optional) – Additional args

class GCNConv(dim_in, dim_out, bias=False, **kwargs)[source]¶: Graph Convolutional Network (GCN) layer

class SAGEConv(dim_in, dim_out, bias=False, **kwargs)[source]¶: GraphSAGE Conv layer

class GATConv(dim_in, dim_out, bias=False, **kwargs)[source]¶: Graph Attention Network (GAT) layer

class GINConv(dim_in, dim_out, bias=False, **kwargs)[source]¶: Graph Isomorphism Network (GIN) layer

class SplineConv(dim_in, dim_out, bias=False, **kwargs)[source]¶: SplineCNN layer

class GeneralConv(dim_in, dim_out, bias=False, **kwargs)[source]¶: A general GNN layer

class GeneralEdgeConv(dim_in, dim_out, bias=False, **kwargs)[source]¶: A general GNN layer that supports edge features as well

class GeneralSampleEdgeConv(dim_in, dim_out, bias=False, **kwargs)[source]¶: A general GNN layer that supports edge features and edge sampling

global_add_pool(x, batch, size=None)[source]¶

Globally pool node embeddings into graph embeddings, via elementwise sum. Pooling function takes in node embedding [num_nodes x emb_dim] and batch (indices) and outputs graph embedding [num_graphs x emb_dim].

Parameters

x (torch.tensor) – Input node embeddings
batch (torch.tensor) – Batch tensor that indicates which node
to which graph (belongs) –
size (optional) – Total number of graphs. Can be auto-inferred.

Returns: Pooled graph embeddings

global_mean_pool(x, batch, size=None)[source]¶

Globally pool node embeddings into graph embeddings, via elementwise mean. Pooling function takes in node embedding [num_nodes x emb_dim] and batch (indices) and outputs graph embedding [num_graphs x emb_dim].

Parameters

x (torch.tensor) – Input node embeddings
batch (torch.tensor) – Batch tensor that indicates which node
to which graph (belongs) –
size (optional) – Total number of graphs. Can be auto-inferred.

Returns: Pooled graph embeddings

global_max_pool(x, batch, size=None)[source]¶

Globally pool node embeddings into graph embeddings, via elementwise max. Pooling function takes in node embedding [num_nodes x emb_dim] and batch (indices) and outputs graph embedding [num_graphs x emb_dim].

Parameters

x (torch.tensor) – Input node embeddings
batch (torch.tensor) – Batch tensor that indicates which node
to which graph (belongs) –
size (optional) – Total number of graphs. Can be auto-inferred.

Returns: Pooled graph embeddings

Register Modules ¶

`register`	Base function for registering a customized module to a module dictionary
`register_act`	Register a customized activation function.
`register_node_encoder`	Register a customized node feature encoder.
`register_edge_encoder`	Register a customized edge feature encoder.
`register_stage`	Register a customized GNN stage (consists of multiple layers).
`register_head`	Register a customized GNN prediction head.
`register_layer`	Register a customized GNN layer.
`register_pooling`	Register a customized GNN pooling layer (for graph classification).
`register_network`	Register a customized GNN model.
`register_config`	Register a customized configuration group.
`register_loader`	Register a customized PyG data loader.
`register_optimizer`	Register a customized optimizer.
`register_scheduler`	Register a customized learning rate scheduler.
`register_loss`	Register a customized loss function.
`register_train`	Register a customized training function.

register(key, module, module_dict)[source]¶

Base function for registering a customized module to a module dictionary

Parameters

key (string) – Name of the module
module – PyTorch module
module_dict (dict) – Python dictionary, hosting all the registered modules

register_act(key, module)[source]¶

Register a customized activation function. After registeration, the module can be directly called by GraphGym.

Parameters

key (string) – Name of the module
module – PyTorch module

register_node_encoder(key, module)[source]¶

Register a customized node feature encoder. After registeration, the module can be directly called by GraphGym.

Parameters

key (string) – Name of the module
module – PyTorch module

register_edge_encoder(key, module)[source]¶

Register a customized edge feature encoder. After registeration, the module can be directly called by GraphGym.

Parameters

key (string) – Name of the module
module – PyTorch module

register_stage(key, module)[source]¶

Register a customized GNN stage (consists of multiple layers). After registeration, the module can be directly called by GraphGym.

Parameters

key (string) – Name of the module
module – PyTorch module

register_head(key, module)[source]¶

Register a customized GNN prediction head. After registeration, the module can be directly called by GraphGym.

Parameters

key (string) – Name of the module
module – PyTorch module

register_layer(key, module)[source]¶

Register a customized GNN layer. After registeration, the module can be directly called by GraphGym.

Parameters

key (string) – Name of the module
module – PyTorch module

register_pooling(key, module)[source]¶

Register a customized GNN pooling layer (for graph classification). After registeration, the module can be directly called by GraphGym.

Parameters

key (string) – Name of the module
module – PyTorch module

register_network(key, module)[source]¶

Register a customized GNN model. After registeration, the module can be directly called by GraphGym.

Parameters

key (string) – Name of the module
module – PyTorch module

register_config(key, module)[source]¶

Register a customized configuration group. After registeration, the module can be directly called by GraphGym.

Parameters

key (string) – Name of the module
module – PyTorch module

register_loader(key, module)[source]¶

Register a customized PyG data loader. After registeration, the module can be directly called by GraphGym.

Parameters

key (string) – Name of the module
module – PyTorch module

register_optimizer(key, module)[source]¶

Register a customized optimizer. After registeration, the module can be directly called by GraphGym.

Parameters

key (string) – Name of the module
module – PyTorch module

register_scheduler(key, module)[source]¶

Register a customized learning rate scheduler. After registeration, the module can be directly called by GraphGym.

Parameters

key (string) – Name of the module
module – PyTorch module

register_loss(key, module)[source]¶

Register a customized loss function. After registeration, the module can be directly called by GraphGym.

Parameters

key (string) – Name of the module
module – PyTorch module

register_train(key, module)[source]¶

Register a customized training function. After registeration, the module can be directly called by GraphGym.

Parameters

key (string) – Name of the module
module – PyTorch module

Utility Modules ¶

`agg_runs`	Aggregate over different random seeds of a single experiment
`agg_batch`	Aggregate across results from multiple experiments via grid search
`params_count`	Computes the number of parameters.
`match_baseline_cfg`	Match the computational budget of a given baseline model.
`get_current_gpu_usage`	Get the current GPU memory usage.
`auto_select_device`	Auto select device for the experiment.
`is_eval_epoch`	Determines if the model should be evaluated at the current epoch.
`is_ckpt_epoch`	Determines if the model should be evaluated at the current epoch.
`dict_to_json`	Dump a Python dictionary to JSON file
`dict_list_to_json`	Dump a list of Python dictionaries to JSON file
`dict_to_tb`	Add a dictionary of statistics to a Tensorboard writer
`makedirs_rm_exist`	Make a directory, remove any existing data.
`dummy_context`	Default context manager that does nothing

agg_runs(dir, metric_best='auto')[source]¶

Aggregate over different random seeds of a single experiment

Parameters

dir (str) – Directory of the results, containing 1 experiment
metric_best (str, optional) – The metric for selecting the best
performance. Options (validation) – auto, accuracy, auc.

agg_batch(dir, metric_best='auto')[source]¶

Aggregate across results from multiple experiments via grid search

Parameters

dir (str) – Directory of the results, containing multiple experiments
metric_best (str, optional) – The metric for selecting the best
performance. Options (validation) – auto, accuracy, auc.

params_count(model)[source]¶

Computes the number of parameters.

Parameters: model (nn.Module) – PyTorch model

match_baseline_cfg(cfg_dict, cfg_dict_baseline, verbose=True)[source]¶

Match the computational budget of a given baseline model. THe current configuration dictionary will be modifed and returned.

Parameters

cfg_dict (dict) – Current experiment’s configuration
cfg_dict_baseline (dict) – Baseline configuration
verbose (str, optional) – If printing matched paramter conunts

get_current_gpu_usage()[source]¶: Get the current GPU memory usage.

auto_select_device(memory_max=8000, memory_bias=200, strategy='random')[source]¶

Auto select device for the experiment. Useful when having multiple GPUs.

Parameters

memory_max (int) – Threshold of existing GPU memory usage. GPUs with
usage beyond this threshold will be deprioritized. (memory) –
memory_bias (int) – A bias GPU memory usage added to all the GPUs.
dvided by zero error. (Avoild) –
strategy (str, optional) – ‘random’ (random select GPU) or ‘greedy’
select GPU) ((greedily) –

is_eval_epoch(cur_epoch)[source]¶: Determines if the model should be evaluated at the current epoch.

is_ckpt_epoch(cur_epoch)[source]¶: Determines if the model should be evaluated at the current epoch.

dict_to_json(dict, fname)[source]¶

Dump a Python dictionary to JSON file

Parameters

dict (dict) – Python dictionary
fname (str) – Output file name

dict_list_to_json(dict_list, fname)[source]¶

Dump a list of Python dictionaries to JSON file

Parameters

dict_list (list of dict) – List of Python dictionaries
fname (str) – Output file name

dict_to_tb(dict, writer, epoch)[source]¶

Add a dictionary of statistics to a Tensorboard writer

Parameters

dict (dict) – Statistics of experiments, the keys are attribute names,
values are the attribute values (the) –
writer – Tensorboard writer object
epoch (int) – The current epoch

makedirs_rm_exist(dir)[source]¶

Make a directory, remove any existing data.

Parameters: dir (str) – The directory to be created.

Returns:

class dummy_context[source]¶: Default context manager that does nothing

torch_geometric.graphgym¶

Workflow Modules¶

Model Modules¶

Register Modules¶

Utility Modules¶

Workflow Modules ¶

Model Modules ¶

Register Modules ¶

Utility Modules ¶