torch_geometric.graphgym¶

Contents

Workflow and Register Modules
Model Modules
Utility Modules

Workflow and Register Modules ¶

`load_ckpt`	Load latest model checkpoint
`save_ckpt`	Save model checkpoint at given epoch
`clean_ckpt`	Only keep the latest model checkpoint, remove all the older checkpoints
`parse_args`	Parses the command line arguments.
`cfg`	CfgNode represents an internal node in the configuration tree.
`set_cfg`	This function sets the default config value.
`load_cfg`	Load configurations from file system and command line
`dump_cfg`	Dumps the config to the output directory specified in `cfg.out_dir`
`set_run_dir`	Create the directory for each random seed experiment run
`set_agg_dir`	Create the directory for aggregated results over all the random seeds
`get_fname`	Extract filename from file name path
`init_weights`	Performs weight initialization
`create_loader`	Create data loader object
`set_printing`	Set up printing options
`create_logger`	Create logger for the experiment
`compute_loss`	Compute loss and prediction score
`create_model`	Create model for graph machine learning
`create_optimizer`	Create optimizer for the model
`create_scheduler`	Create learning rate scheduler for the optimizer
`train`	The core training pipeline
`register_base`	Base function for registering a module in GraphGym.
`register_act`	Registers an activation function in GraphGym.
`register_node_encoder`	Registers a node feature encoder in GraphGym.
`register_edge_encoder`	Registers an edge feature encoder in GraphGym.
`register_stage`	Registers a customized GNN stage in GraphGym.
`register_head`	Registers a GNN prediction head in GraphGym.
`register_layer`	Registers a GNN layer in GraphGym.
`register_pooling`	Registers a GNN global pooling/readout layer in GraphGym.
`register_network`	Registers a GNN model in GraphGym.
`register_config`	Registers a configuration group in GraphGym.
`register_loader`	Registers a data loader in GraphGym.
`register_optimizer`	Registers an optimizer in GraphGym.
`register_scheduler`	Registers a learning rate scheduler in GraphGym.
`register_loss`	Registers a loss function in GraphGym.
`register_train`	Registers a training function in GraphGym.

load_ckpt(model, optimizer=None, scheduler=None)[source]¶

Load latest model checkpoint

Parameters

model (torch.nn.Module) – The model that will be loaded
optimizer (torch.optim, optional) – The optimizer that will be loaded
scheduler (torch.optim, optional) – The schduler that will be loaded

Returns

Epoch count after loading the model

save_ckpt(model, optimizer, scheduler, epoch)[source]¶

Save model checkpoint at given epoch

Parameters

model (torch.nn.Module) – The model that will be saved
optimizer (torch.optim) – The optimizer that will be saved
scheduler (torch.optim) – The schduler that will be saved
epoch (int) – The epoch when the model is saved

clean_ckpt()[source]¶: Only keep the latest model checkpoint, remove all the older checkpoints

parse_args()[source]¶: Parses the command line arguments.

set_cfg(cfg)[source]¶

This function sets the default config value. 1) Note that for an experiment, only part of the arguments will be used The remaining unused arguments won’t affect anything. So feel free to register any argument in graphgym.contrib.config 2) We support at most two levels of configs, e.g., cfg.dataset.name

Returns: configuration use by the experiment.

load_cfg(cfg, args)[source]¶

Load configurations from file system and command line

Parameters

cfg (CfgNode) – Configuration node
args (ArgumentParser) – Command argument parser

dump_cfg(cfg)[source]¶

Dumps the config to the output directory specified in cfg.out_dir

Parameters: cfg (CfgNode) – Configuration node

set_run_dir(out_dir, fname)[source]¶

Create the directory for each random seed experiment run

Parameters

out_dir (string) – Directory for output, specified in cfg.out_dir
fname (string) – Filename for the yaml format configuration file

set_agg_dir(out_dir, fname)[source]¶

Create the directory for aggregated results over all the random seeds

Parameters

out_dir (string) – Directory for output, specified in cfg.out_dir
fname (string) – Filename for the yaml format configuration file

get_fname(fname)[source]¶

Extract filename from file name path

Parameters: fname (string) – Filename for the yaml format configuration file

init_weights(m)[source]¶

Performs weight initialization

Parameters: m (nn.Module) – PyTorch module

create_loader()[source]¶

Create data loader object

Returns: List of PyTorch data loaders

set_printing()[source]¶: Set up printing options

create_logger()[source]¶

Create logger for the experiment

Returns: List of logger objects

compute_loss(pred, true)[source]¶

Compute loss and prediction score

Parameters

pred (torch.tensor) – Unnormalized prediction
true (torch.tensor) – Grou

Returns: Loss, normalized prediction score

create_model(to_device=True, dim_in=None, dim_out=None)[source]¶

Create model for graph machine learning

Parameters

to_device (string) – The devide that the model will be transferred to
dim_in (int, optional) – Input dimension to the model
dim_out (int, optional) – Output dimension to the model

create_optimizer(params, optimizer_config: torch_geometric.graphgym.optimizer.OptimizerConfig)[source]¶

Create optimizer for the model

Parameters: params – PyTorch model parameters

Returns: PyTorch optimizer

create_scheduler(optimizer, scheduler_config: torch_geometric.graphgym.optimizer.SchedulerConfig)[source]¶

Create learning rate scheduler for the optimizer

Parameters: optimizer – PyTorch optimizer

Returns: PyTorch scheduler

train(loggers, loaders, model, optimizer, scheduler)[source]¶

The core training pipeline

Parameters

loggers – List of loggers
loaders – List of loaders
model – GNN model
optimizer – PyTorch optimizer
scheduler – PyTorch learning rate scheduler

register_base(mapping: Dict[str, Any], key: str, module: Optional[Any] = None) → Union[None, Callable][source]¶

Base function for registering a module in GraphGym.

Parameters

mapping (dict) – Python dictionary to register the module. hosting all the registered modules
key (string) – The name of the module.
module (any, optional) – The module. If set to None, will return a decorator to register a module.

register_act(key: str, module: Optional[Any] = None)[source]¶: Registers an activation function in GraphGym.

register_node_encoder(key: str, module: Optional[Any] = None)[source]¶: Registers a node feature encoder in GraphGym.

register_edge_encoder(key: str, module: Optional[Any] = None)[source]¶: Registers an edge feature encoder in GraphGym.

register_stage(key: str, module: Optional[Any] = None)[source]¶: Registers a customized GNN stage in GraphGym.

register_head(key: str, module: Optional[Any] = None)[source]¶: Registers a GNN prediction head in GraphGym.

register_layer(key: str, module: Optional[Any] = None)[source]¶: Registers a GNN layer in GraphGym.

register_pooling(key: str, module: Optional[Any] = None)[source]¶: Registers a GNN global pooling/readout layer in GraphGym.

register_network(key: str, module: Optional[Any] = None)[source]¶: Registers a GNN model in GraphGym.

register_config(key: str, module: Optional[Any] = None)[source]¶: Registers a configuration group in GraphGym.

register_loader(key: str, module: Optional[Any] = None)[source]¶: Registers a data loader in GraphGym.

register_optimizer(key: str, module: Optional[Any] = None)[source]¶: Registers an optimizer in GraphGym.

register_scheduler(key: str, module: Optional[Any] = None)[source]¶: Registers a learning rate scheduler in GraphGym.

register_loss(key: str, module: Optional[Any] = None)[source]¶: Registers a loss function in GraphGym.

register_train(key: str, module: Optional[Any] = None)[source]¶: Registers a training function in GraphGym.

Model Modules ¶

`IntegerFeatureEncoder`	Provides an encoder for integer node features.
`AtomEncoder`	The atom Encoder used in OGB molecule dataset.
`BondEncoder`	The bond Encoder used in OGB molecule dataset.
`GNNLayer`	Wrapper for a GNN layer
`GNNPreMP`	Wrapper for NN layer before GNN message passing
`GNNStackStage`	Simple Stage that stack GNN layers
`FeatureEncoder`	Encoding node and edge features
`GNN`	General GNN model: encoder + stage + head
`GNNNodeHead`	GNN prediction head for node prediction tasks.
`GNNEdgeHead`	GNN prediction head for edge/link prediction tasks.
`GNNGraphHead`	GNN prediction head for graph prediction tasks.
`GeneralLayer`	General wrapper for layers
`GeneralMultiLayer`	General wrapper for a stack of multiple layers
`Linear`	Basic Linear layer.
`BatchNorm1dNode`	BatchNorm for node feature.
`BatchNorm1dEdge`	BatchNorm for edge feature.
`MLP`	Basic MLP model.
`GCNConv`	Graph Convolutional Network (GCN) layer
`SAGEConv`	GraphSAGE Conv layer
`GATConv`	Graph Attention Network (GAT) layer
`GINConv`	Graph Isomorphism Network (GIN) layer
`SplineConv`	SplineCNN layer
`GeneralConv`	A general GNN layer
`GeneralEdgeConv`	A general GNN layer that supports edge features as well
`GeneralSampleEdgeConv`	A general GNN layer that supports edge features and edge sampling
`global_add_pool`	Returns batch-wise graph-level-outputs by adding node features across the node dimension, so that for a single graph \(\mathcal{G}_i\) its output is computed by
`global_mean_pool`	Returns batch-wise graph-level-outputs by averaging node features across the node dimension, so that for a single graph \(\mathcal{G}_i\) its output is computed by
`global_max_pool`	Returns batch-wise graph-level-outputs by taking the channel-wise maximum across the node dimension, so that for a single graph \(\mathcal{G}_i\) its output is computed by

class IntegerFeatureEncoder(emb_dim, num_classes=None)[source]¶

Provides an encoder for integer node features.

Parameters

emb_dim (int) – Output embedding dimension
num_classes (int) – the number of classes for the
mapping to learn from (embedding) –

class AtomEncoder(emb_dim, num_classes=None)[source]¶

The atom Encoder used in OGB molecule dataset.

Parameters

emb_dim (int) – Output embedding dimension
num_classes – None

class BondEncoder(emb_dim)[source]¶

The bond Encoder used in OGB molecule dataset.

Parameters: emb_dim (int) – Output edge embedding dimension

GNNLayer(dim_in, dim_out, has_act=True)[source]¶

Wrapper for a GNN layer

Parameters

dim_in (int) – Input dimension
dim_out (int) – Output dimension
has_act (bool) – Whether has activation function after the layer

GNNPreMP(dim_in, dim_out, num_layers)[source]¶

Wrapper for NN layer before GNN message passing

Parameters

dim_in (int) – Input dimension
dim_out (int) – Output dimension
num_layers (int) – Number of layers

class GNNStackStage(dim_in, dim_out, num_layers)[source]¶

Simple Stage that stack GNN layers

Parameters

dim_in (int) – Input dimension
dim_out (int) – Output dimension
num_layers (int) – Number of GNN layers

class FeatureEncoder(dim_in)[source]¶

Encoding node and edge features

Parameters: dim_in (int) – Input feature dimension

class GNN(dim_in, dim_out, **kwargs)[source]¶

General GNN model: encoder + stage + head

Parameters

dim_in (int) – Input dimension
dim_out (int) – Output dimension
**kwargs (optional) – Optional additional args

class GNNNodeHead(dim_in, dim_out)[source]¶

GNN prediction head for node prediction tasks.

Parameters

dim_in (int) – Input dimension
dim_out (int) – Output dimension. For binary prediction, dim_out=1.

class GNNEdgeHead(dim_in, dim_out)[source]¶

GNN prediction head for edge/link prediction tasks.

Parameters

dim_in (int) – Input dimension
dim_out (int) – Output dimension. For binary prediction, dim_out=1.

class GNNGraphHead(dim_in, dim_out)[source]¶

GNN prediction head for graph prediction tasks. The optional post_mp layer (specified by cfg.gnn.post_mp) is used to transform the pooled embedding using an MLP.

Parameters

dim_in (int) – Input dimension
dim_out (int) – Output dimension. For binary prediction, dim_out=1.

class GeneralLayer(name, layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶

General wrapper for layers

Parameters

name (string) – Name of the layer in registered layer_dict
dim_in (int) – Input dimension
dim_out (int) – Output dimension
has_act (bool) – Whether has activation after the layer
has_bn (bool) – Whether has BatchNorm in the layer
has_l2norm (bool) – Wheter has L2 normalization after the layer
**kwargs (optional) – Additional args

class GeneralMultiLayer(name, layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶

General wrapper for a stack of multiple layers

Parameters

name (string) – Name of the layer in registered layer_dict
num_layers (int) – Number of layers in the stack
dim_in (int) – Input dimension
dim_out (int) – Output dimension
dim_inner (int) – The dimension for the inner layers
final_act (bool) – Whether has activation after the layer stack
**kwargs (optional) – Additional args

class Linear(layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶

Basic Linear layer.

Parameters

dim_in (int) – Input dimension
dim_out (int) – Output dimension
bias (bool) – Whether has bias term
**kwargs (optional) – Additional args

class BatchNorm1dNode(layer_config: torch_geometric.graphgym.models.layer.LayerConfig)[source]¶

BatchNorm for node feature.

Parameters: dim_in (int) – Input dimension

class BatchNorm1dEdge(layer_config: torch_geometric.graphgym.models.layer.LayerConfig)[source]¶

BatchNorm for edge feature.

Parameters: dim_in (int) – Input dimension

class MLP(layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶

Basic MLP model. Here 1-layer MLP is equivalent to a Liner layer.

Parameters

dim_in (int) – Input dimension
dim_out (int) – Output dimension
bias (bool) – Whether has bias term
dim_inner (int) – The dimension for the inner layers
num_layers (int) – Number of layers in the stack
**kwargs (optional) – Additional args

class GCNConv(layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶: Graph Convolutional Network (GCN) layer

class SAGEConv(layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶: GraphSAGE Conv layer

class GATConv(layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶: Graph Attention Network (GAT) layer

class GINConv(layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶: Graph Isomorphism Network (GIN) layer

class SplineConv(layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶: SplineCNN layer

class GeneralConv(layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶: A general GNN layer

class GeneralEdgeConv(layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶: A general GNN layer that supports edge features as well

class GeneralSampleEdgeConv(layer_config: torch_geometric.graphgym.models.layer.LayerConfig, **kwargs)[source]¶: A general GNN layer that supports edge features and edge sampling

global_add_pool(x, batch, size: Optional[int] = None)[source]¶

Returns batch-wise graph-level-outputs by adding node features across the node dimension, so that for a single graph \(\mathcal{G}_i\) its output is computed by

\[\mathbf{r}_i = \sum_{n=1}^{N_i} \mathbf{x}_n\]

Parameters

x (Tensor) – Node feature matrix \(\mathbf{X} \in \mathbb{R}^{(N_1 + \ldots + N_B) \times F}\).
batch (LongTensor) – Batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which assigns each node to a specific example.
size (int, optional) – Batch-size \(B\). Automatically calculated if not given. (default: None)

Return type

Tensor

global_mean_pool(x, batch, size: Optional[int] = None)[source]¶

Returns batch-wise graph-level-outputs by averaging node features across the node dimension, so that for a single graph \(\mathcal{G}_i\) its output is computed by

\[\mathbf{r}_i = \frac{1}{N_i} \sum_{n=1}^{N_i} \mathbf{x}_n\]

Parameters

x (Tensor) – Node feature matrix \(\mathbf{X} \in \mathbb{R}^{(N_1 + \ldots + N_B) \times F}\).
batch (LongTensor) – Batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which assigns each node to a specific example.
size (int, optional) – Batch-size \(B\). Automatically calculated if not given. (default: None)

Return type

Tensor

global_max_pool(x, batch, size: Optional[int] = None)[source]¶

Returns batch-wise graph-level-outputs by taking the channel-wise maximum across the node dimension, so that for a single graph \(\mathcal{G}_i\) its output is computed by

\[\mathbf{r}_i = \mathrm{max}_{n=1}^{N_i} \, \mathbf{x}_n\]

Parameters

x (Tensor) – Node feature matrix \(\mathbf{X} \in \mathbb{R}^{(N_1 + \ldots + N_B) \times F}\).
batch (LongTensor) – Batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which assigns each node to a specific example.
size (int, optional) – Batch-size \(B\). Automatically calculated if not given. (default: None)

Return type

Tensor

Utility Modules ¶

`agg_runs`	Aggregate over different random seeds of a single experiment
`agg_batch`	Aggregate across results from multiple experiments via grid search
`params_count`	Computes the number of parameters.
`match_baseline_cfg`	Match the computational budget of a given baseline model.
`get_current_gpu_usage`	Get the current GPU memory usage.
`auto_select_device`	Auto select device for the experiment.
`is_eval_epoch`	Determines if the model should be evaluated at the current epoch.
`is_ckpt_epoch`	Determines if the model should be evaluated at the current epoch.
`dict_to_json`	Dump a Python dictionary to JSON file
`dict_list_to_json`	Dump a list of Python dictionaries to JSON file
`dict_to_tb`	Add a dictionary of statistics to a Tensorboard writer
`makedirs_rm_exist`	Make a directory, remove any existing data.
`dummy_context`	Default context manager that does nothing

agg_runs(dir, metric_best='auto')[source]¶

Aggregate over different random seeds of a single experiment

Parameters

dir (str) – Directory of the results, containing 1 experiment
metric_best (str, optional) – The metric for selecting the best
performance. Options (validation) – auto, accuracy, auc.

agg_batch(dir, metric_best='auto')[source]¶

Aggregate across results from multiple experiments via grid search

Parameters

dir (str) – Directory of the results, containing multiple experiments
metric_best (str, optional) – The metric for selecting the best
performance. Options (validation) – auto, accuracy, auc.

params_count(model)[source]¶

Computes the number of parameters.

Parameters: model (nn.Module) – PyTorch model

match_baseline_cfg(cfg_dict, cfg_dict_baseline, verbose=True)[source]¶

Match the computational budget of a given baseline model. THe current configuration dictionary will be modifed and returned.

Parameters

cfg_dict (dict) – Current experiment’s configuration
cfg_dict_baseline (dict) – Baseline configuration
verbose (str, optional) – If printing matched paramter conunts

get_current_gpu_usage()[source]¶: Get the current GPU memory usage.

auto_select_device(memory_max=8000, memory_bias=200, strategy='random')[source]¶

Auto select device for the experiment. Useful when having multiple GPUs.

Parameters

memory_max (int) – Threshold of existing GPU memory usage. GPUs with
usage beyond this threshold will be deprioritized. (memory) –
memory_bias (int) – A bias GPU memory usage added to all the GPUs.
dvided by zero error. (Avoild) –
strategy (str, optional) – ‘random’ (random select GPU) or ‘greedy’
select GPU) ((greedily) –

is_eval_epoch(cur_epoch)[source]¶: Determines if the model should be evaluated at the current epoch.

is_ckpt_epoch(cur_epoch)[source]¶: Determines if the model should be evaluated at the current epoch.

dict_to_json(dict, fname)[source]¶

Dump a Python dictionary to JSON file

Parameters

dict (dict) – Python dictionary
fname (str) – Output file name

dict_list_to_json(dict_list, fname)[source]¶

Dump a list of Python dictionaries to JSON file

Parameters

dict_list (list of dict) – List of Python dictionaries
fname (str) – Output file name

dict_to_tb(dict, writer, epoch)[source]¶

Add a dictionary of statistics to a Tensorboard writer

Parameters

dict (dict) – Statistics of experiments, the keys are attribute names,
values are the attribute values (the) –
writer – Tensorboard writer object
epoch (int) – The current epoch

makedirs_rm_exist(dir)[source]¶

Make a directory, remove any existing data.

Parameters: dir (str) – The directory to be created.

Returns:

class dummy_context[source]¶: Default context manager that does nothing

torch_geometric.graphgym¶

Workflow and Register Modules¶

Model Modules¶

Utility Modules¶

Workflow and Register Modules ¶

Model Modules ¶

Utility Modules ¶