torch_geometric.loader
A data loader which merges data objects from a 

A data loader that performs minibatch sampling from node information, using a generic 

A data loader that performs minibatch sampling from link information, using a generic 

A data loader that performs neighbor sampling as introduced in the "Inductive Representation Learning on Large Graphs" paper. 

A linkbased data loader derived as an extension of the nodebased 

The Heterogeneous Graph Sampler from the "Heterogeneous Graph Transformer" paper. 

Clusters/partitions a graph data object into multiple subgraphs, as motivated by the "ClusterGCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" paper. 

The data loader scheme from the "ClusterGCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" paper which merges partioned subgraphs and their betweencluster links from a largescale graph data object to form a minibatch. 

The GraphSAINT sampler base class from the "GraphSAINT: Graph Sampling Based Inductive Learning Method" paper. 

The GraphSAINT node sampler class (see 

The GraphSAINT edge sampler class (see 

The GraphSAINT random walk sampler class (see 

The ShaDow \(k\)hop sampler from the "Decoupling the Depth and Scope of Graph Neural Networks" paper. 

A data loader that randomly samples nodes within a graph and returns their induced subgraph. 

A loader that returns a tuple of data objects by sampling from multiple 

A data loader which batches data objects from a 

A data loader which batches data objects from a 

A data loader which merges succesive events of a 

The neighbor sampler from the "Inductive Representation Learning on Large Graphs" paper, which allows for minibatch training of GNNs on largescale graphs where fullbatch training is not feasible. 

A weighted random sampler that randomly samples elements according to class distribution. 

Dynamically adds samples to a minibatch up to a maximum size (either based on number of nodes or number of edges). 

A GPU prefetcher class for asynchronously transferring data of a 

A loader to cache minibatch outputs, e.g., obtained during 

A context manager to enable CPU affinity for data loader workers (only used when running on CPU devices). 
 class DataLoader(dataset: Union[Dataset, Sequence[BaseData], DatasetAdapter], batch_size: int = 1, shuffle: bool = False, follow_batch: Optional[List[str]] = None, exclude_keys: Optional[List[str]] = None, **kwargs)[source]
A data loader which merges data objects from a
torch_geometric.data.Dataset
to a minibatch. Data objects can be either of typeData
orHeteroData
. Parameters:
dataset (Dataset) – The dataset from which to load the data.
batch_size (int, optional) – How many samples per batch to load. (default:
1
)shuffle (bool, optional) – If set to
True
, the data will be reshuffled at every epoch. (default:False
)follow_batch (List[str], optional) – Creates assignment batch vectors for each key in the list. (default:
None
)exclude_keys (List[str], optional) – Will exclude each key in the list. (default:
None
)**kwargs (optional) – Additional arguments of
torch.utils.data.DataLoader
.
 class NodeLoader(data: Union[Data, HeteroData, Tuple[FeatureStore, GraphStore]], node_sampler: BaseSampler, input_nodes: Union[Tensor, None, str, Tuple[str, Optional[Tensor]]] = None, input_time: Optional[Tensor] = None, transform: Optional[Callable] = None, transform_sampler_output: Optional[Callable] = None, filter_per_worker: Optional[bool] = None, custom_cls: Optional[HeteroData] = None, input_id: Optional[Tensor] = None, **kwargs)[source]
A data loader that performs minibatch sampling from node information, using a generic
BaseSampler
implementation that defines asample_from_nodes()
function and is supported on the provided inputdata
object. Parameters:
data (Any) – A
Data
,HeteroData
, or (FeatureStore
,GraphStore
) data object.node_sampler (torch_geometric.sampler.BaseSampler) – The sampler implementation to be used with this loader. Needs to implement
sample_from_nodes()
. The sampler implementation must be compatible with the inputdata
object.input_nodes (torch.Tensor or str or Tuple[str, torch.Tensor]) – The indices of seed nodes to start sampling from. Needs to be either given as a
torch.LongTensor
ortorch.BoolTensor
. If set toNone
, all nodes will be considered. In heterogeneous graphs, needs to be passed as a tuple that holds the node type and node indices. (default:None
)input_time (torch.Tensor, optional) – Optional values to override the timestamp for the input nodes given in
input_nodes
. If not set, will use the timestamps intime_attr
as default (if present). Thetime_attr
needs to be set for this to work. (default:None
)transform (callable, optional) – A function/transform that takes in a sampled minibatch and returns a transformed version. (default:
None
)transform_sampler_output (callable, optional) – A function/transform that takes in a
torch_geometric.sampler.SamplerOutput
and returns a transformed version. (default:None
)filter_per_worker (bool, optional) – If set to
True
, will filter the returned data in each worker’s subprocess. If set toFalse
, will filter the returned data in the main process. If set toNone
, will automatically infer the decision based on whether data partially lives on the GPU (filter_per_worker=True
) or entirely on the CPU (filter_per_worker=False
). There exists different tradeoffs for setting this option. Specifically, setting this option toTrue
for inmemory datasets will move all features to shared memory, which may result in too many open file handles. (default:None
)custom_cls (HeteroData, optional) – A custom
HeteroData
class to return for minibatches in case of remote backends. (default:None
)**kwargs (optional) – Additional arguments of
torch.utils.data.DataLoader
, such asbatch_size
,shuffle
,drop_last
ornum_workers
.
 collate_fn(index: Union[Tensor, List[int]]) Any [source]
Samples a subgraph from a batch of input nodes.
 filter_fn(out: Union[SamplerOutput, HeteroSamplerOutput]) Union[Data, HeteroData] [source]
Joins the sampled nodes with their corresponding features, returning the resulting
Data
orHeteroData
object to be used downstream.
 class LinkLoader(data: Union[Data, HeteroData, Tuple[FeatureStore, GraphStore]], link_sampler: BaseSampler, edge_label_index: Union[Tensor, None, Tuple[str, str, str], Tuple[Tuple[str, str, str], Optional[Tensor]]] = None, edge_label: Optional[Tensor] = None, edge_label_time: Optional[Tensor] = None, neg_sampling: Optional[NegativeSampling] = None, neg_sampling_ratio: Optional[Union[int, float]] = None, transform: Optional[Callable] = None, transform_sampler_output: Optional[Callable] = None, filter_per_worker: Optional[bool] = None, custom_cls: Optional[HeteroData] = None, input_id: Optional[Tensor] = None, **kwargs)[source]
A data loader that performs minibatch sampling from link information, using a generic
BaseSampler
implementation that defines asample_from_edges()
function and is supported on the provided inputdata
object.Note
Negative sampling is currently implemented in an approximate way, i.e. negative edges may contain false negatives.
 Parameters:
data (Any) – A
Data
,HeteroData
, or (FeatureStore
,GraphStore
) data object.link_sampler (torch_geometric.sampler.BaseSampler) – The sampler implementation to be used with this loader. Needs to implement
sample_from_edges()
. The sampler implementation must be compatible with the inputdata
object.edge_label_index (Tensor or EdgeType or Tuple[EdgeType, Tensor]) – The edge indices, holding source and destination nodes to start sampling from. If set to
None
, all edges will be considered. In heterogeneous graphs, needs to be passed as a tuple that holds the edge type and corresponding edge indices. (default:None
)edge_label (Tensor, optional) – The labels of edge indices from which to start sampling from. Must be the same length as the
edge_label_index
. (default:None
)edge_label_time (Tensor, optional) – The timestamps of edge indices from which to start sampling from. Must be the same length as
edge_label_index
. If set, temporal sampling will be used such that neighbors are guaranteed to fulfill temporal constraints, i.e., neighbors have an earlier timestamp than the ouput edge. Thetime_attr
needs to be set for this to work. (default:None
)neg_sampling (NegativeSampling, optional) – The negative sampling configuration. For negative sampling mode
"binary"
, samples can be accessed via the attributesedge_label_index
andedge_label
in the respective edge type of the returned minibatch. In caseedge_label
does not exist, it will be automatically created and represents a binary classification task (0
= negative edge,1
= positive edge). In caseedge_label
does exist, it has to be a categorical label from0
tonum_classes  1
. After negative sampling, label0
represents negative edges, and labels1
tonum_classes
represent the labels of positive edges. Note that returned labels are of typetorch.float
for binary classification (to facilitate the easeofuse ofF.binary_cross_entropy()
) and of typetorch.long
for multiclass classification (to facilitate the easeofuse ofF.cross_entropy()
). For negative sampling mode"triplet"
, samples can be accessed via the attributessrc_index
,dst_pos_index
anddst_neg_index
in the respective node types of the returned minibatch.edge_label
needs to beNone
for"triplet"
negative sampling mode. If set toNone
, no negative sampling strategy is applied. (default:None
)neg_sampling_ratio (int or float, optional) – The ratio of sampled negative edges to the number of positive edges. Deprecated in favor of the
neg_sampling
argument. (default:None
).transform (callable, optional) – A function/transform that takes in a sampled minibatch and returns a transformed version. (default:
None
)transform_sampler_output (callable, optional) – A function/transform that takes in a
torch_geometric.sampler.SamplerOutput
and returns a transformed version. (default:None
)filter_per_worker (bool, optional) – If set to
True
, will filter the returned data in each worker’s subprocess. If set toFalse
, will filter the returned data in the main process. If set toNone
, will automatically infer the decision based on whether data partially lives on the GPU (filter_per_worker=True
) or entirely on the CPU (filter_per_worker=False
). There exists different tradeoffs for setting this option. Specifically, setting this option toTrue
for inmemory datasets will move all features to shared memory, which may result in too many open file handles. (default:None
)custom_cls (HeteroData, optional) – A custom
HeteroData
class to return for minibatches in case of remote backends. (default:None
)**kwargs (optional) – Additional arguments of
torch.utils.data.DataLoader
, such asbatch_size
,shuffle
,drop_last
ornum_workers
.
 collate_fn(index: Union[Tensor, List[int]]) Any [source]
Samples a subgraph from a batch of input edges.
 filter_fn(out: Union[SamplerOutput, HeteroSamplerOutput]) Union[Data, HeteroData] [source]
Joins the sampled nodes with their corresponding features, returning the resulting
Data
orHeteroData
object to be used downstream.
 class NeighborLoader(data: Union[Data, HeteroData, Tuple[FeatureStore, GraphStore]], num_neighbors: Union[List[int], Dict[Tuple[str, str, str], List[int]]], input_nodes: Union[Tensor, None, str, Tuple[str, Optional[Tensor]]] = None, input_time: Optional[Tensor] = None, replace: bool = False, subgraph_type: Union[SubgraphType, str] = 'directional', disjoint: bool = False, temporal_strategy: str = 'uniform', time_attr: Optional[str] = None, weight_attr: Optional[str] = None, transform: Optional[Callable] = None, transform_sampler_output: Optional[Callable] = None, is_sorted: bool = False, filter_per_worker: Optional[bool] = None, neighbor_sampler: Optional[NeighborSampler] = None, directed: bool = True, **kwargs)[source]
A data loader that performs neighbor sampling as introduced in the “Inductive Representation Learning on Large Graphs” paper. This loader allows for minibatch training of GNNs on largescale graphs where fullbatch training is not feasible.
More specifically,
num_neighbors
denotes how much neighbors are sampled for each node in each iteration.NeighborLoader
takes in this list ofnum_neighbors
and iteratively samplesnum_neighbors[i]
for each node involved in iterationi  1
.Sampled nodes are sorted based on the order in which they were sampled. In particular, the first
batch_size
nodes represent the set of original minibatch nodes.from torch_geometric.datasets import Planetoid from torch_geometric.loader import NeighborLoader data = Planetoid(path, name='Cora')[0] loader = NeighborLoader( data, # Sample 30 neighbors for each node for 2 iterations num_neighbors=[30] * 2, # Use a batch size of 128 for sampling training nodes batch_size=128, input_nodes=data.train_mask, ) sampled_data = next(iter(loader)) print(sampled_data.batch_size) >>> 128
By default, the data loader will only include the edges that were originally sampled (
directed = True
). This option should only be used in case the number of hops is equivalent to the number of GNN layers. In case the number of GNN layers is greater than the number of hops, consider settingdirected = False
, which will include all edges between all sampled nodes (but is slightly slower as a result).Furthermore,
NeighborLoader
works for both homogeneous graphs stored viaData
as well as heterogeneous graphs stored viaHeteroData
. When operating in heterogeneous graphs, up tonum_neighbors
neighbors will be sampled for eachedge_type
. However, more finegrained control over the amount of sampled neighbors of individual edge types is possible:from torch_geometric.datasets import OGB_MAG from torch_geometric.loader import NeighborLoader hetero_data = OGB_MAG(path)[0] loader = NeighborLoader( hetero_data, # Sample 30 neighbors for each node and edge type for 2 iterations num_neighbors={key: [30] * 2 for key in hetero_data.edge_types}, # Use a batch size of 128 for sampling training nodes of type paper batch_size=128, input_nodes=('paper', hetero_data['paper'].train_mask), ) sampled_hetero_data = next(iter(loader)) print(sampled_hetero_data['paper'].batch_size) >>> 128
Note
For an example of using
NeighborLoader
, see examples/hetero/to_hetero_mag.py.The
NeighborLoader
will return subgraphs where global node indices are mapped to local indices corresponding to this specific subgraph. However, often times it is desired to map the nodes of the current subgraph back to the global node indices. TheNeighborLoader
will include this mapping as part of thedata
object:loader = NeighborLoader(data, ...) sampled_data = next(iter(loader)) print(sampled_data.n_id) # Global node index of each node in batch.
In particular, the data loader will add the following attributes to the returned minibatch:
batch_size
The number of seed nodes (first nodes in the batch)n_id
The global node index for every sampled nodee_id
The global edge index for every sampled edgeinput_id
: The global index of theinput_nodes
num_sampled_nodes
: The number of sampled nodes in each hopnum_sampled_edges
: The number of sampled edges in each hop
 Parameters:
data (Any) – A
Data
,HeteroData
, or (FeatureStore
,GraphStore
) data object.num_neighbors (List[int] or Dict[Tuple[str, str, str], List[int]]) – The number of neighbors to sample for each node in each iteration. If an entry is set to
1
, all neighbors will be included. In heterogeneous graphs, may also take in a dictionary denoting the amount of neighbors to sample for each individual edge type.input_nodes (torch.Tensor or str or Tuple[str, torch.Tensor]) – The indices of nodes for which neighbors are sampled to create minibatches. Needs to be either given as a
torch.LongTensor
ortorch.BoolTensor
. If set toNone
, all nodes will be considered. In heterogeneous graphs, needs to be passed as a tuple that holds the node type and node indices. (default:None
)input_time (torch.Tensor, optional) – Optional values to override the timestamp for the input nodes given in
input_nodes
. If not set, will use the timestamps intime_attr
as default (if present). Thetime_attr
needs to be set for this to work. (default:None
)replace (bool, optional) – If set to
True
, will sample with replacement. (default:False
)subgraph_type (SubgraphType or str, optional) – The type of the returned subgraph. If set to
"directional"
, the returned subgraph only holds the sampled (directed) edges which are necessary to compute representations for the sampled seed nodes. If set to"bidirectional"
, sampled edges are converted to bidirectional edges. If set to"induced"
, the returned subgraph contains the induced subgraph of all sampled nodes. (default:"directional"
)disjoint (bool, optional) – If set to :obj: True, each seed node will create its own disjoint subgraph. If set to
True
, minibatch outputs will have abatch
vector holding the mapping of nodes to their respective subgraph. Will get automatically set toTrue
in case of temporal sampling. (default:False
)temporal_strategy (str, optional) – The sampling strategy when using temporal sampling (
"uniform"
,"last"
). If set to"uniform"
, will sample uniformly across neighbors that fulfill temporal constraints. If set to"last"
, will sample the last num_neighbors that fulfill temporal constraints. (default:"uniform"
)time_attr (str, optional) – The name of the attribute that denotes timestamps for either the nodes or edges in the graph. If set, temporal sampling will be used such that neighbors are guaranteed to fulfill temporal constraints, i.e. neighbors have an earlier or equal timestamp than the center node. (default:
None
)weight_attr (str, optional) – The name of the attribute that denotes edge weights in the graph. If set, weighted/biased sampling will be used such that neighbors are more likely to get sampled the higher their edge weights are. Edge weights do not need to sum to one, but must be nonnegative, finite and have a nonzero sum within local neighborhoods. (default:
None
)transform (callable, optional) – A function/transform that takes in a sampled minibatch and returns a transformed version. (default:
None
)transform_sampler_output (callable, optional) – A function/transform that takes in a
torch_geometric.sampler.SamplerOutput
and returns a transformed version. (default:None
)is_sorted (bool, optional) – If set to
True
, assumes thatedge_index
is sorted by column. Iftime_attr
is set, additionally requires that rows are sorted according to time within individual neighborhoods. This avoids internal resorting of the data and can improve runtime and memory efficiency. (default:False
)filter_per_worker (bool, optional) – If set to
True
, will filter the returned data in each worker’s subprocess. If set toFalse
, will filter the returned data in the main process. If set toNone
, will automatically infer the decision based on whether data partially lives on the GPU (filter_per_worker=True
) or entirely on the CPU (filter_per_worker=False
). There exists different tradeoffs for setting this option. Specifically, setting this option toTrue
for inmemory datasets will move all features to shared memory, which may result in too many open file handles. (default:None
)**kwargs (optional) – Additional arguments of
torch.utils.data.DataLoader
, such asbatch_size
,shuffle
,drop_last
ornum_workers
.
 class LinkNeighborLoader(data: Union[Data, HeteroData, Tuple[FeatureStore, GraphStore]], num_neighbors: Union[List[int], Dict[Tuple[str, str, str], List[int]]], edge_label_index: Union[Tensor, None, Tuple[str, str, str], Tuple[Tuple[str, str, str], Optional[Tensor]]] = None, edge_label: Optional[Tensor] = None, edge_label_time: Optional[Tensor] = None, replace: bool = False, subgraph_type: Union[SubgraphType, str] = 'directional', disjoint: bool = False, temporal_strategy: str = 'uniform', neg_sampling: Optional[NegativeSampling] = None, neg_sampling_ratio: Optional[Union[int, float]] = None, time_attr: Optional[str] = None, weight_attr: Optional[str] = None, transform: Optional[Callable] = None, transform_sampler_output: Optional[Callable] = None, is_sorted: bool = False, filter_per_worker: Optional[bool] = None, neighbor_sampler: Optional[NeighborSampler] = None, directed: bool = True, **kwargs)[source]
A linkbased data loader derived as an extension of the nodebased
torch_geometric.loader.NeighborLoader
. This loader allows for minibatch training of GNNs on largescale graphs where fullbatch training is not feasible.More specifically, this loader first selects a sample of edges from the set of input edges
edge_label_index
(which may or not be edges in the original graph) and then constructs a subgraph from all the nodes present in this list by samplingnum_neighbors
neighbors in each iteration.from torch_geometric.datasets import Planetoid from torch_geometric.loader import LinkNeighborLoader data = Planetoid(path, name='Cora')[0] loader = LinkNeighborLoader( data, # Sample 30 neighbors for each node for 2 iterations num_neighbors=[30] * 2, # Use a batch size of 128 for sampling training nodes batch_size=128, edge_label_index=data.edge_index, ) sampled_data = next(iter(loader)) print(sampled_data) >>> Data(x=[1368, 1433], edge_index=[2, 3103], y=[1368], train_mask=[1368], val_mask=[1368], test_mask=[1368], edge_label_index=[2, 128])
It is additionally possible to provide edge labels for sampled edges, which are then added to the batch:
loader = LinkNeighborLoader( data, num_neighbors=[30] * 2, batch_size=128, edge_label_index=data.edge_index, edge_label=torch.ones(data.edge_index.size(1)) ) sampled_data = next(iter(loader)) print(sampled_data) >>> Data(x=[1368, 1433], edge_index=[2, 3103], y=[1368], train_mask=[1368], val_mask=[1368], test_mask=[1368], edge_label_index=[2, 128], edge_label=[128])
The rest of the functionality mirrors that of
NeighborLoader
, including support for heterogeneous graphs. In particular, the data loader will add the following attributes to the returned minibatch:n_id
The global node index for every sampled nodee_id
The global edge index for every sampled edgeinput_id
: The global index of theedge_label_index
num_sampled_nodes
: The number of sampled nodes in each hopnum_sampled_edges
: The number of sampled edges in each hop
Note
Negative sampling is currently implemented in an approximate way, i.e. negative edges may contain false negatives.
Warning
Note that the sampling scheme is independent from the edge we are making a prediction for. That is, by default supervision edges in
edge_label_index
will not get masked out during sampling. In case there exists an overlap between message passing edges indata.edge_index
and supervision edges inedge_label_index
, you might end up sampling an edge you are making a prediction for. You can generally avoid this behavior (if desired) by makingdata.edge_index
andedge_label_index
two disjoint sets of edges, e.g., via theRandomLinkSplit
transformation and itsdisjoint_train_ratio
argument. Parameters:
data (Any) – A
Data
,HeteroData
, or (FeatureStore
,GraphStore
) data object.num_neighbors (List[int] or Dict[Tuple[str, str, str], List[int]]) – The number of neighbors to sample for each node in each iteration. If an entry is set to
1
, all neighbors will be included. In heterogeneous graphs, may also take in a dictionary denoting the amount of neighbors to sample for each individual edge type.edge_label_index (Tensor or EdgeType or Tuple[EdgeType, Tensor]) – The edge indices for which neighbors are sampled to create minibatches. If set to
None
, all edges will be considered. In heterogeneous graphs, needs to be passed as a tuple that holds the edge type and corresponding edge indices. (default:None
)edge_label (Tensor, optional) – The labels of edge indices for which neighbors are sampled. Must be the same length as the
edge_label_index
. If set toNone
its set to torch.zeros(…) internally. (default:None
)edge_label_time (Tensor, optional) – The timestamps for edge indices for which neighbors are sampled. Must be the same length as
edge_label_index
. If set, temporal sampling will be used such that neighbors are guaranteed to fulfill temporal constraints, i.e., neighbors have an earlier timestamp than the ouput edge. Thetime_attr
needs to be set for this to work. (default:None
)replace (bool, optional) – If set to
True
, will sample with replacement. (default:False
)subgraph_type (SubgraphType or str, optional) – The type of the returned subgraph. If set to
"directional"
, the returned subgraph only holds the sampled (directed) edges which are necessary to compute representations for the sampled seed nodes. If set to"bidirectional"
, sampled edges are converted to bidirectional edges. If set to"induced"
, the returned subgraph contains the induced subgraph of all sampled nodes. (default:"directional"
)disjoint (bool, optional) – If set to :obj: True, each seed node will create its own disjoint subgraph. If set to
True
, minibatch outputs will have abatch
vector holding the mapping of nodes to their respective subgraph. Will get automatically set toTrue
in case of temporal sampling. (default:False
)temporal_strategy (str, optional) – The sampling strategy when using temporal sampling (
"uniform"
,"last"
). If set to"uniform"
, will sample uniformly across neighbors that fulfill temporal constraints. If set to"last"
, will sample the last num_neighbors that fulfill temporal constraints. (default:"uniform"
)neg_sampling (NegativeSampling, optional) – The negative sampling configuration. For negative sampling mode
"binary"
, samples can be accessed via the attributesedge_label_index
andedge_label
in the respective edge type of the returned minibatch. In caseedge_label
does not exist, it will be automatically created and represents a binary classification task (0
= negative edge,1
= positive edge). In caseedge_label
does exist, it has to be a categorical label from0
tonum_classes  1
. After negative sampling, label0
represents negative edges, and labels1
tonum_classes
represent the labels of positive edges. Note that returned labels are of typetorch.float
for binary classification (to facilitate the easeofuse ofF.binary_cross_entropy()
) and of typetorch.long
for multiclass classification (to facilitate the easeofuse ofF.cross_entropy()
). For negative sampling mode"triplet"
, samples can be accessed via the attributessrc_index
,dst_pos_index
anddst_neg_index
in the respective node types of the returned minibatch.edge_label
needs to beNone
for"triplet"
negative sampling mode. If set toNone
, no negative sampling strategy is applied. (default:None
)neg_sampling_ratio (int or float, optional) – The ratio of sampled negative edges to the number of positive edges. Deprecated in favor of the
neg_sampling
argument. (default:None
)time_attr (str, optional) – The name of the attribute that denotes timestamps for either the nodes or edges in the graph. If set, temporal sampling will be used such that neighbors are guaranteed to fulfill temporal constraints, i.e. neighbors have an earlier or equal timestamp than the center node. Only used if
edge_label_time
is set. (default:None
)weight_attr (str, optional) – The name of the attribute that denotes edge weights in the graph. If set, weighted/biased sampling will be used such that neighbors are more likely to get sampled the higher their edge weights are. Edge weights do not need to sum to one, but must be nonnegative, finite and have a nonzero sum within local neighborhoods. (default:
None
)transform (callable, optional) – A function/transform that takes in a sampled minibatch and returns a transformed version. (default:
None
)transform_sampler_output (callable, optional) – A function/transform that takes in a
torch_geometric.sampler.SamplerOutput
and returns a transformed version. (default:None
)is_sorted (bool, optional) – If set to
True
, assumes thatedge_index
is sorted by column. Iftime_attr
is set, additionally requires that rows are sorted according to time within individual neighborhoods. This avoids internal resorting of the data and can improve runtime and memory efficiency. (default:False
)filter_per_worker (bool, optional) – If set to
True
, will filter the returned data in each worker’s subprocess. If set toFalse
, will filter the returned data in the main process. If set toNone
, will automatically infer the decision based on whether data partially lives on the GPU (filter_per_worker=True
) or entirely on the CPU (filter_per_worker=False
). There exists different tradeoffs for setting this option. Specifically, setting this option toTrue
for inmemory datasets will move all features to shared memory, which may result in too many open file handles. (default:None
)**kwargs (optional) – Additional arguments of
torch.utils.data.DataLoader
, such asbatch_size
,shuffle
,drop_last
ornum_workers
.
 class HGTLoader(data: Union[HeteroData, Tuple[FeatureStore, GraphStore]], num_samples: Union[List[int], Dict[str, List[int]]], input_nodes: Union[str, Tuple[str, Optional[Tensor]]], is_sorted: bool = False, transform: Optional[Callable] = None, transform_sampler_output: Optional[Callable] = None, filter_per_worker: Optional[bool] = None, **kwargs)[source]
The Heterogeneous Graph Sampler from the “Heterogeneous Graph Transformer” paper. This loader allows for minibatch training of GNNs on largescale graphs where fullbatch training is not feasible.
HGTLoader
tries to (1) keep a similar number of nodes and edges for each type and (2) keep the sampled subgraph dense to minimize the information loss and reduce the sample variance.Methodically,
HGTLoader
keeps track of a node budget for each node type, which is then used to determine the sampling probability of a node. In particular, the probability of sampling a node is determined by the number of connections to already sampled nodes and their node degrees. With this,HGTLoader
will sample a fixed amount of neighbors for each node type in each iteration, as given by thenum_samples
argument.Sampled nodes are sorted based on the order in which they were sampled. In particular, the first
batch_size
nodes represent the set of original minibatch nodes.Note
For an example of using
HGTLoader
, see examples/hetero/to_hetero_mag.py.from torch_geometric.loader import HGTLoader from torch_geometric.datasets import OGB_MAG hetero_data = OGB_MAG(path)[0] loader = HGTLoader( hetero_data, # Sample 512 nodes per type and per iteration for 4 iterations num_samples={key: [512] * 4 for key in hetero_data.node_types}, # Use a batch size of 128 for sampling training nodes of type paper batch_size=128, input_nodes=('paper', hetero_data['paper'].train_mask), ) sampled_hetero_data = next(iter(loader)) print(sampled_data.batch_size) >>> 128
 Parameters:
data (Any) – A
Data
,HeteroData
, or (FeatureStore
,GraphStore
) data object.num_samples (List[int] or Dict[str, List[int]]) – The number of nodes to sample in each iteration and for each node type. If given as a list, will sample the same amount of nodes for each node type.
input_nodes (str or Tuple[str, torch.Tensor]) – The indices of nodes for which neighbors are sampled to create minibatches. Needs to be passed as a tuple that holds the node type and corresponding node indices. Node indices need to be either given as a
torch.LongTensor
ortorch.BoolTensor
. If node indices are set toNone
, all nodes of this specific type will be considered.transform (callable, optional) – A function/transform that takes in an a sampled minibatch and returns a transformed version. (default:
None
)transform_sampler_output (callable, optional) – A function/transform that takes in a
torch_geometric.sampler.SamplerOutput
and returns a transformed version. (default:None
)is_sorted (bool, optional) – If set to
True
, assumes thatedge_index
is sorted by column. This avoids internal resorting of the data and can improve runtime and memory efficiency. (default:False
)filter_per_worker (bool, optional) – If set to
True
, will filter the returned data in each worker’s subprocess. If set toFalse
, will filter the returned data in the main process. If set toNone
, will automatically infer the decision based on whether data partially lives on the GPU (filter_per_worker=True
) or entirely on the CPU (filter_per_worker=False
). There exists different tradeoffs for setting this option. Specifically, setting this option toTrue
for inmemory datasets will move all features to shared memory, which may result in too many open file handles. (default:None
)**kwargs (optional) – Additional arguments of
torch.utils.data.DataLoader
, such asbatch_size
,shuffle
,drop_last
ornum_workers
.
 class ClusterData(data, num_parts: int, recursive: bool = False, save_dir: Optional[str] = None, log: bool = True, keep_inter_cluster_edges: bool = False, sparse_format: Literal['csr', 'csc'] = 'csr')[source]
Clusters/partitions a graph data object into multiple subgraphs, as motivated by the “ClusterGCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks” paper.
Note
The underlying METIS algorithm requires undirected graphs as input.
 Parameters:
data (torch_geometric.data.Data) – The graph data object.
num_parts (int) – The number of partitions.
recursive (bool, optional) – If set to
True
, will use multilevel recursive bisection instead of multilevel kway partitioning. (default:False
)save_dir (str, optional) – If set, will save the partitioned data to the
save_dir
directory for faster reuse. (default:None
)log (bool, optional) – If set to
False
, will not log any progress. (default:True
)keep_inter_cluster_edges (bool, optional) – If set to
True
, will keep intercluster edge connections. (default:False
)sparse_format (str, optional) – The sparse format to use for computing partitions. (default:
"csr"
)
 class ClusterLoader(cluster_data, **kwargs)[source]
The data loader scheme from the “ClusterGCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks” paper which merges partioned subgraphs and their betweencluster links from a largescale graph data object to form a minibatch.
Note
Use
ClusterData
andClusterLoader
in conjunction to form minibatches of clusters. For an example of using ClusterGCN, see examples/cluster_gcn_reddit.py or examples/cluster_gcn_ppi.py. Parameters:
cluster_data (torch_geometric.loader.ClusterData) – The already partioned data object.
**kwargs (optional) – Additional arguments of
torch.utils.data.DataLoader
, such asbatch_size
,shuffle
,drop_last
ornum_workers
.
 class GraphSAINTSampler(data, batch_size: int, num_steps: int = 1, sample_coverage: int = 0, save_dir: Optional[str] = None, log: bool = True, **kwargs)[source]
The GraphSAINT sampler base class from the “GraphSAINT: Graph Sampling Based Inductive Learning Method” paper. Given a graph in a
data
object, this class samples nodes and constructs subgraphs that can be processed in a minibatch fashion. Normalization coefficients for each minibatch are given vianode_norm
andedge_norm
data attributes.Note
See
GraphSAINTNodeSampler
,GraphSAINTEdgeSampler
andGraphSAINTRandomWalkSampler
for currently supported samplers. For an example of using GraphSAINT sampling, see examples/graph_saint.py. Parameters:
data (torch_geometric.data.Data) – The graph data object.
batch_size (int) – The approximate number of samples per batch.
num_steps (int, optional) – The number of iterations per epoch. (default:
1
)sample_coverage (int) – How many samples per node should be used to compute normalization statistics. (default:
0
)save_dir (str, optional) – If set, will save normalization statistics to the
save_dir
directory for faster reuse. (default:None
)log (bool, optional) – If set to
False
, will not log any preprocessing progress. (default:True
)**kwargs (optional) – Additional arguments of
torch.utils.data.DataLoader
, such asbatch_size
ornum_workers
.
 class GraphSAINTNodeSampler(data, batch_size: int, num_steps: int = 1, sample_coverage: int = 0, save_dir: Optional[str] = None, log: bool = True, **kwargs)[source]
The GraphSAINT node sampler class (see
GraphSAINTSampler
).
 class GraphSAINTEdgeSampler(data, batch_size: int, num_steps: int = 1, sample_coverage: int = 0, save_dir: Optional[str] = None, log: bool = True, **kwargs)[source]
The GraphSAINT edge sampler class (see
GraphSAINTSampler
).
 class GraphSAINTRandomWalkSampler(data, batch_size: int, walk_length: int, num_steps: int = 1, sample_coverage: int = 0, save_dir: Optional[str] = None, log: bool = True, **kwargs)[source]
The GraphSAINT random walk sampler class (see
GraphSAINTSampler
). Parameters:
walk_length (int) – The length of each random walk.
 class ShaDowKHopSampler(data: Data, depth: int, num_neighbors: int, node_idx: Optional[Tensor] = None, replace: bool = False, **kwargs)[source]
The ShaDow \(k\)hop sampler from the “Decoupling the Depth and Scope of Graph Neural Networks” paper. Given a graph in a
data
object, the sampler will create shallow, localized subgraphs. A deep GNN on this local graph then smooths the informative local signals.Note
For an example of using
ShaDowKHopSampler
, see examples/shadow.py. Parameters:
data (torch_geometric.data.Data) – The graph data object.
depth (int) – The depth/number of hops of the localized subgraph.
num_neighbors (int) – The number of neighbors to sample for each node in each hop.
node_idx (LongTensor or BoolTensor, optional) – The nodes that should be considered for creating minibatches. If set to
None
, all nodes will be considered.replace (bool, optional) – If set to
True
, will sample neighbors with replacement. (default:False
)**kwargs (optional) – Additional arguments of
torch.utils.data.DataLoader
, such asbatch_size
ornum_workers
.
 class RandomNodeLoader(data: Union[Data, HeteroData], num_parts: int, **kwargs)[source]
A data loader that randomly samples nodes within a graph and returns their induced subgraph.
Note
For an example of using
RandomNodeLoader
, see examples/ogbn_proteins_deepgcn.py. Parameters:
data (torch_geometric.data.Data or torch_geometric.data.HeteroData) – The
Data
orHeteroData
graph object.num_parts (int) – The number of partitions.
**kwargs (optional) – Additional arguments of
torch.utils.data.DataLoader
, such asnum_workers
.
 class ZipLoader(loaders: Union[List[NodeLoader], List[LinkLoader]], filter_per_worker: Optional[bool] = None, **kwargs)[source]
A loader that returns a tuple of data objects by sampling from multiple
NodeLoader
orLinkLoader
instances. Parameters:
loaders (List[NodeLoader] or List[LinkLoader]) – The loader instances.
filter_per_worker (bool, optional) – If set to
True
, will filter the returned data in each worker’s subprocess. If set toFalse
, will filter the returned data in the main process. If set toNone
, will automatically infer the decision based on whether data partially lives on the GPU (filter_per_worker=True
) or entirely on the CPU (filter_per_worker=False
). There exists different tradeoffs for setting this option. Specifically, setting this option toTrue
for inmemory datasets will move all features to shared memory, which may result in too many open file handles. (default:None
)**kwargs (optional) – Additional arguments of
torch.utils.data.DataLoader
, such asbatch_size
,shuffle
,drop_last
ornum_workers
.
 class DataListLoader(dataset: Union[Dataset, List[BaseData]], batch_size: int = 1, shuffle: bool = False, **kwargs)[source]
A data loader which batches data objects from a
torch_geometric.data.dataset
to a Python list. Data objects can be either of typeData
orHeteroData
.Note
This data loader should be used for multiGPU support via
torch_geometric.nn.DataParallel
. Parameters:
dataset (Dataset) – The dataset from which to load the data.
batch_size (int, optional) – How many samples per batch to load. (default:
1
)shuffle (bool, optional) – If set to
True
, the data will be reshuffled at every epoch. (default:False
)**kwargs (optional) – Additional arguments of
torch.utils.data.DataLoader
, such asdrop_last
ornum_workers
.
 class DenseDataLoader(dataset: Union[Dataset, List[Data]], batch_size: int = 1, shuffle: bool = False, **kwargs)[source]
A data loader which batches data objects from a
torch_geometric.data.dataset
to atorch_geometric.data.Batch
object by stacking all attributes in a new dimension.Note
To make use of this data loader, all graph attributes in the dataset need to have the same shape. In particular, this data loader should only be used when working with dense adjacency matrices.
 Parameters:
dataset (Dataset) – The dataset from which to load the data.
batch_size (int, optional) – How many samples per batch to load. (default:
1
)shuffle (bool, optional) – If set to
True
, the data will be reshuffled at every epoch. (default:False
)**kwargs (optional) – Additional arguments of
torch.utils.data.DataLoader
, such asdrop_last
ornum_workers
.
 class TemporalDataLoader(data: TemporalData, batch_size: int = 1, neg_sampling_ratio: float = 0.0, **kwargs)[source]
A data loader which merges succesive events of a
torch_geometric.data.TemporalData
to a minibatch. Parameters:
data (TemporalData) – The
TemporalData
from which to load the data.batch_size (int, optional) – How many samples per batch to load. (default:
1
)neg_sampling_ratio (float, optional) – The ratio of sampled negative destination nodes to the number of postive destination nodes. (default:
0.0
)**kwargs (optional) – Additional arguments of
torch.utils.data.DataLoader
.
 class NeighborSampler(edge_index: Union[Tensor, SparseTensor], sizes: List[int], node_idx: Optional[Tensor] = None, num_nodes: Optional[int] = None, return_e_id: bool = True, transform: Optional[Callable] = None, **kwargs)[source]
The neighbor sampler from the “Inductive Representation Learning on Large Graphs” paper, which allows for minibatch training of GNNs on largescale graphs where fullbatch training is not feasible.
Given a GNN with \(L\) layers and a specific minibatch of nodes
node_idx
for which we want to compute embeddings, this module iteratively samples neighbors and constructs bipartite graphs that simulate the actual computation flow of GNNs.More specifically,
sizes
denotes how much neighbors we want to sample for each node in each layer. This module then takes in thesesizes
and iteratively samplessizes[l]
for each node involved in layerl
. In the next layer, sampling is repeated for the union of nodes that were already encountered. The actual computation graphs are then returned in reversemode, meaning that we pass messages from a larger set of nodes to a smaller one, until we reach the nodes for which we originally wanted to compute embeddings.Hence, an item returned by
NeighborSampler
holds the currentbatch_size
, the IDsn_id
of all nodes involved in the computation, and a list of bipartite graph objects via the tuple(edge_index, e_id, size)
, whereedge_index
represents the bipartite edges between source and target nodes,e_id
denotes the IDs of original edges in the full graph, andsize
holds the shape of the bipartite graph. For each bipartite graph, target nodes are also included at the beginning of the list of source nodes so that one can easily apply skipconnections or add selfloops.Warning
NeighborSampler
is deprecated and will be removed in a future release. Usetorch_geometric.loader.NeighborLoader
instead.Note
For an example of using
NeighborSampler
, see examples/reddit.py or examples/ogbn_products_sage.py. Parameters:
edge_index (Tensor or SparseTensor) – A
torch.LongTensor
or atorch_sparse.SparseTensor
that defines the underlying graph connectivity/message passing flow.edge_index
holds the indices of a (sparse) symmetric adjacency matrix. Ifedge_index
is of typetorch.LongTensor
, its shape must be defined as[2, num_edges]
, where messages from nodesedge_index[0]
are sent to nodes inedge_index[1]
(in caseflow="source_to_target"
). Ifedge_index
is of typetorch_sparse.SparseTensor
, its sparse indices(row, col)
should relate torow = edge_index[1]
andcol = edge_index[0]
. The major difference between both formats is that we need to input the transposed sparse adjacency matrix.sizes ([int]) – The number of neighbors to sample for each node in each layer. If set to
sizes[l] = 1
, all neighbors are included in layerl
.node_idx (LongTensor, optional) – The nodes that should be considered for creating minibatches. If set to
None
, all nodes will be considered.num_nodes (int, optional) – The number of nodes in the graph. (default:
None
)return_e_id (bool, optional) – If set to
False
, will not return original edge indices of sampled edges. This is only useful in case when operating on graphs without edge features to save memory. (default:True
)transform (callable, optional) – A function/transform that takes in a sampled minibatch and returns a transformed version. (default:
None
)**kwargs (optional) – Additional arguments of
torch.utils.data.DataLoader
, such asbatch_size
,shuffle
,drop_last
ornum_workers
.
 class ImbalancedSampler(dataset: Union[Dataset, Data, List[Data], Tensor], input_nodes: Optional[Tensor] = None, num_samples: Optional[int] = None)[source]
A weighted random sampler that randomly samples elements according to class distribution. As such, it will either remove samples from the majority class (undersampling) or add more examples from the minority class (oversampling).
Graphlevel sampling:
from torch_geometric.loader import DataLoader, ImbalancedSampler sampler = ImbalancedSampler(dataset) loader = DataLoader(dataset, batch_size=64, sampler=sampler, ...)
Nodelevel sampling:
from torch_geometric.loader import NeighborLoader, ImbalancedSampler sampler = ImbalancedSampler(data, input_nodes=data.train_mask) loader = NeighborLoader(data, input_nodes=data.train_mask, batch_size=64, num_neighbors=[1, 1], sampler=sampler, ...)
You can also pass in the class labels directly as a
torch.Tensor
:from torch_geometric.loader import NeighborLoader, ImbalancedSampler sampler = ImbalancedSampler(data.y) loader = NeighborLoader(data, input_nodes=data.train_mask, batch_size=64, num_neighbors=[1, 1], sampler=sampler, ...)
 Parameters:
dataset (Dataset or Data or Tensor) – The dataset or class distribution from which to sample the data, given either as a
Dataset
,Data
, ortorch.Tensor
object.input_nodes (Tensor, optional) – The indices of nodes that are used by the corresponding loader, e.g., by
NeighborLoader
. If set toNone
, all nodes will be considered. This argument should only be set for nodelevel loaders and does not have any effect when operating on a set of graphs as given byDataset
. (default:None
)num_samples (int, optional) – The number of samples to draw for a single epoch. If set to
None
, will sample as much elements as there exists in the underlying data. (default:None
)
 class DynamicBatchSampler(dataset: Dataset, max_num: int, mode: str = 'node', shuffle: bool = False, skip_too_big: bool = False, num_steps: Optional[int] = None)[source]
Dynamically adds samples to a minibatch up to a maximum size (either based on number of nodes or number of edges). When data samples have a wide range in sizes, specifying a minibatch size in terms of number of samples is not ideal and can cause CUDA OOM errors.
Within the
DynamicBatchSampler
, the number of steps per epoch is ambiguous, depending on the order of the samples. By default the__len__()
will be undefined. This is fine for most cases but progress bars will be infinite. Alternatively,num_steps
can be supplied to cap the number of minibatches produced by the sampler.from torch_geometric.loader import DataLoader, DynamicBatchSampler sampler = DynamicBatchSampler(dataset, max_num=10000, mode="node") loader = DataLoader(dataset, batch_sampler=sampler, ...)
 Parameters:
dataset (Dataset) – Dataset to sample from.
max_num (int) – Size of minibatch to aim for in number of nodes or edges.
mode (str, optional) –
"node"
or"edge"
to measure batch size. (default:"node"
)shuffle (bool, optional) – If set to
True
, will have the data reshuffled at every epoch. (default:False
)skip_too_big (bool, optional) – If set to
True
, skip samples which cannot fit in a batch by itself. (default:False
)num_steps (int, optional) – The number of minibatches to draw for a single epoch. If set to
None
, will iterate through all the underlying examples, but__len__()
will beNone
since it is ambiguous. (default:None
)
 class PrefetchLoader(loader: DataLoader, device: Optional[device] = None)[source]
A GPU prefetcher class for asynchronously transferring data of a
torch.utils.data.DataLoader
from host memory to device memory. Parameters:
loader (torch.utils.data.DataLoader) – The data loader.
device (torch.device, optional) – The device to load the data to. (default:
None
)
 class CachedLoader(loader: DataLoader, device: Optional[device] = None, transform: Optional[Callable] = None)[source]
A loader to cache minibatch outputs, e.g., obtained during
NeighborLoader
iterations. Parameters:
loader (torch.utils.data.DataLoader) – The data loader.
device (torch.device, optional) – The device to load the data to. (default:
None
)transform (callable, optional) – A function/transform that takes in a sampled minibatch and returns a transformed version. (default:
None
)
 class AffinityMixin[source]
A context manager to enable CPU affinity for data loader workers (only used when running on CPU devices).
Affinitization places data loader workers threads on specific CPU cores. In effect, it allows for more efficient local memory allocation and reduces remote memory calls. Every time a process or thread moves from one core to another, registers and caches need to be flushed and reloaded. This can become very costly if it happens often, and our threads may also no longer be close to their data, or be able to share data in a cache.
See here for the accompanying tutorial.
Warning
To correctly affinitize compute threads (i.e. with
KMP_AFFINITY
), please make sure that you excludeloader_cores
from the list of cores available for the main process. This will cause core oversubsription and exacerbate performance.loader = NeigborLoader(data, num_workers=3) with loader.enable_cpu_affinity(loader_cores=[0, 1, 2]): for batch in loader: pass
 enable_cpu_affinity(loader_cores: Optional[Union[List[List[int]], List[int]]] = None) None [source]
Enables CPU affinity.
 Parameters:
loader_cores ([int], optional) – List of CPU cores to which data loader workers should affinitize to. By default, it will affinitize to
numa0
cores. If used with"spawn"
multiprocessing context, it will automatically enable multithreading and use multiple cores per each worker.