torch_geometric.data.Data
- class Data(x: Optional[Tensor] = None, edge_index: Optional[Tensor] = None, edge_attr: Optional[Tensor] = None, y: Optional[Tensor] = None, pos: Optional[Tensor] = None, **kwargs)[source]
Bases:
BaseData
,FeatureStore
,GraphStore
A data object describing a homogeneous graph. The data object can hold node-level, link-level and graph-level attributes. In general,
Data
tries to mimic the behavior of a regular Python dictionary. In addition, it provides useful functionality for analyzing graph structures, and provides basic PyTorch tensor functionalities. See here for the accompanying tutorial.from torch_geometric.data import Data data = Data(x=x, edge_index=edge_index, ...) # Add additional arguments to `data`: data.train_idx = torch.tensor([...], dtype=torch.long) data.test_mask = torch.tensor([...], dtype=torch.bool) # Analyzing the graph structure: data.num_nodes >>> 23 data.is_directed() >>> False # PyTorch tensor functionality: data = data.pin_memory() data = data.to('cuda:0', non_blocking=True)
- Parameters
x (torch.Tensor, optional) – Node feature matrix with shape
[num_nodes, num_node_features]
. (default:None
)edge_index (LongTensor, optional) – Graph connectivity in COO format with shape
[2, num_edges]
. (default:None
)edge_attr (torch.Tensor, optional) – Edge feature matrix with shape
[num_edges, num_edge_features]
. (default:None
)y (torch.Tensor, optional) – Graph-level or node-level ground-truth labels with arbitrary shape. (default:
None
)pos (torch.Tensor, optional) – Node position matrix with shape
[num_nodes, num_dimensions]
. (default:None
)**kwargs (optional) – Additional attributes.
- to_namedtuple() NamedTuple [source]
Returns a
NamedTuple
of stored key/value pairs.
- update(data: Union[Data, Dict[str, Any]]) Data [source]
Updates the data object with the elements from another data object.
- __cat_dim__(key: str, value: Any, *args, **kwargs) Any [source]
Returns the dimension for which the value
value
of the attributekey
will get concatenated when creating mini-batches usingtorch_geometric.loader.DataLoader
.Note
This method is for internal use only, and should only be overridden in case the mini-batch creation process is corrupted for a specific attribute.
- __inc__(key: str, value: Any, *args, **kwargs) Any [source]
Returns the incremental count to cumulatively increase the value
value
of the attributekey
when creating mini-batches usingtorch_geometric.loader.DataLoader
.Note
This method is for internal use only, and should only be overridden in case the mini-batch creation process is corrupted for a specific attribute.
- is_node_attr(key: str) bool [source]
Returns
True
if the object at keykey
denotes a node-level tensor attribute.
- is_edge_attr(key: str) bool [source]
Returns
True
if the object at keykey
denotes an edge-level tensor attribute.
- subgraph(subset: Tensor) Data [source]
Returns the induced subgraph given by the node indices
subset
.- Parameters
subset (LongTensor or BoolTensor) – The nodes to keep.
- edge_subgraph(subset: Tensor) Data [source]
Returns the induced subgraph given by the edge indices
subset
. Will currently preserve all the nodes in the graph, even if they are isolated after subgraph computation.- Parameters
subset (LongTensor or BoolTensor) – The edges to keep.
- to_heterogeneous(node_type: Optional[Tensor] = None, edge_type: Optional[Tensor] = None, node_type_names: Optional[List[str]] = None, edge_type_names: Optional[List[Tuple[str, str, str]]] = None)[source]
Converts a
Data
object to a heterogeneousHeteroData
object. For this, node and edge attributes are splitted according to the node-level and edge-level vectorsnode_type
andedge_type
, respectively.node_type_names
andedge_type_names
can be used to give meaningful node and edge type names, respectively. That is, the node_type0
is given bynode_type_names[0]
. If theData
object was constructed viato_homogeneous()
, the object can be reconstructed without any need to pass in additional arguments.- Parameters
node_type (torch.Tensor, optional) – A node-level vector denoting the type of each node. (default:
None
)edge_type (torch.Tensor, optional) – An edge-level vector denoting the type of each edge. (default:
None
)node_type_names (List[str], optional) – The names of node types. (default:
None
)edge_type_names (List[Tuple[str, str, str]], optional) – The names of edge types. (default:
None
)
- classmethod from_dict(mapping: Dict[str, Any]) Data [source]
Creates a
Data
object from a Python dictionary.
- property num_features: int
Returns the number of features per node in the graph. Alias for
num_node_features
.
- get_all_tensor_attrs() List[TensorAttr] [source]
Obtains all feature attributes stored in Data.
- apply(func: Callable, *args: List[str])
Applies the function
func
, either to all attributes or only the ones given in*args
.
- apply_(func: Callable, *args: List[str])
Applies the in-place function
func
, either to all attributes or only the ones given in*args
.
- clone(*args: List[str])
Performs cloning of tensors, either for all attributes or only the ones given in
*args
.
- coalesce()
Sorts and removes duplicated entries from edge indices
edge_index
.
- contiguous(*args: List[str])
Ensures a contiguous memory layout, either for all attributes or only the ones given in
*args
.
- coo(edge_types: Optional[List[Any]] = None, store: bool = False) Tuple[Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Optional[Tensor]]]
Obtains the edge indices in the
GraphStore
in COO format.- Parameters
edge_types (List[Any], optional) – The edge types of edge indices to obtain. If set to
None
, will return the edge indices of all existing edge types. (default:None
)store (bool, optional) – Whether to store converted edge indices in the
GraphStore
. (default:False
)
- cpu(*args: List[str])
Copies attributes to CPU memory, either for all attributes or only the ones given in
*args
.
- csc(edge_types: Optional[List[Any]] = None, store: bool = False) Tuple[Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Optional[Tensor]]]
Obtains the edge indices in the
GraphStore
in CSC format.- Parameters
edge_types (List[Any], optional) – The edge types of edge indices to obtain. If set to
None
, will return the edge indices of all existing edge types. (default:None
)store (bool, optional) – Whether to store converted edge indices in the
GraphStore
. (default:False
)
- csr(edge_types: Optional[List[Any]] = None, store: bool = False) Tuple[Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Optional[Tensor]]]
Obtains the edge indices in the
GraphStore
in CSR format.- Parameters
edge_types (List[Any], optional) – The edge types of edge indices to obtain. If set to
None
, will return the edge indices of all existing edge types. (default:None
)store (bool, optional) – Whether to store converted edge indices in the
GraphStore
. (default:False
)
- cuda(device: Optional[Union[int, str]] = None, *args: List[str], non_blocking: bool = False)
Copies attributes to CUDA memory, either for all attributes or only the ones given in
*args
.
- detach(*args: List[str])
Detaches attributes from the computation graph by creating a new tensor, either for all attributes or only the ones given in
*args
.
- detach_(*args: List[str])
Detaches attributes from the computation graph, either for all attributes or only the ones given in
*args
.
- generate_ids()
Generates and sets
n_id
ande_id
attributes to assign each node and edge to a continuously ascending and unique ID.
- get_edge_index(*args, **kwargs) Tuple[Tensor, Tensor]
Synchronously obtains an
edge_index
tuple from theGraphStore
.
- get_tensor(*args, convert_type: bool = False, **kwargs) Union[Tensor, ndarray]
Synchronously obtains a
tensor
from theFeatureStore
.- Parameters
**kwargs (TensorAttr) – Any relevant tensor attributes that correspond to the feature tensor. See the
TensorAttr
documentation for required and optional attributes.convert_type (bool, optional) – Whether to convert the type of the output tensor to the type of the attribute index. (default:
False
)
- Raises
ValueError – If the input
TensorAttr
is not fully specified.KeyError – If the tensor corresponding to the input
TensorAttr
was not found.
- get_tensor_size(*args, **kwargs) Optional[Tuple[int, ...]]
Obtains the size of a tensor given its
TensorAttr
, orNone
if the tensor does not exist.
- is_coalesced() bool
Returns
True
if edge indicesedge_index
are sorted and do not contain duplicate entries.
- property is_cuda: bool
Returns
True
if anytorch.Tensor
attribute is stored on the GPU,False
otherwise.
- multi_get_tensor(attrs: List[TensorAttr], convert_type: bool = False) List[Union[Tensor, ndarray]]
Synchronously obtains a list of tensors from the
FeatureStore
for each tensor associated with the attributes inattrs
.Note
The default implementation simply iterates over all calls to
get_tensor()
. Implementor classes that can provide additional, more performant functionality are recommended to to override this method.- Parameters
attrs (List[TensorAttr]) – A list of input
TensorAttr
objects that identify the tensors to obtain.convert_type (bool, optional) – Whether to convert the type of the output tensor to the type of the attribute index. (default:
False
)
- Raises
ValueError – If any input
TensorAttr
is not fully specified.KeyError – If any of the tensors corresponding to the input
TensorAttr
was not found.
- property num_edges: int
Returns the number of edges in the graph. For undirected graphs, this will return the number of bi-directional edges, which is double the amount of unique edges.
- property num_nodes: Optional[int]
Returns the number of nodes in the graph.
Note
The number of nodes in the data object is automatically inferred in case node-level attributes are present, e.g.,
data.x
. In some cases, however, a graph may only be given without any node-level attributes. PyG then guesses the number of nodes according toedge_index.max().item() + 1
. However, in case there exists isolated nodes, this number does not have to be correct which can result in unexpected behavior. Thus, we recommend to set the number of nodes in your data object explicitly viadata.num_nodes = ...
. You will be given a warning that requests you to do so.
- pin_memory(*args: List[str])
Copies attributes to pinned memory, either for all attributes or only the ones given in
*args
.
- put_edge_index(edge_index: Tuple[Tensor, Tensor], *args, **kwargs) bool
Synchronously adds an
edge_index
tuple to theGraphStore
. Returns whether insertion was successful.- Parameters
edge_index (Tuple[torch.Tensor, torch.Tensor]) – The
edge_index
tuple in a format specified inEdgeAttr
.**kwargs (EdgeAttr) – Any relevant edge attributes that correspond to the
edge_index
tuple. See theEdgeAttr
documentation for required and optional attributes.
- put_tensor(tensor: Union[Tensor, ndarray], *args, **kwargs) bool
Synchronously adds a
tensor
to theFeatureStore
. Returns whether insertion was successful.- Parameters
tensor (torch.Tensor or np.ndarray) – The feature tensor to be added.
**kwargs (TensorAttr) – Any relevant tensor attributes that correspond to the feature tensor. See the
TensorAttr
documentation for required and optional attributes.
- Raises
ValueError – If the input
TensorAttr
is not fully specified.
- record_stream(stream: Stream, *args: List[str])
Ensures that the tensor memory is not reused for another tensor until all current work queued on
stream
has been completed, either for all attributes or only the ones given in*args
.
- remove_edge_index(*args, **kwargs) bool
Synchronously deletes an
edge_index
tuple from theGraphStore
. Returns whether deletion was successful.
- remove_tensor(*args, **kwargs) bool
Removes a tensor from the
FeatureStore
. Returns whether deletion was successful.- Parameters
**kwargs (TensorAttr) – Any relevant tensor attributes that correspond to the feature tensor. See the
TensorAttr
documentation for required and optional attributes.- Raises
ValueError – If the input
TensorAttr
is not fully specified.
- requires_grad_(*args: List[str], requires_grad: bool = True)
Tracks gradient computation, either for all attributes or only the ones given in
*args
.
Moves attributes to shared memory, either for all attributes or only the ones given in
*args
.
- size(dim: Optional[int] = None) Optional[Union[Tuple[Optional[int], Optional[int]], int]]
Returns the size of the adjacency matrix induced by the graph.
- to(device: Union[int, str], *args: List[str], non_blocking: bool = False)
Performs tensor device conversion, either for all attributes or only the ones given in
*args
.
- update_tensor(tensor: Union[Tensor, ndarray], *args, **kwargs) bool
Updates a
tensor
in theFeatureStore
with a new value. Returns whether the update was succesful.Note
Implementor classes can choose to define more efficient update methods; the default performs a removal and insertion.
- Parameters
tensor (torch.Tensor or np.ndarray) – The feature tensor to be updated.
**kwargs (TensorAttr) – Any relevant tensor attributes that correspond to the feature tensor. See the
TensorAttr
documentation for required and optional attributes.
- view(*args, **kwargs) AttrView
Returns a view of the
FeatureStore
given a not yet fully-specifiedTensorAttr
.
- get_all_edge_attrs() List[EdgeAttr] [source]
Obtains all edge attributes stored in the
GraphStore
.