torch_geometric.transforms.AddMetaPaths

class AddMetaPaths(metapaths: List[List[Tuple[str, str, str]]], drop_orig_edge_types: bool = False, keep_same_node_type: bool = False, drop_unconnected_node_types: bool = False, max_sample: Optional[int] = None, weighted: bool = False, **kwargs: bool)[source]

Bases: BaseTransform

Adds additional edge types to a HeteroData object between the source node type and the destination node type of a given metapath, as described in the “Heterogenous Graph Attention Networks” paper (functional name: add_metapaths).

Meta-path based neighbors can exploit different aspects of structure information in heterogeneous graphs. Formally, a metapath is a path of the form

\[\mathcal{V}_1 \xrightarrow{R_1} \mathcal{V}_2 \xrightarrow{R_2} \ldots \xrightarrow{R_{\ell-1}} \mathcal{V}_{\ell}\]

in which \(\mathcal{V}_i\) represents node types, and \(R_j\) represents the edge type connecting two node types. The added edge type is given by the sequential multiplication of adjacency matrices along the metapath, and is added to the HeteroData object as edge type (src_node_type, "metapath_*", dst_node_type), where src_node_type and dst_node_type denote \(\mathcal{V}_1\) and \(\mathcal{V}_{\ell}\), respectively.

In addition, a metapath_dict object is added to the HeteroData object which maps the metapath-based edge type to its original metapath.

from torch_geometric.datasets import DBLP
from torch_geometric.data import HeteroData
from torch_geometric.transforms import AddMetaPaths

data = DBLP(root)[0]
# 4 node types: "paper", "author", "conference", and "term"
# 6 edge types: ("paper","author"), ("author", "paper"),
#               ("paper, "term"), ("paper", "conference"),
#               ("term, "paper"), ("conference", "paper")

# Add two metapaths:
# 1. From "paper" to "paper" through "conference"
# 2. From "author" to "conference" through "paper"
metapaths = [[("paper", "conference"), ("conference", "paper")],
             [("author", "paper"), ("paper", "conference")]]
data = AddMetaPaths(metapaths)(data)

print(data.edge_types)
>>> [("author", "to", "paper"), ("paper", "to", "author"),
     ("paper", "to", "term"), ("paper", "to", "conference"),
     ("term", "to", "paper"), ("conference", "to", "paper"),
     ("paper", "metapath_0", "paper"),
     ("author", "metapath_1", "conference")]

print(data.metapath_dict)
>>> {("paper", "metapath_0", "paper"): [("paper", "conference"),
                                        ("conference", "paper")],
     ("author", "metapath_1", "conference"): [("author", "paper"),
                                              ("paper", "conference")]}
Parameters:
  • metapaths (List[List[Tuple[str, str, str]]]) – The metapaths described by a list of lists of (src_node_type, rel_type, dst_node_type) tuples.

  • drop_orig_edge_types (bool, optional) – If set to True, existing edge types will be dropped. (default: False)

  • keep_same_node_type (bool, optional) – If set to True, existing edge types between the same node type are not dropped even in case drop_orig_edge_types is set to True. (default: False)

  • drop_unconnected_node_types (bool, optional) – If set to True, will drop node types not connected by any edge type. (default: False)

  • max_sample (int, optional) – If set, will sample at maximum max_sample neighbors within metapaths. Useful in order to tackle very dense metapath edges. (default: None)

  • weighted (bool, optional) – If set to True, computes weights for each metapath edge and stores them in edge_weight. The weight of each metapath edge is computed as the number of metapaths from the start to the end of the metapath edge. (default False)