class SAGPooling(in_channels: int, ratio: ~typing.Union[float, int] = 0.5, GNN: ~torch.nn.modules.module.Module = <class 'torch_geometric.nn.conv.graph_conv.GraphConv'>, min_score: ~typing.Optional[float] = None, multiplier: float = 1.0, nonlinearity: ~typing.Union[str, ~typing.Callable] = 'tanh', **kwargs)[source]

Bases: Module

The self-attention pooling operator from the “Self-Attention Graph Pooling” and “Understanding Attention and Generalization in Graph Neural Networks” papers.

If min_score \(\tilde{\alpha}\) is None, computes:

\[ \begin{align}\begin{aligned}\mathbf{y} &= \textrm{GNN}(\mathbf{X}, \mathbf{A})\\\mathbf{i} &= \mathrm{top}_k(\mathbf{y})\\\mathbf{X}^{\prime} &= (\mathbf{X} \odot \mathrm{tanh}(\mathbf{y}))_{\mathbf{i}}\\\mathbf{A}^{\prime} &= \mathbf{A}_{\mathbf{i},\mathbf{i}}\end{aligned}\end{align} \]

If min_score \(\tilde{\alpha}\) is a value in [0, 1], computes:

\[ \begin{align}\begin{aligned}\mathbf{y} &= \mathrm{softmax}(\textrm{GNN}(\mathbf{X},\mathbf{A}))\\\mathbf{i} &= \mathbf{y}_i > \tilde{\alpha}\\\mathbf{X}^{\prime} &= (\mathbf{X} \odot \mathbf{y})_{\mathbf{i}}\\\mathbf{A}^{\prime} &= \mathbf{A}_{\mathbf{i},\mathbf{i}}.\end{aligned}\end{align} \]

Projections scores are learned based on a graph neural network layer.

  • in_channels (int) – Size of each input sample.

  • ratio (float or int) – Graph pooling ratio, which is used to compute \(k = \lceil \mathrm{ratio} \cdot N \rceil\), or the value of \(k\) itself, depending on whether the type of ratio is float or int. This value is ignored if min_score is not None. (default: 0.5)

  • GNN (torch.nn.Module, optional) – A graph neural network layer for calculating projection scores (one of torch_geometric.nn.conv.GraphConv, torch_geometric.nn.conv.GCNConv, torch_geometric.nn.conv.GATConv or torch_geometric.nn.conv.SAGEConv). (default: torch_geometric.nn.conv.GraphConv)

  • min_score (float, optional) – Minimal node score \(\tilde{\alpha}\) which is used to compute indices of pooled nodes \(\mathbf{i} = \mathbf{y}_i > \tilde{\alpha}\). When this value is not None, the ratio argument is ignored. (default: None)

  • multiplier (float, optional) – Coefficient by which features gets multiplied after pooling. This can be useful for large graphs and when min_score is used. (default: 1)

  • nonlinearity (str or callable, optional) – The non-linearity to use. (default: "tanh")

  • **kwargs (optional) – Additional parameters for initializing the graph neural network layer.


Resets all learnable parameters of the module.

forward(x: Tensor, edge_index: Tensor, edge_attr: Optional[Tensor] = None, batch: Optional[Tensor] = None, attn: Optional[Tensor] = None) Tuple[Tensor, Tensor, Optional[Tensor], Tensor, Tensor, Tensor][source]
  • x (torch.Tensor) – The node feature matrix.

  • edge_index (torch.Tensor) – The edge indices.

  • edge_attr (torch.Tensor, optional) – The edge features. (default: None)

  • batch (torch.Tensor, optional) – The batch vector \(\mathbf{b} \in {\{ 0, \ldots, B-1\}}^N\), which assigns each node to a specific example. (default: None)

  • attn (torch.Tensor, optional) – Optional node-level matrix to use for computing attention scores instead of using the node feature matrix x. (default: None)