dense_diff_pool(x: Tensor, adj: Tensor, s: Tensor, mask: Optional[Tensor] = None, normalize: bool = True) Tuple[Tensor, Tensor, Tensor, Tensor][source]

The differentiable pooling operator from the “Hierarchical Graph Representation Learning with Differentiable Pooling” paper.

\[ \begin{align}\begin{aligned}\mathbf{X}^{\prime} &= {\mathrm{softmax}(\mathbf{S})}^{\top} \cdot \mathbf{X}\\\mathbf{A}^{\prime} &= {\mathrm{softmax}(\mathbf{S})}^{\top} \cdot \mathbf{A} \cdot \mathrm{softmax}(\mathbf{S})\end{aligned}\end{align} \]

based on dense learned assignments \(\mathbf{S} \in \mathbb{R}^{B \times N \times C}\). Returns the pooled node feature matrix, the coarsened adjacency matrix and two auxiliary objectives: (1) The link prediction loss

\[\mathcal{L}_{LP} = {\| \mathbf{A} - \mathrm{softmax}(\mathbf{S}) {\mathrm{softmax}(\mathbf{S})}^{\top} \|}_F,\]

and (2) the entropy regularization

\[\mathcal{L}_E = \frac{1}{N} \sum_{n=1}^N H(\mathbf{S}_n).\]
  • x (torch.Tensor) – Node feature tensor \(\mathbf{X} \in \mathbb{R}^{B \times N \times F}\), with batch-size \(B\), (maximum) number of nodes \(N\) for each graph, and feature dimension \(F\).

  • adj (torch.Tensor) – Adjacency tensor \(\mathbf{A} \in \mathbb{R}^{B \times N \times N}\).

  • s (torch.Tensor) – Assignment tensor \(\mathbf{S} \in \mathbb{R}^{B \times N \times C}\) with number of clusters \(C\). The softmax does not have to be applied before-hand, since it is executed within this method.

  • mask (torch.Tensor, optional) – Mask matrix \(\mathbf{M} \in {\{ 0, 1 \}}^{B \times N}\) indicating the valid nodes for each graph. (default: None)

  • normalize (bool, optional) – If set to False, the link prediction loss is not divided by adj.numel(). (default: True)

Return type:

(torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor)