class GraphMultisetTransformer(channels: int, k: int, num_encoder_blocks: int = 1, heads: int = 1, layer_norm: bool = False, dropout: float = 0.0)[source]

Bases: Aggregation

The Graph Multiset Transformer pooling operator from the “Accurate Learning of Graph Representations with Graph Multiset Pooling” paper.

The GraphMultisetTransformer aggregates elements into \(k\) representative elements via attention-based pooling, computes the interaction among them via num_encoder_blocks self-attention blocks, and finally pools the representative elements via attention-based pooling into a single cluster.

  • channels (int) – Size of each input sample.

  • k (int) – Number of \(k\) representative nodes after pooling.

  • num_encoder_blocks (int, optional) – Number of Set Attention Blocks (SABs) between the two pooling blocks. (default: 1)

  • heads (int, optional) – Number of multi-head-attentions. (default: 1)

  • norm (str, optional) – If set to True, will apply layer normalization. (default: False)

  • dropout (float, optional) – Dropout probability of attention weights. (default: 0)


Resets all learnable parameters of the module.

forward(x: Tensor, index: Optional[Tensor] = None, ptr: Optional[Tensor] = None, dim_size: Optional[int] = None, dim: int = -2) Tensor[source]
  • x (torch.Tensor) – The source tensor.

  • index (torch.Tensor, optional) – The indices of elements for applying the aggregation. One of index or ptr must be defined. (default: None)

  • ptr (torch.Tensor, optional) – If given, computes the aggregation based on sorted inputs in CSR representation. One of index or ptr must be defined. (default: None)

  • dim_size (int, optional) – The size of the output tensor at dimension dim after aggregation. (default: None)

  • dim (int, optional) – The dimension in which to aggregate. (default: -2)