torch_geometric.nn.conv.GATConv
- class GATConv(in_channels: Union[int, Tuple[int, int]], out_channels: int, heads: int = 1, concat: bool = True, negative_slope: float = 0.2, dropout: float = 0.0, add_self_loops: bool = True, edge_dim: Optional[int] = None, fill_value: Union[float, Tensor, str] = 'mean', bias: bool = True, **kwargs)[source]
Bases:
MessagePassing
The graph attentional operator from the “Graph Attention Networks” paper
\[\mathbf{x}^{\prime}_i = \alpha_{i,i}\mathbf{\Theta}\mathbf{x}_{i} + \sum_{j \in \mathcal{N}(i)} \alpha_{i,j}\mathbf{\Theta}\mathbf{x}_{j},\]where the attention coefficients \(\alpha_{i,j}\) are computed as
\[\alpha_{i,j} = \frac{ \exp\left(\mathrm{LeakyReLU}\left(\mathbf{a}^{\top} [\mathbf{\Theta}\mathbf{x}_i \, \Vert \, \mathbf{\Theta}\mathbf{x}_j] \right)\right)} {\sum_{k \in \mathcal{N}(i) \cup \{ i \}} \exp\left(\mathrm{LeakyReLU}\left(\mathbf{a}^{\top} [\mathbf{\Theta}\mathbf{x}_i \, \Vert \, \mathbf{\Theta}\mathbf{x}_k] \right)\right)}.\]If the graph has multi-dimensional edge features \(\mathbf{e}_{i,j}\), the attention coefficients \(\alpha_{i,j}\) are computed as
\[\alpha_{i,j} = \frac{ \exp\left(\mathrm{LeakyReLU}\left(\mathbf{a}^{\top} [\mathbf{\Theta}\mathbf{x}_i \, \Vert \, \mathbf{\Theta}\mathbf{x}_j \, \Vert \, \mathbf{\Theta}_{e} \mathbf{e}_{i,j}]\right)\right)} {\sum_{k \in \mathcal{N}(i) \cup \{ i \}} \exp\left(\mathrm{LeakyReLU}\left(\mathbf{a}^{\top} [\mathbf{\Theta}\mathbf{x}_i \, \Vert \, \mathbf{\Theta}\mathbf{x}_k \, \Vert \, \mathbf{\Theta}_{e} \mathbf{e}_{i,k}]\right)\right)}.\]- Parameters
in_channels (int or tuple) – Size of each input sample, or
-1
to derive the size from the first input(s) to the forward method. A tuple corresponds to the sizes of source and target dimensionalities.out_channels (int) – Size of each output sample.
heads (int, optional) – Number of multi-head-attentions. (default:
1
)concat (bool, optional) – If set to
False
, the multi-head attentions are averaged instead of concatenated. (default:True
)negative_slope (float, optional) – LeakyReLU angle of the negative slope. (default:
0.2
)dropout (float, optional) – Dropout probability of the normalized attention coefficients which exposes each node to a stochastically sampled neighborhood during training. (default:
0
)add_self_loops (bool, optional) – If set to
False
, will not add self-loops to the input graph. (default:True
)edge_dim (int, optional) – Edge feature dimensionality (in case there are any). (default:
None
)fill_value (float or torch.Tensor or str, optional) – The way to generate edge features of self-loops (in case
edge_dim != None
). If given asfloat
ortorch.Tensor
, edge features of self-loops will be directly given byfill_value
. If given asstr
, edge features of self-loops are computed by aggregating all features of edges that point to the specific node, according to a reduce operation. ("add"
,"mean"
,"min"
,"max"
,"mul"
). (default:"mean"
)bias (bool, optional) – If set to
False
, the layer will not learn an additive bias. (default:True
)**kwargs (optional) – Additional arguments of
torch_geometric.nn.conv.MessagePassing
.
- Shapes:
input: node features \((|\mathcal{V}|, F_{in})\) or \(((|\mathcal{V_s}|, F_{s}), (|\mathcal{V_t}|, F_{t}))\) if bipartite, edge indices \((2, |\mathcal{E}|)\), edge features \((|\mathcal{E}|, D)\) (optional)
output: node features \((|\mathcal{V}|, H * F_{out})\) or \(((|\mathcal{V}_t|, H * F_{out})\) if bipartite. If
return_attention_weights=True
, then \(((|\mathcal{V}|, H * F_{out}), ((2, |\mathcal{E}|), (|\mathcal{E}|, H)))\) or \(((|\mathcal{V_t}|, H * F_{out}), ((2, |\mathcal{E}|), (|\mathcal{E}|, H)))\) if bipartite