torch_geometric.datasets.GDELTLite

class GDELTLite(root: str, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, force_reload: bool = False)[source]

Bases: InMemoryDataset

The (reduced) version of the Global Database of Events, Language, and Tone (GDELT) dataset used in the “Do We Really Need Complicated Model Architectures for Temporal Networks?” paper, consisting of events collected from 2016 to 2020.

Each node (actor) holds a 413-dimensional multi-hot feature vector that represents CAMEO codes attached to the corresponding actor to server.

Each edge (event) holds a timestamp and a 186-dimensional multi-hot vector representing CAMEO codes attached to the corresponding event to server.

Parameters:
  • root (str) – Root directory where the dataset should be saved.

  • transform (callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before every access. (default: None)

  • pre_transform (callable, optional) – A function/transform that takes in an torch_geometric.data.Data object and returns a transformed version. The data object will be transformed before being saved to disk. (default: None)

  • force_reload (bool, optional) – Whether to re-process the dataset. (default: False)

STATS:

#nodes

#edges

#features

#classes

8,831

1,912,909

413