torch_geometric.profile

`profileit`	A decorator to facilitate profiling a function, e.g., obtaining training runtime and memory statistics of a specific model on a specific dataset.
`timeit`	A context decorator to facilitate timing a function, e.g., obtaining the runtime of a specific model on a specific dataset.
`get_stats_summary`	Creates a summary of collected runtime and memory statistics.
`trace_handler`
`print_time_total`
`rename_profile_file`
`torch_profile`
`xpu_profile`
`count_parameters`	Given a `torch.nn.Module`, count its trainable parameters.
`get_model_size`	Given a `torch.nn.Module`, get its actual disk size in bytes.
`get_data_size`	Given a `torch_geometric.data.Data` object, get its theoretical memory usage in bytes.
`get_cpu_memory_from_gc`	Returns the used CPU memory in bytes, as reported by the Python garbage collector.
`get_gpu_memory_from_gc`	Returns the used GPU memory in bytes, as reported by the Python garbage collector.
`get_gpu_memory_from_nvidia_smi`	Returns the free and used GPU memory in megabytes, as reported by `nivdia-smi`.
`get_gpu_memory_from_ipex`	Returns the XPU memory statistics.
`benchmark`	Benchmark a list of functions `funcs` that receive the same set of arguments `args`.

GNN profiling package.

profileit(device: str)[source]

A decorator to facilitate profiling a function, e.g., obtaining training runtime and memory statistics of a specific model on a specific dataset. Returns a GPUStats if device is xpu or extended object CUDAStats, if device is cuda.

Parameters:: device (str) – Target device for profiling. Options are: cuda and obj:xpu.

@profileit("cuda")
def train(model, optimizer, x, edge_index, y):
    optimizer.zero_grad()
    out = model(x, edge_index)
    loss = criterion(out, y)
    loss.backward()
    optimizer.step()
    return float(loss)

loss, stats = train(model, x, edge_index, y)

class timeit(log: bool = True, avg_time_divisor: int = 0)[source]

A context decorator to facilitate timing a function, e.g., obtaining the runtime of a specific model on a specific dataset.

@torch.no_grad()
def test(model, x, edge_index):
    return model(x, edge_index)

with timeit() as t:
    z = test(model, x, edge_index)
time = t.duration

Parameters:

log (bool, optional) – If set to False, will not log any runtime to the console. (default: True)
avg_time_divisor (int, optional) – If set to a value greater than 1, will divide the total time by this value. Useful for calculating the average of runtimes within a for-loop. (default: 0)

reset()[source]: Prints the duration and resets current timer.

get_stats_summary(stats_list: Union[List[GPUStats], List[CUDAStats]]) → Union[GPUStatsSummary, CUDAStatsSummary][source]

Creates a summary of collected runtime and memory statistics. Returns a GPUStatsSummary if list of GPUStats was passed, otherwise (list of CUDAStats was passed), returns a CUDAStatsSummary.

Parameters:: stats_list (Union[List[GPUStats], List[CUDAStats]]) – A list of GPUStats or CUDAStats objects, as returned by profileit().

trace_handler(p)[source]

print_time_total(p)[source]

rename_profile_file(*args)[source]

torch_profile(export_chrome_trace=True, csv_data=None, write_csv=None)[source]

xpu_profile(export_chrome_trace=True)[source]

count_parameters(model: Module) → int[source]

Given a torch.nn.Module, count its trainable parameters.

Parameters:: model (torch.nn.Model) – The model.

get_model_size(model: Module) → int[source]

Given a torch.nn.Module, get its actual disk size in bytes.

Parameters:: model (torch model) – The model.

get_data_size(data: BaseData) → int[source]

Given a torch_geometric.data.Data object, get its theoretical memory usage in bytes.

Parameters:: data (torch_geometric.data.Data or torch_geometric.data.HeteroData) – The Data or HeteroData graph object.

get_cpu_memory_from_gc() → int[source]: Returns the used CPU memory in bytes, as reported by the Python garbage collector.

get_gpu_memory_from_gc(device: int = 0) → int[source]

Returns the used GPU memory in bytes, as reported by the Python garbage collector.

Parameters:: device (int, optional) – The GPU device identifier. (default: 1)

get_gpu_memory_from_nvidia_smi(device: int = 0, digits: int = 2) → Tuple[float, float][source]

Returns the free and used GPU memory in megabytes, as reported by nivdia-smi.

Note

nvidia-smi will generally overestimate the amount of memory used by the actual program, see here.

Parameters:

device (int, optional) – The GPU device identifier. (default: 1)
digits (int) – The number of decimals to use for megabytes. (default: 2)

get_gpu_memory_from_ipex(device: int = 0, digits=2) → Tuple[float, float, float][source]

Returns the XPU memory statistics.

Parameters:

device (int, optional) – The GPU device identifier. (default: 0)
digits (int) – The number of decimals to use for megabytes. (default: 2)

benchmark(funcs: List[Callable], args: Union[Tuple[Any], List[Tuple[Any]]], num_steps: int, func_names: Optional[List[str]] = None, num_warmups: int = 10, backward: bool = False, per_step: bool = False, progress_bar: bool = False)[source]

Benchmark a list of functions funcs that receive the same set of arguments args.

Parameters:

funcs ([Callable]) – The list of functions to benchmark.
args ((Any, ) or [(Any, )]) – The arguments to pass to the functions. Can be a list of arguments for each function in funcs in case their headers differ. Alternatively, you can pass in functions that generate arguments on-the-fly (e.g., useful for benchmarking models on various sizes).
num_steps (int) – The number of steps to run the benchmark.
func_names ([str], optional) – The names of the functions. If not given, will try to infer the name from the function itself. (default: None)
num_warmups (int, optional) – The number of warmup steps. (default: 10)
backward (bool, optional) – If set to True, will benchmark both forward and backward passes. (default: False)
per_step (bool, optional) – If set to True, will report runtimes per step. (default: False)
progress_bar (bool, optional) – If set to True, will print a progress bar during benchmarking. (default: False)