torch_geometric.profile
A decorator to facilitate profiling a function, e.g., obtaining training runtime and memory statistics of a specific model on a specific dataset. |
|
A context decorator to facilitate timing a function, e.g., obtaining the runtime of a specific model on a specific dataset. |
|
Creates a summary of collected runtime and memory statistics. |
|
Given a |
|
Given a |
|
Given a |
|
Returns the used CPU memory in bytes, as reported by the Python garbage collector. |
|
Returns the used GPU memory in bytes, as reported by the Python garbage collector. |
|
Returns the free and used GPU memory in megabytes, as reported by |
|
Returns the XPU memory statistics. |
|
Benchmark a list of functions |
|
Enables NVTX profiling for a function. |
GNN profiling package.
- profileit(device: str)[source]
A decorator to facilitate profiling a function, e.g., obtaining training runtime and memory statistics of a specific model on a specific dataset. Returns a
GPUStats
ifdevice
isxpu
or extended objectCUDAStats
, ifdevice
iscuda
.- Parameters:
device (str) – Target device for profiling. Options are:
cuda
and obj:xpu.
@profileit("cuda") def train(model, optimizer, x, edge_index, y): optimizer.zero_grad() out = model(x, edge_index) loss = criterion(out, y) loss.backward() optimizer.step() return float(loss) loss, stats = train(model, x, edge_index, y)
- class timeit(log: bool = True, avg_time_divisor: int = 0)[source]
A context decorator to facilitate timing a function, e.g., obtaining the runtime of a specific model on a specific dataset.
@torch.no_grad() def test(model, x, edge_index): return model(x, edge_index) with timeit() as t: z = test(model, x, edge_index) time = t.duration
- Parameters:
- get_stats_summary(stats_list: Union[List[GPUStats], List[CUDAStats]]) Union[GPUStatsSummary, CUDAStatsSummary] [source]
Creates a summary of collected runtime and memory statistics. Returns a
GPUStatsSummary
if list ofGPUStats
was passed, otherwise (list ofCUDAStats
was passed), returns aCUDAStatsSummary
.- Parameters:
stats_list (Union[List[GPUStats], List[CUDAStats]]) – A list of
GPUStats
orCUDAStats
objects, as returned byprofileit()
.- Return type:
Union
[GPUStatsSummary
,CUDAStatsSummary
]
- count_parameters(model: Module) int [source]
Given a
torch.nn.Module
, count its trainable parameters.- Parameters:
model (torch.nn.Model) – The model.
- Return type:
- get_model_size(model: Module) int [source]
Given a
torch.nn.Module
, get its actual disk size in bytes.- Parameters:
model (torch model) – The model.
- Return type:
- get_data_size(data: BaseData) int [source]
Given a
torch_geometric.data.Data
object, get its theoretical memory usage in bytes.- Parameters:
data (torch_geometric.data.Data or torch_geometric.data.HeteroData) – The
Data
orHeteroData
graph object.- Return type:
- get_cpu_memory_from_gc() int [source]
Returns the used CPU memory in bytes, as reported by the Python garbage collector.
- Return type:
- get_gpu_memory_from_gc(device: int = 0) int [source]
Returns the used GPU memory in bytes, as reported by the Python garbage collector.
- get_gpu_memory_from_nvidia_smi(device: int = 0, digits: int = 2) Tuple[float, float] [source]
Returns the free and used GPU memory in megabytes, as reported by
nivdia-smi
.Note
nvidia-smi
will generally overestimate the amount of memory used by the actual program, see here.
- get_gpu_memory_from_ipex(device: int = 0, digits=2) Tuple[float, float, float] [source]
Returns the XPU memory statistics.
- benchmark(funcs: List[Callable], args: Union[Tuple[Any], List[Tuple[Any]]], num_steps: int, func_names: Optional[List[str]] = None, num_warmups: int = 10, backward: bool = False, per_step: bool = False, progress_bar: bool = False)[source]
Benchmark a list of functions
funcs
that receive the same set of argumentsargs
.- Parameters:
funcs ([Callable]) – The list of functions to benchmark.
args ((Any, ) or [(Any, )]) – The arguments to pass to the functions. Can be a list of arguments for each function in
funcs
in case their headers differ. Alternatively, you can pass in functions that generate arguments on-the-fly (e.g., useful for benchmarking models on various sizes).num_steps (int) – The number of steps to run the benchmark.
func_names ([str], optional) – The names of the functions. If not given, will try to infer the name from the function itself. (default:
None
)num_warmups (int, optional) – The number of warmup steps. (default:
10
)backward (bool, optional) – If set to
True
, will benchmark both forward and backward passes. (default:False
)per_step (bool, optional) – If set to
True
, will report runtimes per step. (default:False
)progress_bar (bool, optional) – If set to
True
, will print a progress bar during benchmarking. (default:False
)
- nvtxit(name: Optional[str] = None, n_warmups: int = 0, n_iters: Optional[int] = None)[source]
Enables NVTX profiling for a function.
- Parameters:
name (Optional[str], optional) – Name to give the reference frame for the function being wrapped. Defaults to the name of the function in code.
n_warmups (int, optional) – Number of iters to call that function before starting. Defaults to 0.
n_iters (Optional[int], optional) – Number of iters of that function to record. Defaults to all of them.