Profiling the executor#

We provide tools through the DataExecutor to profile its performance during the run.

Currently, the supported activities can be profiled as follows:

Activity

Description

total_samples

Total number of samples processed so far

total_time

Elapsed time so far, in seconds

rate

Client side smoothed samples/second of all the samples added since last queried

global_rate

Non-smoothed samples/second since the beginning of when the executor context was entered. For a more detailed explanation see Measure throughput of your model

samples_per_sec

Non-smoothed samples/second since the beginning of when the executor context was entered. This value is the same as global_rate

flops_utilization

Real flops utilization for the run

You can track activity performance using names and the DataExecutor profiler.

For example:

executor = cstorch.utils.data.DataExecutor(...)
...
print(f"Total samples: {executor.profiler.rate_tracker.total_samples}")
print(f"Total time: {executor.profiler.rate_tracker.total_time}")
print(f"Rate: {executor.profiler.rate_tracker.rate}")
print(f"Global rate: {executor.profiler.rate_tracker.global_rate}")
print(f"Samples/sec: {executor.profiler.rate_tracker.samples_per_sec}")
print(f"Flops utilization: {executor.profiler.rate_tracker.flops_utilization}")