Performance Flags#
The Cerebras PyTorch API provides a set of performance and debugging related flags. See Performance/Debug Flags for a comprehensive list of the flags that are available.
On this page we will cover how you can set these performance flags using the
Trainer
class.
Prerequisites#
Make sure to have read through Trainer Overview and Trainer Configuration Overview which provide the basic overview of how to run Model Zoo models. In this document, you will be using the tools and configurations outlined in those pages.
Scoped Flags#
While you can simply set the flags directly via the Cerebras PyTorch API, it is often the case that you want different flags for say training versus validation.
As such, we provide two callbacks to facilitate this:
With these callbacks, you can set different performance flags for training and validation.
For example, one of the most important flags that you may want to set is the micro batch size (see working_with_microbatches for more details on micro batching).
While you could set it globally by setting the
cerebras.pytorch.backends.csx.performance.micro_batch_size
flag, if you want
to set different micro batch sizes for training and validation, you can set the values as follows.
trainer:
init:
...
callbacks:
- ScopedTrainFlags:
csx.performance.micro_batch_size: auto
- ScopedValidateFlags:
csx.performance.micro_batch_size: 2
...
from cerebras.modelzoo import Trainer
from cerebras.modelzoo.trainer.callbacks import (
ScopedTrainFlags,
ScopedValidateFlags,
)
trainer = Trainer(
...,
callbacks=[
ScopedTrainFlags(
**{
"csx.performance.micro_batch_size": "auto",
}
),
ScopedValidateFlags(
**{
"csx.performance.micro_batch_size": 2,
}
),
]
)
...
Conclusion#
Setting performance flags in the Trainer
is a crucial step
to optimize and debug your model training and validation processes. By leveraging the
ScopedTrainFlags
and ScopedValidateFlags
callbacks, you can fine-tune your settings
to cater to different stages of your workflow, such as assigning distinct micro batch sizes
for training and validation. This flexibility allows for a more tailored and efficient
training process, ensuring that you can maximize the performance of Model Zoo models.
Further Reading#
To learn more about how you can extend the capabilities of the
Trainer
class, you can check out: