IDRIS

Institute for Development and Resources in Intensive Scientific Computing

Navigation du site

Jean Zay: TensorFlow and PyTorch profiling tools

Profiling is an indispensable step in code optimization. Its goal is to target the execution steps which are the most costly in time or memory and to visualize the workload distribution among GPUs and CPUs.

Profiling an execution is a very time-consuming operation. Therefore, this is generally done on only a few iterations of your training steps. The creation of logs is done a posteriori, at the end of the job. It is not possible to visualize the profiling during its execution. Visualization of the logs can be done either on Jean Zay or on your own local machine (sometimes it can be easier to do this on your own machine!).

NVIDIA also provides a specific profiler for Deep Learning called DLProf. Coupled with Nsight, its debugging tool for GPU kernels, it permits collecting an extremely complete set of information. However, its implementation can be a little complex and can sharply slow down the code execution.

Profiling solutions with Pytorch and TensorFlow

Both these libraries propose profiling solutions. For the moment, we are documenting

with Pytorch:
- PyTorch profilers: native and with TensorBoard
- Nsight-based profiler DLProf with PyTorch

with TensorFlow:
- TensorFlow profiler with TensorBoard

Note: DLProf does not work with TensorFlow at this time.

Each of these profiling tools is capable of tracing GPU activity.

The following comparative table shows the capacities and possibilities of each of these tools.

Profiler	Ease-of-use	Slow down code	Overview	Recom-menda-tions	Trace view	Multi GPU Dist. View	Memory profile	Kernel View	Operator View	Input View	Tensor Core efficiency view
Pytorch TensorBoard	✓	✓	✓	✓	✓	✓	✓	✓	✓	.	✓
Pytorch Native	✗	✓	✗	.	✓	.	✓	.	.	.	.
DLProf + Pytorch	✗	✗	✓	✓	✓	✗	.	✓	✓	.	✓
TensorFlow TensorBoard	✓	✓	✓	✓	✓	✗	✓	✓	.	✓	.

with:

✓ : functionality available and positive impression
✗ : functionality available and negative impression (difficult, or current version limited)
. : functionality absent

Institute for Development and Resources in Intensive Scientific Computing

Navigation du site

Resource Management

For Users

News

Jean Zay: TensorFlow and PyTorch profiling tools

Profiling solutions with Pytorch and TensorFlow