Using Nsight Systems to profile GPU workload

If I build PyTorch from sources, then there is cuDNN info in nsys traces.