If code in a .cu file calls CUDA Runtime API functions but contains no ‘__device__’ code, rename the file to .cpp and compile it with the host compiler. You’ll get faster compilation time, which adds up in a large project. […]
Learn how to use CUDA Graphs to make your application run faster and more efficiently. This video walkthrough shows you how to create CUDA Graphs by the Stream Capture Method and the Explicit API method. It also includes source code.
A 10 minute video discussion about nvprof from the command prompt.