Analyzing a Media Performance Trace Capture File
The media performance trace capture file generated with the Intel® GPA Monitor provides a system-wide picture of how your code works with Intel® Media SDK and Microsoft* DXVA2 and how media-related workloads execute on the GPU.
If your application is GPU-bound and GPU is underutilized (less than approximately 90%):
- Create a media performance trace capture file. Find details here.
- Open it with the Intel® GPA Platform Analyzer.
- Zoom in and locate GPU idle periods that are visualized as gaps in gray tracks GPU EU Queue and GPU MFX Queue.
Queue emptiness and GPU idle periods are caused by an application’s failure to submit next operation(s) into the queue while the GPU is working on current operations - Find the reason of idle periods on GPU and optimize your application performance:
-
- If idle periods match a blocking function call on the CPU, such as the Intel® Media SDK function MFX_SyncOperation, you should optimize the application to leverage asynchronous benefits and submit operations that do not wait for the current operation’s completion.
Refer to the section Identifying Correlations between tasks on CPU and GPU to find out when operations were submitted by the application and when they were executed on the GPU. See sample code from the Intel® Media SDK for examples of asynchronous design of a video processing pipeline. - If idle periods are caused by the Intel® Media SDK operations CopySystemToVideoMemory or CopyVideoToSystemMemory on the CPU, you should optimize the application by working with surfaces located in video memory, either Microsoft* DirectX*9 surfaces or Intel® Media SDK opaque surfaces.
-
If idle periods are caused by doing other operations, such as audio processing and file operations, in the same thread as submitting video operations to the Intel® Media SDK or Microsoft* DirectX*, consider creating additional threads for better parallelization.
- If idle periods match a blocking function call on the CPU, such as the Intel® Media SDK function MFX_SyncOperation, you should optimize the application to leverage asynchronous benefits and submit operations that do not wait for the current operation’s completion.
See Also
Creating a Media Performance Trace Capture File
Loading a Trace Capture File
Monitoring Media Performance Metrics in Real Time
Introduction to the Intel® GPA Media Performance Analyzer
Analyzing Real-time Media Performance Metrics