The metric SLM (Shared Local Memory) Writes represents the number of bytes written to shared local memory from within Compute Shaders. Shared local memory is allocated by adding the groupshared storage-class modifier to a given HLSL variable. This memory is then shared among all threads in a given thread group. Threads only have write access to its specific region of the shared local memory as indexed by SV_GroupIndex.
On Intel® HD Graphics 2500/4000: to access this metric, you must explicitly enable the Intel® Graphics Performance Analyzers option in your BIOS settings:
Select System Agent (SA) Configuration
Select Graphics Configuration
Reboot your machine
If the BIOS on your system does not include the Intel® Graphics Performance Analyzers option, update your BIOS to the latest version from Intel. After completing your performance monitoring activity, we recommend that you disable the Intel® Graphics Performance Analyzers BIOS option and reboot your machine.
A code example that will result in an SLM Writes:
groupshared int my_shared_local_memory[NUM_THREADS_PER_GROUP];
[numthreads(NUM_THREADS_PER_GROUP, 1, 1)]
void MyComputeShader(…, uint grp_idx : SV_GroupIndex)
my_shared_local_memory[grp_idx] = 10;
Writes to shared memory are much faster than to global buffers. The typical usage model for shared local memory is that each thread will initialize its portion of the shared memory buffer. Threads will then execute based on data read from across the shared memory buffer, and then results will be written back out to a global buffer. This model will provide improved performance over each thread doing multiple reads and writes to global buffers.
This metric is always a multiple of 64 since the Intel® HD Graphics 2500/4000 SLM Writes transactions in terms of 64-byte cache lines.