Why SQL Server Wait Statistics Are Your Best Troubleshooting Tool

SQL Server wait statistics tell you exactly where your server is spending its time waiting. Every time a query can't proceed because it's blocked, waiting for I/O, sitting behind a network write, or competing for CPU, SQL Server records that wait. Analysing these wait types is the fastest way to diagnose performance problems without guessing.

The script below extends the approach documented by Paul Randal (SQLskills) and pairs it with our guidance on configuring MAXDOP settings. What makes this version particularly useful is the inline descriptions for each wait type. Rather than cross-referencing Microsoft documentation mid-investigation, you get the context you need right in the result set.

What Are SQL Server Wait Statistics?

When a SQL Server thread can't execute immediately, it enters a wait state. SQL Server tracks these waits in the dynamic management view sys.dm_os_wait_stats, accumulating counts and durations since the last service restart (or since the last time someone cleared the statistics manually).

There are three components to every wait:

  • Wait time - the total time a task spent waiting, in milliseconds
  • Resource wait time - time spent waiting for the actual resource (lock, I/O, memory, etc.)
  • Signal wait time - time spent in the runnable queue after the resource became available, waiting for a CPU to pick up the task

Signal wait time is worth paying attention to. If your signal waits are consistently high as a percentage of total wait time, that's a CPU pressure indicator. The resource is ready, but there's no CPU available to process it. A commonly cited threshold is signal waits exceeding 20-25% of total wait time, which warrants a closer look at CPU capacity and MAXDOP configuration.

Which Wait Types Should You Ignore?

Not all wait types indicate a problem. SQL Server generates dozens of benign background waits that are completely normal, things like checkpoint queues, lazy writer sleep states, Service Broker background threads, and various idle worker waits. Including these in your analysis will skew your percentages and send you chasing non-issues.

The script below filters out these known benign waits using a WHERE NOT IN clause. This is the same filtering approach used by Paul Randal and Glenn Berry in their widely referenced diagnostic scripts. The excluded list covers waits such as SLEEP_TASK, LAZYWRITER_SLEEP, CHECKPOINT_QUEUE, BROKER_RECEIVE_WAITFOR, and all the HADR_* background waits used by Always On Availability Groups.

The Wait Statistics Script With Inline Descriptions

This script queries sys.dm_os_wait_stats, filters out benign waits, calculates percentages, and adds a plain-English description for each wait type. Run it against any SQL Server instance where you're investigating performance issues.

WITH [Waits] AS (
    SELECT
        [wait_type],
        [wait_time_ms] / 1000.0 AS [WaitS],
        ([wait_time_ms] - [signal_wait_time_ms]) / 1000.0 AS [ResourceS],
        [signal_wait_time_ms] / 1000.0 AS [SignalS],
        [waiting_tasks_count] AS [WaitCount],
        100.0 * [wait_time_ms] / SUM([wait_time_ms]) OVER() AS [Percentage],
        ROW_NUMBER() OVER(ORDER BY [wait_time_ms] DESC) AS [RowNum]
    FROM sys.dm_os_wait_stats
    WHERE [wait_type] NOT IN (
        N'BROKER_EVENTHANDLER', N'BROKER_RECEIVE_WAITFOR',
        N'BROKER_TASK_STOP', N'BROKER_TO_FLUSH',
        N'BROKER_TRANSMITTER', N'CHECKPOINT_QUEUE',
        N'CHKPT', N'CLR_AUTO_EVENT',
        N'CLR_MANUAL_EVENT', N'CLR_SEMAPHORE',
        N'DBMIRROR_DBM_EVENT', N'DBMIRROR_EVENTS_QUEUE',
        N'DBMIRROR_WORKER_QUEUE', N'DBMIRRORING_CMD',
        N'DIRTY_PAGE_POLL', N'DISPATCHER_QUEUE_SEMAPHORE',
        N'EXECSYNC', N'FSAGENT',
        N'FT_IFTS_SCHEDULER_IDLE_WAIT', N'FT_IFTSHC_MUTEX',
        N'HADR_CLUSAPI_CALL', N'HADR_FILESTREAM_IOMGR_IOCOMPLETION',
        N'HADR_LOGCAPTURE_WAIT', N'HADR_NOTIFICATION_DEQUEUE',
        N'HADR_TIMER_TASK', N'HADR_WORK_QUEUE',
        N'KSOURCE_WAKEUP', N'LAZYWRITER_SLEEP',
        N'LOGMGR_QUEUE', N'ONDEMAND_TASK_QUEUE',
        N'PWAIT_ALL_COMPONENTS_INITIALIZED',
        N'QDS_PERSIST_TASK_MAIN_LOOP_SLEEP',
        N'QDS_CLEANUP_STALE_QUERIES_TASK_MAIN_LOOP_SLEEP',
        N'REQUEST_FOR_DEADLOCK_SEARCH', N'RESOURCE_QUEUE',
        N'SERVER_IDLE_CHECK', N'SLEEP_BPOOL_FLUSH',
        N'SLEEP_DBSTARTUP', N'SLEEP_DCOMSTARTUP',
        N'SLEEP_MASTERDBREADY', N'SLEEP_MASTERMDREADY',
        N'SLEEP_MASTERUPGRADED', N'SLEEP_MSDBSTARTUP',
        N'SLEEP_SYSTEMTASK', N'SLEEP_TASK',
        N'SLEEP_TEMPDBSTARTUP', N'SNI_HTTP_ACCEPT',
        N'SP_SERVER_DIAGNOSTICS_SLEEP', N'SQLTRACE_BUFFER_FLUSH',
        N'SQLTRACE_INCREMENTAL_FLUSH_SLEEP',
        N'SQLTRACE_WAIT_ENTRIES', N'WAIT_FOR_RESULTS',
        N'WAITFOR', N'WAITFOR_TASKSHUTDOWN',
        N'WAIT_XTP_HOST_WAIT', N'WAIT_XTP_OFFLINE_CKPT_NEW_LOG',
        N'WAIT_XTP_CKPT_CLOSE', N'XE_DISPATCHER_JOIN',
        N'XE_DISPATCHER_WAIT', N'XE_TIMER_EVENT'
    )
)
SELECT
    [W1].[wait_type] AS [WaitType],
    Wait_Type_Description =
        CASE
            WHEN [W1].[wait_type] = 'ABR' THEN 'Identified for informational purposes only. Not supported. Future compatibility is not guaranteed.'
            WHEN [W1].[wait_type] = 'ASSEMBLY_LOAD' THEN 'Occurs during exclusive access to assembly loading.'
            WHEN [W1].[wait_type] = 'ASYNC_DISKPOOL_LOCK' THEN 'Occurs when there is an attempt to synchronize parallel threads that are performing tasks such as creating or initializing a file.'
            WHEN [W1].[wait_type] = 'ASYNC_IO_COMPLETION' THEN 'Occurs when a task is waiting for I/Os to finish.'
            WHEN [W1].[wait_type] = 'ASYNC_NETWORK_IO' THEN 'Occurs on network writes when the task is blocked behind the network. Verify that the client is processing data from the server.'
            WHEN [W1].[wait_type] = 'AUDIT_GROUPCACHE_LOCK' THEN 'Occurs when there is a wait on a lock that controls access to a special cache. The cache contains information about which audits are being used to audit each audit action group.'
            WHEN [W1].[wait_type] = 'AUDIT_LOGINCACHE_LOCK' THEN 'Occurs when there is a wait on a lock that controls access to a special cache. The cache contains information about which audits are being used to audit login audit action groups.'
            WHEN [W1].[wait_type] = 'AUDIT_ON_DEMAND_TARGET_LOCK' THEN 'Occurs when there is a wait on a lock that is used to ensure single initialization of audit related Extended Event targets.'
            WHEN [W1].[wait_type] = 'AUDIT_XE_SESSION_MGR' THEN 'Occurs when there is a wait on a lock that is used to synchronize the starting and stopping of audit related Extended Events sessions.'
            WHEN [W1].[wait_type] = 'BACKUP' THEN 'Occurs when a task is blocked as part of backup processing.'
            WHEN [W1].[wait_type] = 'BACKUP_OPERATOR' THEN 'Occurs when a task is waiting for a tape mount.'
            -- Full CASE statement continues for all documented wait types
        END,
    CAST([W1].[WaitS] AS DECIMAL(14, 2)) AS [Wait_S],
    CAST([W1].[ResourceS] AS DECIMAL(14, 2)) AS [Resource_S],
    CAST([W1].[SignalS] AS DECIMAL(14, 2)) AS [Signal_S],
    [W1].[WaitCount],
    CAST([W1].[Percentage] AS DECIMAL(4, 2)) AS [Percentage],
    CAST(([W1].[WaitS] / [W1].[WaitCount]) AS DECIMAL(14, 4)) AS [AvgWait_S],
    CAST(([W1].[ResourceS] / [W1].[WaitCount]) AS DECIMAL(14, 4)) AS [AvgRes_S],
    CAST(([W1].[SignalS] / [W1].[WaitCount]) AS DECIMAL(14, 4)) AS [AvgSig_S]
FROM [Waits] AS [W1]
JOIN [Waits] AS [W2]
    ON [W2].[RowNum] <= [W1].[RowNum]
GROUP BY
    [W1].[RowNum],
    [W1].[wait_type],
    [W1].[WaitS],
    [W1].[ResourceS],
    [W1].[SignalS],
    [W1].[WaitCount],
    [W1].[Percentage]
HAVING SUM([W2].[Percentage]) - [W1].[Percentage] < 95
ORDER BY [W1].[WaitS] DESC;

How to Read the Results

The script returns results ordered by total wait time, descending. Focus on the top 5 to 10 rows. The HAVING clause cuts off results once the cumulative percentage reaches 95%, which keeps the output focused on waits that actually matter.

Key columns to interpret:

  • WaitType - the name of the wait, matched against the description column
  • Wait_S - total wait time in seconds since last restart or stats clear
  • Resource_S - time actually waiting for the resource itself
  • Signal_S - time waiting in the runnable queue after the resource was available
  • WaitCount - how many times this wait occurred
  • AvgWait_S - average wait per occurrence, useful for spotting high-impact but infrequent waits
  • Percentage - this wait type's share of all non-benign wait time

If a single wait type dominates with more than 50% of total wait time, that's a strong signal about where your bottleneck sits. Common culprits include PAGEIOLATCH_* for I/O pressure, LCK_M_* for locking and blocking, CXPACKET or CXCONSUMER for parallelism issues, and WRITELOG for transaction log bottlenecks.

When Should You Run This?

Run this script as a first step whenever users report slowness and you're not sure where to start. It's also useful to run it periodically on production servers as part of a routine health check, comparing results week over week to catch trends before they become incidents.

One important caveat: sys.dm_os_wait_stats is cumulative since the last SQL Server restart. On a server that's been running for months, the statistics represent the entire period. If you want to isolate a specific time window, you need to capture a baseline snapshot, wait for the observation period, then calculate the delta. You can do this by running the script twice and comparing the results, or by using a monitoring tool that captures wait stats on a schedule.

Key Takeaways

  • SQL Server wait statistics in sys.dm_os_wait_stats are the most reliable starting point for diagnosing performance problems. They show you exactly where time is being lost.
  • Always filter out benign background waits before analysing results. Including idle and system waits skews your percentages and leads to false conclusions.
  • Signal wait time above 20-25% of total wait time points to CPU pressure, not resource contention. The fix is different, so the distinction matters.
  • The top 3 to 5 wait types by percentage will tell you what to investigate first. A single wait type dominating at over 50% is a strong indicator of a specific bottleneck.
  • Wait statistics are cumulative since the last restart. For point-in-time analysis, capture a before and after snapshot and compare the delta.

If you're seeing wait types you don't recognise, or your top waits keep shifting without a clear pattern, that's often a sign of a deeper configuration or workload issue. DBA Services provides SQL Server health checks and ongoing managed support for organisations that need expert eyes on their environment. Get in touch to find out how we can help.