I use the dbWarden alerts to inform me if a SQL job is
taking longer to complete than normal and I got one this morning:
I noticed by looking at the history this purge job was
gradually taking longer and longer to complete each day since I installed it again (see mylast post on this):
RunDate
|
RunTime
|
Duration
|
ExecutionStatus
|
JobName
|
04/10/2013
|
02:00:01
|
00:00:45
|
Succeded
|
mdw_purge_data_[MDW]
|
05/10/2013
|
02:00:00
|
00:13:27
|
Succeded
|
mdw_purge_data_[MDW]
|
06/10/2013
|
02:00:00
|
00:17:03
|
Succeded
|
mdw_purge_data_[MDW]
|
07/10/2013
|
02:00:01
|
00:30:58
|
Succeded
|
mdw_purge_data_[MDW]
|
08/10/2013
|
02:00:01
|
01:04:25
|
Succeded
|
mdw_purge_data_[MDW]
|
09/10/2013
|
02:00:01
|
01:41:39
|
Succeded
|
mdw_purge_data_[MDW]
|
10/10/2013
|
02:00:01
|
02:35:17
|
Succeded
|
mdw_purge_data_[MDW]
|
11/10/2013
|
02:00:00
|
03:34:21
|
Succeded
|
mdw_purge_data_[MDW]
|
12/10/2013
|
02:00:01
|
04:12:51
|
Succeded
|
mdw_purge_data_[MDW]
|
13/10/2013
|
02:00:01
|
04:54:04
|
Succeded
|
mdw_purge_data_[MDW]
|
14/10/2013
|
02:00:00
|
05:38:58
|
Succeded
|
mdw_purge_data_[MDW]
|
15/10/2013
|
02:00:01
|
07:07:13
|
Succeded
|
mdw_purge_data_[MDW]
|
16/10/2013
|
02:00:00
|
08:21:11
|
Cancelled
|
mdw_purge_data_[MDW]
|
The last job I cancelled so I could figure out what was
going on. A quick Google took me to this
page which recommended adding a couple of indexes. Before I did so I took a look
at the size of both the tables mentioned in the article:
sp_spaceused
'[snapshots].[query_stats]'
name
|
rows
|
reserved
|
data
|
index_size
|
unused
|
query_stats
|
5,702,294
|
599072
KB
|
587616
KB
|
11320 KB
|
136 KB
|
sp_spaceused '[snapshots].[notable_query_text]'
name
|
rows
|
reserved
|
data
|
index_size
|
unused
|
notable_query_text
|
26,383
|
34872 KB
|
33616 KB
|
272 KB
|
984 KB
|
...And the execution plan of the following query before I made any changes:
SELECT qt.[sql_handle]
INTO #tmp_notable_query_text_test
FROM snapshots.notable_query_text AS qt
WHERE NOT EXISTS (
SELECT snapshot_id
FROM snapshots.query_stats AS qs
WHERE qs.[sql_handle] = qt.[sql_handle])
Execution Plan Before
Click to enlarge |
You can clearly see that the scan of CIDX_query_stats costs
by far the most of any operation in the query.
So I applied the following indexes as recommended in the
article and looked at the execution plan of the query again:
USE [MDW]
GO
SET ANSI_PADDING ON
GO
CREATE NONCLUSTERED INDEX [IX_sql_handle]
ON [snapshots].[query_stats]
(
[sql_handle]
ASC
)
GO
CREATE NONCLUSTERED INDEX [IX_sql_handle]
ON [snapshots].[notable_query_text]
(
[sql_handle]
ASC
)
GO
Execution Plan After
Click to enlarge |
Now various changes have happened to the query play
including a Nested Loop operator replacing the Hash Match and an Index Seek
replacing the Index Scan.
I ran the job manually and it completed much faster :
RunDate
|
RunTime
|
Duration
|
ExecutionStatus
|
JobName
|
16/10/2013
|
10:41:56
|
00:05:32
|
Succeded
|
mdw_purge_data_[MDW]
|
But much of the hard work would have already been done from
the previous night’s schedule, so I will need to wait until tonight to see if
the time had be reduced.
**Update**
It looks like the queries have done their job. The duration has greatly improved:
RunDate
|
RunTime
|
Duration
|
ExecutionStatus
|
JobName
|
17/10/2013
|
02:00:00
|
00:21:00
|
Succeded
|
mdw_purge_data_[MDW]
|
A word of warning
Even though the MDW has been active for only two weeks and I
am monitoring only one medium-busy server, nearly 14GB of data has been
generated:
Click to enlarge |
Very useful. I have many customers that simply LOVE the Data Collector interface. Unfortunately, DC is full of problems (random errors in collection jobs, database growing too much, report errors).
ReplyDelete