Friday, March 22, 2013

Which of Your Stored Procedures are Using the Most Resources?

When you start to look at tuning up your stored procedures where do you start?  If your database has only a few stored procedures, it might be fairly simple to figure this out by doing a quick test to determine which stored procedure is using the most resources.  But what if you have hundreds of them? How do you identify those that are using the bulk of the resources? In this article I will show you how to quickly identify the worst performing stored procedures on your server.

Getting Metrics Related to Stored Procedure Executions

How do you set about getting figures for the time taken by stored procedures, or the amount of resources used in their execution? There is more than one way to discover these execution metrics.  I am not going to mention all the different methods: Heck, I probably don’t even know all the different ways! Instead, I will focus in on how to use Dynamic Management Views and Function (DMVs and DMFs) to identify resources used by stored procedures.  DMVs and DMFs were introduced in SQL Server 2005 and then enhanced with the introduction SQL Server 2008.   I am sorry to say that, if you are running on a SQL Server version prior to 2005, then this article will not help you in identifying your worst performing stored procedures.

I’m going to show you two different DMV methods to get stored procedure execution metrics.  One method works in both SQL Server 2005 and SQL Server 2008, while the other method is only available in SQL Server 2008.  Let me explain both of these methods.

Before I show you these two methods, first let me explain a little about how SQL Server collects the DMV metrics for stored procedure executions.    These two different DMVs are able to obtain these figures only for those stored procedures that have execution plans in the procedure cache.  The SQL Server engine starts gathering information about the performance of a stored procedure the first time that  the plan is placed in the procedure cache.  As long as the stored procedure’s execution plan stays in the cache, SQL Server keeps updating the performance figures every time the stored procedure is executed. If the execution plan for a stored procedure is dropped from the procedure cache then the metrics for the stored procedure are also removed.  Therefore you will only be able to obtain metrics for those stored procedures that have cached plans, and those metrics will contain a summarization of the amount of resources used since the stored procedure was compiled, and its’ execution plan was placed in the procedure cache.

sys.dm_exec_procedure_stats

The first DMV that I’d like to discuss is sys.dm_exec_procedure_stats.  This DMV was introduced with SQL Server 2008.   By using this DMV, you can returns metrics for stored procedures, but it does not directly identify the stored procedure by name.  Instead, the DMV only identifies the object_id, and a database_id for each stored procedure:  Therefore to identify the actual stored procedure name, that the performance figures belong to, you should either join the output of this DMV with one of the system views within the appropriate database, or use a few metadata functions.  Here is an example of  how you might do that:

SELECT CASE WHEN database_id = 32767 then ‘Resource’ ELSE DB_NAME(database_id)END AS DBName

      ,OBJECT_SCHEMA_NAME(object_id,database_id) AS [SCHEMA_NAME] 

      ,OBJECT_NAME(object_id,database_id)AS [OBJECT_NAME] 

      ,*  FROM sys.dm_exec_procedure_stats 

You can see, by looking at this query, that I used the “database_id” and the “object_id” columns from the sys.dm_exec_procedure_stats and then used a couple of metadata functions to identify the name of the database, the schema_name, and the object name for each record returned from this DMV.  Run this query on one of your SQL Server instances and see what comes back.  Let me explore a couple of key columns values returned by this view, and explain what they mean and how you can use them. 

First, let’s look at the “cached_time” column.  This column contains the time in which the stored procedure was compiled and placed in the procedure cache.  You can use this column to identify how long the database engine has been collecting metrics for the identified stored procedure.  This is very useful when comparing the amount of resources that one stored procedure has consumed over another.   By knowing how long SQL Server has been collecting metrics will help you determine how skewed the resource usage metrics might be when comparing two different stored procedures.

The next column you might be interested in looking at is the “execution_count” .  This column identifies the number of times the stored procedure has been executed since it has been placed in the procedure cache.  When you are looking for tuning opportunities you need to consider tuning those store procedures that have been executed many times.  Making small performance gains in a stored procedure that has been executed 100,000 times might be much more beneficial than speeding up a stored procedure two fold if it has only been executed a few timea.   To understand what each column value represents in this DMV I would recommend you refer to Books Online

When I look at stored procedure metrics, I like to identify the average amount of resources used for a single execution of a stored procedure.   Calculating the average resources usage per execution allows you to more easily compare one stored procedure to another. To calculate these average resource usage numbers, I divide the total used number by the “execution_count” column.   Here is some code  where I used this technique to calculate average resources usage numbers for stored procedures that have metrics on my system:

SELECT CASE WHEN database_id = 32767 then ‘Resource’ ELSE DB_NAME(database_id)END AS DBName

      ,OBJECT_SCHEMA_NAME(object_id,database_id) AS [SCHEMA_NAME] 

      ,OBJECT_NAME(object_id,database_id)AS [OBJECT_NAME]

      ,cached_time

      ,last_execution_time

      ,execution_count

      ,total_worker_time / execution_count AS AVG_CPU

      ,total_elapsed_time / execution_count AS AVG_ELAPSED

      ,total_logical_reads / execution_count AS AVG_LOGICAL_READS

      ,total_logical_writes / execution_count AS AVG_LOGICAL_WRITES

      ,total_physical_reads  / execution_count AS AVG_PHYSICAL_READS

FROM sys.dm_exec_procedure_stats 

ORDER BY AVG_LOGICAL_READS DESC

Here I have ordered the results by the average logical reads per execution of a stored procedure.  By doing this I can tell which SP has performed the most logical I/O’s per execution.   When you run this query on your system, remember to keep an eye on the “execution_count” column.  This will help you identify those stored procedures that are executed frequently. By modifying the query above and ordering the results based on a different column you can compare each stored procedure based on different average resource usage.   Keep in mind when you are looking out the output from different queries, it might not be as beneficial to tune a procedure that is only executed once, over a procedure that is executed many time. 

If you don’t have SQL Server 2008, that doesn’t mean you are out of luck at finding stored procedure metrics.  In the next section I will show you how to get the similar results as the above query using DMVs and DMFs that are available in both SQL Server 2008 and SQL Server 2005.

sys.dm_exec_query_stats

The sys.dm_exec_query_stats DMV can be used to obtain statement level metrics.  Once again, these metrics are only available if the statement comes from a cached plan.  By using this DMV and summarizing the statement metrics up to the plan level, you are able to identify metrics for stored procedures.   The code below can be used in SQL Server 2005 and SQL Server 2008 to produce similar results as I did above using the sys.dm_exec_procedure_stats DMV method:

SELECT CASE WHEN dbid = 32767 then ‘Resource’ ELSE DB_NAME(dbid)END AS DBName

      ,OBJECT_SCHEMA_NAME(objectid,dbid) AS [SCHEMA_NAME] 

      ,OBJECT_NAME(objectid,dbid)AS [OBJECT_NAME]

      ,MAX(qs.creation_time) AS ‘cache_time’

      ,MAX(last_execution_time) AS ‘last_execution_time’

      ,MAX(usecounts) AS [execution_count]

      ,SUM(total_worker_time) / SUM(usecounts) AS AVG_CPU

      ,SUM(total_elapsed_time) / SUM(usecounts) AS AVG_ELAPSED

      ,SUM(total_logical_reads) / SUM(usecounts) AS AVG_LOGICAL_READS

      ,SUM(total_logical_writes) / SUM(usecounts) AS AVG_LOGICAL_WRITES

      ,SUM(total_physical_reads) / SUM(usecounts)AS AVG_PHYSICAL_READS       

FROM sys.dm_exec_query_stats qs 

   join sys.dm_exec_cached_plans cp on qs.plan_handle = cp.plan_handle

   CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle)

WHERE objtype = ‘Proc’

  AND text

       NOT LIKE ‘%CREATE FUNC%’

       GROUP BY cp.plan_handle,DBID,objectid 

One thing to note about this code, the “last_execution_time”,  “AVG_CPU” and “AVG_ELAPSED” will have a slightly different values then using the sys.dm_exec_procedure_stats method.  It makes sense that these numbers will not be exactly the same since these values are calculated by rolling up the individual statement metrics to calculate the totals at the stored procedure level.  If you look closely at the different values produced by these two methods you will find a lower CPU and elapsed time value is reported using this method, and the last execution date will be a little close to the current time.   

Getting Stored Procedure Metrics

When doing performance analysis of your system it is good to be able to find those stored procedures that are using the most resources.  By know this information you can focus in and tune those stored procedures that provide you the biggest performance improvement for your effort.  Next time you are wondering which stored procedures are using the most resource you will now have a way to identify that regardless of whether you are  monitoring SQL Server 2005 or SQL Server 2008.

 

 

No comments:

Post a Comment