SQL Server User Defined Aggregates

March 5th, 2009 by Adi S.
Statistical calculations are quite important in the implementation of financial applications. Surprisingly SQL Server does not support many unless the process is running within the Analysis Services framework. Few months ago I was in a need of calculating a median in several stored procedures performing statistical risk calculations. While I could have used SQL approach like shown in the listing 1 below I would have to replicate the same piece of code in numerous stored procedures.
 
Listing 1: Median using T-SQL and SQL 2005 CTE, ranking
WITH data AS
(
    SELECT someID, someValue,
    ROW_NUMBER() OVER(PARTITION BY someID ORDER BY someValue) AS RowNum,
    COUNT(*) OVER(PARTITION BY someID) AS NumRows
    FROM dbo.someTable
)
SELECT someID, AVG(someValue) AS Median
FROM data
WHERE RowNum IN ((NumRows + 1) / 2, (NumRows + 2) / 2)
GROUP BY someID;
 
Instead I decided to write my own median function using CLR that can be used similar to the AVG built-in aggregate. Visual Studio makes things easy by providing a convenient template to get you started. In order to create an aggregate CLR function you first create a SQL Server Project and then add a new item/class based on the Aggregate template as shown below.
 
NewDBProject.JPG   NewAggregate.JPG
 
There are four methods listed below that the aggregate requires. The Init() method is called once per group and we can use it to instantiate our own variables, such as an array holding all of the values we are looking to find a median for. The Accumulate() method is called on every value addition of the group (think grouped records in a table) and Terminate() returns the aggregate value once the group aggregation completes. The Merge() method is used by the SQL engine itself which might decide to split the aggregation work to multiple threads and merge the results of which prior to executing Terminate() and returning the result.
 
public void Init()
public void Accumulate(double Value)
public void Merge(Median Group)
public double Terminate()
 
The Merge() method is especially interesting and will lead to further discussion in part 2 of this blog. If you are interested in my own implementation of percentile/median function please contact me via “Send a message” on my Spaces page and I would be happy to share my experiences with you in regards to this or any other user defined function.

Tags: , ,

One Response to “SQL Server User Defined Aggregates”

  1. Good dispatch and this post helped me alot in my college assignement. Thanks you on your information.

Leave a Reply