I wrote two blogs about Skewness and Kurtosis, one with T-SQL and one with C# UDA solution. I claimed the CLR solution performs better, because it needs only one pass through the data, and the T-SQL one needs two passes. This is correct, considering my solutions, the CLR one is twice as fast as the T_SQL one. But…

Steve Kass, and extraordinary SQL Server MVP (take a look at his site), has turned my attention on the fact I can use the same technique as I used for the UDA in T-SQL as well. And as always when he makes a remark, he is correct. Bellow you can see all three Skewness solutions together.

— Skewness

SELECT Skew = SUM(((Age*1.0m.mean)/m.[StDev])*((Age*1.0m.mean)/m.[StDev])*((Age*1.0m.mean)/m.[StDev])) * MIN(m.corrfact1)

FROM vTargetMail v CROSS JOIN

(SELECT mean = AVG(Age*1.0), [StDev] = STDEV(Age),

corrfact1 = COUNT(*)*1.0 / (COUNT(*)-1) / (COUNT(*)-2)

FROM vTargetMail v) AS

— Steve Kass

SELECT

(rx3 3*rx2*av + 3*rx*av*av rn*av*av*av)

/ (stdv*stdv*stdv) * rn / (rn1) / (rn2)

FROM (

SELECT

SUM(Age) AS rx,

SUM(Age*Age) AS rx2,

SUM(Age*Age*Age) AS rx3,

COUNT(Age) AS rn,

STDEV(Age) AS stdv,

AVG(Age) AS av

FROM (SELECT 1e0*Age AS Age FROM vTargetMail) AS s

) AS s

— UDA – note it has to be deployed first

SELECT dbo.Skew(Age)

FROM vTargetMail

GO

In SSMS, turn the execution plan on and run in the context of the AdventureWorksDW databases all three solutions in a single batch. just check the first number – query cost relative to the batch for each query. The first query (my first T-SQL solution) cost is 50% of the batch cost; (the second query (Steve’s T-SQL solution) cost is 25% of the batch cost; and the third query (my C# UDA) solution cost is 25% of the batch cost. So you can see you can get the same efectiveness in T-SQL as with UDA in this case. Finally, bellow is similar code for Kurtosis. You can check the performance of the solutions in same way as I showed for Skewness.

— Kurtosis

SELECT Kurt = SUM(SQUARE(SQUARE(((Age*1.0m.mean)/m.[StDev])))) * MIN(m.corrfact2) MIN(m.subfact)

FROM vTargetMail v CROSS JOIN

(SELECT mean = AVG(Age*1.0), [StDev] = STDEV(Age),

corrfact2 = COUNT(*)*1.0 * (COUNT(*)+1) / (COUNT(*)-1) / (COUNT(*)-2) / (COUNT(*)-3),

subfact = 3.0 * SQUARE((COUNT(*)-1)) / (COUNT(*)-2) / (COUNT(*)-3)

FROM vTargetMail v) AS

— Steve Kass

SELECT

(rx4 4*rx3*av + 6*rx2*av*av 4*rx*av*av*av + rn*av*av*av*av)

/ (stdv*stdv*stdv*stdv) * rn * (rn+1) / (rn1) / (rn2) / (rn3)

3e0 * (rn1) * (rn1) / (rn2) / (rn3)

FROM (

SELECT

SUM(Age) AS rx,

SUM(Age*Age) AS rx2,

SUM(Age*Age*Age) AS rx3,

SUM(Age*Age*Age*Age) AS rx4,

COUNT(Age) AS rn,

STDEV(Age) AS stdv,

AVG(Age) AS av

FROM (SELECT 1e0*Age AS Age FROM vTargetMail) AS s

) AS s

— UDA – note it has to be deployed first

SELECT dbo.Kurt(Age)

FROM vTargetMail

GO

### Dejan Sarka

Independent Consultant at Dejan Sarka S.P.
I am a trainer and consultant that focus on development of database & business intelligence applications.

Besides projects, I spend about half of the time on training and mentoring. I am the founder of the Slovenian SQL Server and .NET Users Group. and the main author or coauthor of eleven books about databases and SQL Server. I also developed many courses and seminars for SolidQ.