Sunday, May 10, 2009

Solution for TSQL Challenge 5

As I was browsing through blogs, I found an interesting blog where the author would post variety of TSQL challenges. I found it very interesting to solve those challenges. I never thought I would be in the winners list but as of now, i have been awarded 2 times by the authors. I would also encourage my blog readers to work on the TSQL challengs. This would definitely improve your approach in solving real world problems.

So here is my solution for the TSQL challenge 5 for which the author has awarded me 2nd place. The challenge can be found from http://beyondrelational.com/blogs/tc/archive/2009/04/24/tsql-challenge-5.aspx.


;with extract(id,name,tags,old_pos,new_pos)

as (

select id,name,substring(tags,0,case when charindex(',',tags,0) = 0 then len(tags) + 1 else charindex(',',tags,0) end),

0 as old_pos,charindex(',',tags,0) + 1 as new_pos from @blog --Anchor member function

Union all

select b.id ,b.name,substring(b.tags,new_pos,case when charindex(',',

b.tags,new_pos) = 0 then len(b.tags) else charindex(',',b.tags,new_pos) - new_pos end),

charindex(',',b.tags,new_pos) as old_pos,charindex(',',b.tags,new_pos) + 1 as new_pos

from @blog b inner join extract e on b.id = e.id where new_pos ! = 1 --Recursive member function

)

select f.id,f.data,e.name,b.tags,count(*) as relevance from @filter f,extract e,@blog b

where b.id = e.id and f.data like '%' + e.tags + '%'

group by f.id,f.data,e.name,b.tags

order by id,relevance desc


There are numerous ways to give solution for the challenge. I have used Recursive CTE (common table expression) as I thought this would make it simple and sweet.


After analysis of the challenge, I figured that 2 steps are required to solve this.

  1. To extract the common separated values from @blog table to multiple rows in a new table
  2. Join the new table with @filter table to populate results.


To extract the values form @blog table, I used recursive CTE.


The Anchor member function of the CTE is.

select id,name,substring(tags,0,case when charindex(',',tags,0) = 0 then len(tags) else charindex(',',tags,0) end),

0 as old_pos,charindex(',',tags,0) + 1 as new_pos from @blog


This query selects the first item from the tags column. In case we have sql,profiler,table' then sql is the first item in the list. Substring function takes 3 paramaters, first parameter takes column name, second parameter takes the start index of the string and thrid parameter takes the length of the string to be extracted.


So to get the first item I check the existance of comman using charindex(',',tags,0). If the function returns 0 then there are no more commas in the list so return the length of the string len(tags).


If there are more than one commas in the list then we need a column to hold the starting postion of the new item in the list. So I have used old_pos and new_pos, old_ps will hold the starting postion of the item and new_position will hold the postion of the next item in the list . 0 is the starting postion for all the items in the Anchor member function so I have hard coded 0 as old_pos. charindex(',',tags,0) + 1 as new_pos will return the next position of the comma in the list, 1 is added so that it is pointing to the next item immediately after the comma.


The recurisive member function of the CTE is.

select b.id ,b.name,substring(b.tags,new_pos,case when charindex(',',

b.tags,new_pos) = 0 then len(b.tags) else charindex(',',b.tags,new_pos) - new_pos end),charindex(',',b.tags,new_pos) as old_pos, charindex(',',b.tags,new_pos) + 1 as new_pos

from @blog b inner join extract e on b.id = e.id where new_pos ! = 1


This query is similar to the previous query with little changes. Here for getting the item from the list, we are using charindex(',',b.tags,new_pos) - new_pos which would return the length of the new item. For example, In a list ‘sql,performance,table’, to select the second item from the list, charindex(',',b.tags,new_pos) would return 16 and new_pos would be 5 from previous step. So 16 – 5 = 11 is the length for second item.


I am joining with the table extract with the filter condition new_pos != 1. This is because charindex(',',b.tags,new_pos) funcion will retrun 0 if no commas are found in the list and i am adding 1 for return value. So 1 would be the termination condition for the list.


The last query,

select f.id,f.data,e.name,b.tags,count(*) as relevance from @filter f,extract e,@blog b

where b.id = e.id and f.data like '%' + e.tags + '%'

group by f.id,f.data,e.name,b.tags

order by id,relevance desc


Here I am just grouping the items based upon the relevance. e.tags is passed dynamically for every row to get the count of that item in the data column of @filter table.

Monday, April 27, 2009

Careful when you use count() function

This was a silly mistake that I committed when writing a procedure.It took me great deal of time to figure out the reason. I thought I should share this.

create table #example (id_no int) --create a temporary table
insert into #example select 1
insert into #example select 2
insert into #example select 2
insert into #example select 4
insert into #example select null

when I execute a query like
select COUNT(id_no) from #example --output is 4

but when i execute the query like
select COUNT(*) from #example --output is 5

The reason I found was that if any row has a null value and if that columns is passed as parameter for count function then null values for that specific columns will be skipped. Only not null values will be taken into consideration.

using below query will result in taking only distinct not null values into consideration.
select COUNT(distinct id_no) from #example --output is 3

drop table #example

I would be glad to add more information on this topic if you give your valuable comments.

Wednesday, April 15, 2009

Basic SQL

I started learning basic SQL by creating tables and playing with it using different types of queries. For every question i came across, i wrote a query after analyzing the question. This helped me to learn simple stuff faster.

You can learn some quick SQL by reading Chapter 8: SQL - The Relational Database Standard database systems by Elmasri & Navathe.

A week after I joined my company, I was asked to write a SQL test and going though these basic queries helped me a lot.


Sunday, April 12, 2009

A perfect start

Just after my post graduation, I got a wonderful opportunity to start my career in data warehousing. Every day I learn new concept and I thought it would be useful if I share my knowledge to the people who are pursuing their career in data warehousing or planning to start their career in data warehouse.

Currently I am involved in ETL (Extarct Transform Load) Process where my role is to extract data from the CSV (comma separated values) file and transform the extracted data using pre written procedures and then load it into database.

After working of the ETL for some time, I understood that my knowledge in using SQL was shallow. So I started to learn SQL from basics, yes you got it right, I started from basic select queries.

Currently I am working on SQL Server 2005, So my upcoming blogs will contain SQL queries using SQL Server.