Wednesday, January 27, 2010

Lookup range of values in SSIS

     The Lookup transformation performs lookups by joining data in input columns with columns in a reference dataset. You use the lookup to access additional information in a related table that is based on values in common columns. We can also use this loopup component for performing lookup with range of values, means for suppose our source table contains all the dates and our destination table contains only range of values like startdate and enddate. Here we need to map the source table dates which are fall in the specified range, means we need to consider whatever the dates which are fall in between startdate and enddate.
          Here by using an example we can understand clearly about this feature. We are using three tables for this demo, they are student, grade, student_grade. The structure of these tables along with sample data are given below.

Table Scripts :

CREATE TABLE [dbo].[student](
[rno] [numeric](18, 0) NOT NULL,
[name] [varchar](50) NOT NULL,
[mark] [int] NOT NULL)

CREATE TABLE [dbo].[grade](
[min_mark] [int] NULL,
[max_mark] [int] NULL,
[grade] [varchar](10) NULL)

CREATE TABLE [dbo].[Student_Grade](
[rno] [numeric](18, 0) NULL,
[name] [varchar](50) NULL,
[mark] [int] NULL,
[grade] [varchar](10) NULL)

 Sample Data :


      By looking at the above tables, student table consists marks of every student and grade table consists grading pattern based on minimum and maximum marks. Now, we need to allocate grade to each student record by using lookup component, in the case of lookup we need to map columns between source table and lookup table, but here we need to check the marks fall in which set of range and assign corresponding grade to that record. Here we are using three components for this demo, the below figure will show the structure of ETL Package.


      In the OLEDB Source we need to connect student table and in the look up component we have to connect to grade table, the steps we have to perform in lookup component are as follows :

  • In the General tab select either 'Partial Cache' or 'No Cache'  (this logic won't work in full cache mode, incase if we select 'Full Cache' mode then the options in advanced tab will disabled.)




  • In the Connection tab, make a connection to your local server and select the table grade.

  • In the columns tab, map the mark column to either min_mark or max_mark and select grade column from lookup table.
   

  • In the Advanced tab, check the checkbox (it will allow us to modify the SQL statement) in the custom query box and paste the following query inside the textbox.

     select * from (select * from [dbo].[grade]) [refTable]  where ? >= min_mark               and ? < max_mark

        
         In the above query we are cheking whether the marks >=min_mark  and marks < max_mark, only   thing we have to do here it is need to pass the correct parameter (in the place ?) by using parameter button just below the query textbox.





  • Click the parameter button, here we need to assign column mark to two parameters (we need to pass two parameters (in the place of '?') to the query

     Finally connect the path from lookup to OLEDB destination.In the OLEDB destination we need to select Student_Grade table. After successful execution of this package, the final output will be shown as below :

                                           

Friday, January 8, 2010

Executing Stored Procedure & Insert Statement, Objects, Foreach Loop Container in SSIS

'Execute SQL Task' component is not only restricted to execute simple select queries, it can also execute stored procedures, insert and update statements as well.

In this post I would like to discuss following functionalities of SSIS :
       * Executing stored procedure using 'Execute SQL Task'
       * Full result set and Object variable
       * For each ADO Enumerator on 'For each loop container'



Executing stored procedure using 'Execute SQL Task' :

For implementing this demo I am using two tables, they are user and BIUsers. The structure and sample data of these tables are given below

CREATE TABLE [dbo].[User](
[UserID] [int],
[UserName] [nvarchar](50),
[DeptName] [nvarchar](50)
)

insert into [User] values(1,'murty','BI')
insert into [User] values(2,'dileep','BI')
insert into [User] values(3,'hareesh','IP')
insert into [User] values(4,'akash','IP')
insert into [User] values(5,'raghu','BI')

CREATE TABLE [dbo].[BIUsers](
[userid] [int],
[username] [nvarchar](50)
)

Table User consists all the user names along with their department names, here I am using one stored procedure to get all the users who are working in one particular department. For this Stored Procedure I need to pass deparment name as a parameter so that it will give all the users who are working under this department. Below is the defination of stored procedure :

create procedure [dbo].[UsersDept]
@dept nvarchar(50)
As
Begin
select UserID,UserName from [user] where DeptName=@dept
end

Once we created above two tables and stored procedure, now we need to execute the stored procedure using 'Execure SQL Task'. Before going to implementing this we have to create two package level variables, one is for storing department name and the other one is storing result set data. Below are the variables created at package level.



Set the connection properties for 'Execute SQL Task', sample figure is given below.



Full resultset and Object Variable

Here we need to set the result set property, it contains four different values among these four options we need to take 'Full Result Set', why because we are retriving all the users details corresponding to one particular department. Connection Type is OLEDB and Sql statement is as follows :

EXEC UsersDept ?

The symbol '?' (question mark) indicates it is accepting parameter, so we need to set parameter in the parameter mapping tab. Here we are using variable dept as input parameter, here the variable dept already it contains a value 'BI'. The below figure explains you how to set the parameter mapping.


On successful execution of the above stored procedure it will return one dataset (users data of one particular department), we need to set result set for catching this dataset. Here we are using our second variable result of type object.



For each ADO Enumerator on 'For each loop container'

Upto now we are storing data into object variable result, now we need to retrieve each and every record and store it in another table BIUsers. For taking every record we are using For Each Loop Container, and our data is lies in Object variable so we need to use For Each ADO Enumerator in the for each loop.



Here we are expecting two columns (userid, username) from object variable.
so we need to create two more variables for holding userid and username information. Here we created userid of type int and username of type string.



In the For each loop container under variable mapping section map these two variables as shown below.,



Finally we need to exeucte the insert statement by taking userid and username as parameters, check the below figure for how to set the properties in execute sql task for executing insert statement.



check the below figure for how to set the parameters for insert statement.


Wednesday, January 6, 2010

Conversion of String (12012009) to date in SSIS

Use the following expression for converting string to date, here date is maintained like string.

Ex: suppose date : 2009-12-01 is maintained like '12012009'

Expression :

For suppose your column name is d_record_end_date then you need to use the following expression to convert string to date format. We need to use this expression in derived column, it will create a new column of type DATE.

(DT_date)(SUBSTRING(d_record_end_date,5,4) + "-" + SUBSTRING(d_record_end_date,1,2) + "-" + SUBSTRING(d_record_end_date,3,2))

Thursday, December 17, 2009

Seaching a value in a Table

The below stored procedure is used for searching a particular string in a table, if the string exists it will return the column name and correct search value.

Stored Procedure :

create PROC SearchTable
(@SearchStr nvarchar(100),
@TableName nvarchar(256))
AS
BEGIN
CREATE TABLE #Results (ColumnName nvarchar(370), ColumnValue nvarchar(3630))
SET NOCOUNT ON
DECLARE @ColumnName nvarchar(128), @SearchStr2 nvarchar(110)
SET @SearchStr2 = QUOTENAME('%' + @SearchStr + '%','''')
SET @ColumnName = ''
WHILE (@TableName IS NOT NULL) AND (@ColumnName IS NOT NULL)
BEGIN
SET @ColumnName =(SELECT MIN(QUOTENAME(COLUMN_NAME)) FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = PARSENAME(@TableName, 2) AND TABLE_NAME = PARSENAME(@TableName, 1)
AND DATA_TYPE IN ('char', 'varchar', 'nchar', 'nvarchar')
AND QUOTENAME(COLUMN_NAME) > @ColumnName)
IF @ColumnName IS NOT NULL
BEGIN
INSERT INTO #Results
EXEC ('SELECT ''' + @TableName + '.' + @ColumnName + ''', LEFT(' + @ColumnName + ', 3630)
FROM '+ @TableName + ' (NOLOCK) ' +' WHERE ' + @ColumnName + ' LIKE ' + @SearchStr2)
END
END
SELECT ColumnName, ColumnValue FROM #Results
END

Output

Suppose we have a table called dbo.airways and the data it contains



here I would like to search the string 'amm', so my execute statement is like

EXEC SearchTable 'amm','dbo.airways'

Result :

Display all the table names from all the databases if they contains specific column name

The following query will display all the table names from all the databases if they contains specific column name. Here my column name is 'Y', so I want retrived all the table names from all the databases which are having 'Y' as one of the column. I am taking column name 'Y' in to one variable @Column_Val, so it will give all the database names and their corresponding table names which consists 'Y' as a column.


DECLARE @CMD VARCHAR(1000)

DECLARE @Column_Val VARCHAR(1000)

set @Column_Val='Y'

create table #TableNames ( DatabaseName sysname, TableName sysname)

SET @CMD = 'USE [?]; SELECT DB_NAME()DATABASE_NAME,A.name as Tablename from
( (select * from SYS.Tables) A join (SELECT * FROM SYS.columns where name like''' +@Column_Val+ ''') B on A.object_id=B.object_id)'

insert into #TableNames EXEC SP_MSFOREACHDB @CMD
SELECT * FROM #TableNames Where DatabaseName not in ( 'master','msdb', 'tempdb', 'model' )
drop table #TableNames

Friday, November 6, 2009

Extracting Cube data from SSAS using SSIS (SqlServer 2005)

Cube data can be extracted through MDX queries, we have three approaches for getting the results of an MDX query into an SSIS data stream. Out of these three approaches only last two approaches are working fine.

1. To setup a connection using the OLE DB provider for MSOLAP 9.0 and then put an MDX query in place of the SQL text. This seems to work when you Preview, but crashes SSIS when you run the job.

2. Use a linked server and an OpenQuery command. First we need to define linked server for our source catalog, then we can easily retrieve data by using open query method.
USE master
GO
/* Add new linked server */
EXEC sp_addlinkedserver
@server='LINKED_OLAP', -- local SQL name given to the linked server
@provider='MSOLAP.3', -- OLE DB provider (the .2 means the SQL2K version)
@datasrc='localhost', -- analysis server name (machine name)
@catalog='Adventure Works DW' -- default catalog/database
And then I wrapped my MDX statement in an openquery call.
Ex:
SELECT *
FROM OpenQuery (linked_olap,'SELECT --measures.members
{Measures.[Internet Sales Amount]} ON COLUMNS,
[Date].[Month].members ON ROWS
FROM [Adventure Works]')
Once we prepared this openquery then put this command in place of SQL Query. It works like a normal sql query without any errors.
3. Create a new ADO.Net Connection and from the .Net Providers\SQLClient Data Providers list choose the OLE DB Provider for AS9.0, then create a DataReader Source adapter connected to the new AS connection manager. Put the MDX into the SQLCommand property (be sure to map the external and output columns). This obviously flattens the cell set, but it doesn't limit what you put on the axes. The DataReader Source adapter turns all the columns into DT_NTEXT, which you then have to convert to DT_WSTR to be able to convert to anything else, but that's standard SSIS data type manipulation. Seems a lot slicker than creating a linked server.