Integration Services, which replaces Data Transformation Services (DTS) in SQL Server 2005, is a tool for extracting, transforming, and loading data. Common uses for Integration Services include: loading data into the database; changing data into to or out from your relational database structures; loading your data warehouse data; and taking data out of your database and moving it to other databases or types of storage.
Note: There are several wizards that come with SQL Server Management Studio to aid you in the import and export of data into and out of your database. I will not look at those wizards; I will focus on how you can build a package from scratch so that you don't have to rely on the wizards.
To begin the process, I open SQL Server Business Intelligence (BI) Development Studio, a front-end tool that is installed when you install SQL Server 2005. The BI Development Studio is a scaled down version of Visual Studio. Then I select New Integration Services Project and give the project a name.
When the project opens, you will see an environment that may look familiar to you if you have used SQL Server DTS; some of the items of the toolbox are the same. For the purposes of this project, I am interested in dragging the Data Flow task item from the toolbar into the Control Flow tab. (The idea of a Data Flow task is one of the major differences between DTS and SSIS packages. In an SSIS package, you can control the manner in which your package logic flows inside of the Control Flow tab. When you need to manage the data aspects of your project, you will use the Data Flow task. You can have several different Data Flow tasks in your project -- all of which will reside inside the Control Flow tab.)
Double-click the Data Flow task that you have dragged onto the Control Flow tab. The available options in the Toolbar have changed; I now have available Data Flow Sources, Data Flow Destinations, and Data Flow Transformations. Since I am going to import an Excel file into the database, I will drag the Excel Source item from the Toolbar onto the screen.
The Excel Source item represents an Excel file that I will import from somewhere on my network. Now I need somewhere to put the data. Since my plan is to put the data into the database, I will need a Data Flow Destination. For the purposes of this example, I will choose SQL Server Destination from the Data Flow Destination portion of the toolbar and drag it onto my Data Flow tab.
To designate which Excel file I want to import, I double-click the Excel Source item that I moved onto the screen. From there, I find the Excel file on the network that I want to import.
I also need to designate the sheet from the Excel file that I want to import, along with the columns from the sheet that I want to use.
Now that I have defined my Excel source, I need to define my SQL Server destination. Before doing that, I need to indicate the Data Flow Path from the Excel file to the SQL Server destination; this will allow me to use the structure of the data defined in the Excel Source to model my SQL Server table that I will import the data into. To do this, I click the Excel Source item and drag the green arrow onto the SQL Server Destination item.
To define the database server and database to import the data, double-click the SQL Server Destination item. I will define the server in which I will import the data, along with the database that the data will reside.
I also need to define the table that I will insert the Excel data into. I will create a new table named SalesHistoryExcelData.
Under the Mappings section, I define the relationship between the Input Columns (the Excel data) and the Destination Columns (my new SQL Server table).
Once I successfully define the inputs and outputs, my screen will look like the one below. All I need to do now is run the package and import the data into the new table by clicking the green arrow in the top-middle of the screen, which executes my package.
My package has successfully executed and 30,000 records from my Excel Source item have been transferred to my SQL Server destination.
You can download the Excel file I used for this article.
Tasks in SSIS packages
Importing and exporting data are some of the simplest, most useful tasks to accomplish in SQL Server. However, there are literally hundreds of other tasks that can easily be accomplished in SSIS packages that will take a significant amount of time to do by a different means. I plan to take a look at several more of these tasks in future articles.






1
Gabriel de Figueiredo - 27/02/08
This procedure does not work if there are mixed types in a column, e.g. a Serial Number may be alf-numeric, or numeric. When the package encounters all numerics after a series of alpha-numerics in the column, it imports the column for the row as a NULL.
This is the same issue as faced by me when I use C# with Jet 4.0. Even if IMEX=1 is used in the connect string, Jet 4.0 stupidly converts a long numeric string into scientific notation, or something that to Jet appears as a date (e.g. a serial number 02012005) to a date, when all I want is a string! Aargh!!
» Report offensive content
2
Daniel Gara - 18/03/08
I can use this. Thanks guys!
» Report offensive content
3
navin - 24/03/08
Nice
» Report offensive content
4
Julie - 26/03/08
Gabriel: When working with mixed data types; create a unique identifier in the base excel worksheet and import that with your dataset. A quick query will show you the records that failed the import. Bring the ones that failed in under a second pass.
» Report offensive content
5
Annette - 27/03/08
To Julie:
No, that would no do, as there are NO errors in then import - the cells that does not meet the datatypes that SSIS THINK the column is, is just imported as NULL!
Best advise is to change to CSV file and import, as you here CAN set the datatypes wothout having SSIS change it because it thinks columntypes different.
How is it that MS cannot integrate their own products?? Just my thoughts....
Regards Annette
» Report offensive content
6
gan - 27/03/08
Thxs. It's my 1st lesson on SSIS.
» Report offensive content
7
Mike - 28/03/08
NOTE: If you are using SQL 2005 64 bit there is no JET driver that allows Excel or Access to imported in this manner. You can do so but must use the 32 bit JET driver. From BITS there is an option when you right click on the project and select Properties. Turn off 64 bit mode to run from BITS. Calling this from DTEXEC is also a problem, you need to use the 32 bit DTEXEC which is available on the install disk.
» Report offensive content
8
Andy - 28/03/08
To Annette:
Thanks for the tip about converting to CSV. Funnily enough, I was trying to import from Excel to SQL Server just yesterday, and ran into this same problem.
I found the Import and Export wizard has the same limitation. Since this was just a one time task, I was able to "Import" my data by copying the rows from my Excel spreadsheet, and pasting them into my table in the Server Management Studio.
» Report offensive content
9
mike - 28/03/08
i tried the import excel using SSIS, no luck. i get the table name but no data.
SSIS package "Package.dtsx" starting.
Information: 0x4004300A at Data Flow Task, DTS.Pipeline: Validation phase is beginning.
Information: 0x4004300A at Data Flow Task, DTS.Pipeline: Validation phase is beginning.
Information: 0x40043006 at Data Flow Task, DTS.Pipeline: Prepare for Execute phase is beginning.
Information: 0x40043007 at Data Flow Task, DTS.Pipeline: Pre-Execute phase is beginning.
Error: 0xC0202009 at Data Flow Task, SQL Server Destination [694]: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80040E14.
An OLE DB record is available. Source: "Microsoft SQL Native Client" Hresult: 0x80040E14 Description: "Could not bulk load because SSIS file mapping object 'Global\DTSQLIMPORT ' could not be opened. Operating system error code 2(The system cannot find the file specified.). Make sure you are accessing a local server via Windows security.".
Error: 0xC0202071 at Data Flow Task, SQL Server Destination [694]: Unable to prepare the SSIS bulk insert for data insertion.
Error: 0xC004701A at Data Flow Task, DTS.Pipeline: component "SQL Server Destination" (694) failed the pre-execute phase and returned error code 0xC0202071.
Information: 0x40043008 at Data Flow Task, DTS.Pipeline: Post Execute phase is beginning.
Information: 0x40043009 at Data Flow Task, DTS.Pipeline: Cleanup phase is beginning.
Information: 0x4004300B at Data Flow Task, DTS.Pipeline: "component "SQL Server Destination" (694)" wrote 0 rows.
Task failed: Data Flow Task
Warning: 0x80019002 at Package: SSIS Warning Code DTS_W_MAXIMUMERRORCOUNTREACHED. The Execution method succeeded, but the number of errors raised (3) reached the maximum allowed (1); resulting in failure. This occurs when the number of errors reaches the number specified in MaximumErrorCount. Change the MaximumErrorCount or fix the errors.
SSIS package "Package.dtsx" finished: Failure.
The program '[3720] Package.dtsx: DTS' has exited with code 0 (0x0).
» Report offensive content
10
Dejim - 28/03/08
Great walkthrough article. Another issue to look out for is the 'TypeGuessRows' issue when importing Excel files with columns of data that have cells that contain data that is longer than 255 characters. The Microsoft Excel ODBC driver will, by default, scan the first 8 rows of your data to determine the type of data in each column. You need to go into the registry to change the 'TypeGuessRows' value to scan more columns in order for the driver to correctly determine the datatype of the column
» Report offensive content
11
Peter - 28/03/08
Quick note for those working with Excel files. Annette noted this already - any data that doesn't match what Excel thinks the column should be will be imported as NULL (or have some other wacky conversion done). Example: Zip+4 - if the first handful of Zip Codes are all numeric, that column is a float and all zip+4 (or postal codes from other countries) values are erased.
Also, when exporting to CSV, watch out for that last column. If it doesn't have data all the way down, you get "ragged right" CSV files (Thank you, MS). That means you either need a dummy column just to hold some value for every record in your sheet or ensure that the last column is fully populated. If you don't do this, you'll get all sorts of import errors because SSIS doesn't really handle your EOL character(s). It will import the EOL chars as part of the field until it hits the next comma/tab.
e.g.
col1, col2, col3
1,2,3
4,5,6
7,8
9,10,11
would be imported:
1,2,3
4,5,6
7,8<eol>9,10
^
this is where the error is likely to occur because 8CRLF9 is not a valid integer. Been bitten by this way too many times to trust Excel and the text import in SSIS.
On the other hand, if your data is all the same type, just importing directly from Excel can be a great time-saver, especially using the Wizard. The problems come when you get a file with mixed column values.
» Report offensive content
12
Mike Kear - 28/03/08
This is all very well, but the previous version's DTS allowed you to save it and run it any time you like. It's well and good to go through this process for a single spreadsheet or one table But it gets REALLY tiresome haveing to do his over and over again to import or export databases of 50-60 tables. You can save the SSIS as a Solution but it wont run again. You have to open it in the SQLServer Management Studio, wait for 5 minutes or more while it validates everything again (timing out once or twice, usually) then run it by clicking "debug".
When you click "debug", it takes 5 minutes or more to re-validate everything.
You can't run it unattended, like you could in the old DTS. Which means nightly transfers of information from an online database to a development environment are impossible. (Can't use synchronizing because in a shared environment you dont have server admin access. Also can't run backups because that requires server admin access too).
When you try to run a saved SSIS package, it looks like it's doing everything but no data actually transfers.
SSIS looks at first glance like a great thing.
But for anyone using shared web hosting it's a HUGE step backwards. I have 40 web site databases i have to manage. Once i took nightly copies to my development server. Now i cant afford the time - i'd have no time to actually do any development work.
I hate it. Pity Microsoft never thought about shared hosting or other apps where the database owner doesnt have admin access.
» Report offensive content
13
Stew Davis - 28/03/08
Mike,
I am not sure what kind of PC you are using, but I have packages that have more than 180 tables that validate in a few seconds (almost unnoticeable). My recommendation would be to set DelayValidation flag to True.
I have found that packages also run much faster in SSIS than in the old DTS. The configurable data flow tasks allow you to tweak the threading and concurrency of the package. The speeds are equivalent to BULK INSERT operations with the ability to validate on the fly. I also found it much more cumbersome in the old DTS to debug in one environment and roll-out to production.
I would agree with the previous post that getting around Jet support for 64 bit is a pain. Apparently, from what I have read, Microsoft has no plans to port the Jet Engine to 64 bit. If not, then why even have the icons in the 64bit BIDS UI.
» Report offensive content
14
Terrence - 01/04/08
na
» Report offensive content
15
Oscar Gonzalez - 19/04/08
Hi everyone. I did something that helps me to resolve the NULL issue. Apart of the modify the string connection of the excel file (add the IMEX=1), and additionally I modify the register key
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel
TypeGuessRows = 2500 in a decimal way
It means that when detect the data type of the column It takes 2500 rows to determine the data type.
It works for me, but maybe you need evaluate how long is your file in order to put the needed value.
» Report offensive content
16
Robert - 23/04/08
Peter,
Adding IMEX=1 to "Extended Properties" in Excel Connection Manager connection string solved the problem with zip-codes for me.
As for the CSV file import, I made a simple test. Edited the excel spreadsheet from this article and put some null's in the price column. My table in db allows nulls there too.
I've created two dataflow tasks connected together. First, will export the excel file to csv. Second will import that file to the table.
In exported file, when there's an empty value in last column, the line would look like:
"36,PoolTable," Note the coma at the end.
There are no problems importing that and in table the last column will have NULL value.
Maybe make sure that the format of flat file, in the flat file connection manager, is set to delimeted, not ragged right.
» Report offensive content
17
Myros - 27/05/08
Great post ... thanks :D
» Report offensive content
18
amit - 10/06/08
how to convert the data type for destination table
» Report offensive content
19
Brian - 30/06/08
Peter / Robert,
Inorder to load a excel that has different datatypes for one columns to database, staging into a delimited file and then load the file into table shall work if the excel do not contain any carriage return. If some cells has carriage returns then the single row excel data will be spanned to multiple rows in delimited file and this leading to incorrect read for the later part.
do you know any workaround for this ?
Brian
» Report offensive content
20
Mark - 01/08/08
Hi all,
I've enjoyed this page and also had challenges with importing excel speadsheets to SQL tables. First most, I'm using a oleb destination and have experimented with the SQL destination as well.
Maybe I missed something here...such as the conversion of Unicode to Ansi, which I'm using the Data Conversion Task to accomplish this conversion. Isn't that a given when doing this?
I'm processing many files, which 80 percent work and the others do not load all records into my SQL table. The funny thing is that if I put a data viewier object riight before the inport to my detination it outputs the correct number of records. But not all records are inserted into the target table. I'm still working on it and any insight would be greatly appreciated. Thanks.
» Report offensive content
21
Frank - 20/08/08
Wow... it just shouldn't be this hard...
Thanks! :)
» Report offensive content
22
Krn - 09/12/08
Oscar Gonzalez,
Your solution worked for me. thanks a lot.
» Report offensive content
23
Michael - 10/01/09
Thank you for this walk through, itis veryhelpful. Can you help me with soemthing else. I need to rename (serialize) the excel file afteer I import it, I know that SIS allows this in the properties area but I cannot get it to work. Any thoughts or guidance is much appreciated.
» Report offensive content
24
Gerson - 06/02/09
Hi everybody,
Because i also had problems with the "null" issue, and i will try "IMEX=1" and "TypeGuessRows = 2500", i would like to also add some usefull information to all of you.
When you want to import multiple files (with diferent names) with a DTSX, assuming they always have the same format, you can't directly import them with a wizard or a simple transformation task.
But, there is a way:
1) Create a shared folder on your server (for example \\MyServer\FilesToLoad) and a subfolder \History. In this folder you put every Excel files or CSV you want to import.
2) Make a copy of one of your Excel files (the one you prefer) into \History subfolder, and rename it into "temporal.xls" (or temporal.csv if its a CSV)
3) (OPTIONAL, but it would be faster) Using the import wizard of SQL Server 2005, create and save (don't execute) a DTSX to load "temporals.xls" into your database.
4) Edit this DTSX under VisualStudio (right click on DTSX package and select "edit")
- Initially, you will only have a dataflow that imports data from you temporal.xls to your SQL table and your 2 connections (one for Excel and one for SQL Server)
- Create a globalvariable (on a blank section of "ControlFlow", use right-click and "variables"):
Name: Scope: DataType: Value:
Item_file YourDTSXName String (nothing)
Temporal_file YoutDTSXName String \\MyServer\FilesToLoad\History\temporal.xls
HistoryFile YoutDTSXName String \\MyServer\FilesToLoad\History
- In [Control Flow], create a [Foreach Loop Container] and in
"collection" properties
Folder: \\MyServer\FilesToLoad
Files: *.xls (or *.csv)
Traverse subfloders NO (uncheck)
"Variable Mappings" properties, add
Variable: Index:
User::Item_file 0
> This control flow will read all *.xls files in your folder, and give you their names in "Item_file"
- In [Control Flow], inside your "Foreach Loop Container", create a "File System Task" and edit its properties:
"general" properties
IsDestionationPathVariable True
DestionationVariable User::Temporal_file
OverwriteDestionation True
Operation Copy file
IsSourcePathVaribale True
SourceVariable User::Item_file
- In [Control Flow], move your "DataFlowTask" (created by the Wizard) inside the "Foreach Loop Container"
and link "File System Task" succes event with your DataFlow
> This trick is necessary, since file "Connections" doesn't allow to use variables. It's always static.
(if you use "connections" from the "Connection Managers"; if you want to program everything in .Net scripts it's far more complicated)
- In [Control Flow], inside your "Foreach Loop Container", create another "File System Task" and link "DataFlowTask" succes event with it.
edit its properties
IsDestionationPathVariable True
DestionationVariable User::HistoryFile
OverwriteDestionation True
Operation Move file
IsSourcePathVaribale True
SourceVariable User::Item_file
Save your DTSX. Now, you should be able to load every xls file inside this shared folder, and loaded files be moved into "history" subfolder. :)
Remember that if you edit a DTSX, without being inside a Visual Studio Project, you can't test (debug) directly the DTSX; you must save it, and execute it externally (in the server, for example)
Also remember that "Excel Connection" ONLY works (due to comercial limitations) inside your SQL Server or inside a VisualStudioProject; that means that if you directly execute this DTSX form your workstation, you will get a cryptic error, because Excel connection doesn't work.
Finally, if you use "Script Task" you can also due a far more complex logics, but it would be veery long to explain how.
That's all; hope you will enjoy it!
» Report offensive content
25
Hemant - 08/07/09
Hi there. I have anather query in reading excel file. The case is-
first 5 rows are blank. Next row has format Date, and then again next rows are numbers. But when I read the file all numbers also show in date format. Ex- supose number is 1 then output shows 1900-01-01 00.00.0000. If I change the format of header row(Date) then it get converted into numbers Ex supose Date is "Jan-07" if I format cell manulay to text then it get converted into "39083".
Please help me.
» Report offensive content
26
Liam Lynch - 31/07/09
Hi there, i want to use a sql command for the data access mode to the excel spreadsheet. I only want to get one specific cell out, can do it with the following:
select * from [workbook1$a1:a1]
but i am wondering if there is a way of writing select (cell) A1 as result from workbook1$
Any help is greatly appreciated
» Report offensive content
27
Doug - 06/08/09
I am having the same problem with mixed types(text and numeric) in one colum (a primary ID column that is critical). I tried all of the follwing with varying results and STILL have not been successful:
1. Convert column to TEXT - imports text, numbers null
2. Create new column format as TEXT and copy/paste VALUES ONLY - imports text, numbers null
3. Set registry to 2500 from 8 - imports numbers, text null
4. Added IMEX=1 - imports both, but numbers are exponential format
I guess I will be forced to convert the Excel file to delimited before importing.
In any case, I am a huge Microsoft fan; however, things like this make you wonder if there is a disconnect between software development teams to allow for such a problem and not have it corrected (unless, perhaps, there is an updated version of Jet or other SP that fixes the problem???).
» Report offensive content
28
Martin - 18/09/09
Thank you for this note. This is my first contact with IS
» Report offensive content
29
Madhukar - 18/09/09
Hi , This not works for me. I a mgetting following error. Can any body of you help me out.
" Column "fcAddress1" cannot convert between unicode and non-unicode string data types. "
» Report offensive content
30
rasha - 20/10/09
i have this error message
» Report offensive content
31
Jeff - 23/10/09
For those that have issues with mix data type columns, I found that you have to use the "Data Conversion" in the data flow to force the data into one data type then from there you put the data into the table.
» Report offensive content