sql server days 2013 - create etl solutions faster with metadata driven development
DESCRIPTION
Working on your 20th staging package? Do you need to import 30 different flat files? Don’t have the time to create all those packages? By using Biml – Business Intelligence Markup Language – you can describe your BI solution in a declarative fashion. You can turn tedious, repetitive work into reusable scripts which generate SSIS packages for you, cutting back drastically on development time. This session will introduce you to Biml and get you started on building reusable patterns for common BI tasks.TRANSCRIPT
SQL SERVER DAYS 2013
CREATE ETL SOLUTIONS FASTERWITH METADATA DRIVEN DEVELOPMENT
KOEN VERBEECK
WHO AM I?
OUTLINE
Hello WorldIntroduction Read Flat File
Read Flat Files While Looping
Metadata driven development
Conclusion
OUTLINE
Hello WorldIntroduction Read Flat File
Read Flat Files While Looping
Metadata driven development
Conclusion
INTRODUCTION
INTRODUCTION
• large percentage of BI projects fail• Gartner - http://www.gartner.com/newsroom/id/492112
• one of the reasons is underestimating development effort ETL• Kimball: 70% time of building a DWH goes into ETL
http://www.informationweek.com/the-38-subsystems-of-etl/55300422
INTRODUCTION
“I choose a lazy person to do a hard job. Because a lazy person will find an easy way to do it”Bill Gates
INTRODUCTION
• a lot of SSIS packages are very similar• packages importing flat files• packages writing change data to staging tables• packages exporting data to excel (for some reason)• packages updating dimensions• …
• … but they take a lot of time to create
INTRODUCTION
• solution?• code reuse
o SSIS basically only supports copy-pasteo copy-paste has improved in SSIS 2012
• design patternso for example: incremental load packageo SQL Server 2012 Integration Services Design Patterns
• enable through templateso build a template packageo save it to C:\Program Files (x86)\Microsoft Visual Studio 11.0\Common7\IDE\
PrivateAssemblies\ProjectItems\DataTransformationProject\DataTransformationItems
• but still requires you to edit each package!• (and what if you forget to edit a crucial piece?)
INTRODUCTION
• metadata driven development to the rescue!• (aka code generating code)• automate generation of common logic in SSIS packages
• first option is the “dynamic SSIS package”1. reads metadata from tables2. generates code
o usually outputs T-SQL or bcp commandso uses T-SQL or C#o for example: SELECT … INTO statements
3. loops over the generated code4. executes each statement
• disadvantages• complex project• no parallelism• difficult row based error handling• difficult to incorporate “business logic”
INTRODUCTION
• second option: BIML• started as a project at MS: http://vulcan.codeplex.com/• developer left to found company Varigence
http://www.varigence.com/o took the idea (not the code) and developed BIML
• BIML is a markup language and compilero translates metadata into business intelligence solutions for SQL Servero supports SSIS and SSASo Varigence made part of BIML available as open source
INTRODUCTION
• BIDS Helper has open source implementation of BIML• it’s free!• it’s already in the add-on you love!• it is available for SSIS 2005, 2008, 2008R2, 2012 (and 2014?)
• BIML offers• powerful code generation
o only some parts of the project deployment model are not supported
• reuse BI patterns and componentso create your pattern in BIML and generate all your packages with the same structureo BIML files can reference each other
• .NET based script languageo C# code can be incorporated into BIML to generate objects based on metadatao Intellisense (sometimes) available
• don’t like BIML?o generated packages are just SSIS packages, you can edit them using BIDS/SSDT/SSDTBIo no vendor lock-in
INTRODUCTION
• scenario for our demos• import different flat files
o exports from ERP systems, other database vendors, 3rd party providers, …
• each type of flat file has a different structureo no single SSIS package for all flat files
• the name of the flat files can changeo for example the name includes a timestamp
• this would normally require 1 SSIS package per flat file type• couple of hours/days work?
• let’s solve it with BIML!
OUTLINE
Hello WorldIntroduction Read Flat File
Read Flat Files While Looping
Metadata driven development
Conclusion
HELLO WORLD
• basic BML script structure
BIML
Connections
FileFormats
Packages
Tasks
Containers
Tasks
Dataflow
Precedence constraintsTransformations
You can also specify• events• log handlers• variables• parameters• custom tasks• script tasks/components• …
HELLO WORLD
• let’s take a look at a simple BIML script
HELLO WORLD
• BIML root node
• add connections
• add packages
HELLO WORLD
• specify Tasks
• specify specific properties
HELLO WORLD
• check for errors & generate package
• result
DEMO
show Hello World BIML
OUTLINE
Hello WorldIntroduction Read Flat File
Read Flat Files While Looping
Metadata driven development
Conclusion
READ FLAT FILE
• specify FlatFileFormat• columns: name, data type, size, delimiter (, code page)• what you’d normally specify in the flat file connection manager
• specify connection
READ FLAT FILE
• specify data flow with transformations• if no input/output connectors are specified, transformations are
connected in the order specified in the BIML file
• result
DEMO
import flat file with BIML
OUTLINE
Hello WorldIntroduction Read Flat File
Read Flat Files While Looping
Metadata driven development
Conclusion
READ FLAT FILE IN LOOP
• now let’s loop over a bunch of flat files• specify variables to hold path to current file and source
folder
• add an expression on the flat file connection manager
READ FLAT FILE IN LOOP
• add a for each loop• which has its own tasks child element
READ FLAT FILE IN LOOP
• result
DEMO
import flat file using for each loop with BIML
OUTLINE
Hello WorldIntroduction Read Flat File
Read Flat Files While Looping
Metadata driven development
Conclusion
METADATA DRIVEN DEVELOPMENT
• BIML is nice
• … but isn’t the GUI much faster to developer packages?
• time to enhance BIML with some C# goodness!• called BIMLScript• use C# to read metadata• loop over metadata and create multiple objects• entire website dedicated with tutorials and code snippets
http://bimlscript.com/• also has an online editor
METADATA DRIVEN DEVELOPMENT
• Add namespaces
• Declare variables
METADATA DRIVEN DEVELOPMENT
• Retrieve metadata (stored in a SQL Server table)
• Loop over metadata and create corresponding objects
METADATA DRIVEN DEVELOPMENT
• result
METADATA DRIVEN DEVELOPMENT
• remarks• make sure the code or the metadata doesn’t contain invalid XML
characterso < > “ &
• using C# can mess with the Intellisenseo Visual Studio thinks it’s not valid XML anymoreo color coding can disappear > right click file and choose Open With…o Intellisense can stop working in Visual Studio > use online editor
• beware of the protection levels
• some elements can only appear onceo do not put those in a loopo e.g. Connections, Packages
DEMO
generate multiple packages using BIMLScript
OUTLINE
Hello WorldIntroduction Read Flat File
Read Flat Files While Looping
Metadata driven development
Conclusion
CONCLUSION
• BIML can radically reduce SSIS development time• for frequently used package patterns• when combined with BIMLScript
• BIML supports all versions of SSIS• but some project deployment functionality is missing
• bit of a learning curve• good understanding of SSIS is necessary• basic C# skills needed• return of investment is in next projects
RESOURCES
• Official BIML• Varigence BIML product page
http://www.varigence.com/Products/Biml/Capabilities
• BIMLScript resource hubhttp://bimlscript.com/
• BIDS Helper on Codeplexhttp://bidshelper.codeplex.com/
• Blogs• Stairway to BIML by Andy Leonard
http://www.sqlservercentral.com/stairway/100550/
• BIML articles by Joost van Rossumhttp://microsoft-ssis.blogspot.be/search/label/BIML
• BIML articles by Marco Schreuderhttp://blog.in2bi.eu/tags/biml/
• BIML articles by John Welchhttp://agilebi.com/jwelch/tag/biml/
• Introduction to BIML part I by Koen Verbeeckhttp://www.mssqltips.com/sqlservertip/3094/introduction-to-business-intelligence-markup-language-biml-for-ssis/
SQL SERVER DAYS 2013
Q&A
SQL SERVER DAYS 2013
THANKS FOR LISTENING!
[email protected]@Ko_Verhttp://www.linkedin.com/in/kverbeeck
© 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after
the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.