Search
Tuesday, February 07, 2012 ..:: Forums ::.. Register  Login
 Peter's Forums Minimize
SearchForum Home
     
  Public Postings to Forums  IT Toolbox Data Warehousing Postings  What is ETL for...
 What is ETL for?
 
 11/16/2006 8:43:22 PM
User is offlinePeterNolan
380 posts
3rd


What is ETL for?

>>And since it will make up 50% or more of the 5 year cost of ownership

>>of

the BI solution companies are well advised to take a careful look at what this will be built in...

 

>Interesting point!

 

Thank You...

I have been 'banging on' about the cost of ETL for 15 years now ever since

I realised it was such a significant portion of the actual costs of building a DW....

Metaphor used to charge around USD1M a shot for Design and ETL work for a sizable implementation and that always seemed a lot to me......

 

I wrote a presentation recently for a client about how expensive ETL used to be compared to today and the issues coming up in the 'thought leading'

organisations today around the costs of ETL.

I see some projects with 8-10 'ETL Programmers'....the most I have ever seen is around 50 ETL programmers on one project.....so ETL deve lopment and support is not a 'small cost' in many companies.

My experience is that it is 50%+ and I have seen it as high as 80% of 5 year cost of ownership where the database is smallers....

However, this is overlooked in almost every client I talk to.....

 

But it does not have to be that way...

We are building far more complex DWs as V1.0 now....Recently I d id a project where we had 80 fact tables and 115 dimension tables as V1.0 for

just billing and marketing for a telco. 4,000 fields were included in V1 .0.

10 years ago that was impossible because of the cost of writing the ETL.

...

What does 'thought leading' mean for ETL?

On that project we were able to write all the ETL into the mapping spreadsheet and generate a run time ETL subsystem for a full scale prototype....we then converted the ETL into DataStage using just o ne DataStage programmer....he wrote over 400 DataStage jobs...

With 'thought leading' ideas we have been able to cut the cost of ETL writing and support by 70-80%.....and since it is the most expen sive piece of the project that one can actually reduce the costs of.....tha t's not a bad effort....

Alas, we have done nothing on the area of 'understanding the data' yet.

Best Regards

Peter

 

 

 

 

Kalman Toth, Database, Data Warehouse & BI Architect

URL: http://www.sqlusa.com/orderdoubleheader/ - The Best SQL Server 20

05

Training in the World

 

----- Original Message -----

From: " pete... via dw-select" <dw-select@Groups.ITtoolbox.com>

To: "SQLUSA" <technicalsqlusa@earthlink.net>

Sent: Friday, August 04, 2006 6:02 PM

Subject: [dw-select] What is ETL for?

 

>

>

> In the 80s I used to present to clients that they would be best served

> by developing an enterprise model and building applications to that

> model so that their data was consistent....all on IBM mainframes

of

> course as I worked for IBM.

>

> We failed to convince the world that enterprise modeling and application

> development to that model was a good idea.....most people here

are too

> young to have even had heard of AD/Cycle.....

>

> The world bought packages.

>

> And in a large company with many packages that were developed by

> different companies there is a major difficulty to gain a consistent

> view of the business.

>

> Of our failure to persuade companies to treat information as a corporate

> asset and have it be captured and stored in a consistent manner my

> 'second career' of doing exactly that using things like DWs was born.

>

> ETL is about validating, integrating, and then presenting vastly

> different sets of data into a single coherent set of information on

> which to reasonably base multi-million dollar decisions....

>

> Be it cobol, stored procedures, code generated by tools large complex

> companies must do ETL...

>

> Any since it will make up 50% or more of the 5 year cost of ownership of

> the BI solution companies are well advised to take a careful look at

> what this will be built in...

>

> I've been doing large batch systems for nearly 25 years now and an ETL

> subsystem is, in many ways, just another large though more complicated

> batch system....though we are seeing more 'right time'.

>

> The only good reason to use an ETL tool is to deliver more money to the

> business.....if there is not a direct justifiable connection b

etween

> 'sustainable profit improvement' and buying an ETL tool I would advise

> companies not to buy one....

>

> Peter

>

>

>

> -----Original Message-----

> From: martyn.richard.jones via dw-select

> [mailto:dw-select@Groups.ITtoolbox.com]

> Sent: 03 August 2006 11:48

> To: pete...

> Subject: AW: [dw-select] Re: RE: RE: DTS - is it viable?

>

>

>

> Hello,

>

> I think there is a set of fundamental issues that need to be addressed

> in terms of Data Warehousing and ETL.

>

> As I see it, the reasons for using an ETL in the first place is not for

> maximum performance of the process, so a tool is chosen that will can be

>

> used to:

>

> a. develop the required extraction, transformation and loading process

es

>

> in the expected / required time-frame.

> b. to reliably and timely execute the ETL, which accurately loads data

> into the DW / Data Marts in the production environments, within the

> required processing windows

> c. generate most of the ETL code automatically

> d. easier to maintain, support of metadata, etc.

>

> The thing is, if your load process window is, for example from midnight

> to 7:00 AM then if you can execute everything by 4:00 AM or 3:00 AM is

> neither here nor there, increasing the load process efficiency won´t

> win you new customers, and having something that takes ten minutes

> rather than five minutes to run in the ETL part of the DW world, wont

> lose you customers either.

>

> I think these days a lot of the efficiency / performance requirements

> are still in getting data out of the DW, the querying of data in the DW,

>

> and questions of partitioning, striping, object dirtribution, indexing,

> statistics for cost/rule based query optimization, materialized views

> etc

>

> Even then, as most Data Warehousing should be focused on strategic

> decision support, even if your senior execs can´t use the DW in days,

> it shouldn´t really have any serious impact on the business. For dat

a

> warehouses that are used to provide data to compliment or even support

> CRM and Core systems, then one must take a different approach, but then

> again, what absolutely current mission critical data would one wish to

> keep in the DW rather than also in the core business applications?

>

> At the end of the day, if you can produce performance improvements of

> 100% there are questions that still needs to be asked, such as:

>

> a. So what?

> b. How much will it cost the business to implement?

> c What secondary impact and cost will the new tool have on the business?

> d How long will it take?

> e What are the risks, what are the likelihood of those risks occurring,

> how often, and how much?

> d. Just what is the bottom line business benefits if we decide to

> switch?

> e. Just what are the bottom line business disadvantges if we don´t

> switch?

>

> Everthing must be accompanied by figures: USD, dates, probabilities,

> etc.

>

> Hope this helps,

>

> Martyn Richard Jones

> Data Warehouse Architect

  Public Postings to Forums  IT Toolbox Data Warehousing Postings  What is ETL for...
Search  Forum Home       

Copyright 2002-2010 Peter Nolan   Terms Of Use  Privacy Statement
DotNetNuke® is copyright 2002-2012 by DotNetNuke Corporation