Search
Tuesday, February 07, 2012 ..:: Forums ::.. Register  Login
 Peter's Forums Minimize
SearchForum Home
     
  Public Postings to Forums  DW List Postings  The compelling ...
 The compelling business case for an archival non-lossy
 
 11/16/2006 7:59:35 PM
User is offlinePeterNolan
380 posts
3rd


The compelling business case for an archival non-lossy

Hi Gabriel,

It was my comment you are responding to....

SA=Stability Analysis. There's a detailed enough description of it in the 'Newsletters' portion of my web site here ->

http://pws.prserv.net/ieinet.pnolan/downloads.htm#bm003

Basically it is the process of splitting the data elements on a table by volatility to save disk...something those of us who are older remember doing..;-)...And some of us who don't feel like wasting disk still occasionally do......I was checking the other day and disk is now about 30,000 times cheaper than it was in 1991 when I did my first DW.

Bills was the first book that I read about it in. Others have also talked about it in similar time frames so who knows who was first. I wasn't.

On the statement:

"After all a 3NF TV+SA model is not meant to be queried by the user, so it does not need to be easy to query or easy to understand.... "

I remember specifically discussing this point in one of Bills seminars I attended in 1994 (or there abouts). As he was presenting the details of

3NF+TV+SA I asked/pointed out that no mere mortal end user could

possibly query it.....and no tools would understand it for years to come.

Bills response was:

"The appropriate query tool for this model is called a programmer."

To me, a person who was doing all the could to present information that was easy for a user to understand, this seemed a very strange position to take. For about an hour. Then it made sense.

These models are difficult to query properly. It is possible to build views over them to hide some of the complexity. And in 10 years the HW/SW has gotten much faster so we can hide more and few people bother with SA any more because disk is so cheap....but still, they are hard to query and if you don't know what you are doing you can get lost rows to Cartesian products just he same as in a normal 3NF model only more of them because of the time variance.

I've worked in Life Insurance for years as well as general, and investments and I'm well aware that lots of the guys in those areas like to query data for themselves...and many of them have great skills at it too.

I used to work in billing systems and I even audited a few systems so I know quite a bit about auditabilty.

I am amazed that the leading ETL tools do nothing for you in the space of auditing.

Auditing is important, and is becoming more so in the US/Europe.

We in Australia got the jump on the rest of the world in terms of 'business empires gone bad' and the Australian government introduced a law in 1992 that made directors financially and personally responsible for corporate losses. (We were surprised they were no up until that point in time.)

We got a lot of interest in audit-ability of reporting systems in 1993/4 but after a while everyone forgot auditing was important again....such is life, (Most people have very short attention spans, so they are doomed to repeat their very own mistakes quite often.)

And yes, you are right, give them just 3NF+TV+SA and it won't work....I have never heard anyone propose giving just the archive layer...

Dimensional models have had such an impact because they are intuitively obvious. A good idea whos time has come. But it has been coming for 30 years, not just the last few....;-)

 

Best Regards

 

Peter Nolan

Data Warehousing Consultant

Mobile: +353 879 581 732

Homepage: http://www.peternolan.com

 

-----Original Message-----

From: owner-dwlist@datawarehousing.com

[mailto:owner-dwlist@datawarehousing.com] On Behalf Of Tanase, Gabriel (GEI, GEIH)

Sent: 28 July 2004 10:34

To: 'dwlist@datawarehousing.com'

Subject: RE: dwlist: The compelling business case for an archival non-lossy

http://www.DataWarehousing.com is sponsored by DataMirror, a leading provider of real-time data integration and resiliency solutions.

Please visit our sponsor today at http://www.datamirror.com to access data warehousing white papers and best practices.

For help with list commands, send a message to <mailto:dwlist-request@datawarehousing.com> with the word "help" in the body of the message.

From: "Tanase, Gabriel (GEI, GEIH)" <Gabriel.Tanase@ge.com>

 

 

 

 

 

 

Hi Peter, Jim

Please excuse my ignorance, but what does "3NF TV+SA" mean?

I believe that TV means 'time variant', but at this moment it escapes me what SA is intended to mean.

On the statement that "after all a 3NF TV+SA model is not meant to be queried by the user, so it does not need to be easy to query or easy to understand.... " (don't know whether it is yours or Jim's), I am 95% sure that my auditors would not agree at all.

If the 3NF TV+SA model is used for loading higher-level marts, auditors want someone to be able to manually query it and independently (from the automated loading procedures) to confirm validity and auditability of figures loaded in the marts.

Believe, it's hard and long work to query a 3NF TV+SA to satisfy each way auditors want to scrutinize figures... It better be nice and queryable.

It's a SOX world nowadays, and the CEO puts his signature on the balance sheet and the profit & loss statements, which are fed directly or indirectly from some of the DW reports. Less than auditable accuracy and the CEO goes to jail (or so we're being told when rushing to deliver :-).

Besides auditors, in insurance one also has the actuaries. Another bunch of nice people who love querying themselves and taking big loads of data into their statistical packages. Give them just a 3NF TV+SA and they'll make sure the CEO soon knows you're the worst idiot in IT they ever met (OK, I'm exaggerating this a bit; I am in good terms with these people, but this is before our new DW goes live :-).

 

Best regards,

Gabriel

Gabriel Tanase

IT Systems Designer

GE Financial Insurance Europe

GEIS Shannon, Ireland

e-mail: gabriel.tanase@ge.com

Opinions expressed in this message are my own and do not represent the opinions or policies of GE or GEFI or any of its other employees, directors, officers, shareholders or affiliates.

 

 

 

> -----Original Message-----

> From: Peter Nolan [mailto:peter@peternolan.com]

> Sent: Tuesday, July 27, 2004 8:52 PM

> To: 'jim stagnitto'; DWLIST

> Subject: RE: dwlist: The compelling business case for an archival

> non-lossy

>

> >> I did say I believe a 3NF model like that is more able to capture

all

> versions of data than a type 2 dimension. And the reason I say that is

> because in a type 2 dimension where the table will be queried (a basic

> notion of dimensional modelling) will typically store data from

multiple

> different source files and be somewhat denormalised.....hence tracking

> changes to every field in the table creates a large volume of

redundant

> data than capturing the same information in a 3NF TV+SA model.

>

> >> And in fact, about 8 years ago I discovered that exactly the same

> code could be used to build a type 2 dimension as a 3NF TV+SA model if

> you are just careful enough about getting data into and out of the

> staging area...instead of combining data to go onto the type 2

dimension

> you split it by volatility (yes, I even put integer keys onto my TV

> models but I only use them as internal keys..).....so discussions

about

> using 3NF TV+SA models vs type 2 dimensions seems a waste of time to

me

> because a 3NF TV+SA entity is a type 2 dimension...it's just that it

is

> split based on data volatility and 3NF design rather than bringing

> together disparate data elements to improve query-ability of the

> tables.....after all a 3NF TV+SA model is not meant to be queried by

the

> user, so it does not need to be easy to query or easy to

> understand....

  Public Postings to Forums  DW List Postings  The compelling ...
Search  Forum Home       

Copyright 2002-2010 Peter Nolan   Terms Of Use  Privacy Statement
DotNetNuke® is copyright 2002-2012 by DotNetNuke Corporation