Hi Gabriel,
It was my comment you are responding to....
SA=Stability Analysis. There's a detailed enough description of it in the 'Newsletters' portion of my web site here ->
Basically it is the process of splitting the data elements on a table by volatility to save disk...something those of us who are older remember doing..;-)...And some of us who don't feel like wasting disk still occasionally do......I was checking the other day and disk is now about 30,000 times cheaper than it was in 1991 when I did my first DW.
Bills was the first book that I read about it in. Others have also talked about it in similar time frames so who knows who was first. I wasn't.
On the statement:
"After all a 3NF TV+SA model is not meant to be queried by the user, so it does not need to be easy to query or easy to understand.... "
I remember specifically discussing this point in one of Bills seminars I attended in 1994 (or there abouts). As he was presenting the details of
3NF+TV+SA I asked/pointed out that no mere mortal end user could
possibly query it.....and no tools would understand it for years to come.
Bills response was:
"The appropriate query tool for this model is called a programmer."
To me, a person who was doing all the could to present information that was easy for a user to understand, this seemed a very strange position to take. For about an hour. Then it made sense.
These models are difficult to query properly. It is possible to build views over them to hide some of the complexity. And in 10 years the HW/SW has gotten much faster so we can hide more and few people bother with SA any more because disk is so cheap....but still, they are hard to query and if you don't know what you are doing you can get lost rows to Cartesian products just he same as in a normal 3NF model only more of them because of the time variance.
I've worked in Life Insurance for years as well as general, and investments and I'm well aware that lots of the guys in those areas like to query data for themselves...and many of them have great skills at it too.
I used to work in billing systems and I even audited a few systems so I know quite a bit about auditabilty.
I am amazed that the leading ETL tools do nothing for you in the space of auditing.
Auditing is important, and is becoming more so in the US/Europe.
We in Australia got the jump on the rest of the world in terms of 'business empires gone bad' and the Australian government introduced a law in 1992 that made directors financially and personally responsible for corporate losses. (We were surprised they were no up until that point in time.)
We got a lot of interest in audit-ability of reporting systems in 1993/4 but after a while everyone forgot auditing was important again....such is life, (Most people have very short attention spans, so they are doomed to repeat their very own mistakes quite often.)
And yes, you are right, give them just 3NF+TV+SA and it won't work....I have never heard anyone propose giving just the archive layer...
Dimensional models have had such an impact because they are intuitively obvious. A good idea whos time has come. But it has been coming for 30 years, not just the last few....;-)
Best Regards
Peter Nolan
Data Warehousing Consultant
Mobile: +353 879 581 732
Homepage:
-----Original Message-----
From: owner-dwlist@datawarehousing.com
[
Sent: 28 July 2004 10:34
To: 'dwlist@datawarehousing.com'
Subject: RE: dwlist: The compelling business case for an archival non-lossy
Please visit our sponsor today at
For help with list commands, send a message to <
From: "Tanase, Gabriel (GEI, GEIH)" <Gabriel.Tanase@ge.com>
Hi Peter, Jim
Please excuse my ignorance, but what does "3NF TV+SA" mean?
I believe that TV means 'time variant', but at this moment it escapes me what SA is intended to mean.
On the statement that "after all a 3NF TV+SA model is not meant to be queried by the user, so it does not need to be easy to query or easy to understand.... " (don't know whether it is yours or Jim's), I am 95% sure that my auditors would not agree at all.
If the 3NF TV+SA model is used for loading higher-level marts, auditors want someone to be able to manually query it and independently (from the automated loading procedures) to confirm validity and auditability of figures loaded in the marts.
Believe, it's hard and long work to query a 3NF TV+SA to satisfy each way auditors want to scrutinize figures... It better be nice and queryable.
It's a SOX world nowadays, and the CEO puts his signature on the balance sheet and the profit & loss statements, which are fed directly or indirectly from some of the DW reports. Less than auditable accuracy and the CEO goes to jail (or so we're being told when rushing to deliver :-).
Besides auditors, in insurance one also has the actuaries. Another bunch of nice people who love querying themselves and taking big loads of data into their statistical packages. Give them just a 3NF TV+SA and they'll make sure the CEO soon knows you're the worst idiot in IT they ever met (OK, I'm exaggerating this a bit; I am in good terms with these people, but this is before our new DW goes live :-).
Best regards,
Gabriel
Gabriel Tanase
IT Systems Designer
GE Financial Insurance Europe
GEIS Shannon, Ireland
e-mail: gabriel.tanase@ge.com
Opinions expressed in this message are my own and do not represent the opinions or policies of GE or GEFI or any of its other employees, directors, officers, shareholders or affiliates.
> -----Original Message-----
> From: Peter Nolan [
> Sent: Tuesday, July 27, 2004 8:52 PM
> To: 'jim stagnitto'; DWLIST
> Subject: RE: dwlist: The compelling business case for an archival
> non-lossy
>
> >> I did say I believe a 3NF model like that is more able to capture
all
> versions of data than a type 2 dimension. And the reason I say that is
> because in a type 2 dimension where the table will be queried (a basic
> notion of dimensional modelling) will typically store data from
multiple
> different source files and be somewhat denormalised.....hence tracking
> changes to every field in the table creates a large volume of
redundant
> data than capturing the same information in a 3NF TV+SA model.
> >> And in fact, about 8 years ago I discovered that exactly the same
> code could be used to build a type 2 dimension as a 3NF TV+SA model if
> you are just careful enough about getting data into and out of the
> staging area...instead of combining data to go onto the type 2
dimension
> you split it by volatility (yes, I even put integer keys onto my TV
> models but I only use them as internal keys..).....so discussions
about
> using 3NF TV+SA models vs type 2 dimensions seems a waste of time to
me
> because a 3NF TV+SA entity is a type 2 dimension...it's just that it
is
> split based on data volatility and 3NF design rather than bringing
> together disparate data elements to improve query-ability of the
> tables.....after all a 3NF TV+SA model is not meant to be queried by
the
> user, so it does not need to be easy to query or easy to
> understand....