t last someone who actually understands how shared nothing works and why it is fundamentally not so compatible with a dimensional model which tends to join very large numbers of small tables to very small numbers of large tables......
Anyone who actually understanding how shared nothing works understands this...but there seems to be about 10 of us on the planet.......
-----Original Message-----
From: ngalemm... via dw-select [
Sent: 15 August 2006 19:41
To: pete...
Subject: RE:[dw-select] AW: RE: RE: Teradata Vs Oracle
The distribution of the data across AMPs is precisely my point as to why
dimensional designs do not perform as well as 3NF designs in a shared
nothing architecture such as Teradata. It is not a discussion of which
is a better design, but rather which gets the most out of Teradata's
architecture.
Defining common primary indexes across tables allows you to localize
related rows, such as order headers with order lines, to improve the
performance of a particular join. This is one of the key aspects of
tuning a Teradata model. Because the data is localized, it reduces the
amount of 'chatter' between AMPs.
In a dimensional design any one fact row has foreign keys to many
unrelated dimensions. There is no opportunity to localize dimensions as
they relate to facts. The only time localization does occur is on the
chance that the primary indexes hash to the same AMP. The probability
that a fact row can be resolved locally quickly goes to near zero as
more dimensions are involved in the query.
As a result, essentially all the result sets need to be shared among
all the AMPs to resolve the query. There is a excessive amount of
cross-talk between the AMPs effectively bottle-necking throughput. An
alternate approach is to define join indexes, which basically creates a
pre-joined flat table, with a huge cost in storage space. That is not
to say Teradata could still do the query 'fast', but simply that it is
not an effective use of the hardware. You can do better for the same
cost with other approaches.