Make changes to your data; and a relational database speed fallacy exposed
Speed and the relational model
SQL is built for set-based operations and, as a result, it is very fast for
certain operations.
It tends to be regarded as a well-known fact that the relational database model, as a whole, is inherently slow; but, like many well-known facts, this is a fallacy.
However, it is such an accepted part of database dogma that I know I’m going to have trouble convincing you.
So what I’ll do this month is give you all the arguments that are typically presented to prove that the relational model is slow – then I’ll give you some clues as to why these arguments are flawed.
Proof of the inherent slowness
Suppose you have a large database that is organised according to the relational
model and it is sitting on top of a relational database engine (Access, Oracle,
SQL Server, DB2, whatever).
You run this database (perhaps it is a finance system) on a certain server. You have tuned the database to the best of your ability (applied all the right indexes, and so on) and it now has an average query response time of, say, 20 seconds.
Now suppose you take exactly the same set of data and put it on exactly the same box, but this time you organise it as an Olap (Online Analytical Processing) cube.
I would expect to see the average response time fall to well under a second. Olap cubes are organised, not according to the relational model but to the multidimensional model of data.
Same data, same box; the only difference is the way in which the data is organised, so it must be the data model that makes the difference – relational slow, multidimensional fast. Simple.
If this isn’t true, then businesses all over the world are wasting huge sums of money that they are investing in BI (business intelligence) systems that do precisely this.
These systems reorganise the data in non-relational ways in order to get high query performance. Indeed, it is precisely this area in which I do a great deal of consultancy work.
This makes for a totally convincing argument. I know because I have heard it multiple times – it just happens to be wrong. It is absolutely true that restructuring the data as an Olap cube will give you the huge performance boost I describe above.
Why relational means slow
The relational model is based around the concept of the table. A table is a
two-dimensional structure – there are columns along the top and rows down the
side.
When we build a database that adheres to the relational model, we have to ‘normalise’ the data, which simply means each table stores data about a single class of entity.
So a Customer table stores information about customers, and Order table stores data about orders, and we don’t mix up the two types of data.
Of course, this means we end up with a large number of different tables; it is perfectly possible for a database to be made up of several hundred or even several thousand tables.
The data in the database is thus distributed over a large number of different tables. When we query the database we can bring the data we need back together by following the joins between the tables that are defined by the foreign keys and primary keys.
As many people will tell you, this is the reason relational databases are slow – because every time you run a query, the database has to follow several or dozens of these joins in order to be able to answer your question.
However, as a clue to where the fallacy lies in all this, you’ll notice that there is an implicit assumption that the tables shown here are stored exactly like this on the disk. Of course, the reality is that they aren’t.
Article tags
Related articles
Q.Why are some of the keys on my keyboard doing strange...
Q.Is my phone’s Bluetooth any use?
Q.Can I switch boot drives so that I can work on older...
St Helena, a 'small British village' in the mid-Atlantic, is seeking support and funding for a broadband connection
|
|
|
|
|
Computeractive Excel (2010) Online tutorialPrice: £19.99 |
Computeractive Word (2010) Online TutorialPrice: £19.99 |
Computeractive Powerpoint (2010) Online TutorialPrice: £19.99 |
Angry BirdsPrice: £9.99 |
Back Issue CD-Rom 14 (2011)Price: £15.99 |