Thursday, April 05, 2007

Hints And Pitfalls In Database Development (Part 2): Let The Database Enforce Data Constraints

Letting the database enforce data constraints might sound obvious, but as a matter of fact I had to listen to some people just a few months ago who advocated checking for referential integrity within business code instead of letting the database do so, because "foreign keys are slow" - I mean geez, what kind of nonsense is that?! I don't know, they probably forgot to define their primary keys as well, hence didn't have any index for fast lookup (*sarcasm*).

No other entity is better suited to safeguard data integrity than the database. It's closest to the data, it knows most about the data, and it's the single point every chunk of data has to pass through before being persisted.

And I am not only talking about primary keys and foreign key constraints. I put all kinds of checks on my databases, for example unique compound indices whenever I know that a certain combination of column values can only occur once. Or table constraints that enforce certain rules on the data which are being checked by the database on each insert and update. Triggers are another possibility, but care must be taken - they might be slower than table constraints.

Setting up constraints usually is quite easy, easier than implementing the same rules in business code. Of course performance matters, and it is important to balance costs and benefits. I do not recommend to code complex SQL-statements within constraint checks - they should be limited to simple logical expressions.

During application development, the database will scream immediately when data integrity rules are being violated. In the unlikely case that such a programming mistake slips through into production code, the customer will be confronted with some error message, but his data is unendangered, any you will find out about it immediately and can provide a fix at no time - not months or years later, when data corruption would have occurred, which then might have been impossible to repair. This has saved us on several occasions.

Plus you never know if other clients are going to be manipulating data in the future, clients that might not be aware of the existence of those data rules. Data normally lives longer than application code.

Previous Posts:

Follow-Ups: