By December 9, 2009

Database Design: Dealing With Database Metadata

I recently ran across a very interesting problem involving database design. What is the best way to store database metadata, or, metadata about data in a database?

I have changed the table names to simplify the problem, so let me describe it as such:

I have 2 tables, fruit and vegetable. Each stores whether or not a fruit or vegetable is tasty.  We can ignore the other columns, so it would be like:

Now let's say that someone keeps changing the IsTasty setting through the UI of my application, and management is demanding the ability to see when someone changed it last and who is responsible for the change.  The tricky part here is although we are ignoring the other data on the table, there is other data, and we don't want to track when any data on the table was changed, just this one column.

Solution 1: Track the data in the table by adding columns to track updated on and updated by.

The drawback here is we are adding 2 extra columns per table and anywhere else we wanted to do this would require the same.  Imagine if we had more columns in this same table we wanted to track changes on.  Arghh…

Solution 2: Track the data in a separate table for the fruit and vegetable tables for just that column.

This seems to be a little more clean and extensible in the future.  Now we have created two extra tables and we haven't added a bunch of extra columns to our existing tables, but if we wanted to track more columns we would have to add more tables that are copy and pastes of these tables.  These tables are also exactly the same as each other except for the foreign key.

Solution 3: Create a new table that has 2 optional foreign keys

Here we are removing the duplication of the tables that are almost exactly alike.  It will scale, but we have optional foreign keys and it is very hard to do joins.  We also lose knowing what data this table belongs to and automatic referential integrity provided by the database.  (Which is the ability for the database to know what tables are linked together and make sure the referential data is correct.)  We could create a custom constraint, but each time we need a new column we would have to add a new key, and for a different type of column we would need another table.

Solution 4: Create a meta table to store information about columns in other tables

To me this seems like the best approach.  When a column was changed really has nothing to do with a fruit or vegetable.  It is data about data or meta-data.  By creating a table to store meta-data about any table it is very scalable.  We don't have a foreign key, but we can either create triggers, or use our DAO framework to automatically make inserts to the audit table.

Other options?  Pro's / con's?  Thoughts?

About the author

John Sonmez

John Sonmez is the founder of Simple Programmer and a life coach for software developers. He is the best selling author of the book "Soft Skills: The Software Developer's Life Manual."