Friday, November 20, 2009

Representation of Correlation, classification as well as hierarchy into DB

One fine day while I was driving to the office, I show a modified car in front of my car and I started thinking how to represent it into relational model(ERD). From that my thought process started. I was thinking about few things such as:
  • How to represent a car as a vehicle
  • How to represent make of a car
  • How to represent model of a car
  • How to represent car's association to a person using number plate
  • How the modified car fits into the model which I have derived just now etc....
While I was thinking all these things 3 things came in my mind : correlation between entities, classification of entities and representation of entities into hierarchy. Then I was trying to answer following questions about those three things:
1. Should we represent them into ERD or not?
2. If yes then how should they be represented?

My thoughts on those things are as following:
Correlation : It must be represented into ERD by dedicated column(s) into a table which works as cor-relationship id.
Classification:Classification should not be represented into ERD. Reason behind this is classification depends on criteria to be used for classifying things at that point in time. Here moving parts are : fixed set of criteria to be used, validity of criteria to be applied at given point in time, acceptance of results at that point in time for those input parameter to that person etc. So rather then representing classification its always better to capture values of various criteria into ERD driving the classification.
Hierarchy : Hierarchy should be represented into system. But how to represent an hierarchy is a difficult decision. One can simply say that it depends on business; I would say its 100% correct but do you know your current business 100% and can you envision 70% of your business future shifts in terms of hierarchy. Most of the time when business changes; reorganization of existing entities happens. This reorganization is based on our better understanding of current model as well as demand of changes at that point in time. Other point which add complexity into representation of hierarchy is related to graph theory. How do you want it to be represented it? Directional or non-directional and if its directional then which direction is correct?