Sunday, April 3, 2016

Intro to database week 4 - Keep in Normal.

Week 4 - Keep it Normal

This week was about Database Normalization. This sounds like a tricky subject but it is really just the process of figuring out how to set up a database that isn't a trap. That process involves figuring out what should be in each table and how the tables connect so that you don't end up with redundant data or paint yourself into a corner as far as flexibility goes. For example: would you mix customer data and product data in the same table? What happens if you do? Well there's fancy database terminology for 1NF, 2NF, 3NF, and 4NF but what really matters is that you will make a giant mess. That's what I got out of the end of this week. 

Sherif Ramadan has a good video on 3NF which is really the form we want things in for most cases. 



Figuring out how messed up your database is consists of drawing out dependency diagrams. These let you see problems such as partial dependencies and transient dependencies. Look at this example from our homework relating to prescriptions.
The key for this table is MedNAme, PatientID, and Date. But what does a patients name have to do with the date? Also what does the patient name have to do with the medication? Well the patient is taking some medications, but if you just have the medication name and the date then that doesn't determine a patients name. A big hospital will have lots of patients taking the same medication on the ame date. The patients name is determined by their patient ID though, so this is a partial dependency for this table. That makes you start to think, maybe I should make a table with just patient information? That would make things nice right? The same goes for medication names and their shelf life. Why would you have the shelf life of a medication in a table that is discussing patients names? Of course that's odd, without even knowing about 1NF, 2NF, 3NF etc you probably are starting to think it makes sense to have a table with just medication names. Those are both examples of partial dependencies. 

Transient dependencies would be when an attribute that isn't part of the key determines another attribute that isn't in the key. Look at this example table below regarding invoices which hasn't been normalized. Vendor code determines vendor name. So why is all that information in one row the table invoice? Obviously you should get the vendors their own table and do the same for the products as well. Then you can have the product table use the vendor codes as a foreign key so you know who to buy your products from. 


Overall it seems like the goal of normalization is really to drill into your head how to organize things so that you don't get into a trap. If you start writing lots of code to use your database and don't take the time to really think things through you will spend time back tracking or just end up trapped when you try to scale. So this is a great section to get us going on the basics which will eventually just become a kind of gut feeling for bad design.

No comments:

Post a Comment