Matts Learning Journal: April 2016

Thursday, April 14, 2016

Intro to database week 6 - querious questions

Week 6 - querious questions

This week we went back to SQL with full steam. There were your run of the mill MIN() and MAX() type queries, but also a lot of sub-queries that took me a bit to figure out. The syntax that SQL wants for sub queries threw me off a bit but it seems like you can really do a lot stringing things together. You can also write things that may seem nice to you but turn out to look absolutely terrible. So the big take away for me was to not make a mess, like with any software project. Nobody cares how smart you think and clever you think you are if your query, script, program, whatever looks like junk. It's just not useful.

Here is an example from the homework that I like, for figuring out who the best student was in a given course code.

You could write that query a lot of ways, including one giant line that is a big hassle to read. But as written above it is clear what is going on. We are interested in the last name of the student we will call the best student. Their name is in the student table but the grades are in the enrollment table so we will join those tables together. The unique identifier of interest is the student number so use that for the "on" condition. To round it all out we are looking for the best grade, that's where the sub-query formatting comes in. It is broken into "where a > b" but we can expand the b part out by putting parenthesis around it. That lets us chill out and start over from the beginning with a query that figures out what the max grade was for class # 10014. When it all runs then we end up with the student with the highest grade from class 10014.

I still don't know what I'm doing as far as SQL goes but at least I know what to look for and the structure of it. To really be able to get much done I think you need a lot of weeks of repetition on real world problems where you fiddle around for a while. It is clear that sub-queries really let you start doing useful work instead of just finding simple pieces of information.

Monday, April 11, 2016

Intro to database week 5 - What the graph?

week 5 - What the graph?

This week was a bit jacked for me. I was supposed to mail a box to Mexico but then got a call "we'll let you know in a few hours if you need to mail that stuff". Instead they called and told me to go to Mexico the next morning. What was interesting about that is that my battery in my laptop sucks so I wrote on paper this week. I don't normally write on paper at all during the week or month. It was nice to try that out for a change. Printing homework, and writing on paper, pretty odd. Then I spent a while learning some online diagram software along with diagram software for my phone so I could make things look nice because my hand writing is really bad. It looks a bit like I'm having a seizure when I write but my hands are actually pretty steady because I can do microscopic soldering for circuit boards.

That's water under the bridge though because my homework is in and this week was really great for me overall even though I was so slow. After getting home from Chihuahua I watched a good documentary on Sunday about the guy that made the Segway and he said he's a really slow learner but he just keeps re-reading things until it soaks in. That was my plan for this week and it seemed to work out. So overall this week the subject was cool graphs that are your pre-planning for database development, but also an iterative process during database design and implementation. Drawing graphs that show how items are connected lets you figure out where you are going to run into trouble or see new connections you didn't think of before. Two kinds of graphs were presented, the Chen model which seems like it isn't very cool, and the Crow's Foot Model which seems very nice indeed.

What makes the Crow's Foot Model So nice? Well for one it seems that good graphing software supports it's symbols and makes creating these graphs fun. These diagrams also are very descriptive in expressing the quantity of each item and their relationship. Are there ten sailors per ship? Does each sailor get their own ship? Is there a law that says people can only have two kids or one car? These are all important information for your database that look like garbage when written out in long sentences. They of course need to be compressed into nice connected graphs that are concise and show you what is going on at a glance. Nobody has time to read your epic novel about rules and by requiring interpretation you leave room for mistakes during implementation. So draw cool graphs instead.

Well, that's the critical information from this week summed up. Draw nice graphs like this one homework 5.3!

week 5 pt 2 - Multi-frequency & multi-sample smart colorimetry for water analysis

The big distraction for me lately has been fish tank water. What's a nice way to press 1 button and have information logged to your database about the quality of your fish water? Colorimetry is popular of course, like this example unit from IORodeo .

There are also a variety of commercial handheld units with a specific frequency of light that they use to measure things like chlorine and ammonia. The basic principle being that they measure the absorption of light of a given frequency going through a sample to determine the concentration of a known solution. For example with fish water there are sample kits from companies like API which let you test a variety of chemical levels. PH, Ammonia, Nitrate, Nitrites, these are all critical information for keeping fish healthy and happy. Low cost colorimeters generally hold one sample though and use one frequency of light. They also don't seem to automatically log the information to a centralized database. It's almost as if they think you're going to want to connect the device to a computer over USB and log the data yourself, or worse yet write it on paper. As we discussed earlier that just isn't an option.

So it seems to me the solution is clear, the world needs a low cost colorimeter that hold many vials at once for testing difference chemicals simultaneously, which also logs to a database on it's own. It would probably need some internal storage in case you are lazy and don't connect it to the internet, and it should be able to connect to your smart phone of course because everybody loves phones. I think this could be good project because although my interest is fish water, my dislike for repetitive actions could benefit others by providing an inexpensive multi-sample solution. By allowing the device to be re-configured with a phone application each sample could be tested for concentrations of a different chemical, or multiple samples could be tested for the same chemical. This would require a few light sources for each sample so that the wave length projected could be customized. The good news there is that LEDs are just about one of the least expensive things in the world right now.

On the receiving end light sensors, even advanced ones that give you lots of information, are also very inexpensive these days. Wifi enabled microcontrollers like the ESP2866 are also very cheap. So it seems there is room in the budget to make the housing look really cool and have a unit that holds 4-5 samples at once for testing with the press of a button.

Sunday, April 3, 2016

Intro to database week 4 - Keep in Normal.

Week 4 - Keep it Normal

This week was about Database Normalization. This sounds like a tricky subject but it is really just the process of figuring out how to set up a database that isn't a trap. That process involves figuring out what should be in each table and how the tables connect so that you don't end up with redundant data or paint yourself into a corner as far as flexibility goes. For example: would you mix customer data and product data in the same table? What happens if you do? Well there's fancy database terminology for 1NF, 2NF, 3NF, and 4NF but what really matters is that you will make a giant mess. That's what I got out of the end of this week.

Sherif Ramadan has a good video on 3NF which is really the form we want things in for most cases.

Figuring out how messed up your database is consists of drawing out dependency diagrams. These let you see problems such as partial dependencies and transient dependencies. Look at this example from our homework relating to prescriptions.

The key for this table is MedNAme, PatientID, and Date. But what does a patients name have to do with the date? Also what does the patient name have to do with the medication? Well the patient is taking some medications, but if you just have the medication name and the date then that doesn't determine a patients name. A big hospital will have lots of patients taking the same medication on the ame date. The patients name is determined by their patient ID though, so this is a partial dependency for this table. That makes you start to think, maybe I should make a table with just patient information? That would make things nice right? The same goes for medication names and their shelf life. Why would you have the shelf life of a medication in a table that is discussing patients names? Of course that's odd, without even knowing about 1NF, 2NF, 3NF etc you probably are starting to think it makes sense to have a table with just medication names. Those are both examples of partial dependencies.

Transient dependencies would be when an attribute that isn't part of the key determines another attribute that isn't in the key. Look at this example table below regarding invoices which hasn't been normalized. Vendor code determines vendor name. So why is all that information in one row the table invoice? Obviously you should get the vendors their own table and do the same for the products as well. Then you can have the product table use the vendor codes as a foreign key so you know who to buy your products from.

Overall it seems like the goal of normalization is really to drill into your head how to organize things so that you don't get into a trap. If you start writing lots of code to use your database and don't take the time to really think things through you will spend time back tracking or just end up trapped when you try to scale. So this is a great section to get us going on the basics which will eventually just become a kind of gut feeling for bad design.