Sunday, November 23, 2014

Love-hate relationship 2nd Part

No doubt this class helped me finish my MS thesis dissertation, at that time I needed to process images to obtain measurements that required the use of cross-correlation analysis and inverse Fourier Transform (FT), which I did using Matlab. Statistics in Matlab has reduced the amount of time it would otherwise take to solve problems like this using spreadsheets, I cannot even tell you how I would have to implement IFT using a spreadsheet, my brain has started boiling down just from the thought. Maybe I should start blogging about Matlab as well. One way or another, I have used Statistics in my profession and in my daily life.

Statistics in our everyday life

But who doesn’t use Statistics in everyday life? We are immersed in a series of processes that are averaging and producing spreading around a hypothetical value we have named the mean. When you are driving your car down the road, you see you car odometer and see a constant speed, and if you use the cruise control, that speed needle would seem to be static. But that is not what is happening. The fact is that your odometer is not able to register the small changes that do happen to your speed (for one, when you are taking a curve on the road, you are accelerating because your velocity, not the speed, is changing direction, but that is the subject of another Physics blog), and it only presents the average or mean to you on your dashboard.
And not many of us are aware of things like this. For some reason, when some people say they hate mathematics, they usually include Statistics. But the basic concepts of Statistics are based on common sense, and not much on mathematics, but not many can explain these concepts in a very simple way, let alone in a correct way. If you open any book on basic Statistics, you will first encounter things such as average (or mean) error, standard deviation, variance, histograms, probability distributions, samples, populations, and so on. Some of these subjects might sound very intimidating (like kurtosis or skewness) but unless you actually want to become a statistician (those people that live and breath Statistics), then you do need to dig deeper into the subject to that you can pick up the required vocabulary and concepts.
But many of us don’t need that depth of knowledge, but we do need those concepts to make sense of what is happening around us. Fortunately, there are many books that have tried to explain Statistics from a very non-technical point of view (I cannot reference one, I have read textbooks and specialized books on the subject, I should find some to post here). I will not try to attempt that same goal, instead I would like to offer a few explanations of how I see and understand some of these basic concepts. My next post will contain the basic concept of the mean and what “means” to me. Until next time.

Wednesday, November 19, 2014

Love-hate relationship

So why is that a subject, that I loathed that much when I was in college, becomes the centerpiece of this blog? I don’t have the answer. The truth is that in order to understand many things in this world, we need to understand statistics. And not just your plain “I know what a bell curve is” but the required knowledge to make decisions when faced with the data. These days the amount of data that is generated every single day surpases what we can read in one year, reading on a full time basis. We need tools to process all this data so that we can mine knowledge and make decisions based on that knowledge.
Maybe there is your answer. I need to understand statistics in order to understand data mining. But not only that, I need to master Probability as well. At the end of the day, we need to produce models that can predict future events based on the analysis of the data performed. These models are sometimes based on statistical models, or we need statistical concepts to calculate parameters that mean something when presented. I consider data mining to be a subject that every graduate from a college these days have to know, at least on a basic level.
And why is it that I hated the subject so much? Well, as it usually happens, everything started a long time ago, when I was in college. The teacher we had was good, I don’t want to imply she didn’t teach us anything. I was able to grasp some things from that class. I hated that we didn’t have time to cover a lot. She surely made the subject hard to learn. In those days we didn’t have no stinking computers, or TAs, her office hours were impossible to attend (usually while I was taking other classes), and my classmates hated the subject more than I did, so not much help from my peers either.
When I graduated from college I went to work for a Metrology Laboratory (yes, metrology the science of measurements, maybe I should start a blog about that too), and surprise! I had to use statistics and probability to calculate uncertainties and errors in the measurement. At this point, I was able to learn a little more about the subject. But most of the details on this area were already worked out for us (we usually based our calibration and calculations procedures on standards and other publications), so it is not like we were talking about statistics that much anyways.
Then I came to the University of Florida to get a MS in Engineering Mechanics. There I took a class on data analysis. It was very interesting and I guess was one of the reasons I started to use Matlab for programming, mathematical calculations and plotting of data.The use of probability and techniques such as Fourier transforms made the class more interesting, although the focus was more on analysis of periodic data. But in this class I learned cross correlation analysis, linear regression, probability density functions, and so on.
To be continued...