# Engineering the Statistics

Do you remember the probability course you took in undergrad? If you were like me, you would consider it one of those courses that you get out of confused. But maybe a time will come where you regret skipping class because of the lecturer's persisting attempts to scare you with mathematical involved nomenclature.
As you might have guessed, I had this moment few months back where I had to go deep into statistical analysis. I learned things the hard way, or maybe it is the right way. I mean that I had to do things myself. What happened was liek this: I use to derive the probability distribution function (pdf) or simply assume it. Then ask my ever loyal friend MATLAB to verify it with simulation and compare it to theory. Since I'm a unfortunate PhD student, the simulation and theory do not match. What other experienced PhD students advise is to find the nearest wall and bang your head into it untill you find what' wrong. You hope it's the MATLAB code but it's not the code since you gone through it for the whole night. You face reality and check your theory, modify it and go through the previous iteration again.

This was what I gone through last month. I learned few lessons the hard way about probability and statistics. Firstly, the pdf contains all the information there is about your variables, so if you have the correct pdf you can go back home early and tell your family the great achievement you done. Of course they will try to show some iterest and hide the wierd looks they often give you.

If you don't have the pdf, try to figure out the probability of some event, i.e., $\inline&space;P(X, derive it with respect to x and you get you precious pdf, since that probability is nothing but the cumulative distribution function (cdf).

If you can't find the cdf, try finding the first and second order moments, which are nothing but fancy names for the mean and variance of your random variable or process. Having done that, approximate the pdf with Gaussian distribution with the same mean and variance. You cannot believe how many transactions in IEEE signal processing use the Gaussian distribution for approximation. Just use the central limit theorem and you get Mr. Gaussian distribution, which simplifies the maths and you get yourself a journal paper! If the Gaussian pdf is not a good fit, I suggest using the Gamma distribution which is very flexible. Use moment matching to find the required parameters for Gamma pdf.

If Gamma doesn't work, go back to the basics and use Chernoff bound to find an upper bound for the tail probability.
If that doesn't work, well ... Try painting the wall with your favourite color and bang harder !!
And tell me if you find other ways :)

[ - ]
Comment by October 9, 2012
I assume that there some standard methods to fit various probability distribution to your variable or process. As you mentioned, "Method of Moments" is one of them. But there is also "Maximum Likelihood" and so on. Furthermore, there is a wide spectrum of probability distributions available so that you can fit any sample you may have.
BTW: Always I thought that PDF stands for "probability density function".
[ - ]
Comment by January 14, 2013
Interesting?

To post reply to a comment, click on the 'reply' button attached to each comment. To post a new comment (not a reply to a comment) check out the 'Write a Comment' tab at the top of the comments.