Data Analytics Methods

If odds are not odd, what about odds ratios?

What are the odds of developing a brain tumor from long-term use of cell phones? This is an evolving area of research.  Some studies have found an association and others have not. But two recent meta-analyses suggest that the odds are about 33 to 44% greater due to long-term cell phone usage. Got your attention?…

Read More

Odds and probability…two sides of the same coin

What are the lifetime odds of dying from being hit by a meteorite? 1 in 1,600,000. Yep, not very likely.  You are much more likely to die from a dog attack (1 in 86,781) or from a lightning strike (1 in 138,849). But why odds? Why not express these likelihoods in terms of probabilities?  Seems…

Read More

Curse of Big Data

“Big data.” We checked in with Google search trends recently. Appears that “Big Data” has lost its luster search-wise…started trending down about 4 years ago. Nowadays, everything is big data? Implications of big data However, this does not mean we should lose sight of certain statistical implications associated with being “big”. Yes, large amounts of…

Read More

Practical Time Series Forecasting – Bounding Uncertainty

“A good forecaster is not smarter than everyone else, he merely has his ignorance better organized.” ― Anonymous Predicting the future is an exercise in probability rather than certainty. As we have mentioned several times over the course of these articles, your forecast model will be wrong. It is just a matter of how useful…

Read More

Practical Time Series Forecasting – Meta Models

“There are two kinds of forecasters: those who don’t know, and those who don’t know they don’t know.” ― John Kenneth Galbraith After an extensive model building and vetting process, along the lines we previously discussed here and here, the practical forecaster may still be left with several strong performing models. These models perform similarly…

Read More

Practical Time Series Forecasting – Know When to Roll ‘em

“Prediction is very difficult, especially if it’s about the future.” ― Niels Bohr, physicist Holdout samples are a key component to estimating a “useful” forecasting model. Set aside data at least equal in length to your forecast horizon (“holdout sample”). Build your models on the remaining data (“modeling sample”). And compare the candidate models’ forecast…

Read More

Practical Time Series Forecasting – To Difference or Not to Difference

“It is sometimes very difficult to decide whether trend is best modeled as deterministic or stochastic, and the decision is an important part of the science – and art – of building forecasting models.” ― Diebold,  Elements of Forecasting, 1998 A time series can have a very strong trend. Visually, we often can see it. Gross…

Read More

Practical Time Series Forecasting – What Makes a Model Useful?

“In God we trust. All others must bring data.” ― W. Edwards Deming, statistician So, you have estimated a bunch of forecasting models and realize (kudos to you!) that they are “all wrong” (ala George Box). But your forecasting deadline is looming, and you need to find some useful models on which to base a…

Read More

Practical Time Series Forecasting – Know When to Hold ‘em

“The only relevant test of the validity of a hypothesis is comparison of prediction with experience.” ― Milton Friedman, economist Holdout samples are a mainstay of predictive analytics. Set aside a portion of your data (say, 30%). Build your candidate models. Then “internally validate” your models using the holdout sample. More sophisticated methods like cross…

Read More