By Eugene Chen, Ph.D.
Forecasting as a human activity has a long history. In ancient Egypt, priests monitor the level and clarity of Nile River using the Nilometers. From the measurements, they predict how good the crops would be for the coming year. It is fair to say that these priests are one of the earlier forecasters. Fast-forward to 2015, Forecasts has become ubiquitous in our daily lives. An average Joe uses weather forecast to decide if he wants to take an umbrella to work. An executive in a mining company may use an iron-ore price forecast to make future business plans. The staff of Fed makes and uses various types of economic forecasts for policy-making.
Advances in Technologies have greatly improved the accuracy and stability of forecasts. Equipped with modern computers and large amounts of high-quality digital data, forecasters of our times are able to construct and validate accurate forecast models with unprecedented efficiency. In this article, I would like to briefly introduce our forecast models at DecisionNext. It is not meant to be a detailed account of our repertoire. The goal is just to provide you, the reader, a general idea of what we do.
A good deal of forecasts that we make at DecisionNext falls into the category of ‘Time-series forecast’. Namely, a quantity in the future is predicted based on a series of previously observed values. Concrete examples include Forecasts of copper price, gasoline production, and the average beef marbling in a slaughterhouse. While forecasting time-series, we usually ask ourselves three questions:
1. Does the series follow a deterministic trend or a cyclic pattern? A nice and quick forecast can be made if one is able to show that the time-series actually repeats itself. This may sound too good to be true. However, a lot of time-series do repeat themselves to a good degree. As an example, Agriculture data such as regional grain production tend to be highly seasonal.
2. Does the series and the associated noise affect its own future? Sometimes, the future value of a time-series is influenced by its immediate past. This is often true for the prices of entities whose intrinsic values are hard to evaluate. Examples include short-term prices of stocks and luxury goods (such as fine wine and premium tea leafs). Since the intrinsic values are hard to evaluate, prices tends to float further if it has an upward trend in the immediate past (they call it “momentum” in your daily financial news feed) and to sink in the opposite case.
3. Do external factors affect the time-series? It is common that a time-series is sensitive to external factors. A concrete example is the price of perishable goods. Vendors of perishable goods have the pressure to sell whatever they have in time for avoiding waste and storage cost. Insights into the future supply thus have strong predictive power over the movement of price (as vendors use the price to adjust the rate of sale).
More often than not, the answer is ‘yes’ for all of the three questions. A real-life time-series often (simultaneously) follows a deterministic trend or periodic pattern, influences its own future, and is affected by external factors—though just to a certain degree in each case. A linear Forecasting model that encompasses all these aspects is called an ARMAX model.
Forecasts are never perfect and error is an unavoidable component thereof. No scientific forecast can be completed without a proper estimation of errors. Validation is used to fulfill such goal. The idea of model validation is to purposely holdout a subset of available data, forecast that piece of data (as if it is unknown) with the remaining data, and compares the holdout data with the forecasted value (that is only based on the remaining data). The difference between the two can be used as an estimation of the model’s accuracy.
A detailed analysis of the errors can further elaborate the forecast. By characterizing the errors as a statistical distribution, the forecast can be cast into the form of a probability distribution function, rather than just a number. Thus, the user of the forecast is not only provided with a ‘best guess’ for, say, a future commodity price. He or she is also informed with the probability that the price is 10%/20%/30% below or over thence. This opens up the possibility of Scenario-based planning, where the user is allowed to develop business plans that are both profitable and risk-averse. The plan is risk-averse because all possible price movements are being considered. The plan can still be profitable because all risks are accurately quantified. At the user’s discretion, he or she does not need to sacrifice the profit for events that are unlikely to happen.
What is in a forecast? At DecisionNext, a forecast consists of prudent variable selection, judicious model building, and careful validation. We have built a scalable platform that can help you make the most logical decision based on the latest data. Are you ready to gain the analytic edge?