Paul Park: Forecasting Retail Sales

Data

The US Census website has a lot of interesting data to play around with. I wanted to try some forecasting and decided to look at retail and food service sales. Sales data ranges from January 1992 to June 2015, and the data was added to a pandas dataframe, which was then plotted below:

sales

There is a linear trend in the time series along with seasonal patterns that need to be accounted for when we attempt to forecast sales. Another feature that is of interest is the drastic dip that occurs between 2008 and 2009 most likely due to the subprime crisis.

After struggling to forecast using Python's existing packages, I decided it might easier to call on R from Python using rpy2 because R seems to have a more extensive and well-documented library for forecasting. It didn't take me long to decide that it might be even easier to clean the data in Python, save the data as a csv file and then import it to R to do the forecasting. The plot below uses an ETS model (exponential smoothing state space model - the acronym doesn't make sense to me at the moment) with a multiplicative error, additive trend, and multiplicative seasonality. The forecast is in green, with the different shades indicating a 80% and 95% confidence interval.

forecast



Next Steps?



Back to Profile