The Time Series Forecasting System Charles Hallahan, Economic Research Service/USDA, Washington, DC

Similar documents
Box-Jenkins Methodology: Linear Time Series Analysis Using R

Appendices to Chapter 4. Appendix 4A: Variables used in the Analysis

Processing data with Mestrelab Mnova

SIDRA INTERSECTION 8.0 UPDATE HISTORY

User s Manual. Log Scale (/LG) GX10/GX20/GP10/GP20/GM10 IM 04L51B01-06EN. 3rd Edition

Sampler Overview. Statistical Demonstration Software Copyright 2007 by Clifford H. Wagner

Case study: how to create a 3D potential scan Nyquist plot?

4K Video Traffic Prediction using Seasonal Autoregressive Modeling

#PS168 - Analysis of Intraventricular Pressure Wave Data (LVP Analysis)

Defining and Labeling Circuits and Electrical Phasing in PLS-CADD

BitWise (V2.1 and later) includes features for determining AP240 settings and measuring the Single Ion Area.

E X P E R I M E N T 1

Detecting Medicaid Data Anomalies Using Data Mining Techniques Shenjun Zhu, Qiling Shi, Aran Canes, AdvanceMed Corporation, Nashville, TN

Speech and Speaker Recognition for the Command of an Industrial Robot

ORM0022 EHPC210 Universal Controller Operation Manual Revision 1. EHPC210 Universal Controller. Operation Manual

INTRODUCTION TO ENDNOTE

StaMPS Persistent Scatterer Practical

RedRat Control User Guide

Analyzing and Saving a Signal

APPLICATION OF MULTI-GENERATIONAL MODELS IN LCD TV DIFFUSIONS

Lab experience 1: Introduction to LabView

Problem Points Score USE YOUR TIME WISELY USE CLOSEST DF AVAILABLE IN TABLE SHOW YOUR WORK TO RECEIVE PARTIAL CREDIT

Upgrading E-learning of basic measurement algorithms based on DSP and MATLAB Web Server. Milos Sedlacek 1, Ondrej Tomiska 2

Exercise #1: Create and Revise a Smart Group

Blueline, Linefree, Accuracy Ratio, & Moving Absolute Mean Ratio Charts

MIE 402: WORKSHOP ON DATA ACQUISITION AND SIGNAL PROCESSING Spring 2003

Moving on from MSTAT. March The University of Reading Statistical Services Centre Biometrics Advisory and Support Service to DFID

User s Manual. Log Scale (/LG) GX10/GX20/GP10/GP20/GM10 IM 04L51B01-06EN. 2nd Edition

StaMPS Persistent Scatterer Exercise

Source/Receiver (SR) Setup

Analysis of local and global timing and pitch change in ordinary

Getting started with Spike Recorder on PC/Mac/Linux

Algebra I Module 2 Lessons 1 19

J.M. Stewart Corporation 2201 Cantu Ct., Suite 218 Sarasota, FL Stewartsigns.com

Linear mixed models and when implied assumptions not appropriate

Welch Allyn CardioPerfect Workstation Tango+ Interface Notes

Event recording (or logging) with a Fluke 287/289 Digital Multimeter

Getting started with

The EEGer 4.3 Tutorial

Footnotes and Endnotes

PulseCounter Neutron & Gamma Spectrometry Software Manual

NetLogo User's Guide

Dektak Step by Step Instructions:

Rack-Mount Receiver Analyzer 101

SigPlay User s Guide

Common Spatial Patterns 3 class BCI V Copyright 2012 g.tec medical engineering GmbH

Subject-specific observed profiles of change from baseline vs week trt=10000u

Appendix D. UW DigiScope User s Manual. Willis J. Tompkins and Annie Foong

Brain-Computer Interface (BCI)

AMIQ-K2 Program for Transferring Various-Format I/Q Data to AMIQ. Products: AMIQ, SMIQ

GS122-2L. About the speakers:

STAT 250: Introduction to Biostatistics LAB 6

MANOVA/MANCOVA Paul and Kaila

MultiSpec Tutorial: Visualizing Growing Degree Day (GDD) Images. In this tutorial, the MultiSpec image processing software will be used to:

Software Quick Manual

THE BERGEN EEG-fMRI TOOLBOX. Gradient fmri Artifatcs Remover Plugin for EEGLAB 1- INTRODUCTION

VivoSense. User Manual Galvanic Skin Response (GSR) Analysis Module. VivoSense, Inc. Newport Beach, CA, USA Tel. (858) , Fax.

MODFLOW - Grid Approach

A Matlab toolbox for. Characterisation Of Recorded Underwater Sound (CHORUS) USER S GUIDE

Cable Calibration Function for the 2400B/C and 2500A/B Series Microwave Signal Generators. Technical Brief

InPlace User Guide for Faculty of Arts, Education and Social Sciences Staff

Getting started with

Tutorial 0: Uncertainty in Power and Sample Size Estimation. Acknowledgements:

48 TV Caller ID TV CALLER ID

Example the number 21 has the following pairs of squares and numbers that produce this sum.

v. 8.0 GMS 8.0 Tutorial MODFLOW Grid Approach Build a MODFLOW model on a 3D grid Prerequisite Tutorials None Time minutes

Common Spatial Patterns 2 class BCI V Copyright 2012 g.tec medical engineering GmbH

Modelling Intervention Effects in Clustered Randomized Pretest/Posttest Studies. Ed Stanek

User s Manual. Log Scale (/LG) GX10/GP10/GX20/GP20 IM 04L51B01-06EN. 1st Edition

MAutoPitch. Presets button. Left arrow button. Right arrow button. Randomize button. Save button. Panic button. Settings button

Measurement User Guide

Getting Started with the LabVIEW Sound and Vibration Toolkit

Please feel free to download the Demo application software from analogarts.com to help you follow this seminar.

PCASP-X2 Module Manual

AUTOPILOT DLM Satellite Downlink Manager USER GUIDE

How to create a video of your presentation mind map

Practicum 3, Fall 2010

ECONOMICS 351* -- INTRODUCTORY ECONOMETRICS. Queen's University Department of Economics. ECONOMICS 351* -- Winter Term 2005 INTRODUCTORY ECONOMETRICS

Keyframing TOPICS. Camera Keyframing 'Key Camera' Popover Controlling the 'Key Camera' Transition Starting the 'Key Camera' Operation

Analysis of AP/axon classes and PSP on the basis of AP amplitude

Physics 105. Spring Handbook of Instructions. M.J. Madsen Wabash College, Crawfordsville, Indiana

Initially, you can access the Schedule Xpress Scheduler from any repair order screen.

MindMouse. This project is written in C++ and uses the following Libraries: LibSvm, kissfft, BOOST File System, and Emotiv Research Edition SDK.

m RSC Chromatographie Integration Methods Second Edition CHROMATOGRAPHY MONOGRAPHS Norman Dyson Dyson Instruments Ltd., UK

QUICK START GUIDE FOR DEMONSTRATION CIRCUIT /12/14 BIT 10 TO 105 MSPS ADC

EDL8 Race Dash Manual Engine Management Systems

Pictures To Exe Version 5.0 A USER GUIDE. By Lin Evans And Jeff Evans (Appendix F By Ray Waddington)

PRELIMINARY INFORMATION. Professional Signal Generation and Monitoring Options for RIFEforLIFE Research Equipment

A-ATF (1) PictureGear Pocket. Operating Instructions Version 2.0

Part No. ENC-LAB01 Users Manual Introduction EncoderLAB

Vision Call Statistics User Guide

Mixed Effects Models Yan Wang, Bristol-Myers Squibb, Wallingford, CT

Ultra 4K Tool Box. Version Release Note

ONSIGHT CONNECT FOR SMARTPHONES GUIDE

StrataSync. DSAM 24 Hour POP Report

TechNote: MuraTool CA: 1 2/9/00. Figure 1: High contrast fringe ring mura on a microdisplay

PCIe: EYE DIAGRAM ANALYSIS IN HYPERLYNX

Optiflex Interactive Video System

Bootstrap Methods in Regression Questions Have you had a chance to try any of this? Any of the review questions?

December 2006 Edition /A. Getting Started Guide for the VSX Series Version 8.6 for SCCP

Transcription:

INTRODUCTION The Time Series Forecasting System Charles Hallahan, Economic Research Service/USDA, Washington, DC The Time Series Forecasting System (TSFS) is a component of SAS/ETS that provides a menu-based front-end for forecasting activities. The tasks of creating a date variable, graphing a data series and quickly seeing the results of differencing and/or applying a log transformation, testing for unit roots, examining autocorrelation and partial autocorrelation plots, performing seasonality tests, and, finally, estimating models and producing forecasts are just a mouse click away. This Tutorial is an introduction to the TSFS. Part 1 shows how to access the system and generate forecasting models for several variables through automatic selection from a default list of models provided with the TSFS. Simulated data will be used for this exercise. Part 2 shows how to override the defaults of the TSFS and take control of specifying a model. This paper is written using the SAS System Version 8.02. Figure 1 Accessing the TSFS PART 1: USING THE TSFS IN AUTOMATIC MODE The first step is to make sure a libname statement has been executed to point to the directory containing the SAS data set to be analyzed. We ll start with two simulated quarterly data series of length 48 each. Y1 was generated to follow an AR(1) process with an AR parameter of 0.8 and a mean of 2. Y2 was generated as an ARMA(1,1) with parameter values of 0.8 and 0.2, a mean of 2, a linear trend of 0.1 and additive seasonal factors of 0.1, -0.4, 0.3 and 0.1. The TSFS can be accessed either through menu choices (see Figure 1) as Solutions Analysis Forecasting System Figure 2 Starting a Project or by entering the command forecast on the command line or by assigning the command forecast to a function key. The two simulated series are in the SAS dataset mylib.simdata. A new project is started (see Figure 2), and since the SAS dataset does not contain a date variable (see Figure 2 again), one can be created within TSFS (see Figure 3) by selecting a starting date and frequency. We use 1990Q1 and QTR respectively and the default variable name of DATE. Figure 3 Creating a Date Variable

To view a series, click on the View Series button and select Y1 (see Figure 4). Figure 6 The Automatic Model Fitting Screen Figure 4 Viewing a Series Clicking on a point on the graph produces the actual data value in a box on the upper right. The buttons along the upper left are for zooming and transformations. For example, clicking on the key creates the first difference and immediately graphs it. The keys down the upper right produce the sample autocorrelation and partial autocorrelation functions and also carry out white noise and unit root tests. These buttons will be used later when building specific models. Before clicking on the Run button to have the best model selected on the basis of the default Model Selection Criterion of the Root Mean Square Error, look at some of the other default options. The default is to keep just the Best Model. Clicking on the Options pull-down menu along the top, click on the option Automatic Fit to change the default to keep the Best 5 Models based on RMSE. Note that the option to have the TSFS automatically perform diagnostics on the variables is checked. These diagnostics will be discussed later. See Figure 7 for this step. Clicking on the return arrow along the top brings back the main menu. TSFS can find the best fitting model for the two variables by selecting from a preset list of models and using RMSE, the root mean square error, as the default goodness-of-fit measure. From the Time Series Forecasting Menu (see Figure 5), click on the Fit Models Automatically button. From the Automatic Model Fitting screen (see Figure 6) click on the Select button on the line Series to Process: and select the two variables Y1 and Y2. Figure 7 The Automatic Model Selection Options Another option of interest is which models are in the selection list as candidates for best fitting model. Click again on the Options pull-down menu and select Model Selection List. The result is Figure 8. Figure 5 Time Series Forecasting Menu The models in this list are characterized by three factors: Ãtrend - either none, deterministic or stochastic (that is, first or second difference) Ãlog or not Ãseasonality or not.

using the series diagnostics is not full proof. In this case, Y1 is found to not need a Log transformation (true), to have a trend (not true), and not to be seasonal (true). Y2 gets a Maybe on needing a Log transformation, to have a trend (true), and not to be seasonal (not true). As a result, the true models are not used as candidate models in either case. The Best 5 Models as determined by RMSE are summarized in the Automatic Model Fitting Results screen, Figure 10. Figure 8 Model Selection List For example, Winter s method is a smoothing model for data exhibiting a trend and seasonality; no log is taken. The TSFS documentation clearly explains all the available models. Models not included in this list should be added before selecting the best models. If the option to automatically have TSFS perform the preliminary diagnostics is selected, then the list of models actually fit to a data series is much reduced. To add models to the list, click on the Actions button. For this tutorial we ll add 8 models to the list: AR(1), MA(1), ARMA(1,1), linear trend + seasonal dummies + AR(1), and their log equivalents. For example, to add the model linear trend + seasonal dummies + AR(1) to the list, after clicking on the Actions button, click on Add and then Custom Model. After making the appropriate selections, the result is Figure 9. Figure 10 Automatic Model Fitting Results Screen The best fitting model for Y1 is Damped Trend Exponential Smoothing with a RMSE = 1.09531 and for Y2 is Linear (Holt) Exponential Smoothing with a RMSE = 1.11705. We ll now fit the true for both Y1 and Y2 by clicking on the Close button twice to return to the Main Menu and clicking on the Develop Models button (see Figure 5). Right-clicking in a blank area and selecting Fit Models from List, make sure the box Show All Models is checked (Figure 11). Scroll to the bottom of the list and select the true model, AR(1). Figure 9 Custom Model Selection Screen After adding these 8 models, return to the Automatic Model Fitting screen (Figure 6) and click on Run. A box appears stating that models will be fit to the 2 series Y1 and Y2 and the models will be selected after the series diagnostics are run. The alternative is to fit all the models to each series. While it takes longer to fit all the models in the list, we ll see that Figure 11 Models to Fit Screen

The RMSE for the AR(1) model is 1.01417, about an 8% improvement over the best model selected using the series diagnostics. Repeating the same procedure for Y2 and fitting its true model linear trend + seasonal dummies + AR(1) errors leads to a RMSE of 0.94159, a 16% improvement. The moral is don t rely too strongly on the series diagnostics. I would recommend fitting all available models. PART 2: USING THE TSFS TO DEVELOP MODELS In this example we ll use the variable GCD, personal consumption expenditures on durable goods, found in the SAS dataset SASHELP.CITIQTR. The data is quarterly from 1980/1 1991/4. The object is to find a good forecasting model for GCD. A new project is started for this variable. One decision that needs to be made is whether or not to use a Holdout Sample, that is, put aside some observations when estimating the models and use the held-out observations to evaluate the performance of each model. In this example, we ll find the best models both with and without a holdout sample of 8 periods and see if it makes a difference. Figure 13 Unit Root Test for GCD The first difference of GCD appears to be stationary, in fact, the left-most graph in Figure 14 implies that )GCD is white noise, i.e., GCD is a random walk. The next step is to examine the autocorrelation (ACF) and partial autocorrelation (PACF) functions of )GCD (Figure 15). The first step is to graph the data (Figure 12). Figure 14 Unit Root Test for )GCD Figure 12 Graph of GCD The data obviously has a trend. Seasonality is harder to determine visually. To decide between using a deterministic or stochastic trend, a unit root test can be performed by clicking on the 3 rd button from the top along the right-hand side of the graph (Figure 13). The Dickey-Fuller test (the middle graph) clearly indicates a unit root or stochastic trend and that taking a first difference is appropriate. A first difference is easily obtained by clicking on the ) key along the top of the graph (Figure 14). The ACF and PACF suggest that either an AR(1) or MA(1) model for )GCD is reasonable. We next fit ARIMA(1,1,0) and ARIMA(0,1,1) models both with and without intercepts. Intercepts seem to be needed since Figure 12 looks more like a random walk with drift than without drift. The random walk with drift model was also fit. Based on the Akaike and Schwartz Information Criteria (AIC & SIC) and also RMSE, the ARIMA(1,1,0) model with intercept is the best model when no holdout sample is used. The parameter estimates are shown in Figure 16.

Since a holdout sample has not been used at this point, the Fit Range and Evaluation Range are the same, namely, 1980/1 1991/4. The RMSE over this range for the ARIMA(1,1,0) with intercept model is 12.41. The ARIMA(0,1,1) with intercept model is very close with a RMSE of 12.43. Figure 15 ACF and PACF for )GCD We can see how this model compares with our candidate list of models by using the Automatic Fit option and not subsetting the list of models by the series diagnostics. The automatic series diagnostics for GCD is undecided on a Log transformation, detects a trend, and rejects seasonality. However, fitting all models in the list results in the best model according to the RMSE criterion to be Seasonal Dummies + Linear Trend + AR(1) with RMSE = 11.48. According to the MAPE criterion, the two models are essentially equivalent with a MAPE of 2.85%. Figure 17 shows the parameter estimates for the Seasonal Dummies + Linear Trend + AR(1) model. Note that none of the seasonal dummies are significant at the.05 level. Figure 15 Parameter Estimates for ARIMA(1,1,0) Model with Intercept for GCD With t-values in parentheses, the estimates for the intercept and autoregressive coefficient are 5.01 (3.4) and 0.25 (-1.8) respectively. The forecasts for this model can be viewed by clicking on the 2 nd icon from the bottom along the right (Figure 16). Figure 17 Parameter Estimates for Seasonal Dummies + Linear Trend + AR(1) Model We will now define a holdout sample of the last eight observations, 1990/1 1991/4, and evaluate each of the above two models based on the out-of-sample RMSE. To do this, go back to the Develop Models screen and click on the Set Ranges button. This brings up the Time Ranges Specification screen (Figure 18) where a holdout sample of 8 periods is set. The models are now refit by clicking on the Edit menu item along the top of the screen: Edit Refit Models All Models Figure 16 Forecasts for ARIMA(1,1,0) Model with Intercept for GCD

SAS and SAS/ETS are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration. CONTACT INFORMATION Figure 18 Time Ranges Specification Screen Author Name: Charles Hallahan Company: Economic Research Service/USDA Address: Room 3117 1800 M St, NW City, State ZIP: Washington, DC 20036-5831 Work Phone: 202-694-5051 Fax: 202-694-5718 e-mail: hallahan@ers.usda.gov The previous best model, Seasonal Dummies + Linear Trend + AR(1), now has a RMSE of 26.41 while the ARIMA(1,1,0) with intercept model has a RMSE of 15.34. However, the overall best model using this particular holdout sample and the RMSE criterion are the ARIMA(1,1,0) NOINT and ARIMA(0,1,1) NOINT models with RMSE s of 12.76. So the answer to the question What is the best forecasting model is It depends! Topics not covered in this introduction to the TSFS include intervention models, adding explanatory variables, and combining forecasts. CONCLUSION The Time Series Forecasting System provides an intuitive point-and-click interface to the forecasting tools of SAS/ETS software. You can extend the list of candidate models provided by the SAS System and through automatic diagnostic tests for trend, seasonality, and log transformation, a large number of models can be automatically fit to a set of data series. Graphs of data series and error diagnostics are immediately available. Explanatory variables, time trends, seasonal dummies, and interventions are easily included in models. In summary, the TSFS converts the task of forecasting with the SAS System from a batch-oriented process to a straightforward point-and-click operation.