ContentsGo back |
ARIMA Time Series predictions using Box-Jenkins approach with R
This document shows the way to use R to make predictions with ARIMA models.
The data used in this document, can be downloaded in the following link:
http://www.robhyndman.info/TSDL/data/abraham4.dat
Box-Jenkins Methodology has the following steps to make ARIMA models:
Note: The piece of code shown in this document appears in a red box while the
simulation of a R session is shown in a blue box. I. Model identification:If you want to know if your data is stationary, you will look up your data. Load this piece of code to see the form of your data. ################################### # DATA ANALYSIS SCRIPT # # Autor: Juan Antonio Breña Moral # # Email: bren@juanantonio.info # ################################### ################################### # R ENGINE SETTINGS # ################################### #Sentence written to set working directory WD <- "C://Documents and Settings/JAB/Escritorio"; setwd(WD); # make sure working directory is set getwd(); # set digits options("digits"=3) ################################### # ARIMA MODEL SCRIPT # ################################### INSTALL_LIBRARIES <- function(){ install.packages("forecast"); install.packages("nortest"); install.packages("quadprog"); install.packages("zoo"); install.packages("tseries"); } #Function used to load libraries #FORECAST: www-personal.buseco.monash.edu.au/~hyndman/Rlibrary/forecast/ #NORMAL TEST: http://cran.r-project.org/src/contrib/Descriptions/nortest.html #TIME SERIES: //cran.r-project.org/src/contrib/Descriptions/tseries.html LOAD_LIBRARIES <- function(){ library("forecast"); library("nortest"); library("tseries"); } # Funcion que genera el mensaje de cabecera. HEADER <-function(){ cat("\n::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::"); cat("\n:: SALES PREDICTION | AUTHOR: Juan Antonio Breña Moral ::"); cat("\n:: VERSION: 2006-06-17 | DOWNLOADS: http://www.juanantonio.info ::"); cat("\n:: EMAIL: bren@juanantonio.info ::"); cat("\n::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::"); cat("\n:: DESCRIPTION: ::"): cat("\n:: Sales Prediction, is a Script to make prediccions. It was ::"); cat("\n:: designed to control a sales budget in any company using ::"); cat("\n:: ARIMA Time Series statistical techniques. The script needs ::"); cat("\n:: a sales data to start to build a ARIMA model to make a ::"); cat("\n:: predictions. This script is very useful to design a sales ::"); cat("\n:: budget in the following exercise and control it. The ::"); cat("\n:: methodology used to build ARIMA model is Box-Jenkins. ::"); cat("\n::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::\n"); } TIME_SERIES <- function(DATA,F,S){ DATOS.TS <- ts(DATA,frequency = F, start = S); return(DATOS.TS); } JAB.PREDICTION.EDA <- function(DATOS){ FILE_NAME <- format(Sys.time(), "%X"); FILE_EXTENSION <- "jpeg"; #Graphical Output par(mfrow= c(2,2)); plot(DATOS,type="l") hist(DATOS, probability =T); lines(density(DATOS), col="red"); qqnorm(DATOS); abline(0,1,col="red"); boxplot(DATOS, horizontal=T); par(mfrow= c(1,1)); } #FROM: http://tolstoy.newcastle.edu.au/R/help/99b/0171.html mykurtosis <- function(x) { m4 <- mean((x-mean(x))^4) kurt <- m4/(sd(x)^4)-3 kurt } myskewness <- function(x) { m3 <- mean((x-mean(x))^3) skew <- m3/(sd(x)^3) skew } JAB.PREDICTION.EDA_TABLE <- function(DATAS){ COLUMN_NAMES <- c("MEAN","MEDIAN","SD","KURTOSIS","SKEWNESS"); DATA <- c(); DATA[1] <- mean(DATAS); DATA[2] <- median(DATAS); DATA[3] <- sd(DATAS); DATA[4] <- mykurtosis(DATAS); DATA[5] <- myskewness(DATAS); names(DATA) <- COLUMN_NAMES; print("JAB: EDA VIEW 1.0") print(DATA); stem(DATAS); } JAB.PREDICTION.NORMAL_TESTS <- function(DATAS){ COLUMN_NAMES <- c("SHAPIRO","ANDERSON-DARLING", "CRAMER-VON MISES","KS","JARQUE-BERA"); DATA <- c(); DATA[1] <- shapiro.test(DATAS)$p.value; DATA[2] <- ad.test(DATAS)$p.value; DATA[3] <- cvm.test(DATAS)$p.value; DATA[4] <- lillie.test(DATAS)$p.value; DATA[5] <- jarque.bera.test(DATAS)$p.value; names(DATA) <- COLUMN_NAMES; print("JAB: NORMAL TEST TABLE 1.0") print(DATA); } Enter the data in R system using JGR as R GUI or using RClass. # To predict anything, you have to get the data SALES <- scan(); 6550 8728 12026 14395 14587 13791 9498 8251 7049 9545 9364 8456 7237 9374 11837 13784 15926 13821 11143 7975 7610 10015 12759 8816 10677 10947 15200 17010 20900 16205 12143 8997 5568 11474 12256 10583 10862 10965 14405 20379 20128 17816 12268 8642 7962 13932 15936 12628 12267 12470 18944 21259 22015 18581 15175 10306 10792 14752 13754 11738 12181 12965 19990 23125 23541 21247 15189 14767 10895 17130 17697 16611 12674 12760 20249 22135 20677 19933 15388 15113 13401 16135 17562 14720 12225 11608 20985 19692 24081 22114 14220 13434 13598 17187 16119 13713 13210 14251 20139 21725 26099 21084 18024 16722 14385 21342 17180 14577 LOAD_LIBRARIES(); JAB.PREDICTION.EDA(SALES); JAB.PREDICTION.EDA_TABLE(SALES); JAB.PREDICTION.NORMAL_TESTS(SALES); [1] "JAB: EDA VIEW 1.0" MEAN MEDIAN SD KURTOSIS SKEWNESS 14595.1111111 14076.0000000 4525.2139131 -0.6600995 0.3106385 5 | 6 6 | 6 7 | 026 8 | 0035678 9 | 04455 10 | 03678999 11 | 015678 12 | 012233356788 13 | 02446788889 14 | 2344466788 15 | 1222499 16 | 11267 17 | 0122678 18 | 069 19 | 79 20 | 0112479 21 | 012337 22 | 011 23 | 15 24 | 1 25 | 26 | 1 [1] "JAB: NORMAL TEST TABLE 1.0" SHAPIRO ANDERSON-DARLING CRAMER-VON MISES KS JARQUE-BERA 0.06335382 0.03812055 0.04695266 0.11516182 0.17425073
If you notice, the sales are asymetric. Data fails different tests for normality. NORMAL_SAMPLE <- rnorm(5000); JAB.PREDICTION.EDA(NORMAL_SAMPLE); JAB.PREDICTION.EDA_TABLE(NORMAL_SAMPLE); JAB.PREDICTION.NORMAL_TESTS(NORMAL_SAMPLE); [1] "JAB: EDA VIEW 1.0" MEAN MEDIAN SD KURTOSIS SKEWNESS -0.0001047225 0.0006451141 0.9922556060 0.0825848178 -0.0086063633 The decimal point is at the | -4 | 5 -4 | -3 | 65 -3 | 3000 -2 | 999998877777666666666665 -2 | 44444444444444333333333322222222222222222221111111111111111111111111+8 -1 | 99999999999999999999999999999999999888888888888888888888888888888888+173 -1 | 44444444444444444444444444444444444444444444444444444444444444444444+384 -0 | 99999999999999999999999999999999999999999999999999999999999999999999+715 -0 | 44444444444444444444444444444444444444444444444444444444444444444444+785 0 | 00000000000000000000000000000000000000000000000000000000000000000000+819 0 | 55555555555555555555555555555555555555555555555555555555555555555555+650 1 | 00000000000000000000000000000000000000000000000000000000000000000000+439 1 | 55555555555555555555555555555555555555555555555555555555555555555555+166 2 | 00000000000000000000011111111111111111222222222222222223333333333344 2 | 555555555566666666677889999 3 | 00011114 3 | 55 4 | 1 [1] "JAB: NORMAL TEST TABLE 1.0" SHAPIRO ANDERSON-DARLING CRAMER-VON MISES KS JARQUE-BERA 0.4513131 0.5120592 0.6798911 0.6469248 0.4664040
II. Model estimation and verification:
TIME_SERIES <- function(DATA,F,S){ DATOS.TS <- ts(DATA,frequency = F, start = S); return(DATOS.TS); } JAB.PREDICTION.BEST_ARIMA <- function(TIME_SERIES){ DATOS.BEST_ARIMA <- best.arima(DATOS.TS) return(DATOS.BEST_ARIMA) } SALES.TS <- TIME_SERIES(SALES,12,1968); SALES.ARIMA <- JAB.PREDICTION.BEST_ARIMA(SALES.TS); III. Forecasting:Forecasting the future is always difficult...
"Computers in the future may weigh no more than 1.5 tons."
"I think there is a world market for maybe five computers."
"I have traveled the length and breadth of this country and talked with the best people, and I can assure you that data processing is a fad that won't last out the year."
"But what ... is it good for?"
"There is no reason anyone would want a computer in their home."
"This 'telephone' has too many shortcomings to be seriously considered as a means of communication. The device is inherently of no value to us."
"The wireless music box has no imaginable commercial value. Who would pay for a message sent to nobody in particular?"
"The concept is interesting and well-formed, but in order to earn better than a 'C,' the idea must be feasible."
"Who the hell wants to hear actors talk?"
"I'm just glad it'll be Clark Gable who's falling on his face and not Gary Cooper."
"A cookie store is a bad idea. Besides, the market research reports say America likes crispy cookies, not soft and chewy cookies like you make."
"We don't like their sound, and guitar music is on the way out."
"Heavier-than-air flying machines are impossible."
"If I had thought about it, I wouldn't have done the experiment. The literature was full of examples that said you can't do this."
"So we went to Atari and said, 'Hey, we've got this amazing thing, even built with some of your parts, and what do you think about funding us? Or we' ll give it to you. We just want to do it. Pay our salary, we'll come work for you.' And they said, 'No.' So then we went to Hewlett-Packard, and they said, 'Hey, we don't need you. You haven't got through college yet.'"
"Professor Goddard does not know the relation between action and reaction and the need to have something better than a vacuum against which to react. He seems to lack the basic knowledge ladled out daily in high schools."
"You want to have consistent and uniform muscle development across all of your muscles? It can't be done. It's just a fact of life. You just have to accept inconsistent muscle development as an unalterable condition of weight training."
"Drill for oil? You mean drill into the ground to try and find oil? You're crazy."
"Stocks have reached what looks like a permanently high plateau."
"Airplanes are interesting toys but of no military value."
"Everything that can be invented has been invented."
"Louis Pasteur's theory of germs is ridiculous fiction".
"The abdomen, the chest, and the brain will forever be shut from the
intrusion of the wise and humane surgeon."
"640K ought to be enough for anybody." |