A stationary stochastic process generates data in a special way which makes it possible to attempt to forecast its future values. The opposite of a stationary stochastic process is a non-stationary stochastic process which is commonly referred to as a random walk, and by definition if a time series is a random walk, it is virtually impossible to forecast its future values based on past values alone. A stationary process tends to converge to its mean over time, this property is called Mean Reversion. How can one tell the difference between a stationary or a non-stationary stochastic process? Why do only stationary stochastic processes lend themselves to forecasting with any amount of accuracy? The objective of this post is to define these terms and conduct an empirical test for stationary processes that generate some well known macroeconomic time series. I find that the second difference of GDP is a stationary process and that this is the correct way to look at the data for forecasting.

**Necessary and Sufficient Conditions for A Stationary Process**

A stationary time series has an average value that does not change over time to dramatically. The average inflation rate in the U.S. since the mid-80’s has remained fairly steady so the inflation rate might adhere to this condition. A second condition of a stationary time series is that the variance of the series is a constant or fairly close to being constant. If over time a series values have become more volatile, it might violate the constant variance condition for a stationary time series. The final condition for a stationary time series is that the covariance between values at time t and t-k are constant or close to constant. This means that the correlation between current and past values has remained constant over the period under investigation. If any of these condition is violated a time series is said to be non-stationary or that it follows a random walk. If the error term of a forecast has a mean value of zero, constant variance, and zero serial correlation then it is said to be a “white noise” process. A white noise process is a special case of a stationary time series and if the errors of a forecast follow a white noise process then we can say that the forecast is unbiased. In mathematical notation a white noise process is defined as:

**Integrated Processes**

Several important time series may not be a stationary process. Many variables exhibit trends, different variances, and correlation between past and future values. What can be done with these variables if one needs to forecast their future values? The answer is first- differencing, or subtracting past previous values from future values and creating a new time series. If the first difference of a stationary time series is stationary the it is said to be integrated of order 1 or I(1). If a variable is stationary it s said to be integrated of order zero or I(0). If second differencing makes a non-stationary variable stationary then it is said to be stationary of order 2 or I(2).

**Properties of Integrated Processes**

**Empirical Test Whether or Not Gross Domestic Product is a Stationary Process**

Using quarterly data for U.S. Gross Domestic Product starting from 1947 we can test for a stationary process. If the process is not stationary then the process is to first difference to until the process is stationary. ** **

First we input the data and run the commands to tell STATA to recognize the data variable as a time variable and the data as a time series:

The date is stored as a string variable the following commands format date as a date variable and format it accordingly…

Now that STATA recognizes time2 as a date we can set the command to recognize the data set as a time series…

As one can see from the graph above, GDP has a definite time trend. This suggest that the GDP is non-stationary, but to check with a statistical test we can use a unit root test. The above graphics and data manipulation where done on STATA, but just to mix it up a bit the remainder of the analysis will be conducted on Eviews.

Selecting values for trend and intercept a Augmented Dickey Fuller-Test was run to test for a stationary process in GDP data. A non-stationary process is also called a Unit Root process, hence the description of the test above and below.

The test shows that one cannot reject the hypothesis that GDP has a unit root, but could first differencing the data lead to a stationary time series? A second test (not shown) rejects the hypothesis that the first difference of GDP is a unit root process. One can see that the time series of the first difference does not have time independent variance as shown below.

**Testing the Second Difference of GDP**

The second difference of GDP is a stationary process according to the Augmented Dickey-Fuller test.

The test above rejects the null hypothesis that the second difference of GDP is non-stationary, thus if one is going to use past values of GDP the second difference is the appropriate way to transform the data. The final graph below shows the second difference of GDP which is the series that was shown to be stationary in the ADF test above.

Can you please tell me why you say the forecast of white noise process is unbiased ?

If an error term in a regression is found to have a mean of zero and constant variance it is defined as white noise. Typically if your error term is not a white noise process it means that on average (mean) your error s are positive/negative. Having consistently positive/negative error terms means that one is consistently underestimating/overestimating the value you are trying to forecast. In the same token, if the variance of your forecast errors is higher at certain time intervals (say during peak sales season), then you don’t have a white noise process, and thus you cannot be confident about the precision of your forecast. There are several ways to ensure that your error term is a white noise process including looking at percentage changes instead of the entire number and/or modelling the variance in the error term. GARCH and ARCH models are one way of modelling error in the variance term and taking the first difference of the data is a way of ensuring the mean is zero.