Data Preparation - Stationarity
In this issue, the second tutorial in our data preparation series, we will touch on the second most important assumption in time series analysis:Stationarity, or the assumption that a time series sample is drawn from a stationary process.
We’ll start by defining the stationary process and stating the minimum stationary requirements for our time series analysis. Then we demonstrate how to examine sample data, draw a few observations, and highlight the intuitions behind them.
In a mathematical sense, a stationary process is a stochastic process whose joint probability distribution does not change when shifted in time or space. Consequently, parameters such as the mean and variance, if they exist, also do not change as a result of a shift in time or position. This is often referred to as the strict-form of stationary process.
Let be a stochastic process, where is the density (mass) distribution function of the joint distribution of . Then is said to be stationary if, for all values of shift () and all values of ,
The function is not affected by a shift across time.
A simplified example would be a Gaussian white-noise process, where each observation is identically distributed and independent from all observations in a given sample. Consequently, the joint probability distribution of the sample data is expressed as follows:
The stationary process assumption is very strict, and is very difficult to check for outside of a few trivial cases (e.g. white noise). For practical time series analysis, a “weak-sense” stationary process (WSS) is adequate.
Weak-sense stationary (WSS)
A weaker form of stationary process is called weak-sense stationary (WSS) or covariance stationary. The WSS requires that only the 1st (mean) and second (covariance) moments don’t vary with respect to time.
The WSS is also referred to as a first-order stationary process. Furthermore, the WSS definition leads to the following conclusions:
- That the auto-covariance and auto-correlation functions are only dependent on (shift over time)
- The auto covariance and auto-correlation functions are dependent on the absolute value of the shift :
Note: For time series analysis; we shall only concern ourselves with the WSS form of stationary process.
Checking for a stationary assumption
Let’s assume we have a time series data sample; how do we examine it for stationarity?
1. Visual Method
Before we delve into statistical tests for stationarity, let’s demonstrate in plain words how to examine for stationarity using a time series plot. Keep in mind that we are looking for a relatively stable mean and variance over time. My preferred method is to plot the sample data, moving average, and exponential weighted volatility on the same graph.
- The (weighted) moving average (WMA) is a proxy for the process’s marginal mean.
- The exponential-weighted volatility (EWMA) is a proxy for the process’s marginal standard deviation.
Examine the stability of the mean and variance over time.
Let’s look at the IBM stock daily closing prices process between January 2, 2012 and today (April 3rd, 2012):
The graph above shows a trending sample mean but rather stable volatility. As a result, the stationary assumption does not hold for the closing prices process.
Note: The EWMA function assumes that the process mean is zero(0); however, this is not the case for the closing prices process, so we need to de-mean the series with TSSUB before passing it to EWMA.
Let’s look at the daily log returns of IBM stock:
The daily log returns exhibit a stable mean over time, and the volatility is somewhat bound between 0.6 – 1.2% per trading day.
Note: We typically ignore the first few EWMA values because the number of observations used to calculate those values is very limited, leading to inaccurate results.
The sample data mean is not significantly different than zero, and the volatility (standard deviation) is around 0.8%, which is the center line for EWMA in our sample (excluding values in the beginning of the sample).
In sum, the IBM stock daily log returns data sample looks stationary.
2. Statistical Test
In practice, the common reason for non-stationarity in sample data is the presence of trend and integration (i.e. unit-root) between the observations themselves.
A number of statistical tests can be utilized to examine the stationary assumptions by decomposing the process into three elements: a deterministic trend, a random walk (unit-root), and a stationary error. The following tests are commonly used to establish the stationary assumptions:
- Trend stationary - Kwiatkowski–Phillips–Schmidt–Shin (KPSS)
- Unit-root Test or random walk test – Augmented Dickey-Fuller (ADF)
The stationary assumption is not holding; what can I do?
If a stationary assumption fails to hold, the solution it quite simple: transform the data into a stationary process.
How exactly do we go about making that kind of transformation? Earlier, we mentioned that the presence of trend and/or unit-root (integration) in the time series commonly leads to non-stationarity. Using the statistical test, we can check for the presence of trend and/or unit root. Next, we apply various techniques including de-trending, seasonal adjustment, and differencing, in order to yield a stationary process.
In financial time series, unit root (random walk) is often found in the raw time series, while trend may be found in macroeconomic data. An analyst’s experience and familiarity with the type of time series is critical in picking/applying the appropriate transformation techniques.
In the IBM stock closing prices time series, the data showed random-walk behavior. We could also easily compute the ACF functions, and we demonstrated an ACF for lag one with a value as high as 100%. To remove the random-walk, we took the first difference and ended up with a stationary process.
IMPORTANT: We assume that the underlying process has not undergone any structural changes (i.e. exogenous events) within our sample data.