# Mann Kendall Test in Python for Trend Detection

The **Mann-Kendall Trend Test **is used to determine whether or not a trend exists in time series data. It is a non-parametric test, meaning there is no underlying assumption made about the normality of the data.

The hypotheses for the test are as follows:

**H0 (null hypothesis): **There is no trend present in the data.

**HA (alternative hypothesis): **A trend is present in the data. (This could be a positive or negative trend)

If the p-value of the test is lower than some significance level (common choices are 0.10, 0.05, and 0.01), then there is statistically significant evidence that a trend is present in the time series data.

The test can handle seasonal patterns within the data. The slope of the trend is often determined with Sen’s slope, which can also handle seasonality in the data.

It is my understanding that the tests do not take into consideration any autocorrelation in the data. Autocorrelation would be for example if an observation with a high value would tend to be followed by another observation with a high value. There are two conditions under which these they are usually used.

1) When the observations are far enough apart in time that autocorrelation is unlikely to be important. For example, in the stage data in this chapter, monthly averages are used.

2) When the values are likely to be influenced mostly by non-autocorrelated factors. For data where autocorrelation is likely to be important, other models, such as autoregressive integrated moving average (ARIMA), could be used.

Three datasets are used here for demonstrating the use of this package. the datasets are —

**Daily Female Births Dataset:**This dataset describes the number of daily female births in California in 1959. This dataset is available in here.**Shampoo Sales Dataset:**This dataset describes the monthly number of sales of shampoo over 3 years. This dataset is available in here.**Air Passengers Dataset:**This famous dataset describes monthly international airline passengers (in thousands) from January 1949 to December 1960. It is widely used as a nonstationary seasonal time series. This dataset is available in here.

Let’s get started. I will begin by loading the dataset.

**Mann Kendall Original Test**

**import **numpy **as **np

**import **pandas **as **pd

**import **pymannkendall **as **mk

**import **matplotlib.pyplot **as **plt

**import **statsmodels.api **as **sm

%matplotlib inline

*# read Datset*

df= pd.read_csv(**"daily-total-female-births-CA.csv"**)

print(df.head())

df.plot(figsize=(12,8));

plt.show()

`from datetime import datetime`

df[**'dated'**] = pd.to_datetime(df[**'date'**])

df = df.set_index(**'dated'**)

df.drop([**'date'**], axis=1, inplace=True)

fig, ax = plt.subplots(figsize=(12, 8))

sm.graphics.tsa.plot_acf(df, lags=20, ax=ax)

plt.show()

From this ACF plot, it shows slight autocorrelation in the first lag. We can ignore it. So, in our demonstration, we *assume* that there is no autocorrelation in **Daily Female Births Dataset**. So, to check the trend in this dataset, we can use the *Original Mann Kendall test*.

`import pymannkendall as mk`

import matplotlib.pyplot as plt

import statsmodels.api as sm

print(mk.original_test(df, alpha=0.05))

# Output

**From this result, we can say that there is a significant trend in this dataset. Because the p-value is smaller than alpha=0.05 so h=True. The trend is “increasing” and the value of trend/slope is 0.019230769230769232.**

Moving on to the next dataset,** the Shampoo dataset.**

**Hamed and Rao Modified MK Test**

`shampoo_df = pd.read_csv(`**'sales-of-shampoo-over-a-three-ye.csv'**, header=0)

shampoo_df.dropna(inplace=True) # Droping the last row which contains NaN

def preprocess_data(df):

processed_df = df.rename(columns={

**'Sales of shampoo over a three year period'**: **'shampoo_sales'**

}).copy()

processed_df[**'Month'**] = pd.to_datetime(processed_df.Month.apply(lambda val: **'190'**+val))

processed_df = processed_df.set_index(**'Month'**)

return processed_df

processed_df = preprocess_data(shampoo_df)

# line plot

processed_df.plot(figsize=(16, 7));

plt.show()

fig, ax = plt.subplots(figsize=(12, 8))

sm.graphics.tsa.plot_acf(processed_df, lags=20, ax=ax);

plt.show()

From this ACF plot, we see autocorrelation in the first lag. So, modified Mann Kendall test should be applied in here.

We can use Hamed and Rao Modified MK Test, Yue and Wang Modified MK Test, Modified MK test using Pre-Whitening method or Modified MK test using Trend free Pre-Whitening method for this Shampoo dataset.

`print (mk.hamed_rao_modification_test(processed_df))`

**Output for Mann Kendall Hamed Rao Test**

Modified_Mann_Kendall_Test_Hamed_Rao_Approach(trend=’increasing’, h=True, p=2.8916160532688195e-07, z=5.130377554630905, Tau=0.6984126984126984, s=440.0, var_s=7322.011164105283, slope=11.509375, intercept=78.73593749999998)

Hamed and Rao Modified MK Test shows that there is a significant trend in this dataset. We can check this using other modified tests.

`print(mk.yue_wang_modification_test(processed_df))`

print(mk.trend_free_pre_whitening_modification_test(processed_df))

print(mk.pre_whitening_modification_test(processed_df))

# Output for Yue Wang, Trend Free Pre-Whiteneing & Pre-Whiteneing Tests

Modified_Mann_Kendall_Test_Yue_Wang_Approach(trend=’increasing’, h=True, p=4.89164264649844e-13, z=7.228299591844885, Tau=0.6984126984126984, s=440.0, var_s=3688.559143751511, slope=11.509375, intercept=78.73593749999998)

Modified_Mann_Kendall_Test_Trend_Free_PreWhitening_Approach(trend=’increasing’, h=True, p=1.2152931994080518e-09, z=6.078212917371978, Tau=0.7210084033613445, s=429.0, var_s=4958.333333333333, slope=11.509375, intercept=78.73593749999998)

Modified_Mann_Kendall_Test_PreWhitening_Approach(trend=’increasing’, h=True, p=0.002860908222931835, z=2.982300730486251, Tau=0.3546218487394958, s=211.0, var_s=4958.333333333333, slope=11.509375, intercept=78.73593749999998)

All these modified tests indicate that there is a significant increasing trend.

**Seasonal MK Test**

On to the last dataset which is the **Air Passengers Dataset**.

`Passenger_data = pd.read_csv(`**"AirPassengers.csv"**,parse_dates=[**'Month'**],index_col=**'Month'**)

print(Passenger_data.head())

Passenger_data.plot(figsize=(12,8))

plt.show()

From this graph, we can easily notice that there is seasonality. So, are going to use **Seasonal MK Test** on this monthly dataset. Since we have monthly data, so here we will use **period=12**.

`print(mk.seasonal_test(Passenger_data,period=12))`

# Output

Seasonal_Mann_Kendall_Test(trend=’increasing’, h=True, p=0.0, z=15.50571052301472, Tau=0.98989898989899, s=784.0, var_s=2550.0, slope=30.23611111111111, intercept=85.3431712962963)

According to the result, this dataset also has a **significant increasing trend**.