Wang Haihua
🚅 🚋😜 🚑 🚔
The manager notes from previous surveys that many American customers who previously stayed at their hotel traveled to Portugal via London with British Airways.
You are tasked with building a Prophet model that can forecast passenger numbers traveling from San Francisco to London with British Airways as accurately as possible.
Specifically, your managers want to maximize the accuracy of the Prophet model through the defining of "changepoints", or points that mark a significant change in cancellation trends.
Additionally, they would also like the Prophet model to account for uncertainty in the trend and seasonality components.
import pandas as pd
import numpy as np
from fbprophet import Prophet
import statsmodels.api as sm
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.tsa.seasonal import seasonal_decompose
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from sklearn.metrics import mean_squared_error
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
np.random.seed(42)
idx = pd.IndexSlice
import fbprophet
fbprophet.__version__
'0.7.1'
You are tasked with building a Prophet model that can forecast passenger numbers traveling from San Francisco to London with British Airways as accurately as possible.
df = (pd.read_csv('data/british_airways.csv', parse_dates=['Date'])
.rename(columns={'Date': 'ds', 'Adjusted Passenger Count': 'y'}))
df.info()
df.head(2)
df.tail(2)
_ = df.plot(x='ds', y='y', figsize=(12, 4), title='Adjusted Passenger Count')
<class 'pandas.core.frame.DataFrame'> RangeIndex: 129 entries, 0 to 128 Data columns (total 2 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 ds 129 non-null datetime64[ns] 1 y 129 non-null int64 dtypes: datetime64[ns](1), int64(1) memory usage: 2.1 KB
ds | y | |
---|---|---|
0 | 2005-07-01 | 21686 |
1 | 2005-08-01 | 20084 |
ds | y | |
---|---|---|
127 | 2016-02-01 | 16230 |
128 | 2016-03-01 | 18392 |
train, test = df.iloc[:115], df.iloc[115:]
len(train), len(test)
(115, 14)
m = Prophet()
m.fit(train)
INFO:fbprophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this. INFO:fbprophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
<fbprophet.forecaster.Prophet at 0x7fe132cc6ca0>
Predictions are then made on a dataframe with a column ds
containing the dates for which a prediction is to be made.
You can get a suitable dataframe that extends into the future a specified number of days using the helper method Prophet.make_future_dataframe
.
future = m.make_future_dataframe(periods=len(test), freq='M')
future.tail(2)
ds | |
---|---|
127 | 2016-01-31 |
128 | 2016-02-29 |
forecast = m.predict(future)
forecast.tail(2)
ds | trend | yhat_lower | yhat_upper | trend_lower | trend_upper | additive_terms | additive_terms_lower | additive_terms_upper | yearly | yearly_lower | yearly_upper | multiplicative_terms | multiplicative_terms_lower | multiplicative_terms_upper | yhat | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
127 | 2016-01-31 | 18089.307220 | 12843.423767 | 15198.180296 | 18050.655292 | 18121.616087 | -4111.614690 | -4111.614690 | -4111.614690 | -4111.614690 | -4111.614690 | -4111.614690 | 0.0 | 0.0 | 0.0 | 13977.69253 |
128 | 2016-02-29 | 18105.011939 | 14568.818318 | 17044.917821 | 18062.220859 | 18141.574588 | -2325.227709 | -2325.227709 | -2325.227709 | -2325.227709 | -2325.227709 | -2325.227709 | 0.0 | 0.0 | 0.0 | 15779.78423 |
fig1 = m.plot(forecast)
ax = fig1.gca()
ax = test.plot(x='ds', y='y', ls='--', color='C3', ax=ax, label='True TS')
fig2 = m.plot_components(forecast)
test.head(2)
ds | y | |
---|---|---|
115 | 2015-02-01 | 14372 |
116 | 2015-03-01 | 17304 |
round(np.sqrt(mean_squared_error(test['y'], forecast['yhat'].iloc[-len(test):])), 4)
2346.5605