RNN是一种特殊的神经网路结构,其本身是包含循环的网络,允许信息在神经元之间传递,如下图所示:
图示是一个RNN结构示意图,图中的 $A$ 表示神经网络模型, $X_t$ 表示模型的输入信号, $h_t$ 表示模型的输出信号,如果没有 $A$ 的输出信号传递到 $A$ 的那个箭头, 这个网络模型与普通的神经网络结构无异。那么这个箭头做了什么事情呢? 它允许 $A$ 将信息传递给 $A$ ,神经网络将自己的输出作为输入了!
输入信号是一个时间序列,跟时间 $t$ 有关。也就是说,在 $t$ 时刻,输入信号 $X_t$ 作为神经网络 $A$ 的输入, $A$ 的输出分流为两部分,一部分输出给 $h_t$ 另一部分作为一个隐藏的信号流被 入到 $A$ 中,在下一次时刻输入信号 $X_{t+1}$ 时,这部分隐藏的信号流也作为输入信号输入到了 $A$ 中。此时神经网络 $A$ 就同时接收了 $t$ 时刻和 $t+1$ 时刻的信号输入了,此时的输出信号又将被传递到下一时刻的 $A$ 中。如果我们把上面那个图根据时间 $t$ 展开来看,就是: 看到了吗? $t=0$ 时刻的信息输出给 $t=1$ 时刻的模型 $A$ 了, $t=1$ 时刻的信息输出给 $t=2$ 时刻的模型 $A$ 了,...。这样,相当于RNN在时间序列上把自己复制了很多遍,每个模型都对应一 个时刻的输入,并且当前时刻的输出还作为下一时刻的模型的输入信号。 这样链式的结构揭示了RNN本质上是与序列相关的,是对于时间序列数据最自然的神经网络架 构。并且理论上,RNN可以保留以前任意时刻的信息。RNN在语音识别、自然语言处理、图片描 述、视频图像处理等领域已经取得了一定的成果,而且还将更加大放异彩。在实际使用的时候,用 得最多的一种RNN结构是LSTM,为什么是LSTM呢? 我们从普通RNN的同限性说起。
RNN利用了神经网络的“内部循环”来保留时间序列的上下文信息,可以使用过去的信号数据来推测对当前信号的理解,这是非常重要的进步,并且理论上RNN可以保留过去任意时刻的信息。但实际使用RNN时往往遇到问题,请看下面这个例子。
假如我们构造了一个语言模型,可以通过当前这一句话的意思来预测下一个词语。现在有这样一句话:“我是一个中国人,出生在普通家庭,我最常说汉语,也喜欢写汉字。我喜欢妈妈做的菜”。我们的语言模型在预测“我最常说汉语”的“汉语”这个词时,它要预测“我最常说”这后面可能跟的是一个语言,可能是英语,也可能是汉语,那么它需要用到第一句话的“我是中国人”这段话的意思来推测我最常说汉语,而不是英语、法语等。而在预测“我喜欢妈妈做的菜”的最后的词“菜”时并不需要“我是中国人”这个信息以及其他的信息,它跟我是不是一个中国人没有必然的关系。
这个例子告诉我们,想要精确地处理时间序列,有时候我们只需要用到最近的时刻的信息。例如预测“我喜欢妈妈做的菜”最后这个词“菜”,此时信息传递是这样的: “菜”这个词与“我”、“喜欢”、“妈妈”、“做”、“的”这几个词关联性比较大,距离也比较近,所以可以直接利用这几个词进行最后那个词语的推测。
而有时候我们又需要用到很早以前时刻的信息,例如预测“我最常说汉语”最后的这个词“汉语”。此时信息传递是这样的:
此时,我们要预测“汉语”这个词,仅仅依靠“我”、“最”、“常”、“说”这几个词还不能得出我说的是汉语,必须要追溯到更早的句子“我是一个中国人”,由“中国人”这个词语来推测我最常说的是汉语。因此,这种情况下,我们想要推测“汉语”这个词的时候就比前面那个预测“菜”这个词所用到的信息就处于更早的时刻。
而RNN虽然在理论上可以保留所有历史时刻的信息,但在实际使用时,信息的传递往往会因为时间间隔太长而逐渐衰减,传递一段时刻以后其信息的作用效果就大大降低了。因此,普通RNN对于信息的长期依赖问题没有很好的处理办法。
为了克服这个问题,Hochreiter等人在1997年改进了RNN,提出了一种特殊的RNN模型——LSTM网络,可以学习长期依赖信息,在后面的20多年被改良和得到了广泛的应用,并且取得了极大的成功。
长短期记忆(Long Short Term Memory,LSTM)网络是一种特殊的RNN模型,其特殊的结构设计使得它可以避免长期依赖问题,记住很早时刻的信息是LSTM的默认行为,而不需要专门为此付出很大代价。
普通的RNN模型中,其重复神经网络模块的链式模型如下图所示,这个重复的模块只有一个非常简单的结构,一个单一的神经网络层(例如tanh层),这样就会导致信息的处理能力比较低。
而LSTM在此基础上将这个结构改进了,不再是单一的神经网络层,而是4个,并且以一种特殊的方式进行交互。 粗看起来,这个结构有点复杂,不过不用担心,接下来我们会慢慢解释。在解释这个神经网络层时我们先来认识一些基本的模块表示方法。图中的模块分为以下几种:
LSTM主要包括三个不同的门结构: 遗忘门、记忆门和输出门。这三个门用来控制LSTM的信息保 留和传递,最终反映到细胞状态 $C_t$ 和输出信号 $h_t$ 。如下图所示: 图中标示了LSTM中各个门的构成情况和相互之间的关系,其中:
顾名思义,遗忘门的作用就是用来 "忘记" 信息的。在LSTM的使用过程中,有一些信息不是必要 的,因此遗忘门的作用就是用来选择这些信息并 "忘记" 它们。遗忘门决定了细胞状态 $C_{t-1}$ 中的 哪些信息将被遗忘。那么遗忘门的工作原理是什么呢? 看下面这张图。 左边高亮的结构就是遗忘门了,包含一个sigmoid神经网络层 (黄色方框, 神经网络参数为 $\left.W_f, b_f\right)$ ,接收 $t$ 时刻的输入信号 $x_t$ 和 $t-1$ 时刻LSTM的上一个输出信号 $h_{t-1}$ ,这两个信号 进行垪接以后共同输入到sigmoid神经网络层中,然后输出信号 $f_t , f_t$ 是一个 0 到1之间的数 值,并与 $C_{t-1}$ 相乘来决定 $C_{t-1}$ 中的哪些信息将被保留,哪些信息将被舍弃。可能看到这里有的 初学者还是不知道具体是什么意思,我们用一个简单的例子来说明。 假设 $C_{t-1}=[0.5,0.6,0.4], h_{t-1}=[0.3,0.8,0.69], x_t=[0.2,1.3,0.7]$, 那么遗忘门的 输入信号就是 $h_{t-1}$ 和 $x_t$ 的组合,即 $\left[h_{t-1}, x_t\right]=[0.3,0.8,0.69,0.2,1.3,0.7]$, 然后通过 sigmoid 神经网络层输出每一个元素都处于 0 到 1 之间的向量 $f_t=[0.5,0.1,0.8]$ ,注意,此时 $f_t$ 是一个与 $c_{t-1}$ 维数相同的向量,此处为 3 维。如果看到这里还没有看懂的读者,可能会有这样的 疑问: 输入信号明明是6维的向量,为什么 $f_t$ 就变成了3维呢? 这里可能是将sigmoid神经网络层 当成了sigmoid激活函数了,两者不是一个东西,初学者在这里很容易混淆。下文所提及的 sigmoid神经网络层和tanh神经网络层而是类似的道理,他们并不是简单的sigmoid激活函数 和 $\tanh$ 激活函数,在学习时要注意区分。
记忆门的作用与遗忘门相反,它将决定新输入的信息 $x_t$ 和 $h_{t-1}$ 中哪些信息将被保留。 如图所示,记忆门包含2个部分。第一个是包含sigmoid神经网络层(输入门,神经网络网络参数 为 $W_i, b_i$ ) 和一个 $\tanh$ 神经网络层 (神经网络参数为 $W_c, b_c$ )。
有了遗忘门和记忆门,我们就可以更新细胞状态 $C_t$ 了。
这里将遗忘门的输出 $f_t$ 与上一时刻的细胞状态 $C_{t-1}$ 相乘来选择遗忘和保留一些信息,将记忆门 的输出与从遗忘门选择后的信息加和得到新的细胞状态 $C_t$ 。这就表示 $t$ 时刻的细胞状态 $C_t$ 已经 包含了此时需要丟弃的 $t-1$ 时刻传递的信息和 $t$ 时刻从输入信号获取的需要新加入的信息 $i_t \cdot \widetilde{C}_t$ 。 $C_t$ 将继续传递到 $t+1$ 时刻的LSTM网络中,作为新的细胞状态传递下去。
前面已经讲了LSTM如何来更新细胞状态 $C_t$ ,那么在 $t$ 时刻我们输入信号 $x_t$ 以后,对应的输出信号该如何计算呢?
如上面左图所示,输出门就是将 $t-1$ 时刻传递过来并经过了前面遗忘门与记忆门选择后的细胞状 态 $C_{t-1}$ ,与 $t-1$ 时刻的输出信号 $h_{t-1}$ 和 $t$ 时刻的输入信号 $x_t$ 整合到一起作为当前时刻的输 出信号。整合的过程如上图所示, $x_t$ 和 $h_{t-1}$ 经过一个sigmoid神经网络层(神经网络参数为 $W_o, b_o$ ) 输出一个 0 到 1 之间的数值 $o_t$ 。 $C_t$ 经过一个tanh函数(注意:这里不是 $\tanh$ 神经 网络层) 到一个在 $-1$ 到 1 之间的数值,并与 $o_t$ 相乘得到输出信号 $h_t$ ,同时 $h_t$ 也作为下一个 时刻的输入信号传递到下一阶段。 其中, $\tanh$ 函数是激活函数的一种,函数图像为:
至此,基本的LSTM网络模型就介绍完了。
# LSTM for international airline passengers problem with window regression framing
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from pandas import read_csv
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
# convert an array of values into a dataset matrix
def create_dataset(dataset, look_back=1):
dataX, dataY = [], []
for i in range(len(dataset)-look_back-1):
a = dataset[i:(i+look_back), 0]
dataX.append(a)
dataY.append(dataset[i + look_back, 0])
return np.array(dataX), np.array(dataY)
# fix random seed for reproducibility
tf.random.set_seed(7)
# load the dataset
dataframe = read_csv('data/airline-passengers.csv', usecols=[1], engine='python')
dataset = dataframe.values
dataset = dataset.astype('float32')
# normalize the dataset
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(dataset)
# split into train and test sets
train_size = int(len(dataset) * 0.67)
test_size = len(dataset) - train_size
train, test = dataset[0:train_size,:], dataset[train_size:len(dataset),:]
# reshape into X=t and Y=t+1
look_back = 3
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)
# reshape input to be [samples, time steps, features]
trainX = np.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = np.reshape(testX, (testX.shape[0], 1, testX.shape[1]))
# create and fit the LSTM network
model = Sequential()
model.add(LSTM(4, input_shape=(1, look_back)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, epochs=100, batch_size=1, verbose=2)
# make predictions
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)
# invert predictions
trainPredict = scaler.inverse_transform(trainPredict)
trainY = scaler.inverse_transform([trainY])
testPredict = scaler.inverse_transform(testPredict)
testY = scaler.inverse_transform([testY])
# calculate root mean squared error
trainScore = np.sqrt(mean_squared_error(trainY[0], trainPredict[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
testScore = np.sqrt(mean_squared_error(testY[0], testPredict[:,0]))
print('Test Score: %.2f RMSE' % (testScore))
# shift train predictions for plotting
trainPredictPlot = np.empty_like(dataset)
trainPredictPlot[:, :] = np.nan
trainPredictPlot[look_back:len(trainPredict)+look_back, :] = trainPredict
# shift test predictions for plotting
testPredictPlot = np.empty_like(dataset)
testPredictPlot[:, :] = np.nan
testPredictPlot[len(trainPredict)+(look_back*2)+1:len(dataset)-1, :] = testPredict
# plot baseline and predictions
plt.plot(scaler.inverse_transform(dataset))
plt.plot(trainPredictPlot)
plt.plot(testPredictPlot)
plt.show()
参考资料
import pandas
import matplotlib.pyplot as plt
dataset = pandas.read_csv('data/airline-passengers.csv', usecols=[1], engine='python')
plt.plot(dataset)
plt.show()
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import LSTM
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
# fix random seed for reproducibility
tf.random.set_seed(7)
# load the dataset
dataframe = pd.read_csv('data/airline-passengers.csv', usecols=[1], engine='python')
dataset = dataframe.values
dataset = dataset.astype('float32')
dataframe
Passengers | |
---|---|
0 | 112 |
1 | 118 |
2 | 132 |
3 | 129 |
4 | 121 |
... | ... |
139 | 606 |
140 | 508 |
141 | 461 |
142 | 390 |
143 | 432 |
144 rows × 1 columns
# normalize the dataset
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(dataset)
# split into train and test sets
train_size = int(len(dataset) * 0.67)
test_size = len(dataset) - train_size
train, test = dataset[0:train_size,:], dataset[train_size:len(dataset),:]
print(len(train), len(test))
96 48
# convert an array of values into a dataset matrix
def create_dataset(dataset, look_back=1):
dataX, dataY = [], []
for i in range(len(dataset)-look_back-1):
a = dataset[i:(i+look_back), 0]
dataX.append(a)
dataY.append(dataset[i + look_back, 0])
return np.array(dataX), np.array(dataY)
# reshape into X=t and Y=t+1
look_back = 1
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)
# reshape input to be [samples, time steps, features]
trainX = np.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = np.reshape(testX, (testX.shape[0], 1, testX.shape[1]))
# create and fit the LSTM network
model = Sequential()
model.add(LSTM(4, input_shape=(1, look_back)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, epochs=100, batch_size=1, verbose=2)
Epoch 1/100 94/94 - 1s - loss: 0.0499 - 1s/epoch - 13ms/step Epoch 2/100 94/94 - 0s - loss: 0.0266 - 99ms/epoch - 1ms/step Epoch 3/100 94/94 - 0s - loss: 0.0194 - 99ms/epoch - 1ms/step Epoch 4/100 94/94 - 0s - loss: 0.0177 - 105ms/epoch - 1ms/step Epoch 5/100 94/94 - 0s - loss: 0.0168 - 94ms/epoch - 996us/step Epoch 6/100 94/94 - 0s - loss: 0.0158 - 98ms/epoch - 1ms/step Epoch 7/100 94/94 - 0s - loss: 0.0149 - 98ms/epoch - 1ms/step Epoch 8/100 94/94 - 0s - loss: 0.0139 - 98ms/epoch - 1ms/step Epoch 9/100 94/94 - 0s - loss: 0.0131 - 99ms/epoch - 1ms/step Epoch 10/100 94/94 - 0s - loss: 0.0120 - 98ms/epoch - 1ms/step Epoch 11/100 94/94 - 0s - loss: 0.0110 - 99ms/epoch - 1ms/step Epoch 12/100 94/94 - 0s - loss: 0.0099 - 101ms/epoch - 1ms/step Epoch 13/100 94/94 - 0s - loss: 0.0088 - 97ms/epoch - 1ms/step Epoch 14/100 94/94 - 0s - loss: 0.0079 - 97ms/epoch - 1ms/step Epoch 15/100 94/94 - 0s - loss: 0.0068 - 102ms/epoch - 1ms/step Epoch 16/100 94/94 - 0s - loss: 0.0060 - 95ms/epoch - 1ms/step Epoch 17/100 94/94 - 0s - loss: 0.0051 - 95ms/epoch - 1ms/step Epoch 18/100 94/94 - 0s - loss: 0.0043 - 92ms/epoch - 978us/step Epoch 19/100 94/94 - 0s - loss: 0.0038 - 98ms/epoch - 1ms/step Epoch 20/100 94/94 - 0s - loss: 0.0033 - 95ms/epoch - 1ms/step Epoch 21/100 94/94 - 0s - loss: 0.0029 - 96ms/epoch - 1ms/step Epoch 22/100 94/94 - 0s - loss: 0.0026 - 102ms/epoch - 1ms/step Epoch 23/100 94/94 - 0s - loss: 0.0024 - 140ms/epoch - 1ms/step Epoch 24/100 94/94 - 0s - loss: 0.0022 - 131ms/epoch - 1ms/step Epoch 25/100 94/94 - 0s - loss: 0.0022 - 102ms/epoch - 1ms/step Epoch 26/100 94/94 - 0s - loss: 0.0021 - 100ms/epoch - 1ms/step Epoch 27/100 94/94 - 0s - loss: 0.0021 - 98ms/epoch - 1ms/step Epoch 28/100 94/94 - 0s - loss: 0.0020 - 99ms/epoch - 1ms/step Epoch 29/100 94/94 - 0s - loss: 0.0021 - 93ms/epoch - 989us/step Epoch 30/100 94/94 - 0s - loss: 0.0021 - 105ms/epoch - 1ms/step Epoch 31/100 94/94 - 0s - loss: 0.0020 - 103ms/epoch - 1ms/step Epoch 32/100 94/94 - 0s - loss: 0.0021 - 114ms/epoch - 1ms/step Epoch 33/100 94/94 - 0s - loss: 0.0021 - 100ms/epoch - 1ms/step Epoch 34/100 94/94 - 0s - loss: 0.0020 - 94ms/epoch - 997us/step Epoch 35/100 94/94 - 0s - loss: 0.0021 - 101ms/epoch - 1ms/step Epoch 36/100 94/94 - 0s - loss: 0.0021 - 105ms/epoch - 1ms/step Epoch 37/100 94/94 - 0s - loss: 0.0021 - 119ms/epoch - 1ms/step Epoch 38/100 94/94 - 0s - loss: 0.0020 - 102ms/epoch - 1ms/step Epoch 39/100 94/94 - 0s - loss: 0.0020 - 91ms/epoch - 973us/step Epoch 40/100 94/94 - 0s - loss: 0.0020 - 95ms/epoch - 1ms/step Epoch 41/100 94/94 - 0s - loss: 0.0020 - 99ms/epoch - 1ms/step Epoch 42/100 94/94 - 0s - loss: 0.0020 - 94ms/epoch - 1ms/step Epoch 43/100 94/94 - 0s - loss: 0.0020 - 93ms/epoch - 993us/step Epoch 44/100 94/94 - 0s - loss: 0.0020 - 98ms/epoch - 1ms/step Epoch 45/100 94/94 - 0s - loss: 0.0021 - 94ms/epoch - 1ms/step Epoch 46/100 94/94 - 0s - loss: 0.0020 - 97ms/epoch - 1ms/step Epoch 47/100 94/94 - 0s - loss: 0.0020 - 101ms/epoch - 1ms/step Epoch 48/100 94/94 - 0s - loss: 0.0021 - 99ms/epoch - 1ms/step Epoch 49/100 94/94 - 0s - loss: 0.0020 - 95ms/epoch - 1ms/step Epoch 50/100 94/94 - 0s - loss: 0.0020 - 92ms/epoch - 974us/step Epoch 51/100 94/94 - 0s - loss: 0.0020 - 97ms/epoch - 1ms/step Epoch 52/100 94/94 - 0s - loss: 0.0020 - 95ms/epoch - 1ms/step Epoch 53/100 94/94 - 0s - loss: 0.0020 - 93ms/epoch - 990us/step Epoch 54/100 94/94 - 0s - loss: 0.0020 - 97ms/epoch - 1ms/step Epoch 55/100 94/94 - 0s - loss: 0.0020 - 96ms/epoch - 1ms/step Epoch 56/100 94/94 - 0s - loss: 0.0020 - 95ms/epoch - 1ms/step Epoch 57/100 94/94 - 0s - loss: 0.0020 - 97ms/epoch - 1ms/step Epoch 58/100 94/94 - 0s - loss: 0.0020 - 93ms/epoch - 987us/step Epoch 59/100 94/94 - 0s - loss: 0.0020 - 90ms/epoch - 957us/step Epoch 60/100 94/94 - 0s - loss: 0.0020 - 94ms/epoch - 1ms/step Epoch 61/100 94/94 - 0s - loss: 0.0020 - 105ms/epoch - 1ms/step Epoch 62/100 94/94 - 0s - loss: 0.0021 - 92ms/epoch - 977us/step Epoch 63/100 94/94 - 0s - loss: 0.0020 - 89ms/epoch - 943us/step Epoch 64/100 94/94 - 0s - loss: 0.0020 - 93ms/epoch - 991us/step Epoch 65/100 94/94 - 0s - loss: 0.0020 - 94ms/epoch - 1ms/step Epoch 66/100 94/94 - 0s - loss: 0.0020 - 95ms/epoch - 1ms/step Epoch 67/100 94/94 - 0s - loss: 0.0021 - 95ms/epoch - 1ms/step Epoch 68/100 94/94 - 0s - loss: 0.0020 - 92ms/epoch - 978us/step Epoch 69/100 94/94 - 0s - loss: 0.0020 - 97ms/epoch - 1ms/step Epoch 70/100 94/94 - 0s - loss: 0.0020 - 93ms/epoch - 992us/step Epoch 71/100 94/94 - 0s - loss: 0.0020 - 95ms/epoch - 1ms/step Epoch 72/100 94/94 - 0s - loss: 0.0020 - 96ms/epoch - 1ms/step Epoch 73/100 94/94 - 0s - loss: 0.0020 - 93ms/epoch - 993us/step Epoch 74/100 94/94 - 0s - loss: 0.0021 - 93ms/epoch - 985us/step Epoch 75/100 94/94 - 0s - loss: 0.0020 - 97ms/epoch - 1ms/step Epoch 76/100 94/94 - 0s - loss: 0.0020 - 92ms/epoch - 981us/step Epoch 77/100 94/94 - 0s - loss: 0.0020 - 91ms/epoch - 969us/step Epoch 78/100 94/94 - 0s - loss: 0.0020 - 98ms/epoch - 1ms/step Epoch 79/100 94/94 - 0s - loss: 0.0020 - 97ms/epoch - 1ms/step Epoch 80/100 94/94 - 0s - loss: 0.0020 - 92ms/epoch - 983us/step Epoch 81/100 94/94 - 0s - loss: 0.0019 - 95ms/epoch - 1ms/step Epoch 82/100 94/94 - 0s - loss: 0.0021 - 91ms/epoch - 972us/step Epoch 83/100 94/94 - 0s - loss: 0.0020 - 93ms/epoch - 985us/step Epoch 84/100 94/94 - 0s - loss: 0.0020 - 91ms/epoch - 970us/step Epoch 85/100 94/94 - 0s - loss: 0.0020 - 92ms/epoch - 976us/step Epoch 86/100 94/94 - 0s - loss: 0.0020 - 94ms/epoch - 1ms/step Epoch 87/100 94/94 - 0s - loss: 0.0021 - 97ms/epoch - 1ms/step Epoch 88/100 94/94 - 0s - loss: 0.0020 - 92ms/epoch - 977us/step Epoch 89/100 94/94 - 0s - loss: 0.0020 - 95ms/epoch - 1ms/step Epoch 90/100 94/94 - 0s - loss: 0.0020 - 97ms/epoch - 1ms/step Epoch 91/100 94/94 - 0s - loss: 0.0020 - 95ms/epoch - 1ms/step Epoch 92/100 94/94 - 0s - loss: 0.0020 - 98ms/epoch - 1ms/step Epoch 93/100 94/94 - 0s - loss: 0.0020 - 97ms/epoch - 1ms/step Epoch 94/100 94/94 - 0s - loss: 0.0020 - 94ms/epoch - 1ms/step Epoch 95/100 94/94 - 0s - loss: 0.0021 - 93ms/epoch - 984us/step Epoch 96/100 94/94 - 0s - loss: 0.0020 - 93ms/epoch - 990us/step Epoch 97/100 94/94 - 0s - loss: 0.0020 - 95ms/epoch - 1ms/step Epoch 98/100 94/94 - 0s - loss: 0.0020 - 93ms/epoch - 985us/step Epoch 99/100 94/94 - 0s - loss: 0.0020 - 114ms/epoch - 1ms/step Epoch 100/100 94/94 - 0s - loss: 0.0020 - 97ms/epoch - 1ms/step
<keras.callbacks.History at 0x212cad442b0>
# make predictions
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)
# invert predictions
trainPredict = scaler.inverse_transform(trainPredict)
trainY = scaler.inverse_transform([trainY])
testPredict = scaler.inverse_transform(testPredict)
testY = scaler.inverse_transform([testY])
# calculate root mean squared error
trainScore = np.sqrt(mean_squared_error(trainY[0], trainPredict[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
testScore = np.sqrt(mean_squared_error(testY[0], testPredict[:,0]))
print('Test Score: %.2f RMSE' % (testScore))
3/3 [==============================] - 0s 2ms/step 2/2 [==============================] - 0s 2ms/step Train Score: 22.71 RMSE Test Score: 48.70 RMSE
# shift train predictions for plotting
trainPredictPlot = np.empty_like(dataset)
trainPredictPlot[:, :] = np.nan
trainPredictPlot[look_back:len(trainPredict)+look_back, :] = trainPredict
# shift test predictions for plotting
testPredictPlot = np.empty_like(dataset)
testPredictPlot[:, :] = np.nan
testPredictPlot[len(trainPredict)+(look_back*2)+1:len(dataset)-1, :] = testPredict
# plot baseline and predictions
plt.plot(scaler.inverse_transform(dataset))
plt.plot(trainPredictPlot)
plt.plot(testPredictPlot)
plt.show()
完整代码
# LSTM for international airline passengers problem with regression framing
import numpy as np
import matplotlib.pyplot as plt
from pandas import read_csv
import math
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import LSTM
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
# convert an array of values into a dataset matrix
def create_dataset(dataset, look_back=1):
dataX, dataY = [], []
for i in range(len(dataset)-look_back-1):
a = dataset[i:(i+look_back), 0]
dataX.append(a)
dataY.append(dataset[i + look_back, 0])
return np.array(dataX), np.array(dataY)
# fix random seed for reproducibility
tf.random.set_seed(7)
# load the dataset
dataframe = read_csv('data/airline-passengers.csv', usecols=[1], engine='python')
dataset = dataframe.values
dataset = dataset.astype('float32')
# normalize the dataset
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(dataset)
# split into train and test sets
train_size = int(len(dataset) * 0.67)
test_size = len(dataset) - train_size
train, test = dataset[0:train_size,:], dataset[train_size:len(dataset),:]
# reshape into X=t and Y=t+1
look_back = 1
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)
# reshape input to be [samples, time steps, features]
trainX = np.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = np.reshape(testX, (testX.shape[0], 1, testX.shape[1]))
# create and fit the LSTM network
model = Sequential()
model.add(LSTM(4, input_shape=(1, look_back)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, epochs=100, batch_size=1, verbose=2)
# make predictions
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)
# invert predictions
trainPredict = scaler.inverse_transform(trainPredict)
trainY = scaler.inverse_transform([trainY])
testPredict = scaler.inverse_transform(testPredict)
testY = scaler.inverse_transform([testY])
# calculate root mean squared error
trainScore = np.sqrt(mean_squared_error(trainY[0], trainPredict[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
testScore = np.sqrt(mean_squared_error(testY[0], testPredict[:,0]))
print('Test Score: %.2f RMSE' % (testScore))
# shift train predictions for plotting
trainPredictPlot = np.empty_like(dataset)
trainPredictPlot[:, :] = np.nan
trainPredictPlot[look_back:len(trainPredict)+look_back, :] = trainPredict
# shift test predictions for plotting
testPredictPlot = np.empty_like(dataset)
testPredictPlot[:, :] = np.nan
testPredictPlot[len(trainPredict)+(look_back*2)+1:len(dataset)-1, :] = testPredict
# plot baseline and predictions
plt.plot(scaler.inverse_transform(dataset))
plt.plot(trainPredictPlot)
plt.plot(testPredictPlot)
plt.show()
Epoch 1/100 94/94 - 1s - loss: 0.0357 - 1s/epoch - 15ms/step Epoch 2/100 94/94 - 0s - loss: 0.0171 - 112ms/epoch - 1ms/step Epoch 3/100 94/94 - 0s - loss: 0.0128 - 127ms/epoch - 1ms/step Epoch 4/100 94/94 - 0s - loss: 0.0117 - 139ms/epoch - 1ms/step Epoch 5/100 94/94 - 0s - loss: 0.0109 - 108ms/epoch - 1ms/step Epoch 6/100 94/94 - 0s - loss: 0.0098 - 117ms/epoch - 1ms/step Epoch 7/100 94/94 - 0s - loss: 0.0089 - 136ms/epoch - 1ms/step Epoch 8/100 94/94 - 0s - loss: 0.0080 - 114ms/epoch - 1ms/step Epoch 9/100 94/94 - 0s - loss: 0.0073 - 102ms/epoch - 1ms/step Epoch 10/100 94/94 - 0s - loss: 0.0064 - 115ms/epoch - 1ms/step Epoch 11/100 94/94 - 0s - loss: 0.0056 - 121ms/epoch - 1ms/step Epoch 12/100 94/94 - 0s - loss: 0.0048 - 128ms/epoch - 1ms/step Epoch 13/100 94/94 - 0s - loss: 0.0042 - 113ms/epoch - 1ms/step Epoch 14/100 94/94 - 0s - loss: 0.0037 - 115ms/epoch - 1ms/step Epoch 15/100 94/94 - 0s - loss: 0.0033 - 128ms/epoch - 1ms/step Epoch 16/100 94/94 - 0s - loss: 0.0029 - 104ms/epoch - 1ms/step Epoch 17/100 94/94 - 0s - loss: 0.0026 - 109ms/epoch - 1ms/step Epoch 18/100 94/94 - 0s - loss: 0.0024 - 102ms/epoch - 1ms/step Epoch 19/100 94/94 - 0s - loss: 0.0024 - 140ms/epoch - 1ms/step Epoch 20/100 94/94 - 0s - loss: 0.0022 - 131ms/epoch - 1ms/step Epoch 21/100 94/94 - 0s - loss: 0.0021 - 118ms/epoch - 1ms/step Epoch 22/100 94/94 - 0s - loss: 0.0021 - 117ms/epoch - 1ms/step Epoch 23/100 94/94 - 0s - loss: 0.0020 - 110ms/epoch - 1ms/step Epoch 24/100 94/94 - 0s - loss: 0.0020 - 112ms/epoch - 1ms/step Epoch 25/100 94/94 - 0s - loss: 0.0021 - 105ms/epoch - 1ms/step Epoch 26/100 94/94 - 0s - loss: 0.0020 - 107ms/epoch - 1ms/step Epoch 27/100 94/94 - 0s - loss: 0.0020 - 142ms/epoch - 2ms/step Epoch 28/100 94/94 - 0s - loss: 0.0020 - 115ms/epoch - 1ms/step Epoch 29/100 94/94 - 0s - loss: 0.0021 - 117ms/epoch - 1ms/step Epoch 30/100 94/94 - 0s - loss: 0.0020 - 107ms/epoch - 1ms/step Epoch 31/100 94/94 - 0s - loss: 0.0020 - 125ms/epoch - 1ms/step Epoch 32/100 94/94 - 0s - loss: 0.0021 - 117ms/epoch - 1ms/step Epoch 33/100 94/94 - 0s - loss: 0.0021 - 119ms/epoch - 1ms/step Epoch 34/100 94/94 - 0s - loss: 0.0020 - 96ms/epoch - 1ms/step Epoch 35/100 94/94 - 0s - loss: 0.0021 - 103ms/epoch - 1ms/step Epoch 36/100 94/94 - 0s - loss: 0.0021 - 112ms/epoch - 1ms/step Epoch 37/100 94/94 - 0s - loss: 0.0021 - 104ms/epoch - 1ms/step Epoch 38/100 94/94 - 0s - loss: 0.0020 - 112ms/epoch - 1ms/step Epoch 39/100 94/94 - 0s - loss: 0.0020 - 112ms/epoch - 1ms/step Epoch 40/100 94/94 - 0s - loss: 0.0020 - 106ms/epoch - 1ms/step Epoch 41/100 94/94 - 0s - loss: 0.0020 - 105ms/epoch - 1ms/step Epoch 42/100 94/94 - 0s - loss: 0.0020 - 103ms/epoch - 1ms/step Epoch 43/100 94/94 - 0s - loss: 0.0020 - 115ms/epoch - 1ms/step Epoch 44/100 94/94 - 0s - loss: 0.0020 - 110ms/epoch - 1ms/step Epoch 45/100 94/94 - 0s - loss: 0.0021 - 106ms/epoch - 1ms/step Epoch 46/100 94/94 - 0s - loss: 0.0020 - 127ms/epoch - 1ms/step Epoch 47/100 94/94 - 0s - loss: 0.0020 - 128ms/epoch - 1ms/step Epoch 48/100 94/94 - 0s - loss: 0.0021 - 126ms/epoch - 1ms/step Epoch 49/100 94/94 - 0s - loss: 0.0020 - 105ms/epoch - 1ms/step Epoch 50/100 94/94 - 0s - loss: 0.0020 - 114ms/epoch - 1ms/step Epoch 51/100 94/94 - 0s - loss: 0.0020 - 137ms/epoch - 1ms/step Epoch 52/100 94/94 - 0s - loss: 0.0020 - 128ms/epoch - 1ms/step Epoch 53/100 94/94 - 0s - loss: 0.0020 - 117ms/epoch - 1ms/step Epoch 54/100 94/94 - 0s - loss: 0.0020 - 112ms/epoch - 1ms/step Epoch 55/100 94/94 - 0s - loss: 0.0020 - 104ms/epoch - 1ms/step Epoch 56/100 94/94 - 0s - loss: 0.0020 - 114ms/epoch - 1ms/step Epoch 57/100 94/94 - 0s - loss: 0.0020 - 110ms/epoch - 1ms/step Epoch 58/100 94/94 - 0s - loss: 0.0020 - 113ms/epoch - 1ms/step Epoch 59/100 94/94 - 0s - loss: 0.0020 - 98ms/epoch - 1ms/step Epoch 60/100 94/94 - 0s - loss: 0.0020 - 101ms/epoch - 1ms/step Epoch 61/100 94/94 - 0s - loss: 0.0020 - 108ms/epoch - 1ms/step Epoch 62/100 94/94 - 0s - loss: 0.0021 - 97ms/epoch - 1ms/step Epoch 63/100 94/94 - 0s - loss: 0.0020 - 99ms/epoch - 1ms/step Epoch 64/100 94/94 - 0s - loss: 0.0020 - 107ms/epoch - 1ms/step Epoch 65/100 94/94 - 0s - loss: 0.0020 - 114ms/epoch - 1ms/step Epoch 66/100 94/94 - 0s - loss: 0.0020 - 126ms/epoch - 1ms/step Epoch 67/100 94/94 - 0s - loss: 0.0021 - 98ms/epoch - 1ms/step Epoch 68/100 94/94 - 0s - loss: 0.0020 - 113ms/epoch - 1ms/step Epoch 69/100 94/94 - 0s - loss: 0.0020 - 101ms/epoch - 1ms/step Epoch 70/100 94/94 - 0s - loss: 0.0020 - 105ms/epoch - 1ms/step Epoch 71/100 94/94 - 0s - loss: 0.0020 - 100ms/epoch - 1ms/step Epoch 72/100 94/94 - 0s - loss: 0.0020 - 102ms/epoch - 1ms/step Epoch 73/100 94/94 - 0s - loss: 0.0020 - 100ms/epoch - 1ms/step Epoch 74/100 94/94 - 0s - loss: 0.0021 - 104ms/epoch - 1ms/step Epoch 75/100 94/94 - 0s - loss: 0.0020 - 101ms/epoch - 1ms/step Epoch 76/100 94/94 - 0s - loss: 0.0020 - 104ms/epoch - 1ms/step Epoch 77/100 94/94 - 0s - loss: 0.0020 - 98ms/epoch - 1ms/step Epoch 78/100 94/94 - 0s - loss: 0.0020 - 102ms/epoch - 1ms/step Epoch 79/100 94/94 - 0s - loss: 0.0020 - 117ms/epoch - 1ms/step Epoch 80/100 94/94 - 0s - loss: 0.0020 - 108ms/epoch - 1ms/step Epoch 81/100 94/94 - 0s - loss: 0.0020 - 97ms/epoch - 1ms/step Epoch 82/100 94/94 - 0s - loss: 0.0021 - 98ms/epoch - 1ms/step Epoch 83/100 94/94 - 0s - loss: 0.0020 - 105ms/epoch - 1ms/step Epoch 84/100 94/94 - 0s - loss: 0.0020 - 103ms/epoch - 1ms/step Epoch 85/100 94/94 - 0s - loss: 0.0020 - 108ms/epoch - 1ms/step Epoch 86/100 94/94 - 0s - loss: 0.0020 - 102ms/epoch - 1ms/step Epoch 87/100 94/94 - 0s - loss: 0.0021 - 106ms/epoch - 1ms/step Epoch 88/100 94/94 - 0s - loss: 0.0020 - 106ms/epoch - 1ms/step Epoch 89/100 94/94 - 0s - loss: 0.0020 - 109ms/epoch - 1ms/step Epoch 90/100 94/94 - 0s - loss: 0.0020 - 103ms/epoch - 1ms/step Epoch 91/100 94/94 - 0s - loss: 0.0020 - 105ms/epoch - 1ms/step Epoch 92/100 94/94 - 0s - loss: 0.0020 - 102ms/epoch - 1ms/step Epoch 93/100 94/94 - 0s - loss: 0.0020 - 144ms/epoch - 2ms/step Epoch 94/100 94/94 - 0s - loss: 0.0020 - 106ms/epoch - 1ms/step Epoch 95/100 94/94 - 0s - loss: 0.0021 - 99ms/epoch - 1ms/step Epoch 96/100 94/94 - 0s - loss: 0.0020 - 104ms/epoch - 1ms/step Epoch 97/100 94/94 - 0s - loss: 0.0020 - 102ms/epoch - 1ms/step Epoch 98/100 94/94 - 0s - loss: 0.0020 - 105ms/epoch - 1ms/step Epoch 99/100 94/94 - 0s - loss: 0.0020 - 109ms/epoch - 1ms/step Epoch 100/100 94/94 - 0s - loss: 0.0020 - 106ms/epoch - 1ms/step 3/3 [==============================] - 0s 2ms/step 2/2 [==============================] - 0s 1ms/step Train Score: 22.73 RMSE Test Score: 48.48 RMSE
LSTM for Regression Using the Window Method
# LSTM for international airline passengers problem with window regression framing
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from pandas import read_csv
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
# convert an array of values into a dataset matrix
def create_dataset(dataset, look_back=1):
dataX, dataY = [], []
for i in range(len(dataset)-look_back-1):
a = dataset[i:(i+look_back), 0]
dataX.append(a)
dataY.append(dataset[i + look_back, 0])
return np.array(dataX), np.array(dataY)
# fix random seed for reproducibility
tf.random.set_seed(7)
# load the dataset
dataframe = read_csv('data/airline-passengers.csv', usecols=[1], engine='python')
dataset = dataframe.values
dataset = dataset.astype('float32')
# normalize the dataset
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(dataset)
# split into train and test sets
train_size = int(len(dataset) * 0.67)
test_size = len(dataset) - train_size
train, test = dataset[0:train_size,:], dataset[train_size:len(dataset),:]
# reshape into X=t and Y=t+1
look_back = 3
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)
# reshape input to be [samples, time steps, features]
trainX = np.reshape(trainX, (trainX.shape[0], 1, trainX.shape[1]))
testX = np.reshape(testX, (testX.shape[0], 1, testX.shape[1]))
# create and fit the LSTM network
model = Sequential()
model.add(LSTM(4, input_shape=(1, look_back)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, epochs=100, batch_size=1, verbose=2)
# make predictions
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)
# invert predictions
trainPredict = scaler.inverse_transform(trainPredict)
trainY = scaler.inverse_transform([trainY])
testPredict = scaler.inverse_transform(testPredict)
testY = scaler.inverse_transform([testY])
# calculate root mean squared error
trainScore = np.sqrt(mean_squared_error(trainY[0], trainPredict[:,0]))
print('Train Score: %.2f RMSE' % (trainScore))
testScore = np.sqrt(mean_squared_error(testY[0], testPredict[:,0]))
print('Test Score: %.2f RMSE' % (testScore))
# shift train predictions for plotting
trainPredictPlot = np.empty_like(dataset)
trainPredictPlot[:, :] = np.nan
trainPredictPlot[look_back:len(trainPredict)+look_back, :] = trainPredict
# shift test predictions for plotting
testPredictPlot = np.empty_like(dataset)
testPredictPlot[:, :] = np.nan
testPredictPlot[len(trainPredict)+(look_back*2)+1:len(dataset)-1, :] = testPredict
# plot baseline and predictions
plt.plot(scaler.inverse_transform(dataset))
plt.plot(trainPredictPlot)
plt.plot(testPredictPlot)
plt.show()
Epoch 1/100 92/92 - 1s - loss: 0.0481 - 1s/epoch - 13ms/step Epoch 2/100 92/92 - 0s - loss: 0.0181 - 105ms/epoch - 1ms/step Epoch 3/100 92/92 - 0s - loss: 0.0122 - 106ms/epoch - 1ms/step Epoch 4/100 92/92 - 0s - loss: 0.0106 - 108ms/epoch - 1ms/step Epoch 5/100 92/92 - 0s - loss: 0.0093 - 109ms/epoch - 1ms/step Epoch 6/100 92/92 - 0s - loss: 0.0084 - 134ms/epoch - 1ms/step Epoch 7/100 92/92 - 0s - loss: 0.0075 - 103ms/epoch - 1ms/step Epoch 8/100 92/92 - 0s - loss: 0.0068 - 109ms/epoch - 1ms/step Epoch 9/100 92/92 - 0s - loss: 0.0060 - 118ms/epoch - 1ms/step Epoch 10/100 92/92 - 0s - loss: 0.0056 - 116ms/epoch - 1ms/step Epoch 11/100 92/92 - 0s - loss: 0.0052 - 120ms/epoch - 1ms/step Epoch 12/100 92/92 - 0s - loss: 0.0050 - 100ms/epoch - 1ms/step Epoch 13/100 92/92 - 0s - loss: 0.0048 - 119ms/epoch - 1ms/step Epoch 14/100 92/92 - 0s - loss: 0.0048 - 108ms/epoch - 1ms/step Epoch 15/100 92/92 - 0s - loss: 0.0046 - 110ms/epoch - 1ms/step Epoch 16/100 92/92 - 0s - loss: 0.0044 - 102ms/epoch - 1ms/step Epoch 17/100 92/92 - 0s - loss: 0.0044 - 101ms/epoch - 1ms/step Epoch 18/100 92/92 - 0s - loss: 0.0043 - 105ms/epoch - 1ms/step Epoch 19/100 92/92 - 0s - loss: 0.0042 - 114ms/epoch - 1ms/step Epoch 20/100 92/92 - 0s - loss: 0.0041 - 116ms/epoch - 1ms/step Epoch 21/100 92/92 - 0s - loss: 0.0041 - 119ms/epoch - 1ms/step Epoch 22/100 92/92 - 0s - loss: 0.0040 - 109ms/epoch - 1ms/step Epoch 23/100 92/92 - 0s - loss: 0.0040 - 110ms/epoch - 1ms/step Epoch 24/100 92/92 - 0s - loss: 0.0039 - 111ms/epoch - 1ms/step Epoch 25/100 92/92 - 0s - loss: 0.0039 - 107ms/epoch - 1ms/step Epoch 26/100 92/92 - 0s - loss: 0.0038 - 103ms/epoch - 1ms/step Epoch 27/100 92/92 - 0s - loss: 0.0037 - 103ms/epoch - 1ms/step Epoch 28/100 92/92 - 0s - loss: 0.0037 - 105ms/epoch - 1ms/step Epoch 29/100 92/92 - 0s - loss: 0.0037 - 106ms/epoch - 1ms/step Epoch 30/100 92/92 - 0s - loss: 0.0036 - 115ms/epoch - 1ms/step Epoch 31/100 92/92 - 0s - loss: 0.0036 - 122ms/epoch - 1ms/step Epoch 32/100 92/92 - 0s - loss: 0.0035 - 102ms/epoch - 1ms/step Epoch 33/100 92/92 - 0s - loss: 0.0035 - 102ms/epoch - 1ms/step Epoch 34/100 92/92 - 0s - loss: 0.0035 - 106ms/epoch - 1ms/step Epoch 35/100 92/92 - 0s - loss: 0.0034 - 104ms/epoch - 1ms/step Epoch 36/100 92/92 - 0s - loss: 0.0034 - 104ms/epoch - 1ms/step Epoch 37/100 92/92 - 0s - loss: 0.0032 - 111ms/epoch - 1ms/step Epoch 38/100 92/92 - 0s - loss: 0.0031 - 111ms/epoch - 1ms/step Epoch 39/100 92/92 - 0s - loss: 0.0031 - 116ms/epoch - 1ms/step Epoch 40/100 92/92 - 0s - loss: 0.0031 - 113ms/epoch - 1ms/step Epoch 41/100 92/92 - 0s - loss: 0.0030 - 132ms/epoch - 1ms/step Epoch 42/100 92/92 - 0s - loss: 0.0030 - 116ms/epoch - 1ms/step Epoch 43/100 92/92 - 0s - loss: 0.0029 - 114ms/epoch - 1ms/step Epoch 44/100 92/92 - 0s - loss: 0.0028 - 113ms/epoch - 1ms/step Epoch 45/100 92/92 - 0s - loss: 0.0028 - 115ms/epoch - 1ms/step Epoch 46/100 92/92 - 0s - loss: 0.0027 - 114ms/epoch - 1ms/step Epoch 47/100 92/92 - 0s - loss: 0.0026 - 116ms/epoch - 1ms/step Epoch 48/100 92/92 - 0s - loss: 0.0026 - 109ms/epoch - 1ms/step Epoch 49/100 92/92 - 0s - loss: 0.0025 - 105ms/epoch - 1ms/step Epoch 50/100 92/92 - 0s - loss: 0.0024 - 106ms/epoch - 1ms/step Epoch 51/100 92/92 - 0s - loss: 0.0024 - 113ms/epoch - 1ms/step Epoch 52/100 92/92 - 0s - loss: 0.0023 - 104ms/epoch - 1ms/step Epoch 53/100 92/92 - 0s - loss: 0.0022 - 125ms/epoch - 1ms/step Epoch 54/100 92/92 - 0s - loss: 0.0023 - 121ms/epoch - 1ms/step Epoch 55/100 92/92 - 0s - loss: 0.0023 - 106ms/epoch - 1ms/step Epoch 56/100 92/92 - 0s - loss: 0.0023 - 120ms/epoch - 1ms/step Epoch 57/100 92/92 - 0s - loss: 0.0022 - 99ms/epoch - 1ms/step Epoch 58/100 92/92 - 0s - loss: 0.0021 - 102ms/epoch - 1ms/step Epoch 59/100 92/92 - 0s - loss: 0.0022 - 98ms/epoch - 1ms/step Epoch 60/100 92/92 - 0s - loss: 0.0022 - 100ms/epoch - 1ms/step Epoch 61/100 92/92 - 0s - loss: 0.0021 - 111ms/epoch - 1ms/step Epoch 62/100 92/92 - 0s - loss: 0.0021 - 120ms/epoch - 1ms/step Epoch 63/100 92/92 - 0s - loss: 0.0021 - 98ms/epoch - 1ms/step Epoch 64/100 92/92 - 0s - loss: 0.0020 - 106ms/epoch - 1ms/step Epoch 65/100 92/92 - 0s - loss: 0.0020 - 100ms/epoch - 1ms/step Epoch 66/100 92/92 - 0s - loss: 0.0020 - 99ms/epoch - 1ms/step Epoch 67/100 92/92 - 0s - loss: 0.0020 - 104ms/epoch - 1ms/step Epoch 68/100 92/92 - 0s - loss: 0.0019 - 103ms/epoch - 1ms/step Epoch 69/100 92/92 - 0s - loss: 0.0019 - 120ms/epoch - 1ms/step Epoch 70/100 92/92 - 0s - loss: 0.0019 - 105ms/epoch - 1ms/step Epoch 71/100 92/92 - 0s - loss: 0.0021 - 98ms/epoch - 1ms/step Epoch 72/100 92/92 - 0s - loss: 0.0019 - 125ms/epoch - 1ms/step Epoch 73/100 92/92 - 0s - loss: 0.0018 - 121ms/epoch - 1ms/step Epoch 74/100 92/92 - 0s - loss: 0.0019 - 111ms/epoch - 1ms/step Epoch 75/100 92/92 - 0s - loss: 0.0018 - 101ms/epoch - 1ms/step Epoch 76/100 92/92 - 0s - loss: 0.0018 - 101ms/epoch - 1ms/step Epoch 77/100 92/92 - 0s - loss: 0.0018 - 99ms/epoch - 1ms/step Epoch 78/100 92/92 - 0s - loss: 0.0018 - 102ms/epoch - 1ms/step Epoch 79/100 92/92 - 0s - loss: 0.0019 - 103ms/epoch - 1ms/step Epoch 80/100 92/92 - 0s - loss: 0.0018 - 105ms/epoch - 1ms/step Epoch 81/100 92/92 - 0s - loss: 0.0018 - 115ms/epoch - 1ms/step Epoch 82/100 92/92 - 0s - loss: 0.0019 - 107ms/epoch - 1ms/step Epoch 83/100 92/92 - 0s - loss: 0.0019 - 107ms/epoch - 1ms/step Epoch 84/100 92/92 - 0s - loss: 0.0018 - 105ms/epoch - 1ms/step Epoch 85/100 92/92 - 0s - loss: 0.0019 - 106ms/epoch - 1ms/step Epoch 86/100 92/92 - 0s - loss: 0.0018 - 114ms/epoch - 1ms/step Epoch 87/100 92/92 - 0s - loss: 0.0018 - 104ms/epoch - 1ms/step Epoch 88/100 92/92 - 0s - loss: 0.0018 - 104ms/epoch - 1ms/step Epoch 89/100 92/92 - 0s - loss: 0.0018 - 102ms/epoch - 1ms/step Epoch 90/100 92/92 - 0s - loss: 0.0019 - 109ms/epoch - 1ms/step Epoch 91/100 92/92 - 0s - loss: 0.0018 - 110ms/epoch - 1ms/step Epoch 92/100 92/92 - 0s - loss: 0.0019 - 111ms/epoch - 1ms/step Epoch 93/100 92/92 - 0s - loss: 0.0019 - 106ms/epoch - 1ms/step Epoch 94/100 92/92 - 0s - loss: 0.0019 - 102ms/epoch - 1ms/step Epoch 95/100 92/92 - 0s - loss: 0.0018 - 105ms/epoch - 1ms/step Epoch 96/100 92/92 - 0s - loss: 0.0018 - 97ms/epoch - 1ms/step Epoch 97/100 92/92 - 0s - loss: 0.0019 - 103ms/epoch - 1ms/step Epoch 98/100 92/92 - 0s - loss: 0.0018 - 126ms/epoch - 1ms/step Epoch 99/100 92/92 - 0s - loss: 0.0018 - 101ms/epoch - 1ms/step Epoch 100/100 92/92 - 0s - loss: 0.0017 - 100ms/epoch - 1ms/step 3/3 [==============================] - 0s 1ms/step 2/2 [==============================] - 0s 2ms/step Train Score: 22.12 RMSE Test Score: 70.08 RMSE