Skip to content

1. Keras 介绍

建议有下面的基础(一种或多种)

  • TensorFlow
  • NumPy
  • SciPy
  • Theano
  • CNTK

Keras 中文文档

2. Backend 后端

配置后端文件 $HOME/.keras/keras.json

json
{
    "image_data_format": "channels_last",
    "epsilon": 1e-07,
    "floatx": "float32",
    "backend": "tensorflow"
}
  • backend 可选值为 theano, tensorflow, cntk

3. Regressor 回归器

现在有一些数据,要求使用直线近似数据。这是一个回归类问题。

现在,我们开始训练数据:

py
import matplotlib.pyplot as plt
import numpy as np
from keras.layers import Dense
from keras.models import Sequential

np.random.seed(1337)
X = np.linspace(-1, 1, 200)
np.random.shuffle(X)
Y = 0.5 * X + 2 + np.random.normal(0, 0.05, (200,))

plt.scatter(X, Y)
plt.show()

X_train, Y_train = X[:160], Y[:160]
X_test, Y_test = X[160:], Y[160:]

# 创建模型,并加入层

model = Sequential()
model.add(Dense(1, input_dim=1))
model.compile(loss='mse', optimizer='sgd')

# 训练

for step in range(301):
    cost = model.train_on_batch(X_train, Y_train)
    if step % 100 == 0:
        print('cost:', cost)

# 测试

cost = model.evaluate(X_test, Y_test, batch_size=40)
W, b = model.layers[0].get_weights()

print('W:', W, 'b:', b)

Y_pred = model.predict(X_test)
plt.scatter(X_test, Y_test)
plt.plot(X_test, Y_pred)
plt.show()

生成结果:

cost: 4.219132423400879
cost: 0.11019308865070343
cost: 0.01302764005959034
cost: 0.0049691214226186275
1/1 [==============================] - 0s 138ms/step - loss: 0.0058
W: [[0.5734286]] b: [2.0012124]

4. Classifier 分类器

训练 MNIST 数据,分类手写数字数据为十个类

py
from keras.optimizers import rmsprop_v2
from keras.layers import Dense, Activation
from keras.models import Sequential
from keras.utils import np_utils
from keras.datasets import mnist
import numpy as np
np.random.seed(1337)


(X_train, y_train), (X_test, y_test) = mnist.load_data()

X_train = X_train.reshape(X_train.shape[0], -1) / 255
X_test = X_test.reshape(X_test.shape[0], -1) / 255

y_train = np_utils.to_categorical(y_train, num_classes=10)
y_test = np_utils.to_categorical(y_test, num_classes=10)

model = Sequential([
    Dense(32, input_dim=784),
    Activation('relu'),
    Dense(10),
    Activation('softmax')
])

rmsprop = rmsprop_v2.RMSprop(learning_rate=0.001, rho=0.9,
                             epsilon=1e-8, decay=0.0)
model.compile(
    optimizer=rmsprop,
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

model.fit(X_train, y_train, epochs=2, batch_size=32)


loss, accuracy = model.evaluate(X_test, y_test)

print('test loss:', loss)
print('test accuracy:', accuracy)

输出

Epoch 1/2
1875/1875 [==============================] - 5s 2ms/step - loss: 0.3486 - accuracy: 0.9030   
Epoch 2/2
1875/1875 [==============================] - 5s 3ms/step - loss: 0.1977 - accuracy: 0.9437
313/313 [==============================] - 1s 2ms/step - loss: 0.1690 - accuracy: 0.9487
test loss: 0.16901984810829163   
test accuracy: 0.9487000107765198

5. CNN 卷积神经网络

使用卷积神经网络训练 MNIST 数据

py
import numpy as np
from keras.datasets import mnist
from keras.layers import Activation, Convolution2D,\
    Dense, Flatten, MaxPooling2D
from keras.models import Sequential
from keras.optimizers import adam_v2
from keras.utils import np_utils

np.random.seed(1337)

(X_train, y_train), (X_test, y_test) = mnist.load_data()

X_train = X_train.reshape(-1, 1, 28, 28)
X_test = X_test.reshape(-1, 1, 28, 28)
y_train = np_utils.to_categorical(y_train, num_classes=10)
y_test = np_utils.to_categorical(y_test, num_classes=10)

model = Sequential()
model.add(Convolution2D(
    32,
    kernel_size=(5, 5),
    input_shape=(1, 28, 28),
    padding='same'
))
model.add(Activation('relu'))
model.add(MaxPooling2D(
    pool_size=(2, 2),
    strides=(2, 2),
    padding='same'
))
model.add(Convolution2D(
    64,
    kernel_size=(5, 5),
    padding='same'
))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same'))
model.add(Flatten())
model.add(Dense(1024))
model.add(Activation('relu'))
model.add(Dense(10))
model.add(Activation('softmax'))

adam = adam_v2.Adam(learning_rate=1e-4)

model.compile(optimizer=adam,
              loss='categorical_crossentropy',
              metrics=['accuracy'])

model.fit(X_train, y_train, epochs=1, batch_size=32)

loss, accuracy = model.evaluate(X_test, y_test)
print('test loss:', loss)
print('test accuracy:', accuracy)

输出

1875/1875 [==============================] - 54s 28ms/step - loss: 0.2956 - accuracy: 0.9194
313/313 [==============================] - 3s 7ms/step - loss: 0.1317 - accuracy: 0.9583
test loss: 0.13174782693386078
test accuracy: 0.958299994468689

6. RNN 循环神经网络

如果数据是有顺序关联的,那么我们需要记住以前的数据。

设我们的数据为 X(t)X(t) ,我们的 RNN 网络会输入数据,产生结果 Y(t)Y(t) ,并进入状态 S(t)S(t) ,那么下一个时间状态值为 X(t+1)Y(t+1)X(t+1) \rightarrow Y(t+1) ,内部状态包含了 S(t),S(t+1)S(t), S(t+1)

长短期记忆网络(LSTM RNN),解决了一些 RNN 不能解决的梯度问题。

误差反向传播时,梯度可能会被网络传播越来越大(梯度爆炸)或者越来越小(梯度弥散)。