Skip to content

1. 三好学生成绩问题

1.1 搭建神经网络

计算三好学生成绩的公式是

总分=w1德育+w2智育+w3体育总分 = w_1 \cdot 德育 + w_2 \cdot 智育 + w_3 \cdot 体育

现在只有两份数据

90w1+80w2+70w3=8598w1+95w2+87w3=96\begin{aligned} 90w_1 + 80w_2 + 70w_3 = 85 \\ 98w_1 + 95w_2 + 87w_3 = 96 \end{aligned}

现在要求估计 w1,w2,w3w_1, w_2, w_3 的值。

【设计神经网络】

输入层: 33 个节点,表示三门课的成绩 x1,x2,x3x_1,x_2,x_3

隐藏层: 33 个节点,其中节点的运算分别是 w1,w2,w3*w_1,*w_2,*w_3

输出层:一个节点表示结果 yy ,运算为 Σ\Sigma

代码实现如下:

python
# 暂时只使用 TensorFlow 版本 1.x
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

# 定义三个输入节点,表示为“占位符”
x1 = tf.placeholder(dtype=tf.float32)
x2 = tf.placeholder(dtype=tf.float32)
x3 = tf.placeholder(dtype=tf.float32)

# 定义三个可变参数,并设置初始值为 0.1
w1 = tf.Variable(0.1, dtype=tf.float32)
w2 = tf.Variable(0.1, dtype=tf.float32)
w3 = tf.Variable(0.1, dtype=tf.float32)

# 定义三个隐藏节点,规则是相乘
n1 = x1 * w1
n2 = x2 * w2
n3 = x3 * w3

# 定义输出层,规则是将隐层的值相加
y = n1 + n2 + n3

# 创建回话对象,使用回话对象来运行整个网络的计算
sess = tf.Session()

# 初始化可变参数
init = tf.global_variables_initializer()
sess.run(init)

# 执行计算任务,并输出结果
result = sess.run([x1, x2, x3, w1, w2, w3, y],
                  feed_dict = {x1: 90, x2: 80, x3: 70})
print(result)

运行到终端:

ini
# 警告:没有充分发挥 CPU 算力
I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
python
# 输出结果
[array(90., dtype=float32), array(80., dtype=float32), array(70., dtype=float32), 0.1, 0.1, 0.1, 24.0]

1.2 训练神经网络

修改其参数,使其输出结果:

python
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

x1 = tf.placeholder(dtype=tf.float32)
x2 = tf.placeholder(dtype=tf.float32)
x3 = tf.placeholder(dtype=tf.float32)

# 目标值,或者说训练集
yTrain = tf.placeholder(dtype=tf.float32)

w1 = tf.Variable(0.1, dtype=tf.float32)
w2 = tf.Variable(0.1, dtype=tf.float32)
w3 = tf.Variable(0.1, dtype=tf.float32)

n1 = x1 * w1
n2 = x2 * w2
n3 = x3 * w3

y = n1 + n2 + n3

# 定义损失值:距离目标值的绝对值
loss = tf.abs(y - yTrain)

# 创建优化器
optimizer = tf.train.RMSPropOptimizer(0.001)

# 训练对象,定义为最小化损失值
train = optimizer.minimize(loss)

sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)

result = sess.run([train, x1, x2, x3, w1, w2, w3, y, yTrain, loss],
                  feed_dict={x1: 90, x2: 80, x3: 70, yTrain: 85})
print(result)

result = sess.run([train, x1, x2, x3, w1, w2, w3, y, yTrain, loss],
                  feed_dict={x1: 98, x2: 95, x3: 87, yTrain: 96})
print(result)

输出结果:

python
[None, array(90., dtype=float32), array(80., dtype=float32), array(70., dtype=float32), 0.10316052, 0.10316006, 0.103159375, 24.0, array(85., dtype=float32), 61.0]      
[None, array(98., dtype=float32), array(95., dtype=float32), array(87., dtype=float32), 0.10554425, 0.10563005, 0.1056722, 28.884804, array(96., dtype=float32), 67.1152]

我们发现误差更大了,因此需要网络使用优化器继续循环优化。

tf.train.RMSPropOptimizer(epsilon) 优化器即为 AlphaGo 所使用的优化器,参数 epsilon 为学习率,表示为 ε\varepsilon

1.3 进行多轮训练

将最后一段代码写入循环:

python
for _ in range(5000):
    result = sess.run([train, x1, x2, x3, w1, w2, w3, y,
                       yTrain, loss],
                      feed_dict={x1: 90, x2: 80, x3: 70, yTrain: 85})
    result = sess.run([train, x1, x2, x3, w1, w2, w3, y,
                       yTrain, loss],
                      feed_dict={x1: 98, x2: 95, x3: 87, yTrain: 96})
print(result)

可以比较不同循环次数对结果的影响:

python
[None, array(98., dtype=float32), array(95., dtype=float32), array(87., dtype=float32), 0.5828438, 0.2860972, 0.13144642, 96.03325, array(96., dtype=float32), 0.0332489]