Skip to content

1. 简化的神经网络

1.1 运行时查看变量

python
name = 'adam'
print('Name: {}'.format(name))

1.2 张量

运行如下代码:

python
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

x1 = tf.placeholder(dtype=tf.float32)
x2 = tf.placeholder(dtype=tf.float32)
x3 = tf.placeholder(dtype=tf.float32)

yTrain = tf.placeholder(dtype=tf.float32)

print('x1:', x1)

w1 = tf.Variable(0.1, dtype=tf.float32)
w2 = tf.Variable(0.1, dtype=tf.float32)
w3 = tf.Variable(0.1, dtype=tf.float32)
print('w1:', w1)

n1 = x1 * w1
n2 = x2 * w2
n3 = x3 * w3
print('n1:', n1)

y = n1 + n2 + n3

print('y:', y)

输出:

x1: Tensor("Placeholder:0", dtype=float32)
w1: <tf.Variable 'Variable:0' shape=() dtype=float32_ref>
n1: Tensor("mul:0", dtype=float32) 
y: Tensor("add_1:0", dtype=float32)

可以看出

  • x1 是一个 Tensor 对象
  • w1 是一个 tf.Variable 对象
  • n1, y 都是 Tensor 对象,但操作 mul, addPlaceholder 不同

张量(tensor)是接收输入数据后经过计算操作输出的对象,张量在神经网络中 流动(flow)的过程就是计算过程。

1.3 重新组织输入数据

我们改造一下代码:

python
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

x = tf.placeholder(shape=(3,), dtype=tf.float32)
yTrain = tf.placeholder(shape=(), dtype=tf.float32)
w = tf.Variable(tf.zeros((3,)), dtype=tf.float32)
n = x * w
y = tf.reduce_sum(n)

loss = tf.abs(y - yTrain)
optimizer = tf.train.RMSPropOptimizer(0.001)
train = optimizer.minimize(loss)

sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
result = sess.run([train, x, w, y, yTrain, loss],
                  feed_dict = {
                      x: [90, 80, 70],
                      yTrain: 85
                  })
print(result)
result = sess.run([train, x, w, y, yTrain, loss],
                  feed_dict = {
                      x: [98, 95, 87],
                      yTrain: 96
                  })
print(result)

输出:

python
[None, array([90., 80., 70.], dtype=float32), array([0.00316052, 0.00316006, 0.00315938], dtype=float32), 0.0, array(85., dtype=float32), 85.0]
[None, array([98., 95., 87.], dtype=float32), array([0.00554424, 0.00563004, 0.0056722 ], dtype=float32), 0.88480234, array(96., dtype=float32), 95.1152]

将上述循环改为 50005000 次,那么:

python
for _ in range(5000):
    result = sess.run([train, x, w, y, yTrain, loss],
                    feed_dict={
                        x: [90, 80, 70],
                        yTrain: 85
                    })
    result = sess.run([train, x, w, y, yTrain, loss],
                    feed_dict={
                        x: [98, 95, 87],
                        yTrain: 96
                    })
print(result)

输出:

python
[None, array([98., 95., 87.], dtype=float32), array([0.58403337, 0.28828683, 0.1340753 ], dtype=float32), 95.987595, array(96., dtype=float32), 0.0124053955]

1.4 标量、向量和张量

意义:

  • 标量(scalar)是一个数字
  • 向量(vector)一维数组
  • 张量(tensor)多维数组

一维矩阵是向量

[102030] \begin{bmatrix} 10 & 20 & 30 \end{bmatrix}

二维数组就是一个矩阵

[908070989587] \begin{bmatrix} 90 & 80 & 70 \\ 98 & 95 & 87 \end{bmatrix}

其形态为 (2,  3)(2,\;3)

三维数组可以表示为

[[908070989587][998877889063]] \begin{bmatrix} \begin{bmatrix} 90 & 80 & 70 \\ 98 & 95 & 87 \end{bmatrix} & \begin{bmatrix} 99 & 88 & 77 \\ 88 & 90 & 63 \end{bmatrix} \end{bmatrix}

或者

[[908070989587][998877889063]] \begin{bmatrix} \begin{bmatrix} 90 & 80 & 70 \\ 98 & 95 & 87 \end{bmatrix} \\ \\ \begin{bmatrix} 99 & 88 & 77 \\ 88 & 90 & 63 \end{bmatrix} \end{bmatrix}

其形态是 (2,  2,  3)(2,\;2,\;3)

1.5 查看和设定张量形态

python
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

x = tf.placeholder(dtype=tf.float32)
xShape = tf.shape(x)

sess = tf.Session()

result = sess.run(xShape, feed_dict={x: 8})
print(result)

result = sess.run(xShape, feed_dict={x: [1, 2, 3]})
print(result)

result = sess.run(xShape,
                  feed_dict={x: [[1, 2, 3], [3, 6, 9]]})
print(result)

输出:

python
[]
[3]
[2 3]

可以在定义张量的时候就设定其形态:

python
x = tf.placeholder(shape=(2, 3), dtype=tf.float32)

1.6 使用 softmax 函数规范可变参数

在上一节中,我们认为权重的和应该为一,即 w1+w2+w3=1w_1 + w_2 + w_3 = 1

修改上一次的代码,加入 softmax() 函数

python
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()

x = tf.placeholder(shape=(3,), dtype=tf.float32)
yTrain = tf.placeholder(shape=(), dtype=tf.float32)
w = tf.Variable(tf.zeros((3,)), dtype=tf.float32)

# 执行后,得到向量 wn,其各项相加为 1
wn = tf.nn.softmax(w)
n = x * wn
y = tf.reduce_sum(n)

loss = tf.abs(y - yTrain)
optimizer = tf.train.RMSPropOptimizer(0.1)
train = optimizer.minimize(loss)

sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)

for _ in range(2):
    result = sess.run([train, x, w, wn, y, yTrain, loss],
                    feed_dict={
                        x: [90, 80, 70],
                        yTrain: 85
                    })
    print(result[3])
    result = sess.run([train, x, w, wn, y, yTrain, loss],
                    feed_dict={
                        x: [98, 95, 87],
                        yTrain: 96
                    })
    print(result[3])

输出为:

python
[0.33333334 0.33333334 0.33333334]
[0.413998   0.32727832 0.2587237 ]
[0.44992    0.32819405 0.22188595]
[0.5284719  0.2905868  0.18094125]

softmax() 函数也可以是列表类型,如下

python
wn = tf.nn.softmax([w1, w2, w3])

1.7 线性问题

“三好学生”问题是一个线性问题,满足

y=wxy = wx

更多的线性问题使用

y=wx+by = wx + b

这样适应性更强的公式来处理,参数 bb 称为 偏置(bias)。

解决非线性问题时,我们先通过线性模型作为神经元的节点,然后作非线性处理。