【译】Effective TensorFlow Chapter1——TensorFlow 基础

Tensorflow

本文翻译自：《TensorFlow Basics》，如有侵权请联系删除，仅限于学术交流，请勿商用。如有谬误，请联系指出。

TensorFlow 和其他数值计算库（如 NumPy）之间最显著的区别在于 TensorFlow 中的操作是基于符号运算的。这是一个强大的概念，它允许 TensorFlow 执行命令式库（如 NumPy）所不能做的所有事情（例如，自动区分）。但这也要付出更大的代价。在我我试图揭秘 TensorFlow，并提供一些指导方针和最佳实践，以便更有效地使用 TensorFlow。让我们从一个简单的例子开始，我们要乘以两个随机矩阵。首先，我们看一个在 NumPy 完成的实施：

import numpy as np

x = np.random.normal(size=[10, 10])
y = np.random.normal(size=[10, 10])
z = np.dot(x, y)

print(z)

现在我们在 TensorFlow 中执行完全相同的计算：

import tensorflow as tf

x = tf.random_normal([10, 10])
y = tf.random_normal([10, 10])
z = tf.matmul(x, y)

sess = tf.Session()
z_val = sess.run(z)

print(z_val)

与立即执行计算并生成结果的 NumPy 不同， tensorflow 只向图中表示结果的节点提供一个 handle（类型为 Tensor）。如果我们尝试直接打印 Z 值，我们会得到这样的结果：

Tensor("MatMul:0", shape = (10, 10), dtype = float32)

由于两个输入都具有完全定义的形状，所以 tensorflow 能够推断 张量 的形状及其类型。为了计算张量的值，我们需要创建一个 会话 并使用 Session.run() 方法评估它。

提示：当使用 Jupyter 笔记本时，请确保在定义新节点之前在开始调用 tf.reset_default_graph() 来清除符号图。

为了理解符号计算有多么强大，让我们看看另一个例子。假设我们有来自曲线（例如 f(x)=5x^2+3 ）的样本，并且我们希望基于这些样本预测 f(x) 。我们定义一个参数函数 g(x, w) = w0 x^2 + w1 x + w2 ，它是输入 x 和潜在参数 w 的函数，我们的目标是找到潜在的参数，使得 g(x，w)≈f(x) 。这可以通过最小化损失函数来完成： L(w) = ∑ (f(x) - g(x, w))^2. 。虽然这个简单问题有一个闭合解(a closed form solution)，但我们选择使用更通用的方法，可以应用于任意的可微函数，并且使用随机梯度下降。我们简单地计算 L(w)相对于一组采样点上的 w 的平均梯度，并沿相反方向移动。

以下是利用 TensorFlow 完成的代码：

import numpy as np
import tensorflow as tf

# Placeholders are used to feed values from python to TensorFlow ops. We define
# two placeholders, one for input feature x, and one for output y.
x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)

# Assuming we know that the desired function is a polynomial of 2nd degree, we
# allocate a vector of size 3 to hold the coefficients. The variable will be
# automatically initialized with random noise.
w = tf.get_variable("w", shape=[3, 1])

# We define yhat to be our estimate of y.
f = tf.stack([tf.square(x), x, tf.ones_like(x)], 1)
yhat = tf.squeeze(tf.matmul(f, w), 1)

# The loss is defined to be the l2 distance between our estimate of y and its
# true value. We also added a shrinkage term, to ensure the resulting weights
# would be small.
loss = tf.nn.l2_loss(yhat - y) + 0.1 * tf.nn.l2_loss(w)

# We use the Adam optimizer with learning rate set to 0.1 to minimize the loss.
train_op = tf.train.AdamOptimizer(0.1).minimize(loss)

def generate_data():
    x_val = np.random.uniform(-10.0, 10.0, size=100)
    y_val = 5 * np.square(x_val) + 3
    return x_val, y_val

sess = tf.Session()
# Since we are using variables we first need to initialize them.
sess.run(tf.global_variables_initializer())
for _ in range(1000):
    x_val, y_val = generate_data()
    _, loss_val = sess.run([train_op, loss], {x: x_val, y: y_val})
    print(loss_val)
print(sess.run([w]))

通过运行这段代码，你应该看到接近这个的结果：

[4.9924135, 0.00040895029, 3.4504161]

这是我们参数的一个相对接近的近似值。

对于 TensorFlow 来说，这只是冰山一角。许多问题，例如优化具有数百万参数的大型神经网络，只需几行代码就可以在 TensorFlow 中高效实现。 TensorFlow 考虑到了多个设备和线程的扩展，并支持各种平台。