只有存在时才可以恢复变量吗?这样做的最习惯的方式是什么?TensorFlow – 恢复(如果存在)
例如,考虑下面的小例子:
import tensorflow as tf
import glob
import sys
import os
with tf.variable_scope('volatile'):
x = tf.get_variable('x', initializer=0)
with tf.variable_scope('persistent'):
y = tf.get_variable('y', initializer=0)
add1 = tf.assign_add(y, 1)
saver = tf.train.Saver(tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, 'persistent'))
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
tf.get_default_graph().finalize()
print('save file', sys.argv[1])
if glob.glob(sys.argv[1] + '*'):
saver.restore(sess, sys.argv[1])
print(sess.run(y))
sess.run(add1)
print(sess.run(y))
saver.save(sess, sys.argv[1])
当使用相同的参数运行两次,程序首先打印0\n1
然后1\n2
预期。现在假设您通过在persistent
范围内的add1
之后添加z = tf.get_variable('z', initializer=0)
来更新您的代码以具有新功能。再次运行这个时候,老保存文件存在将具有以下突破:
NotFoundError(见上文回溯):在检查点没有找到关键持久/ Z [节点:保存/ RestoreV2_1 = RestoreV2 [dtypes = [DT_INT32],_device =“/ job:localhost/replica:0/task:0/device:CPU:0”](_ arg_save/Const_0_0,save/RestoreV2_1/tensor_names,save/RestoreV2_1/shape_andslices)]] [ Node:save/Assign_1/_18 = _Recvclient_terminated = false,recv_device =“/ job:localhost/replica:0/task:0/device:GPU:0”,send_device =“/ job:localhost/replica:0/task:0/device:CPU:0“,send_device_incarnation = 1,tensor_name =”edge_12_save/Assign_1“,tensor_type = DT_FLOAT,_device =”/ job:localhost/replica:0/task:0/device:GPU:0“]]
===========解决方案如下:
您可以使用下面的功能恢复(从here拍摄):
<code class="prettyprint-override">def optimistic_restore(session, save_file, graph=tf.get_default_graph()): reader = tf.train.NewCheckpointReader(save_file) saved_shapes = reader.get_variable_to_shape_map() var_names = sorted([(var.name, var.name.split(':')[0]) for var in tf.global_variables() if var.name.split(':')[0] in saved_shapes]) restore_vars = [] for var_name, saved_var_name in var_names: curr_var = graph.get_tensor_by_name(var_name) var_shape = curr_var.get_shape().as_list() if var_shape == saved_shapes[saved_var_name]: restore_vars.append(curr_var) opt_saver = tf.train.Saver(restore_vars) opt_saver.restore(session, save_file) </code>
我通常运行sess.run(tf.global_variables_initializer())
,以确保所有的变量初始化,然后我跑optimistic_restore(sess,...)
恢复它可以是变量恢复。