Tensorflow:不提供梯度的自定义损失函数
Posted
技术标签:
【中文标题】Tensorflow:不提供梯度的自定义损失函数【英文标题】:Tensorflow: Custom Loss Function Not Providing Gradients 【发布时间】:2021-10-26 21:59:20 【问题描述】:我正在尝试在 tensorflow 中定义一个自定义损失函数,该函数根据 this post 的答案来惩罚误报和误报。我必须修改计算特异性和回忆的代码,因为我有一个多类分类问题,而帖子中的问题只是二进制分类。如果重要的话,我正在使用存储在 ImageDataGenerator
对象中的图像进行训练。
损失函数的作用如下。
-
将
y_pred
中的logits 和y_true
中的one-hot 编码类转换为每个批次的稀疏数字向量(例如[0, 2, 1, 1])
为真阳性、真阴性、假阳性和假阴性实例化计数器(TPx、TNx、FPx、FNx - x 为 0、1 或 2,具体取决于类别)。庞大的 if
和 elif
语句基本上计算了混淆矩阵中的每个点,因为 3x3 混淆矩阵比 2x2 混淆矩阵复杂得多。它只是将每个类的指标(TP_g、TN_g、FP_g、FN_g)相加得到总指标。
将添加的指标转换为张量流张量(我从上述帖子中窃取了该部分)
计算特异性和召回率,然后从 1.0
中减去加权和以返回批次的总损失。
这是我定义的损失函数:
def myLossFcn(y_true, y_pred, recall_weight, spec_weight):
#benign ==0
#hyperplastic ==1
#neoplastic ==2
y_true = np.argmax(y_true, axis=1)
y_pred = np.argmax(y_pred, axis=1)
y_true = tensorflow.cast(y_true, tensorflow.float32)
y_pred = tensorflow.cast(y_pred, tensorflow.float32)
print('y_true:', y_true)
print('y_pred:', y_pred)
#true positives for all classes
TP0 = 0
TP1 = 0
TP2 = 0
for i in range(len(y_true)):
if y_true[i] == 0 and y_pred[i] == 0:
TP0 += 1 #benign true positive
elif y_true[i] == 1 and y_pred[i] == 1:
TP1 += 1 #hyperplastic true positive
elif y_true[i] == 2 and y_pred[i] ==2: #neoplastic true positive
TP2 += 1
TP_g = TP0 + TP1 + TP2 #num true positives total (per batch)
#true negatives for all classes
TN0 = 0
TN1 = 0
TN2 = 0
for i in range(len(y_true)):
if (y_true[i] == 1 and y_pred[i] == 1) or (y_true[i] == 1 and y_pred[i] == 2) or (y_true[i] == 2 and y_pred[i] == 1) or (y_true[i] == 2 and y_pred[i] == 2):
TN0 +=1
elif (y_true[i] == 0 and y_pred[i] == 0) or (y_true[i] == 0 and y_pred[i] == 2) or (y_true[i] == 2 and y_pred[i] == 0) or (y_true[i] == 2 and y_pred[i] == 2):
TN1 +=1
elif (y_true[i] == 0 and y_pred[i] == 0) or (y_true[i] == 0 and y_pred[i] == 1) or (y_true[i] == 1 and y_pred[i] == 0) or (y_true[i] == 1 and y_pred[i] == 1):
TN2 +=1
TN_g = TN0 + TN1 + TN2
#false positives for all classes
FP0 = 0
FP1 = 0
FP2 = 0
for i in range(len(y_true)):
if (y_true[i] == 0 and y_pred[i] == 1) or (y_true[i] == 0 and y_pred[i] == 2):
FP0 +=1
elif (y_true[i] == 1 and y_pred[i] == 0) or (y_true[i] == 1 and y_pred[i] == 2):
FP1 +=1
elif (y_true[i] == 0 and y_pred[i] == 2) or (y_true[i] == 1 and y_pred[i] == 2):
FP2 +=1
FP_g = FP0 + FP1 + FP2
#false negatives for all classes
FN0 = 0
FN1 = 0
FN2 = 0
for i in range(len(y_true)):
if (y_true[i] == 0 and y_pred[i] == 1) or (y_true[i] == 0 and y_pred[i] == 2):
FN0 +=1
elif (y_true[i] == 1 and y_pred[i] == 0) or (y_true[i] == 1 and y_pred[i] == 2):
FN1 += 1
elif (y_true[i] == 0 and y_pred[i] == 1) or (y_true[i] == 1 and y_pred[i] == 2):
FN2 +=1
FN_g = FN0 + FN1 + FN2
#Converted as Keras Tensors
TP_g = K.sum(K.variable(TP_g))
TN_g = K.sum(K.variable(TN_g))
FP_g = K.sum(K.variable(FP_g))
FN_g = K.sum(K.variable(FN_g))
print(TP_g)
print(TN_g)
print(FP_g)
print(FN_g)
specificity = TN_g / (TN_g + FP_g + K.epsilon())
recall = TP_g / (TP_g + FN_g + K.epsilon())
print('spec:', specificity)
print('recall:', recall)
loss = 1.0 - (recall_weight*recall + spec_weight*specificity)
print('loss:', loss)
return tensorflow.constant(loss)
在上一篇文章之后,我实例化了一个函数包装器以传递权重以获得特异性和召回率,然后开始训练:
def custom_loss(recall_weight, spec_weight):
def recall_spec_loss(y_true, y_pred):
return myLossFcn(y_true, y_pred, recall_weight, spec_weight)
return recall_spec_loss
model = tensorflow.keras.applications.resnet50.ResNet50(weights=None,
input_shape=(100,100,1),
pooling=max,
classes=3)
loss = custom_loss(recall_weight=0.9, spec_weight=0.1)
model.compile(optimizer=hyperparameters['optimizer'],
loss=loss,
metrics=['accuracy', tensorflow.keras.metrics.FalseNegatives()],
run_eagerly=True)
history = model.fit(train_set,
epochs=50,
callbacks=[model_checkpoint],
validation_data=val_set,
verbose=2)
当我运行我的代码时,我得到一个错误提示
ValueError:没有为任何变量提供渐变:[为简洁起见,我不会复制+粘贴它列出的所有渐变名称]
我还将发布我收到的输出以及该错误消息的回溯:
Found 625 images belonging to 3 classes.
Found 376 images belonging to 3 classes.
Found 252 images belonging to 3 classes.
Epoch 1/50
y_true: tf.Tensor([0. 2. 1. 0.], shape=(4,), dtype=float32)
y_pred: tf.Tensor([0. 0. 0. 0.], shape=(4,), dtype=float32)
tf.Tensor(2.0, shape=(), dtype=float32)
tf.Tensor(4.0, shape=(), dtype=float32)
tf.Tensor(1.0, shape=(), dtype=float32)
tf.Tensor(1.0, shape=(), dtype=float32)
spec: tf.Tensor(0.8, shape=(), dtype=float32)
recall: tf.Tensor(0.6666667, shape=(), dtype=float32)
loss: tf.Tensor(0.32, shape=(), dtype=float32)
Traceback (most recent call last):
File "/home/d/dsussman/dsherman/endo_git_v2/justin_method.py", line 253, in <module>
verbose=2)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 1178, in fit
tmp_logs = self.train_function(iterator)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 850, in train_function
return step_function(self, iterator)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 840, in step_function
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/distribute/distribute_lib.py", line 1285, in run
return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/distribute/distribute_lib.py", line 2833, in call_for_each_replica
return self._call_for_each_replica(fn, args, kwargs)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/distribute/distribute_lib.py", line 3608, in _call_for_each_replica
return fn(*args, **kwargs)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/autograph/impl/api.py", line 597, in wrapper
return func(*args, **kwargs)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 833, in run_step
outputs = model.train_step(data)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 794, in train_step
self.optimizer.minimize(loss, self.trainable_variables, tape=tape)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py", line 530, in minimize
return self.apply_gradients(grads_and_vars, name=name)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py", line 630, in apply_gradients
grads_and_vars = optimizer_utils.filter_empty_gradients(grads_and_vars)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/utils.py", line 76, in filter_empty_gradients
([v.name for _, v in grads_and_vars],))
ValueError: No gradients provided for any variable:
我在网上找了好久都没有用。如this post 中所述,我已确保我的所有变量都是张量,并查看了this post,但我并不真正理解解决方案的含义:
请记住,调用您编写的 python 函数 (custom_loss) 来生成和编译 C 函数。编译函数是训练期间调用的函数。当您调用 python custom_loss 函数时,参数是没有附加数据的张量对象。
K.eval
调用将失败,K.shape
调用也将失败
我什至不确定第二个帖子是否相关,但这是我在互联网上能找到的全部内容。我希望解决方案像我忘记做一些非常明显的事情一样简单,或者改变一些容易的事情,但对于我的生活,我无法弄清楚出了什么问题。
非常感谢任何帮助
编辑
我已经更新了我的损失函数,以便所有中间值都是 dtype float32 的张量流张量并且收到相同的错误:
def myLossFcn(y_true, y_pred, recall_weight, spec_weight):
#benign ==0
#hyperplastic ==1
#neoplastic ==2
print('y_true:', y_true)
print('y_pred:', y_pred)
tp = tensorflow.keras.metrics.TruePositives()
tp.update_state(y_pred, y_true)
TP_g = tp.result()
tn = tensorflow.metrics.TrueNegatives()
tn.update_state(y_pred, y_true)
TN_g = tn.result()
fp = tensorflow.keras.metrics.FalsePositives()
fp.update_state(y_pred, y_true)
FP_g = fp.result()
fn = tensorflow.keras.metrics.FalseNegatives()
fn.update_state(y_pred, y_true)
FN_g= fn.result()
print(TP_g)
print(TN_g)
print(FP_g)
print(FN_g)
#Converted as Keras Tensors
TP_g = K.sum(K.variable(TP_g))
TN_g = K.sum(K.variable(TN_g))
FP_g = K.sum(K.variable(FP_g))
FN_g = K.sum(K.variable(FN_g))
print(TP_g)
print(TN_g)
print(FP_g)
print(FN_g)
specificity = TN_g / (TN_g + FP_g + K.epsilon())
recall = TP_g / (TP_g + FN_g + K.epsilon())
print('spec:', specificity)
print('recall:', recall)
loss = 1.0 - (recall_weight*recall + spec_weight*specificity)
print('loss:', loss)
return tensorflow.constant(loss) #probably not a tensorflow scalar atm
我打印了两次指标,看看K.sum(K.variable(**METRIC**))
是否有任何影响。
这是输出:
tf.Tensor(8.0, shape=(), dtype=float32)
tf.Tensor(4.0, shape=(), dtype=float32)
tf.Tensor(0.0, shape=(), dtype=float32)
tf.Tensor(0.0, shape=(), dtype=float32)
tf.Tensor(8.0, shape=(), dtype=float32)
spec: tf.Tensor(0.0, shape=(), dtype=float32)
recall: tf.Tensor(0.33333334, shape=(), dtype=float32)
loss: tf.Tensor(0.7, shape=(), dtype=float32)
Traceback (most recent call last):
File "/home/d/dsussman/dsherman/endo_git_v2/justin_method.py", line 282, in <module>
verbose=2)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 1178, in fit
tmp_logs = self.train_function(iterator)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 850, in train_function
return step_function(self, iterator)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 840, in step_function
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/distribute/distribute_lib.py", line 1285, in run
return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/distribute/distribute_lib.py", line 2833, in call_for_each_replica
return self._call_for_each_replica(fn, args, kwargs)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/distribute/distribute_lib.py", line 3608, in _call_for_each_replica
return fn(*args, **kwargs)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/autograph/impl/api.py", line 597, in wrapper
return func(*args, **kwargs)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 833, in run_step
outputs = model.train_step(data)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 794, in train_step
self.optimizer.minimize(loss, self.trainable_variables, tape=tape)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py", line 530, in minimize
return self.apply_gradients(grads_and_vars, name=name)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py", line 630, in apply_gradients
grads_and_vars = optimizer_utils.filter_empty_gradients(grads_and_vars)
File "/home/d/dsussman/dsherman/.conda/envs/myNewEnv/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/utils.py", line 76, in filter_empty_gradients
([v.name for _, v in grads_and_vars],))
ValueError: No gradients provided for any variable:
【问题讨论】:
***.com/questions/61894755/… 问题在于 if 和 for 语句 这里有多个问题,首先是loss必须用tensorflow来实现,而不是numpy,而且计算TPs、FPs、TNs等是不可微分的,这是个数学问题。跨度> 感谢您的解释,我会尝试更新这篇文章 【参考方案1】:这是因为您定义的损失函数不支持收敛为答案,并且其导数具有不连续性。即当您将错误转换为零和一时。 梯度下降需要逐渐收敛到一个答案,而离散损失函数对此无济于事。
【讨论】:
我该如何解决这个问题?只使用keras.backend
函数或其他什么?以上是关于Tensorflow:不提供梯度的自定义损失函数的主要内容,如果未能解决你的问题,请参考以下文章
Tensorflow:实现新的损失函数返回“ValueError:没有为任何变量提供梯度”
TensorFlow 2 自定义损失:“没有为任何变量提供梯度”错误
Keras 中的自定义损失函数 - 遍历 TensorFlow