将 shap 与 SimpleRNN 顺序模型一起使用时出错

Posted

技术标签:

【中文标题】将 shap 与 SimpleRNN 顺序模型一起使用时出错【英文标题】:Error using shap with SimpleRNN sequential model 【发布时间】:2021-01-31 17:00:15 【问题描述】:

在下面的代码中,我导入了一个保存的稀疏 numpy 矩阵,使用 python 创建,对其进行致密化,将掩码、batchnorm 和密集输出层添加到多对一 SimpleRNN。 keras 顺序模型工作正常,但是,我无法使用 shap。这是在 Windows 10 桌面上的 Winpython 3830 在 Jupyter 实验室中运行的。 X 矩阵的形状为 (4754, 500, 64):4754 个示例,具有 500 个时间步长和 64 个变量。我创建了一个函数来模拟数据,以便可以测试代码。模拟数据返回同样的错误。

from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow.keras.models import Sequential
import tensorflow.keras.backend as Kb
from tensorflow.keras import layers
from tensorflow.keras.layers import BatchNormalization
from tensorflow import keras as K
import numpy as np
import shap
import random

def create_x():
    dims = [10,500,64]
    data = []
    y = []
    for i in range(dims[0]):
        data.append([])

        for j in range(dims[1]):
            data[i].append([])
            for k in range(dims[2]):
                isnp = random.random()
                if isnp > .2:
                    data[i][j].append(np.nan)
                else:
                    data[i][j].append(random.random())
        if isnp > .5:
            y.append(0)
        else:
            y.append(1)
    return np.asarray(data), np.asarray(y)

def first_valid(arr, axis, invalid_val=0):
    #return the 2nd index of 3 for  the first non np.nan on the 3rd axis
    mask = np.invert(np.isnan(arr))
    return np.where(mask.any(axis=axis), mask.argmax(axis=axis), invalid_val)

def densify_np(X):
    X_copy = np.empty_like (X)
    X_copy[:] = X
    #loop over the first index
    for i in range(len(X_copy)):

        old_row = []
        #get the 2nd index of the first valid value for each 3rd index
        indices = first_valid(X_copy[i,:,:],axis=0, invalid_val=0)
        for j in range(len(indices)):
            if np.isnan(X_copy[i,indices[j],j]):
                old_row.append(0)
            else:
                old_row.append(X_copy[i,indices[j],j])
        X_copy[i,0,:]= old_row
        for k in range(1,len(X_copy[i,:])):
            for l in range(len(X_copy[i,k,:])):
                if np.isnan(X_copy[i,k,l]):
                    X_copy[i,k,l] = X_copy[i,k-1,l]
           
    return(X_copy)
#this is what I do in the actual code
#X = np.load('C:/WinPython/WPy64-3830/data/X.npy')
#Y = np.load('C:/WinPython/WPy64-3830/scripts/Y.npy')

#simulated junk data
X, Y = create_x()

#create a dense matrix from the sparse one.
X = densify_np(X)

seed = 7
np.random.seed(seed)
array_size = 64
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.33, random_state=seed)
batch = 64
model = Sequential()


model.add(layers.Input(shape=(500,array_size)))
model.add(layers.Masking(mask_value=0.,input_shape=(500, array_size)))
model.add(BatchNormalization())
model.add(layers.SimpleRNN(1, activation=None, dropout = 0, recurrent_dropout=.2))
model.add(layers.Dense(1, activation = 'sigmoid'))
opt = K.optimizers.Adam(learning_rate=.001)

model.compile(loss='binary_crossentropy', optimizer=opt)
model.fit(X_train, y_train.astype(int), validation_data=(X_test,y_test.astype(int)), epochs=25, batch_size=batch)

explainer = shap.DeepExplainer(model, X_test)
shap_values = explainer.shap_values(X_train)

运行最后一行来创建 shap_values 会产生以下错误。

StagingError                              Traceback (most recent call last)
<ipython-input-6-f789203da9c8> in <module>
      1 import shap
      2 explainer = shap.DeepExplainer(model, X_test)
----> 3 shap_values = explainer.shap_values(X_train)
      4 print('done')

C:\WinPython\WPy64-3830\python-3.8.3.amd64\lib\site-packages\shap\explainers\deep\__init__.py in shap_values(self, X, ranked_outputs, output_rank_order, check_additivity)
    117         were chosen as "top".
    118         """
--> 119         return self.explainer.shap_values(X, ranked_outputs, output_rank_order, check_additivity=check_additivity)

C:\WinPython\WPy64-3830\python-3.8.3.amd64\lib\site-packages\shap\explainers\deep\deep_tf.py in shap_values(self, X, ranked_outputs, output_rank_order, check_additivity)
    302                 # run attribution computation graph
    303                 feature_ind = model_output_ranks[j,i]
--> 304                 sample_phis = self.run(self.phi_symbolic(feature_ind), self.model_inputs, joint_input)
    305 
    306                 # assign the attributions to the right part of the output arrays

C:\WinPython\WPy64-3830\python-3.8.3.amd64\lib\site-packages\shap\explainers\deep\deep_tf.py in run(self, out, model_inputs, X)
    359 
    360                 return final_out
--> 361             return self.execute_with_overridden_gradients(anon)
    362 
    363     def custom_grad(self, op, *grads):

C:\WinPython\WPy64-3830\python-3.8.3.amd64\lib\site-packages\shap\explainers\deep\deep_tf.py in execute_with_overridden_gradients(self, f)
    395         # define the computation graph for the attribution values using a custom gradient-like computation
    396         try:
--> 397             out = f()
    398         finally:
    399             # reinstate the backpropagatable check

C:\WinPython\WPy64-3830\python-3.8.3.amd64\lib\site-packages\shap\explainers\deep\deep_tf.py in anon()
    355                     v = tf.constant(data, dtype=self.model_inputs[i].dtype)
    356                     inputs.append(v)
--> 357                 final_out = out(inputs)
    358                 tf_execute.record_gradient = tf_backprop._record_gradient
    359 

C:\WinPython\WPy64-3830\python-3.8.3.amd64\lib\site-packages\tensorflow\python\eager\def_function.py in __call__(self, *args, **kwds)
    778       else:
    779         compiler = "nonXla"
--> 780         result = self._call(*args, **kwds)
    781 
    782       new_tracing_count = self._get_tracing_count()

C:\WinPython\WPy64-3830\python-3.8.3.amd64\lib\site-packages\tensorflow\python\eager\def_function.py in _call(self, *args, **kwds)
    821       # This is the first call of __call__, so we have to initialize.
    822       initializers = []
--> 823       self._initialize(args, kwds, add_initializers_to=initializers)
    824     finally:
    825       # At this point we know that the initialization is complete (or less

C:\WinPython\WPy64-3830\python-3.8.3.amd64\lib\site-packages\tensorflow\python\eager\def_function.py in _initialize(self, args, kwds, add_initializers_to)
    694     self._graph_deleter = FunctionDeleter(self._lifted_initializer_graph)
    695     self._concrete_stateful_fn = (
--> 696         self._stateful_fn._get_concrete_function_internal_garbage_collected(  # pylint: disable=protected-access
    697             *args, **kwds))
    698 

C:\WinPython\WPy64-3830\python-3.8.3.amd64\lib\site-packages\tensorflow\python\eager\function.py in _get_concrete_function_internal_garbage_collected(self, *args, **kwargs)
   2853       args, kwargs = None, None
   2854     with self._lock:
-> 2855       graph_function, _, _ = self._maybe_define_function(args, kwargs)
   2856     return graph_function
   2857 

C:\WinPython\WPy64-3830\python-3.8.3.amd64\lib\site-packages\tensorflow\python\eager\function.py in _maybe_define_function(self, args, kwargs)
   3211 
   3212       self._function_cache.missed.add(call_context_key)
-> 3213       graph_function = self._create_graph_function(args, kwargs)
   3214       self._function_cache.primary[cache_key] = graph_function
   3215       return graph_function, args, kwargs

C:\WinPython\WPy64-3830\python-3.8.3.amd64\lib\site-packages\tensorflow\python\eager\function.py in _create_graph_function(self, args, kwargs, override_flat_arg_shapes)
   3063     arg_names = base_arg_names + missing_arg_names
   3064     graph_function = ConcreteFunction(
-> 3065         func_graph_module.func_graph_from_py_func(
   3066             self._name,
   3067             self._python_function,

C:\WinPython\WPy64-3830\python-3.8.3.amd64\lib\site-packages\tensorflow\python\framework\func_graph.py in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, autograph, autograph_options, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, override_flat_arg_shapes)
    984         _, original_func = tf_decorator.unwrap(python_func)
    985 
--> 986       func_outputs = python_func(*func_args, **func_kwargs)
    987 
    988       # invariant: `func_outputs` contains only Tensors, CompositeTensors,

C:\WinPython\WPy64-3830\python-3.8.3.amd64\lib\site-packages\tensorflow\python\eager\def_function.py in wrapped_fn(*args, **kwds)
    598         # __wrapped__ allows AutoGraph to swap in a converted function. We give
    599         # the function a weak reference to itself to avoid a reference cycle.
--> 600         return weak_wrapped_fn().__wrapped__(*args, **kwds)
    601     weak_wrapped_fn = weakref.ref(wrapped_fn)
    602 

C:\WinPython\WPy64-3830\python-3.8.3.amd64\lib\site-packages\tensorflow\python\framework\func_graph.py in wrapper(*args, **kwargs)
    971           except Exception as e:  # pylint:disable=broad-except
    972             if hasattr(e, "ag_error_metadata"):
--> 973               raise e.ag_error_metadata.to_exception(e)
    974             else:
    975               raise

StagingError: in user code:

    C:\WinPython\WPy64-3830\python-3.8.3.amd64\lib\site-packages\shap\explainers\deep\deep_tf.py:244 grad_graph  *
        x_grad = tape.gradient(out, shap_rAnD)
    C:\WinPython\WPy64-3830\python-3.8.3.amd64\lib\site-packages\tensorflow\python\eager\backprop.py:1067 gradient  **
        flat_grad = imperative_grad.imperative_grad(
    C:\WinPython\WPy64-3830\python-3.8.3.amd64\lib\site-packages\tensorflow\python\eager\imperative_grad.py:71 imperative_grad
        return pywrap_tfe.TFE_Py_TapeGradient(
    C:\WinPython\WPy64-3830\python-3.8.3.amd64\lib\site-packages\tensorflow\python\eager\backprop.py:151 _gradient_function
        grad_fn = ops._gradient_registry.lookup(op_name)  # pylint: disable=protected-access
    C:\WinPython\WPy64-3830\python-3.8.3.amd64\lib\site-packages\tensorflow\python\framework\registry.py:96 lookup
        raise LookupError(

    LookupError: gradient registry has no entry for: shap_TensorListStack

【问题讨论】:

似乎是一个常见的问题,参见例如#1110 或 #1490. 我已经看到了这些问题。我的代码在移除了屏蔽和批处理规范后生成了相同的错误。 你的 shap、tensorflow 和 keras 版本是什么? shap 0.35.0,tensoflow 2.3.0。 Keras 2.3.1 【参考方案1】:

shap repo 的所有者said:

这里的根本问题是 DeepExplainer 还不支持 TF 2.0。

那是 2019 年 12 月 11 日。现在还是这样吗?用 Tensorflow 1.15 试试看是否可行。

shap repo 上的另一个 issue 对此(2020 年 6 月 2 日)说:

好的,谢谢。我没有看到 Lundberg 的帖子。在新版本的 SHAP 发布之前,我将坚持使用 TF 1.15 的解决方法。

【讨论】:

以上是关于将 shap 与 SimpleRNN 顺序模型一起使用时出错的主要内容,如果未能解决你的问题,请参考以下文章

真香!利用 Shap 可完美实现机器学习模型输出可视化!

ML之shap:基于boston波士顿房价回归预测数据集利用shap值对XGBoost模型实现可解释性案例

可以使用帮助为 Keras SimpleRNN 正确格式化数据

ML之shap:基于boston波士顿房价回归预测数据集利用Shap值对LiR线性回归模型实现可解释性案例

使用 lightgbm Tweedie 目标将 SHAP 值从原始单位转换为原生单位?

为啥 SHAP 的 Deep Explainer 在 ResNet-50 预训练模型上失败?