使用具有附加属性的自定义层保存和加载 keras 模型
Posted
技术标签:
【中文标题】使用具有附加属性的自定义层保存和加载 keras 模型【英文标题】:save and load keras model with custom layer with additional attributes 【发布时间】:2020-07-25 01:45:06 【问题描述】:我创建了一个自定义层DenseWithMask
,它是Dense
的子类。它还有一些属性,包括我称之为edge_mask
的属性。所以这段代码工作得很好:
new_layer = DenseWithMask(10)
print(new_layer.edge_mask)
但是,如果我使用DenseWithMask
制作模型,然后保存并再次加载,则图层没有edge_mask
属性。 (我添加的所有其他属性也都丢失了。)
这是一个示例(本文末尾的 DenseWithMask
代码):
import tensorflow as tf
from DenseWithMask import DenseWithMask
# make a model
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
DenseWithMask(128, activation='relu'),
DenseWithMask(10)])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
# try accessing edge_mask
print('edge_mask:',model.layers[1].edge_mask)
# save model
model.save('model_with_custom_layers')
# load model
model2 = tf.keras.models.load_model('model_with_custom_layers',
custom_objects="DenseWithMask": DenseWithMask)
# try accessing attributes of custom layer
model2.layers[1].edge_mask
这会返回:
edge_mask: tf.Tensor(
[[ True True True ... True True True]
[ True True True ... True True True]
[ True True True ... True True True]
...
[ True True True ... True True True]
[ True True True ... True True True]
[ True True True ... True True True]], shape=(784, 128), dtype=bool)
INFO:tensorflow:Assets written to: model_with_custom_layers\assets
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-6-32e4901279b3> in <module>
23
24 # try accessing attributes of custom layer
---> 25 model2.layers[1].edge_mask
AttributeError: 'DenseWithMask' object has no attribute 'edge_mask'
我怎样才能让它工作?在this thread 中修复错误的行已经在我的代码中。
下面是DenseWithMask
的代码。我已经包括了整个班级,但我希望只有 __init__
和 get_config
与我的问题相关。
(在get_config
中,我将 tensorflow 常量数组的属性转换为 numpy 数组,因为save
无法将 tensorflow 常量数组写入 JSON 文件。我希望这会导致麻烦,但我目前的问题似乎独立于那个,因为我尝试不转换 get_config
中的 tensorflow 常量数组并通过 model2=from_config(model.get_config())
构建模型,并且也没有生成具有 edge_mask
作为属性的模型。)
################################################################################
# Define a keras layer class that allows for permanent pruning
################################################################################
# imports copied from keras.layers.core
from tensorflow.python.eager import context
from tensorflow.python.framework import dtypes
from tensorflow.python.framework import tensor_shape
from tensorflow.python.keras import activations
from tensorflow.python.keras import backend as K
from tensorflow.python.keras import constraints
from tensorflow.python.keras import initializers
from tensorflow.python.keras import regularizers
from tensorflow.python.keras.engine.base_layer import Layer
from tensorflow.python.keras.engine.input_spec import InputSpec
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import gen_math_ops
from tensorflow.python.ops import math_ops
from tensorflow.python.ops import nn
from tensorflow.python.ops import sparse_ops
from tensorflow.python.ops import standard_ops
# other imports
import numpy as np
import tensorflow as tf
#from tensorflow.keras.layers import *
################################################################################
# The following class is a copy of Dense in keras. I marked all the lines that
# I added or changed
class DenseWithMask(Layer):
"""Dense layer but with optional permanent masking of units or edges."""
def __init__(self,
units,
unit_mask=None, # NEW
edge_mask=None, # NEW
activation=None,
use_bias=True,
kernel_initializer='glorot_uniform',
bias_initializer='zeros',
kernel_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
kernel_constraint=None,
bias_constraint=None,
**kwargs):
if 'input_shape' not in kwargs and 'input_dim' in kwargs:
kwargs['input_shape'] = (kwargs.pop('input_dim'),)
super(DenseWithMask, self).__init__( # changed 'Dense' to 'DenseWithMask'
activity_regularizer=regularizers.get(activity_regularizer)) #, **kwargs)
self.units = int(units) if not isinstance(units, int) else units
# NEW: add unit_mask to class attributes
self.unit_mask = unit_mask
# NEW: add edge_mask to class attributes
self.edge_mask = edge_mask
# NEW: add unit_mask_indices to class attributes
self.unit_mask_indices = None
# NEW: add edge_mask_indices to class attributes
self.edge_mask_indices = None
self.activation = activations.get(activation)
self.use_bias = use_bias
self.kernel_initializer = initializers.get(kernel_initializer)
self.bias_initializer = initializers.get(bias_initializer)
self.kernel_regularizer = regularizers.get(kernel_regularizer)
self.bias_regularizer = regularizers.get(bias_regularizer)
self.kernel_constraint = constraints.get(kernel_constraint)
self.bias_constraint = constraints.get(bias_constraint)
self.supports_masking = True
self.input_spec = InputSpec(min_ndim=2)
super(DenseWithMask, self).__init__(**kwargs)
def get_config(self):
config =
'class_name': 'DenseWithMask',
'units': self.units,
'unit_mask': self.unit_mask.numpy(), #NEW: added unit_mask to config
'unit_mask_indices': self.unit_mask_indices.numpy(), # NEW
'edge_mask': self.edge_mask.numpy(), #NEW: added edge_mask to config
'edge_mask_indices': self.edge_mask_indices.numpy(), # NEW
'activation': activations.serialize(self.activation),
'use_bias': self.use_bias,
'kernel_initializer': initializers.serialize(self.kernel_initializer),
'bias_initializer': initializers.serialize(self.bias_initializer),
'kernel_regularizer': regularizers.serialize(self.kernel_regularizer),
'bias_regularizer': regularizers.serialize(self.bias_regularizer),
'activity_regularizer':
regularizers.serialize(self.activity_regularizer),
'kernel_constraint': constraints.serialize(self.kernel_constraint),
'bias_constraint': constraints.serialize(self.bias_constraint)
base_config = super(DenseWithMask, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
def build(self, input_shape):
dtype = dtypes.as_dtype(self.dtype or K.floatx())
if not (dtype.is_floating or dtype.is_complex):
raise TypeError('Unable to build `Dense` layer with non-floating point '
'dtype %s' % (dtype,))
input_shape = tensor_shape.TensorShape(input_shape)
if tensor_shape.dimension_value(input_shape[-1]) is None:
raise ValueError('The last dimension of the inputs to `Dense` '
'should be defined. Found `None`.')
last_dim = tensor_shape.dimension_value(input_shape[-1])
self.input_spec = InputSpec(min_ndim=2, axes=-1: last_dim)
# NEW: update masks in build
if self.unit_mask is None:
self.unit_mask = np.ones(self.units, dtype=bool)
else:
# check if previously set mask matches number of units
if not len(self.unit_mask) == self.units:
raise ValueError('Length of unit_mask must be equal to number of units.')
if self.edge_mask is None:
self.edge_mask = np.ones((last_dim, self.units), dtype=bool)
else:
# check if previously set mask matches input dimensions
if not self.edge_mask.shape == (last_dim, self.units):
raise ValueError('Dimensions of edge_mask must be equal to (last_input_dim, units).')
# NEW: incorporate unit_mask info into edge_mask
self.edge_mask = self.edge_mask * self.unit_mask
# NEW: incorporate edge_mask info into unit_mask
self.unit_mask = self.unit_mask * np.any(self.edge_mask, axis=0)
# NEW: update mask indices
self.edge_mask_indices = np.stack(self.edge_mask.nonzero()).T
self.edge_mask_indices = np.array(self.edge_mask_indices, dtype='int64')
self.unit_mask_indices = self.unit_mask.nonzero()[0]
# need to convert indices for bias to 2d array for sparse tensor
self.unit_mask_indices = np.stack([np.zeros(len(self.unit_mask_indices)),
self.unit_mask_indices]).T
self.unit_mask_indices = np.array(self.unit_mask_indices, dtype='int64')
# NEW: turn all new attributes into tensorflow constants
self.unit_mask = tf.constant(self.unit_mask)
self.edge_mask = tf.constant(self.edge_mask)
self.unit_mask_indices = tf.constant(self.unit_mask_indices)
self.edge_mask_indices = tf.constant(self.edge_mask_indices)
self.kernel = self.add_weight(
'kernel',
# NEW: trainable kernel weights may be fewer than before;
# they are now stored as 1D array,
shape=[int(np.sum(self.edge_mask))],
initializer=self.kernel_initializer,
regularizer=self.kernel_regularizer,
constraint=self.kernel_constraint,
dtype=self.dtype,
trainable=True)
if self.use_bias:
self.bias = self.add_weight(
'bias',
#NEW: trainable biases may be fewer than number of units
shape=[int(np.sum(self.unit_mask))],
initializer=self.bias_initializer,
regularizer=self.bias_regularizer,
constraint=self.bias_constraint,
dtype=self.dtype,
trainable=True)
else:
self.bias = None
self.built = True
def rebuild(self, edge_mask=None, unit_mask=None):
# NEW: This is a new function for rebuilding a DenseWithMask layer with a
# new edge mask and/or unit mask
# if none are given, get default values for masks from layer
if edge_mask is None:
edge_mask = self.edge_mask.numpy()
if unit_mask is None:
unit_mask = self.unit_mask.numpy()
# incorporate unit_mask info into edge_mask
edge_mask = edge_mask * unit_mask
# incorporate edge_mask info into unit_mask
unit_mask = unit_mask * np.any(edge_mask, axis=0)
# NOW: get new arrays of trainable weights for layer
# The stuff below contains slow, redundant lines but
# those might become handy when adding default values for
# new edges and nodes(?)
# create old kernel
kernel_old = np.zeros_like(self.edge_mask, dtype=float)
kernel_old[self.edge_mask.numpy()]=self.kernel.numpy()
# create new kernel
kernel_new = np.zeros_like(edge_mask, dtype=float)
for i in range(len(kernel_new)):
for j in range(len(kernel_new[0])):
if edge_mask[i,j]:
kernel_new[i,j] = kernel_old[i,j]
# create initializer for new kernel
vals = list(kernel_new[edge_mask])
initK = tf.compat.v1.keras.initializers.Constant(value=vals,
verify_shape=False)
# save new edge mask
self.edge_mask = edge_mask
# build new kernel
self.kernel = self.add_weight(
'kernel',
shape=[int(np.sum(self.edge_mask))],
initializer=initK,
regularizer=self.kernel_regularizer,
constraint=self.kernel_constraint,
dtype=self.dtype,
trainable=True)
# create old bias list
bias_old = np.zeros_like(self.unit_mask, dtype=float)
bias_old[self.unit_mask.numpy()] = self.bias.numpy()
# create new bias list
bias_new = np.zeros_like(unit_mask, dtype=float)
for i in range(len(bias_new)):
if unit_mask[i]:
bias_new[i] = bias_old[i]
# create initializer for new biases
vals = list(bias_new[unit_mask])
initB = tf.compat.v1.keras.initializers.Constant(value=vals,
verify_shape=False)
# save new unit mask
self.unit_mask = unit_mask
# build new biases
self.bias = self.add_weight(
'bias',
shape=[int(np.sum(self.unit_mask))],
initializer=initB,
regularizer=self.bias_regularizer,
constraint=self.bias_constraint,
dtype=self.dtype,
trainable=True)
# update mask indices
self.edge_mask_indices = np.stack(self.edge_mask.nonzero()).T
self.edge_mask_indices = np.array(self.edge_mask_indices, dtype='int64')
self.unit_mask_indices = self.unit_mask.nonzero()[0]
# need to convert indices for bias to 2d array for sparse tensor
self.unit_mask_indices = np.stack([np.zeros(len(self.unit_mask_indices)),
self.unit_mask_indices]).T
self.unit_mask_indices = np.array(self.unit_mask_indices, dtype='int64')
# turn all new attributes into tensorflow constants
self.unit_mask = tf.constant(self.unit_mask)
self.edge_mask = tf.constant(self.edge_mask)
self.unit_mask_indices = tf.constant(self.unit_mask_indices)
self.edge_mask_indices = tf.constant(self.edge_mask_indices)
def call(self, inputs):
# NEW: create a kernel (2D numpy array) from 1D list of trainable kernel weights
kernel = tf.SparseTensor(indices=self.edge_mask_indices,
values=self.kernel,
dense_shape=self.edge_mask.shape)
kernel = tf.sparse.to_dense(kernel)
# NEW: create a bias vector (1D numpy array) from 1D list of trainable biases
bias = tf.SparseTensor(indices=self.unit_mask_indices, values=self.bias,
dense_shape=[1,self.unit_mask.shape[0]])
bias = tf.squeeze(tf.sparse.to_dense(bias))
rank = inputs.shape.rank
if rank is not None and rank > 2:
# Broadcasting is required for the inputs.
outputs = standard_ops.tensordot(inputs, kernel, [[rank - 1], [0]]) # NEW: self.kernel -> kernel
# Reshape the output back to the original ndim of the input.
if not context.executing_eagerly():
shape = inputs.shape.as_list()
output_shape = shape[:-1] + [self.units]
outputs.set_shape(output_shape)
else:
inputs = math_ops.cast(inputs, self._compute_dtype)
if K.is_sparse(inputs):
outputs = sparse_ops.sparse_tensor_dense_matmul(inputs, kernel) # NEW: self.kernel -> kernel
else:
outputs = gen_math_ops.mat_mul(inputs, kernel) # NEW: self.kernel -> kernel
if self.use_bias:
outputs = nn.bias_add(outputs, bias) # NEW: self.bias -> bias
if self.activation is not None:
return self.activation(outputs)
return outputs
def compute_output_shape(self, input_shape):
input_shape = tensor_shape.TensorShape(input_shape)
input_shape = input_shape.with_rank_at_least(2)
if tensor_shape.dimension_value(input_shape[-1]) is None:
raise ValueError(
'The innermost dimension of input_shape must be defined, but saw: %s'
% input_shape)
return input_shape[:-1].concatenate(self.units)
【问题讨论】:
嗨,不是真正解决您的问题,所以留在 cmets,最近人们发现 tf v2 之后的新 tensorflow-keras 绑定不会更新任何自定义层的权重,例如您的,您可能想要切换到 pytorch。只是提醒一下。 谢谢@aak!你有链接到人们讨论这个问题的地方吗?看来我的权重更新得很好。 【参考方案1】:我无法使用您提供的代码重现您的错误。我还有其他几个错误可以通过执行以下操作来修复:
class_name
/ unit_mask_indices
/ edge_mask_indices
不应在 get_config
返回的字典中,因为没有 __init__
参数
unit_mask
/ edge_mask
如果不是 None
,则应转换为 __init__
中的 numpy 数组
这是一个工作代码(基于您提供的代码):
import tensorflow as tf
################################################################################
# Define a keras layer class that allows for permanent pruning
################################################################################
# imports copied from keras.layers.core
from tensorflow.python.eager import context
from tensorflow.python.framework import dtypes
from tensorflow.python.framework import tensor_shape
from tensorflow.python.keras import activations
from tensorflow.python.keras import backend as K
from tensorflow.python.keras import constraints
from tensorflow.python.keras import initializers
from tensorflow.python.keras import regularizers
from tensorflow.python.keras.engine.base_layer import Layer
from tensorflow.python.keras.engine.input_spec import InputSpec
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import gen_math_ops
from tensorflow.python.ops import math_ops
from tensorflow.python.ops import nn
from tensorflow.python.ops import sparse_ops
from tensorflow.python.ops import standard_ops
# other imports
import numpy as np
import tensorflow as tf
#from tensorflow.keras.layers import *
################################################################################
# The following class is a copy of Dense in keras. I marked all the lines that
# I added or changed
class DenseWithMask(Layer):
"""Dense layer but with optional permanent masking of units or edges."""
def __init__(
self,
units,
unit_mask=None, # NEW
edge_mask=None, # NEW
activation=None,
use_bias=True,
kernel_initializer='glorot_uniform',
bias_initializer='zeros',
kernel_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
kernel_constraint=None,
bias_constraint=None,
**kwargs
):
if 'input_shape' not in kwargs and 'input_dim' in kwargs:
kwargs['input_shape'] = (kwargs.pop('input_dim'),)
super(DenseWithMask, self).__init__( # changed 'Dense' to 'DenseWithMask'
activity_regularizer=regularizers.get(activity_regularizer)) #, **kwargs)
self.units = int(units) if not isinstance(units, int) else units
# NEW: add unit_mask to class attributes
self.unit_mask = np.array(unit_mask) if unit_mask is not None else None
# NEW: add edge_mask to class attributes
self.edge_mask = np.array(edge_mask) if edge_mask is not None else None
# NEW: add unit_mask_indices to class attributes
self.unit_mask_indices = None
# NEW: add edge_mask_indices to class attributes
self.edge_mask_indices = None
self.activation = activations.get(activation)
self.use_bias = use_bias
self.kernel_initializer = initializers.get(kernel_initializer)
self.bias_initializer = initializers.get(bias_initializer)
self.kernel_regularizer = regularizers.get(kernel_regularizer)
self.bias_regularizer = regularizers.get(bias_regularizer)
self.kernel_constraint = constraints.get(kernel_constraint)
self.bias_constraint = constraints.get(bias_constraint)
self.supports_masking = True
self.input_spec = InputSpec(min_ndim=2)
super(DenseWithMask, self).__init__(**kwargs)
def get_config(self):
config =
'units': self.units,
'unit_mask': self.unit_mask.numpy(), #NEW: added unit_mask to config
'edge_mask': self.edge_mask.numpy(), #NEW: added edge_mask to config
'activation': activations.serialize(self.activation),
'use_bias': self.use_bias,
'kernel_initializer': initializers.serialize(self.kernel_initializer),
'bias_initializer': initializers.serialize(self.bias_initializer),
'kernel_regularizer': regularizers.serialize(self.kernel_regularizer),
'bias_regularizer': regularizers.serialize(self.bias_regularizer),
'activity_regularizer':
regularizers.serialize(self.activity_regularizer),
'kernel_constraint': constraints.serialize(self.kernel_constraint),
'bias_constraint': constraints.serialize(self.bias_constraint)
base_config = super(DenseWithMask, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
def build(self, input_shape):
dtype = dtypes.as_dtype(self.dtype or K.floatx())
if not (dtype.is_floating or dtype.is_complex):
raise TypeError('Unable to build `Dense` layer with non-floating point '
'dtype %s' % (dtype,))
input_shape = tensor_shape.TensorShape(input_shape)
if tensor_shape.dimension_value(input_shape[-1]) is None:
raise ValueError('The last dimension of the inputs to `Dense` '
'should be defined. Found `None`.')
last_dim = tensor_shape.dimension_value(input_shape[-1])
self.input_spec = InputSpec(min_ndim=2, axes=-1: last_dim)
# NEW: update masks in build
if self.unit_mask is None:
self.unit_mask = np.ones(self.units, dtype=bool)
else:
# check if previously set mask matches number of units
if not len(self.unit_mask) == self.units:
raise ValueError('Length of unit_mask must be equal to number of units.')
if self.edge_mask is None:
self.edge_mask = np.ones((last_dim, self.units), dtype=bool)
else:
# check if previously set mask matches input dimensions
if not self.edge_mask.shape == (last_dim, self.units):
raise ValueError('Dimensions of edge_mask must be equal to (last_input_dim, units).')
# NEW: incorporate unit_mask info into edge_mask
self.edge_mask = self.edge_mask * self.unit_mask
# NEW: incorporate edge_mask info into unit_mask
self.unit_mask = self.unit_mask * np.any(self.edge_mask, axis=0)
# NEW: update mask indices
self.edge_mask_indices = np.stack(self.edge_mask.nonzero()).T
self.edge_mask_indices = np.array(self.edge_mask_indices, dtype='int64')
self.unit_mask_indices = self.unit_mask.nonzero()[0]
# need to convert indices for bias to 2d array for sparse tensor
self.unit_mask_indices = np.stack([np.zeros(len(self.unit_mask_indices)),
self.unit_mask_indices]).T
self.unit_mask_indices = np.array(self.unit_mask_indices, dtype='int64')
# NEW: turn all new attributes into tensorflow constants
self.unit_mask = tf.constant(self.unit_mask)
self.edge_mask = tf.constant(self.edge_mask)
self.unit_mask_indices = tf.constant(self.unit_mask_indices)
self.edge_mask_indices = tf.constant(self.edge_mask_indices)
self.kernel = self.add_weight(
'kernel',
# NEW: trainable kernel weights may be fewer than before;
# they are now stored as 1D array,
shape=[int(np.sum(self.edge_mask))],
initializer=self.kernel_initializer,
regularizer=self.kernel_regularizer,
constraint=self.kernel_constraint,
dtype=self.dtype,
trainable=True)
if self.use_bias:
self.bias = self.add_weight(
'bias',
#NEW: trainable biases may be fewer than number of units
shape=[int(np.sum(self.unit_mask))],
initializer=self.bias_initializer,
regularizer=self.bias_regularizer,
constraint=self.bias_constraint,
dtype=self.dtype,
trainable=True)
else:
self.bias = None
self.built = True
def rebuild(self, edge_mask=None, unit_mask=None):
# NEW: This is a new function for rebuilding a DenseWithMask layer with a
# new edge mask and/or unit mask
# if none are given, get default values for masks from layer
if edge_mask is None:
edge_mask = self.edge_mask.numpy()
if unit_mask is None:
unit_mask = self.unit_mask.numpy()
# incorporate unit_mask info into edge_mask
edge_mask = edge_mask * unit_mask
# incorporate edge_mask info into unit_mask
unit_mask = unit_mask * np.any(edge_mask, axis=0)
# NOW: get new arrays of trainable weights for layer
# The stuff below contains slow, redundant lines but
# those might become handy when adding default values for
# new edges and nodes(?)
# create old kernel
kernel_old = np.zeros_like(self.edge_mask, dtype=float)
kernel_old[self.edge_mask.numpy()]=self.kernel.numpy()
# create new kernel
kernel_new = np.zeros_like(edge_mask, dtype=float)
for i in range(len(kernel_new)):
for j in range(len(kernel_new[0])):
if edge_mask[i,j]:
kernel_new[i,j] = kernel_old[i,j]
# create initializer for new kernel
vals = list(kernel_new[edge_mask])
initK = tf.compat.v1.keras.initializers.Constant(value=vals,
verify_shape=False)
# save new edge mask
self.edge_mask = edge_mask
# build new kernel
self.kernel = self.add_weight(
'kernel',
shape=[int(np.sum(self.edge_mask))],
initializer=initK,
regularizer=self.kernel_regularizer,
constraint=self.kernel_constraint,
dtype=self.dtype,
trainable=True)
# create old bias list
bias_old = np.zeros_like(self.unit_mask, dtype=float)
bias_old[self.unit_mask.numpy()] = self.bias.numpy()
# create new bias list
bias_new = np.zeros_like(unit_mask, dtype=float)
for i in range(len(bias_new)):
if unit_mask[i]:
bias_new[i] = bias_old[i]
# create initializer for new biases
vals = list(bias_new[unit_mask])
initB = tf.compat.v1.keras.initializers.Constant(value=vals,
verify_shape=False)
# save new unit mask
self.unit_mask = unit_mask
# build new biases
self.bias = self.add_weight(
'bias',
shape=[int(np.sum(self.unit_mask))],
initializer=initB,
regularizer=self.bias_regularizer,
constraint=self.bias_constraint,
dtype=self.dtype,
trainable=True)
# update mask indices
self.edge_mask_indices = np.stack(self.edge_mask.nonzero()).T
self.edge_mask_indices = np.array(self.edge_mask_indices, dtype='int64')
self.unit_mask_indices = self.unit_mask.nonzero()[0]
# need to convert indices for bias to 2d array for sparse tensor
self.unit_mask_indices = np.stack([np.zeros(len(self.unit_mask_indices)),
self.unit_mask_indices]).T
self.unit_mask_indices = np.array(self.unit_mask_indices, dtype='int64')
# turn all new attributes into tensorflow constants
self.unit_mask = tf.constant(self.unit_mask)
self.edge_mask = tf.constant(self.edge_mask)
self.unit_mask_indices = tf.constant(self.unit_mask_indices)
self.edge_mask_indices = tf.constant(self.edge_mask_indices)
def call(self, inputs):
# NEW: create a kernel (2D numpy array) from 1D list of trainable kernel weights
kernel = tf.SparseTensor(indices=self.edge_mask_indices,
values=self.kernel,
dense_shape=self.edge_mask.shape)
kernel = tf.sparse.to_dense(kernel)
# NEW: create a bias vector (1D numpy array) from 1D list of trainable biases
bias = tf.SparseTensor(indices=self.unit_mask_indices, values=self.bias,
dense_shape=[1,self.unit_mask.shape[0]])
bias = tf.squeeze(tf.sparse.to_dense(bias))
rank = inputs.shape.rank
if rank is not None and rank > 2:
# Broadcasting is required for the inputs.
outputs = standard_ops.tensordot(inputs, kernel, [[rank - 1], [0]]) # NEW: self.kernel -> kernel
# Reshape the output back to the original ndim of the input.
if not context.executing_eagerly():
shape = inputs.shape.as_list()
output_shape = shape[:-1] + [self.units]
outputs.set_shape(output_shape)
else:
inputs = math_ops.cast(inputs, self._compute_dtype)
if K.is_sparse(inputs):
outputs = sparse_ops.sparse_tensor_dense_matmul(inputs, kernel) # NEW: self.kernel -> kernel
else:
outputs = gen_math_ops.mat_mul(inputs, kernel) # NEW: self.kernel -> kernel
if self.use_bias:
outputs = nn.bias_add(outputs, bias) # NEW: self.bias -> bias
if self.activation is not None:
return self.activation(outputs)
return outputs
def compute_output_shape(self, input_shape):
input_shape = tensor_shape.TensorShape(input_shape)
input_shape = input_shape.with_rank_at_least(2)
if tensor_shape.dimension_value(input_shape[-1]) is None:
raise ValueError(
'The innermost dimension of input_shape must be defined, but saw: %s'
% input_shape)
return input_shape[:-1].concatenate(self.units)
def main():
# make a model
model = tf.keras.models.Sequential(
[
tf.keras.layers.Flatten(input_shape=(28, 28)),
DenseWithMask(128, activation='relu'),
DenseWithMask(10)
]
)
model.compile(
optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy']
)
# try accessing edge_mask
print('edge_mask:', model.layers[1].edge_mask)
# save model
model.save('model_with_custom_layers')
# load model
model2 = tf.keras.models.load_model('model_with_custom_layers', custom_objects="DenseWithMask": DenseWithMask)
# try accessing attributes of custom layer
print("edge_mask of loaded model:", model2.layers[1].edge_mask)
if __name__ == "__main__":
main()
以及对应的输出:
edge_mask: tf.Tensor(
[[ True True True ... True True True]
[ True True True ... True True True]
[ True True True ... True True True]
...
[ True True True ... True True True]
[ True True True ... True True True]
[ True True True ... True True True]], shape=(784, 128), dtype=bool)
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/resource_variable_ops.py:1817: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
edge_mask of loaded model: tf.Tensor(
[[ True True True ... True True True]
[ True True True ... True True True]
[ True True True ... True True True]
...
[ True True True ... True True True]
[ True True True ... True True True]
[ True True True ... True True True]], shape=(784, 128), dtype=bool)
关于你的代码,你可以稍微简化一下:
- 正如您提到的,您的自定义类
DenseWithMask
是 tensorflow 中 Dense
类的扩展版本,因此您可以使用继承(至少在 __init__
和 get_config
中,我没有检查您的所有方法)
import tensorflow as tf
class DenseWithMask(tf.keras.layers.Dense):
"""
Dense layer but with optional permanent masking of units or edges.
"""
def __init__(
self,
units,
unit_mask=None, # NEW
edge_mask=None, # NEW
**kwargs
):
if 'input_shape' not in kwargs and 'input_dim' in kwargs:
kwargs['input_shape'] = (kwargs.pop('input_dim'),)
self.unit_mask = np.array(unit_mask) if unit_mask is not None else None
self.edge_mask = np.array(edge_mask) if edge_mask is not None else None
self.unit_mask_indices = None
self.edge_mask_indices = None
super().__init__(units=units, **kwargs)
def get_config(self):
config = super().get_config()
config.update(
"edge_mask": self.edge_mask.numpy(),
"unit_mask": self.unit_mask.numpy()
)
return config
-
当您定义自定义可调用对象(例如层、指标、优化器等)而不是在
load_model
方法中定义映射 custom_objects
时,您可以使用 tensorflow 提供的 utils 函数自动执行此操作:@987654321 @
import tensorflow as tf
@tf.keras.utils.register_keras_serializable()
class DenseWithMask(tf.keras.layers.Dense):
....
然后当你保存/加载模型时,你可以跳过custom_objects
映射:
# make a model
model = tf.keras.models.Sequential(
[
tf.keras.layers.Flatten(input_shape=(28, 28)),
DenseWithMask(128, activation='relu'),
DenseWithMask(10)
]
)
model.compile(
optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy']
)
# try accessing edge_mask
print('edge_mask:', model.layers[1].edge_mask)
# save model
model.save('model_with_custom_layers.h5')
# load model
model2 = tf.keras.models.load_model('model_with_custom_layers.h5')
# try accessing attributes of custom layer
print("edge_mask of loaded model:", model2.layers[1].edge_mask)
注意:对于第二点,我设法让它仅适用于 h5
文件格式,但它也应该适用于 pb
格式
【讨论】:
以上是关于使用具有附加属性的自定义层保存和加载 keras 模型的主要内容,如果未能解决你的问题,请参考以下文章
Keras:如何在编译期间输入形状未知时创建带权重的自定义图层?