将 Numpy 数组转换为张量

Posted

技术标签:

【中文标题】将 Numpy 数组转换为张量【英文标题】:Convert Numpy Arrrays to a Tensor 【发布时间】:2021-10-17 11:19:46 【问题描述】:

我使用 pandas 将文件转换为数据框,现在我想通过 TensorFlow 训练深度学习模型。我没有成功训练模型:划分训练集和测试集后,当我去编译模型时它告诉我

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type 
numpy.ndarray).

我认为问题在于 numpy 数组的大小不同,但尽管执行了填充(这样所有数组在列内都具有相同的维度),但问题并没有解决。 下面我插入一个我在数据集中拥有的列的示例:如果我想将其转换为张量,我应该怎么做?

df = pd.read_parquet('example.parquet')
df['column']

0                            [0, 1, 1, 1, 0, 1, 0, 1, 0]
1          [0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0]
2          [0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1]
3                      [0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1]
4                   [0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0]
                         ...                        
115                          [0, 1, 0, 0, 1, 1, 1, 1, 1]
116    [0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, ...
117     [0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1]
118    [0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, ...
119                    [0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1]

显然我插入了原始列,而不是我填充失败的那一列。

这些是我训练模型的步骤,如果它们有用的话

from sklearn.preprocessing import LabelEncoder
label_encoder = LabelEncoder()
Y = label_encoder.fit_transform(Y)
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.3, random_state = 42)
#create model
model = Sequential()

#add model layers
model.add(BatchNormalization())
model.add(Dense(20, activation='softmax', input_shape=(X_train.shape)))

# compile model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=50)

更新:完整的回溯

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported 
object type numpy.ndarray).
--------------------------------------------------------------------- 
------
ValueError                                Traceback (most recent call 
last)
~\AppData\Local\Temp/ipykernel_16380/3421148994.py in <module>
  1 from livelossplot import PlotLossesKeras
  2 
----> 3 model.fit(X_train, y_train, validation_data=(X_test, y_test), 
epochs=50, callbacks=[PlotLossesKeras()])

~\AppData\Local\Programs\Python\Python39\lib\site- 
packages\keras\engine\training.py in fit(self, x, y, batch_size, 
epochs, verbose, callbacks, validation_split, validation_data, 
shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, 
validation_steps, validation_batch_size, validation_freq, 
max_queue_size, workers, use_multiprocessing)
1132          training_utils.RespectCompiledTrainableState(self):
1133       # Creates a `tf.data.Dataset` and handles batch and epoch 
iteration.
-> 1134       data_handler = data_adapter.get_data_handler(
1135           x=x,
1136           y=y,

~\AppData\Local\Programs\Python\Python39\lib\site- 
packages\keras\engine\data_adapter.py in get_data_handler(*args, 
**kwargs)
1381   if getattr(kwargs["model"], "_cluster_coordinator", None):
1382     return _ClusterCoordinatorDataHandler(*args, **kwargs)
-> 1383   return DataHandler(*args, **kwargs)
1384 
1385
~\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\data_adapter.py in __init__(self, x, y, sample_weight, batch_size, steps_per_epoch, initial_epoch, epochs, shuffle, class_weight, max_queue_size, workers, use_multiprocessing, model, steps_per_execution, distribute)
   1136 
   1137     adapter_cls = select_data_adapter(x, y)
-> 1138     self._adapter = adapter_cls(
   1139         x,
   1140         y,

~\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\data_adapter.py in __init__(self, x, y, sample_weights, sample_weight_modes, batch_size, epochs, steps, shuffle, **kwargs)
    228                **kwargs):
    229     super(TensorLikeDataAdapter, self).__init__(x, y, **kwargs)
--> 230     x, y, sample_weights = _process_tensorlike((x, y, sample_weights))
    231     sample_weight_modes = broadcast_sample_weight_modes(
    232         sample_weights, sample_weight_modes)

~\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\data_adapter.py in _process_tensorlike(inputs)
   1029     return x
   1030 
-> 1031   inputs = tf.nest.map_structure(_convert_numpy_and_scipy, inputs)
   1032   return tf.__internal__.nest.list_to_tuple(inputs)
   1033
~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\util\nest.py in map_structure(func, *structure, **kwargs)
    867 
    868   return pack_sequence_as(
--> 869       structure[0], [func(*x) for x in entries],
    870       expand_composites=expand_composites)
    871 

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\util\nest.py in <listcomp>(.0)
    867 
    868   return pack_sequence_as(
--> 869       structure[0], [func(*x) for x in entries],
    870       expand_composites=expand_composites)
    871 

~\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\data_adapter.py in _convert_numpy_and_scipy(x)
   1024       if issubclass(x.dtype.type, np.floating):
   1025         dtype = backend.floatx()
-> 1026       return tf.convert_to_tensor(x, dtype=dtype)
   1027     elif _is_scipy_sparse(x):
   1028       return _scipy_sparse_to_sparse_tensor(x)

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\util\dispatch.py in wrapper(*args, **kwargs)
    204     """Call target, and fall back on dispatchers if there is a TypeError."""
    205     try:
--> 206       return target(*args, **kwargs)
    207     except (TypeError, ValueError):
    208       # Note: convert_to_eager_tensor currently raises a ValueError, not a
~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\ops.py in convert_to_tensor_v2_with_dispatch(value, dtype, dtype_hint, name)
   1428     ValueError: If the `value` is a tensor not of given `dtype` in graph mode.
   1429   """
-> 1430   return convert_to_tensor_v2(
   1431       value, dtype=dtype, dtype_hint=dtype_hint, name=name)
   1432 

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\ops.py in convert_to_tensor_v2(value, dtype, dtype_hint, name)
   1434 def convert_to_tensor_v2(value, dtype=None, dtype_hint=None, name=None):
   1435   """Converts the given `value` to a `Tensor`."""
-> 1436   return convert_to_tensor(
   1437       value=value,
   1438       dtype=dtype,

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\profiler\trace.py in wrapped(*args, **kwargs)
    161         with Trace(trace_name, **trace_kwargs):
    162           return func(*args, **kwargs)
--> 163       return func(*args, **kwargs)
    164 
    165     return wrapped

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\ops.py in convert_to_tensor(value, dtype, name, as_ref, preferred_dtype, dtype_hint, ctx, accepted_result_types)
   1564 
   1565     if ret is None:
-> 1566       ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
   1567 
   1568     if ret is NotImplemented:
~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\tensor_conversion_registry.py in _default_conversion_function(***failed resolving arguments***)
     50 def _default_conversion_function(value, dtype, name, as_ref):
     51   del as_ref  # Unused.
---> 52   return constant_op.constant(value, dtype, name=name)
     53 
     54 

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\constant_op.py in constant(value, dtype, shape, name)
    269     ValueError: if called on a symbolic tensor.
    270   """
--> 271   return _constant_impl(value, dtype, shape, name, verify_shape=False,
    272                         allow_broadcast=True)
    273 
    ~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\constant_op.py in _constant_impl(value, dtype, shape, name, verify_shape, allow_broadcast)
    281       with trace.Trace("tf.constant"):
    282         return _constant_eager_impl(ctx, value, dtype, shape, verify_shape)
--> 283     return _constant_eager_impl(ctx, value, dtype, shape, verify_shape)
    284 
    285   g = ops.get_default_graph()

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\constant_op.py in _constant_eager_impl(ctx, value, dtype, shape, verify_shape)
    306 def _constant_eager_impl(ctx, value, dtype, shape, verify_shape):
    307   """Creates a constant on the current device."""
--> 308   t = convert_to_eager_tensor(value, ctx, dtype)
    309   if shape is None:
    310     return t

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\constant_op.py in convert_to_eager_tensor(value, ctx, dtype)
    104       dtype = dtypes.as_dtype(dtype).as_datatype_enum
    105   ctx.ensure_initialized()
--> 106   return ops.EagerTensor(value, ctx.device_name, dtype)
    107 
    108 

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).

【问题讨论】:

请更新完整的回溯。 我用完整的回溯更新了第一条消息 会显示 x_train 的形状,似乎数据类型与某些对象不匹配,请尝试X.astype(np.float32) 我遇到了这个错误 ValueError: setting an array element with a sequence。 -------------------------------------------------- ------------------------- TypeError Traceback (最近一次调用最后一次) TypeError: only size-1 arrays can be convert to Python scalars 我的 X 的形状是 (120,3),我的 X_train 的形状是 (84,3) 【参考方案1】:

首先,非常感谢您的回复。不幸的是,我尝试按照您所说的进行操作,但问题仍然存在。我在我使用的脚本下方报告了与特征类型和形状相关的所有信息。

import pandas as pd
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense
import numpy as np
from sklearn.preprocessing import LabelEncoder
#Import Dataset
df = pd.read_parquet('toydataset.parquet')
#Selected only one column of the dataset
X = df['column1']
Y= df['label']

现在我提供所选列的信息

X.shape
(120,)
X
0                            [0, 1, 1, 1, 0, 1, 0, 1, 0]
1          [0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0]
2          [0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1]
3                      [0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1]
4                   [0, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0]
                             ...                        
115                          [0, 1, 0, 0, 1, 1, 1, 1, 1]
116    [0, 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, ...
117     [0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1]
118    [0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, ...
119                    [0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1]
Name: column1, Length: 120, dtype: object

现在我要填充列。数组最大长度为13040

#First encode the label
label_encoder = LabelEncoder()
Y = label_encoder.fit_transform(Y)
#Padding the column
for i in range (0, len(df['column1'])):
    pad_size = 13040-len(df['column1'][i])
    df['column1'][i] = np.pad(df['column1'][i], (pad_size, 0))
    #print(df['column1'][i])

这是结果

X=df['column1']
X
0      [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
1      [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
2      [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
3      [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
4      [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
                             ...                        
115    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
116    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
117    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
118    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
119    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
Name: column1, Length: 120, dtype: object

现在我要拆分数据集

    X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.3, random_state = 42)
X_train.shape
(84,)
X_test.shape
(36,)
y_train.shape
(84,)
y_test.shape
(36,)

我还提供了 X_train 的结果

X_train
30     [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
53     [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
118    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
9      [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
33     [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
                             ...                        
106    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
14     [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
92     [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
51     [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
102    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
Name: column1, Length: 84, dtype: object

然后我按照建议的步骤进行操作

model = Sequential()
model.add(Dense(20, activation='softmax', input_shape=(84,)))
#I tried with input shape (84,) but also with other numbers but the result is always the same
# compile model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=50, batch_size=32)

这是完整的回溯

    ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\data\util\structure.py in normalize_element(element, element_signature)
    105         if spec is None:
--> 106           spec = type_spec_from_value(t, use_fallback=False)
    107       except TypeError:

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\data\util\structure.py in type_spec_from_value(element, use_fallback)
    485 
--> 486   raise TypeError("Could not build a TypeSpec for %r with type %s" %
    487                   (element, type(element).__name__))

TypeError: Could not build a TypeSpec for 30     [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
53     [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
118    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
9      [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
33     [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
                             ...                        
106    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
14     [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
92     [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
51     [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
102    [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
Name: column1, Length: 84, dtype: object with type Series
During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_16012/2272230314.py in <module>
     24 # compile model
     25 model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
---> 26 model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=50, batch_size=1)
     27 
     28 

~\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
   1132          training_utils.RespectCompiledTrainableState(self):
   1133       # Creates a `tf.data.Dataset` and handles batch and epoch iteration.
-> 1134       data_handler = data_adapter.get_data_handler(
   1135           x=x,
   1136           y=y,

~\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\data_adapter.py in get_data_handler(*args, **kwargs)
   1381   if getattr(kwargs["model"], "_cluster_coordinator", None):
   1382     return _ClusterCoordinatorDataHandler(*args, **kwargs)
-> 1383   return DataHandler(*args, **kwargs)
   1384 
   1385 
   ~\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\data_adapter.py in __init__(self, x, y, sample_weight, batch_size, steps_per_epoch, initial_epoch, epochs, shuffle, class_weight, max_queue_size, workers, use_multiprocessing, model, steps_per_execution, distribute)
   1136 
   1137     adapter_cls = select_data_adapter(x, y)
-> 1138     self._adapter = adapter_cls(
   1139         x,
   1140         y,

~\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\data_adapter.py in __init__(self, x, y, sample_weights, sample_weight_modes, batch_size, epochs, steps, shuffle, **kwargs)
    320     indices_dataset = indices_dataset.flat_map(slice_batch_indices)
    321 
--> 322     dataset = self.slice_inputs(indices_dataset, inputs)
    323 
    324     if shuffle == "batch":

~\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\data_adapter.py in slice_inputs(self, indices_dataset, inputs)
    346     dataset = tf.data.Dataset.zip((
    347         indices_dataset,
--> 348         tf.data.Dataset.from_tensors(inputs).repeat()
    349     ))
    350 
    ~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\data\ops\dataset_ops.py in from_tensors(tensors)
    604       Dataset: A `Dataset`.
    605     """
--> 606     return TensorDataset(tensors)
    607 
    608   @staticmethod

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\data\ops\dataset_ops.py in __init__(self, element)
   3823   def __init__(self, element):
   3824     """See `Dataset.from_tensors()` for details."""
-> 3825     element = structure.normalize_element(element)
   3826     self._structure = structure.type_spec_from_value(element)
   3827     self._tensors = structure.to_tensor_list(self._structure, element)

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\data\util\structure.py in normalize_element(element, element_signature)
    109         # the value. As a fallback try converting the value to a tensor.
    110         normalized_components.append(
--> 111             ops.convert_to_tensor(t, name="component_%d" % i))
    112       else:
    113         if isinstance(spec, sparse_tensor.SparseTensorSpec):

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\profiler\trace.py in wrapped(*args, **kwargs)
    161         with Trace(trace_name, **trace_kwargs):
    162           return func(*args, **kwargs)
--> 163       return func(*args, **kwargs)
    164 
    165     return wrapped
~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\ops.py in convert_to_tensor(value, dtype, name, as_ref, preferred_dtype, dtype_hint, ctx, accepted_result_types)
   1564 
   1565     if ret is None:
-> 1566       ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
   1567 
   1568     if ret is NotImplemented:

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\constant_op.py in _constant_tensor_conversion_function(v, dtype, name, as_ref)
    344                                          as_ref=False):
    345   _ = as_ref
--> 346   return constant(v, dtype=dtype, name=name)
    347 
    348 

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\constant_op.py in constant(value, dtype, shape, name)
    269     ValueError: if called on a symbolic tensor.
    270   """
--> 271   return _constant_impl(value, dtype, shape, name, verify_shape=False,
    272                         allow_broadcast=True)
    273 
    ~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\constant_op.py in _constant_impl(value, dtype, shape, name, verify_shape, allow_broadcast)
    281       with trace.Trace("tf.constant"):
    282         return _constant_eager_impl(ctx, value, dtype, shape, verify_shape)
--> 283     return _constant_eager_impl(ctx, value, dtype, shape, verify_shape)
    284 
    285   g = ops.get_default_graph()

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\constant_op.py in _constant_eager_impl(ctx, value, dtype, shape, verify_shape)
    306 def _constant_eager_impl(ctx, value, dtype, shape, verify_shape):
    307   """Creates a constant on the current device."""
--> 308   t = convert_to_eager_tensor(value, ctx, dtype)
    309   if shape is None:
    310     return t

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\constant_op.py in convert_to_eager_tensor(value, ctx, dtype)
    104       dtype = dtypes.as_dtype(dtype).as_datatype_enum
    105   ctx.ensure_initialized()
--> 106   return ops.EagerTensor(value, ctx.device_name, dtype)
    107 
    108 

ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).

更新:使用填充我已经解决了转换问题。现在这是我的代码

import pandas as pd
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense, Conv1D, Flatten
import numpy as np
from sklearn.preprocessing import LabelEncoder
df = pd.read_parquet('toydataset.parquet')
Y= df['label']
label_encoder = LabelEncoder()
Y = label_encoder.fit_transform(Y)
#New Padding
for i in range (0, len(df['column1'])):
    pad_size = 13040 - len(df['column1'][i])
    df['column1'][i] = np.pad(df['column1'][i], (pad_size, 0))

final_array = np.array([np.array(i) for i in df['column1']])
X=final_array
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.3, random_state = 42)
#create model
model = Sequential()

#add model layers
#model.add(BatchNormalization())
model.add(Dense(20, activation='softmax', input_shape=(13040,)))

# compile model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=50)

现在有这个问题

     ValueError: Shapes (None, 1) and (None, 20) are incompatible
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_6140/3165204354.py in <module>
      8 # compile model
      9 model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
---> 10 model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=50)

~\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
   1182                 _r=1):
   1183               callbacks.on_train_batch_begin(step)
-> 1184               tmp_logs = self.train_function(iterator)
   1185               if data_handler.should_sync:
   1186                 context.async_wait()

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\eager\def_function.py in __call__(self, *args, **kwds)
    883 
    884       with OptionalXlaContext(self._jit_compile):
--> 885         result = self._call(*args, **kwds)
    886 
    887       new_tracing_count = self.experimental_get_tracing_count()

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\eager\def_function.py in _call(self, *args, **kwds)
    931       # This is the first call of __call__, so we have to initialize.
    932       initializers = []
--> 933       self._initialize(args, kwds, add_initializers_to=initializers)
    934     finally:
    935       # At this point we know that the initialization is complete (or less ~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\eager\def_function.py in _initialize(self, args, kwds, add_initializers_to)
    757     self._graph_deleter = FunctionDeleter(self._lifted_initializer_graph)
    758     self._concrete_stateful_fn = (
--> 759         self._stateful_fn._get_concrete_function_internal_garbage_collected(  # pylint: disable=protected-access
    760             *args, **kwds))
    761 

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\eager\function.py in _get_concrete_function_internal_garbage_collected(self, *args, **kwargs)
   3064       args, kwargs = None, None
   3065     with self._lock:
-> 3066       graph_function, _ = self._maybe_define_function(args, kwargs)
   3067     return graph_function
   3068 

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\eager\function.py in _maybe_define_function(self, args, kwargs)
   3461 
   3462           self._function_cache.missed.add(call_context_key)
-> 3463           graph_function = self._create_graph_function(args, kwargs)
   3464           self._function_cache.primary[cache_key] = graph_function
   3465 

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\eager\function.py in _create_graph_function(self, args, kwargs, override_flat_arg_shapes)
3296     arg_names = base_arg_names + missing_arg_names
   3297     graph_function = ConcreteFunction(
-> 3298         func_graph_module.func_graph_from_py_func(
   3299             self._name,
   3300             self._python_function,

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\func_graph.py in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, autograph, autograph_options, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, override_flat_arg_shapes, acd_record_initial_resource_uses)
   1005         _, original_func = tf_decorator.unwrap(python_func)
   1006 
-> 1007       func_outputs = python_func(*func_args, **func_kwargs)
   1008 
   1009       # invariant: `func_outputs` contains only Tensors, CompositeTensors,

~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\eager\def_function.py in wrapped_fn(*args, **kwds)
    666         # the function a weak reference to itself to avoid a reference cycle.
    667         with OptionalXlaContext(compile_with_xla):
--> 668           out = weak_wrapped_fn().__wrapped__(*args, **kwds)
    669         return out
    670 ~\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\func_graph.py in wrapper(*args, **kwargs)
    992           except Exception as e:  # pylint:disable=broad-except
    993             if hasattr(e, "ag_error_metadata"):
--> 994               raise e.ag_error_metadata.to_exception(e)
    995             else:
    996               raise

ValueError: in user code:
     C:\Users\Luigi\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\training.py:853 train_function  *
        return step_function(self, iterator)
    C:\Users\Luigi\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\training.py:842 step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    C:\Users\Luigi\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:1286 run
        return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
    C:\Users\Luigi\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:2849 call_for_each_replica
        return self._call_for_each_replica(fn, args, kwargs)
    C:\Users\Luigi\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:3632 _call_for_each_replica
        return fn(*args, **kwargs)
    C:\Users\Luigi\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\training.py:835 run_step  **
        outputs = model.train_step(data)
    C:\Users\Luigi\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\training.py:788 train_step
        loss = self.compiled_loss(
    C:\Users\Luigi\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\engine\compile_utils.py:201 __call__
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    C:\Users\Luigi\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\losses.py:141 __call__
        losses = call_fn(y_true, y_pred)
    C:\Users\Luigi\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\losses.py:245 call  **
        return ag_fn(y_true, y_pred, **self._fn_kwargs)
    C:\Users\Luigi\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\util\dispatch.py:206 wrapper
        return target(*args, **kwargs)
    C:\Users\Luigi\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\losses.py:1665 categorical_crossentropy
        return backend.categorical_crossentropy(
    C:\Users\Luigi\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\util\dispatch.py:206 wrapper
        return target(*args, **kwargs)
    C:\Users\Luigi\AppData\Local\Programs\Python\Python39\lib\site-packages\keras\backend.py:4839 categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)
    C:\Users\Luigi\AppData\Local\Programs\Python\Python39\lib\site-packages\tensorflow\python\framework\tensor_shape.py:1161 assert_is_compatible_with
        raise ValueError("Shapes %s and %s are incompatible" % (self, other))

    ValueError: Shapes (None, 1) and (None, 20) are incompatible

【讨论】:

非常感谢。我已经解决了张量转换问题。当我去创建模型时,我应该把什么作为 input_shape? X_train.shape 是 (84, 13040),X_test.shape 是 (36, 13040),y_train.shape 是 (84, _ ),y_test.shape 是 (36, _ )。我尝试使用 input_shape = (84, _ ) 或 (84, 13040) 但它在第一个时期给了我错误 model.add(Dense(20, activation='softmax', input_shape=(13040,))) 此外,您不会得到任何好的结果,因为您的输入非常大,但您没有那么多样本。 非常感谢您为我提供的帮助。显然,这是一个了解训练如何工作的示例,我知道我需要更多样本。第一个epoch还是有问题,我更新了之前的回答。【参考方案2】:

我假设您目前正在处理填充数据。所以现在在填充数据之后,你要做Scaling。完成此操作后,您的 X 形状分别为 (120,3) 和 (84,3) 用于训练和测试。

现在第一个明显的错误在下面一行

model.add(Dense(20, activation='softmax', input_shape=(X_train.shape)))

您没有在input_shape 中指定batch 的维度。用更简单的方式说你正在给模型提供图像,那么在 1 通道图像的情况下你会在input_shape 中写什么?如下所示。

height = 224
width = 224
model.add(Dense(20, activation='softmax', input_shape=(height, width)))

# In your case you have written
model.add(Dense(20, activation='softmax', input_shape=(120, 3)))

这告诉模型,对应于形状 (120,3) 的每个输入,有一些标签不是这种情况,因此您应该只传递如下特征的维度

model.add(Dense(20, activation='softmax', input_shape=(3,)))

在此之后,错误应该被删除。另外,我没有看到你在model.fit 中使用batch_size 参数,你应该使用它。

我看到的第二件事不是语法错误,而是下面代码中的方法错误。

#create model
model = Sequential()
#add model layers
model.add(BatchNormalization()) # RED FLAG
model.add(Dense(20, activation='softmax', input_shape=(X_train.shape)))

您不应该在输入上使用BatchNormalization。使用BatchNormalization 的主要原因是提高模型的训练速度,即使这样也不是在输入上。此外,需要注意的重要一点是,BatchNormalization 在训练批次上是 Normalization,而不是在整个数据集上,因此如果您不使用可以代表整个人口的大批量大小,则几乎没有用处。

更新: 您没有正确填充。填充后 X.shape 的输出应该是 ( _ , _ ) 而不是 ( _ , )。所以,请执行以下操作

# Creating some random data
random_array = []
for i in range(20):
    random_array.append([i for i in range(i+1)])

df = pd.DataFrame()
df['values'] = random_array

for i in range (0, len(df['values'])):
    pad_size = 21 - len(df['values'][i])
    df['values'][i] = np.pad(df['values'][i], (pad_size, 0))

final_array = np.array([np.array(i) for i in df['values']])
print(final_array.shape) # This will give (20, 21) and not (20,)

【讨论】:

非常感谢您的回答。我在下一个答案中提供了更多信息。不幸的是错误仍然存​​在

以上是关于将 Numpy 数组转换为张量的主要内容,如果未能解决你的问题,请参考以下文章

将 Numpy 数组转换为张量

如何将 pytorch 张量转换为 numpy 数组?

无法将 NumPy 数组转换为张量(不支持的对象类型 numpy.ndarray)错误

根据其他列将数据框的一列转换为numpy数组或张量

无法将 NumPy 数组转换为张量(不支持的对象类型浮点数)

TensorFlow ValueError:无法将 NumPy 数组转换为张量(不支持的对象类型列表)