Keras:ValueError:无法将 NumPy 数组转换为张量(不支持的对象类型列表)
Posted
技术标签:
【中文标题】Keras:ValueError:无法将 NumPy 数组转换为张量(不支持的对象类型列表)【英文标题】:Keras: ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type list) 【发布时间】:2021-12-01 19:12:22 【问题描述】:我正在尝试对音频信号进行分类。为此,我开始编译用于训练 CNN 的每个 .wav 文件的 MFCC,按标签拆分它们(在某些文件中,前几秒有一个声音,其余的有另一个声音)。然后我将它们分成 2.5 秒的序列,并将每个 MFCC 存储在一个自己的 json 文件中,如下所示:(由于 librosa.features.mfcc 返回一个 nd.array,我必须将其转换为列表,然后再将其存储在 json 中)
for path in wav_paths:
# split stereo .wav-file into channels
filename = os.path.basename(path)
print(filename)
audiosegment = AudioSegment.from_file(path)
arr_mono = audiosegment.get_array_of_samples()
audio_data = (np.asarray(arr_mono)).astype(
np.float32) # audio_data is array.array (int16), ndarray (float32) needed for librosa
sample_rate = audiosegment.frame_rate
# calculate MFCCs for whole audio
mfcc = librosa.feature.mfcc(audio_data, sr=sample_rate, n_mfcc=n_mfcc, n_fft=framesize, hop_length=int(hop_size))
duration = audiosegment.duration_seconds
begin, end, event = create_dataframe.read_json(path_to_json)
list1 = [0, begin, end, duration] # one sound goes from 0secs to begin, the other one from begin to end and then the first one again from end to duration
list2 = list(zip(list1, list1[1:])) # list2=[(0, begin), (begin, end), (end, duration)
lst_mfcc_split_by_label = []
for from_sec, to_sec in list2:
# get label of sequence
label_str = create_dataframe.get_label(begin, end, event, from_sec, to_sec)
label = create_dataframe.label_key(label_str) # label as number between 0 and 3
# split MFCC by label
index_first_frame = librosa.time_to_frames(from_sec, sr=sample_rate, hop_length=hop_size)
index_last_frame = librosa.time_to_frames(to_sec, sr=sample_rate, hop_length=hop_size)
lst_mfcc_split_by_label = np.hsplit(mfcc, [index_first_frame,
index_last_frame + 1]) # returns list of 3 arrays (mfcc-array split at index_first_frame and index_last_frame)
mfcc_split_by_label = lst_mfcc_split_by_label[
1] # returns part between index_first_frame and index_last_frame+1)
# set size of blocks
secs_per_split = 2.5
# nur Blöcke betrachten, die genau secs_per_split entsprechen
n_blocks_in_sequence = int((to_sec - from_sec)/secs_per_split) # abrunden
to_sec_block = n_blocks_in_sequence * secs_per_split # end of last block of sequence
for time in np.arange(0, to_sec_block, secs_per_split):
# get index of frame corresponding to begin and end of block
index_first_frame_block = librosa.time_to_frames(time, sr=sample_rate, hop_length=hop_size)
index_last_frame_block = librosa.time_to_frames(time + 2.5, sr=sample_rate, hop_length=hop_size)
# split
lst_mfcc_split_in_blocks = np.hsplit(mfcc, [index_first_frame_block,
index_last_frame_block + 1]) # returns list of 3 arrays (mfcc-array split at index_first_frame and index_last_frame+1)
mfcc_split_in_blocks = lst_mfcc_split_in_blocks[
1] # returns part between index_first_frame and index_last_frame+1)
# store label and mfcc in dict
data["label"] = label
data["mfcc"] = mfcc_split_in_blocks.tolist()
# save MFCCs to json file
json_filename_data = str(time) + "-" + str(time + secs_per_split) + filename + ".json"
path_to_json_data = os.path.join(dirPath_data, json_filename_data)
with open(path_to_json_data, "w") as fp:
json.dump(data, fp, indent=4)
然后,在尝试拟合我的模型时(见下文),我总是收到以下错误:
ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type list).
我也收到此警告:
C:\Users\emmah\OneDrive - rwth-aachen.de\Dokumente\Uni\RWTH\13_Bachelorarbeit\BA Emma Heyen\06 - Repo\ba-emma-heyen-0\src\train_CNN.py:12: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
X = np.array(data["mfcc"])
但是当我指定dtype=object
时,它并没有改变任何东西。
这就是我加载数据并执行 Train_test_split 的方式:
def load_dataset(data_path):
list_data_X = []
list_data_y = []
files = [f for f in os.listdir(data_path) if os.path.isfile(os.path.join(data_path, f))]
for f in files:
path_to_json = os.path.join(data_path, f)
with open(path_to_json, "r") as fp:
data = json.load(fp)
# extract inputs and targets
X = data["mfcc"]
y = data["label"]
list_data_X.append(X)
list_data_y.append(y)
X_arr = np.array(list_data_X, dtype = object)
y_arr = np.array(list_data_y, dtype = object)
return X_arr, y_arr
def get_data_splits(data_path, test_size=0.1, test_validation=0.1): # train_size=0.9, validation=.9*.9=.09 of all data
# load dataset
X, y = load_dataset_2(data_path)
# create train/validation/test splits
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=test_size)
X_train, X_validation, y_train, y_validation = train_test_split(X_train, y_train, test_size=test_validation)
# covert inputs from 2d to 3d arrays bc Im using a CNN
X_train = X_train[..., np.newaxis]
X_validation = X_validation[..., np.newaxis]
X_test = X_test[..., np.newaxis]
return X_train, X_validation, X_test, y_train, y_validation, y_test
然后我像这样构建我的模型:
X_train, X_validation, X_test, y_train, y_validation, y_test = train_CNN.get_data_splits(DATA_PATH)
# build CNN model
input_shape = (X_train.shape[0], X_train.shape[1], X_train.shape[2])
model = train_CNN.build_model(input_shape, learning_rate=LEARNING_RATE, num_keywords=NUM_KEYWORDS)
# train model
model.fit(X_train, y_train, epochs=EPOCHS, batch_size=BATCH_SIZE, validation_data=(X_validation, y_validation))
我还尝试通过附加一个包含每个段的所有 mfcc 的列表,将所有 mfcc 存储在一个 json 中,但是当我尝试训练 CNN 时,我得到了同样的错误。
我发现了很多关于完全相同或类似错误的帖子,可以通过将数组转换为 np.float32 来解决它,但在这里没有帮助。
有人知道有什么帮助吗? 提前致谢!
【问题讨论】:
【参考方案1】:事实证明,在每个 .wav 文件的第一个 json(从 0 到 2.5)中,mfcc 向量的 len 比所有其他 json 中的 len 短一个。
仍然不知道为什么会发生这种情况,但我认为这就是我收到上述错误的原因。
【讨论】:
正如目前所写,您的答案尚不清楚。请edit 添加其他详细信息,以帮助其他人了解这如何解决所提出的问题。你可以找到更多关于如何写好答案的信息in the help center。 这种张量流错误通常是输入不是正确的数字数组的结果,而是包含大小不同的列表或数组。ragged arrays
不是正确的张量流输入。以上是关于Keras:ValueError:无法将 NumPy 数组转换为张量(不支持的对象类型列表)的主要内容,如果未能解决你的问题,请参考以下文章
ValueError:无法找到可以处理输入的数据适配器:<class 'NoneType'>,<class 'NoneType'> in keras model.predict
Keras:ValueError:decode_predictions 需要一批预测
Keras 嵌入层 - ValueError:检查输入时出错:预期有 2 个维度,但得到了 (39978, 20, 20)