ValueError:所需数组的深度太小,
Posted
技术标签:
【中文标题】ValueError:所需数组的深度太小,【英文标题】:ValueError: object of too small depth for desired array, 【发布时间】:2020-05-03 07:53:02 【问题描述】:当我昨天运行下面的代码时,它正在工作。但是当我今天运行这段代码时,我得到了这个错误。 我认为这个问题源于修改我的数据,但是当我尝试使用旧数据时,它仍然给出同样的错误。 (我不确定,它是否与数据的形状有关,但我想展示它。) 有人可以帮我吗?
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.2, random_state = 0)
print("Shape of x_train :", x_train.shape)
print("Shape of x_test :", x_test.shape)
print("Shape of y_train :", y_train.shape)
print("Shape of y_test :", y_test.shape)
Shape of x_train : (257763, 96)
Shape of x_test : (64441, 96)
Shape of y_train : (257763,)
Shape of y_test : (64441,)
from imblearn.ensemble import BalancedRandomForestClassifier
model = BalancedRandomForestClassifier(n_estimators = 200, random_state = 0, max_depth=6)
model.fit(x_train, y_train)
以下是完全错误;
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-9-7698c432c37d> in <module>
7
8 model = BalancedRandomForestClassifier(n_estimators = 200, random_state =
0, max_depth=6)
----> 9 model.fit(x_train, y_train)
10 y_pred_rf = model.predict(x_test)
11
/opt/anaconda/envs/env_python/lib/python3.6/site-
packages/imblearn/ensemble/_forest.py in fit(self, X, y, sample_weight)
433 s, t, self, X, y, sample_weight, i,
len(trees),
434 verbose=self.verbose,
class_weight=self.class_weight)
--> 435 for i, (s, t) in enumerate(zip(samplers,
trees)))
436 samplers, trees = zip(*samplers_trees)
437
/opt/anaconda/envs/env_python/lib/python3.6/site-
packages/joblib/parallel.py
in __call__(self, iterable)
919 # remaining jobs.
920 self._iterating = False
--> 921 if self.dispatch_one_batch(iterator):
922 self._iterating = self._original_iterator is not None
923
/opt/anaconda/envs/env_python/lib/python3.6/site-
packages/joblib/parallel.py in dispatch_one_batch(self, iterator)
757 return False
758 else:
--> 759 self._dispatch(tasks)
760 return True
761
/opt/anaconda/envs/env_python/lib/python3.6/site-
packages/joblib/parallel.py in _dispatch(self, batch)
714 with self._lock:
715 job_idx = len(self._jobs)
--> 716 job = self._backend.apply_async(batch, callback=cb)
717 # A job can complete so quickly than its callback is
718 # called before we get here, causing self._jobs to
/opt/anaconda/envs/env_python/lib/python3.6/site-
packages/joblib/_parallel_backends.py in apply_async(self, func,
callback)
180 def apply_async(self, func, callback=None):
181 """Schedule a func to be run"""
--> 182 result = ImmediateResult(func)
183 if callback:
184 callback(result)
/opt/anaconda/envs/env_python/lib/python3.6/site-
packages/joblib/_parallel_backends.py in __init__(self, batch)
547 # Don't delay the application, to avoid keeping the input
548 # arguments in memory
--> 549 self.results = batch()
550
551 def get(self):
/opt/anaconda/envs/env_python/lib/python3.6/site-
packages/joblib/parallel.py in __call__(self)
223 with parallel_backend(self._backend, n_jobs=self._n_jobs):
224 return [func(*args, **kwargs)
--> 225 for func, args, kwargs in self.items]
226
227 def __len__(self):
/opt/anaconda/envs/env_python/lib/python3.6/site-
packages/joblib/parallel.py in <listcomp>(.0)
223 with parallel_backend(self._backend, n_jobs=self._n_jobs):
224 return [func(*args, **kwargs)
--> 225 for func, args, kwargs in self.items]
226
227 def __len__(self):
/opt/anaconda/envs/env_python/lib/python3.6/site-
packages/imblearn/ensemble/_forest.py in
_local_parallel_build_trees(sampler, tree, forest, X, y, sample_weight,
tree_idx, n_trees, verbose, class_weight)
43 tree = _parallel_build_trees(tree, forest, X_resampled,
y_resampled,
44 sample_weight, tree_idx, n_trees,
---> 45 verbose=verbose,
class_weight=class_weight)
46 return sampler, tree
47
/opt/anaconda/envs/env_python/lib/python3.6/site-
packages/sklearn/ensemble/_forest.py in _parallel_build_trees(tree,
forest, X, y, sample_weight, tree_idx, n_trees, verbose, class_weight,
n_samples_bootstrap)
153 indices = _generate_sample_indices(tree.random_state,
n_samples,
154 n_samples_bootstrap)
--> 155 sample_counts = np.bincount(indices, minlength=n_samples)
156 curr_sample_weight *= sample_counts
157
<__array_function__ internals> in bincount(*args, **kwargs)
ValueError: object of too small depth for desired array
【问题讨论】:
你得到的错误是什么? 添加了我的完整错误文本 代码对我来说看起来不错。Value error
表示它收到了一个无法执行所需任务的值。您的 x
或 y
似乎已损坏。您应该检查输入数据。当您尝试不同的算法(例如 sklearn 的 RandomForest)时,也会出现同样的错误。
确保尺寸和数据类型与函数的文档相匹配。
@MertTürkyılmaz 你能找到解决方案吗?在我的情况下,不同的算法不会出错。我只收到 BalanceRandomForest 的错误
【参考方案1】:
根据回溯,bincount
引发了错误。这再现了它:
In [13]: np.bincount(0)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-13-65825aeaf27a> in <module>
----> 1 np.bincount(0)
<__array_function__ internals> in bincount(*args, **kwargs)
ValueError: object of too small depth for desired array
In [14]: np.bincount(np.arange(5))
Out[14]: array([1, 1, 1, 1, 1])
bincount
适用于一维数组;如果给定标量,它会引发此错误。
现在回到traceback
,找出代码中的哪个变量是标量,而它应该是一个数组。
【讨论】:
【参考方案2】:一个小技巧是在 Jupyter notebook 中安装最新版本的 python(对我来说安装 3.7.4 有效)。对于旧版本的python,错误仍然存在。
我也有同样的问题。我在我的电脑上安装了 Jupyter notebook,我的笔记本上的 python 版本是 3.7.4。 BalancedRandomForestClassifier 工作得很好。但是,当我尝试在旧版本上运行它时说 python 3.6。我遇到了上面提到的同样的故障。
我创建的特征(BoW)也是一个二维数组。
array([[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
...,
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0]])
Jupyter notebook on my machine
Jupyter notebook on my Google Colab
【讨论】:
以上是关于ValueError:所需数组的深度太小,的主要内容,如果未能解决你的问题,请参考以下文章
ValueError:对象太深,无法在 optimize.curve_fit 中找到所需数组
sklearn - KNeighborsClassifier - ValueError:未知标签类型:'连续'
ValueError:检查输入时出错:预期dense_1_input有2维,但得到了形状为(60000、28、28)的数组