ValueError: 找到具有 0 个特征的数组 (shape=(2698, 0)),而 MinMaxScaler 要求最小值为 1

Posted

技术标签:

【中文标题】ValueError: 找到具有 0 个特征的数组 (shape=(2698, 0)),而 MinMaxScaler 要求最小值为 1【英文标题】:ValueError: Found array with 0 feature(s) (shape=(2698, 0)) while a minimum of 1 is required by MinMaxScaler 【发布时间】:2022-01-16 03:50:47 【问题描述】:

我试图使用 sklearn 对我的数据进行预处理

import math
import datetime
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from pandas_datareader import data
import pandas_datareader.data as web

from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import Dense, LSTM


start = datetime.datetime(2011,1,1)
end = datetime.date.today()
df = web.DataReader("1211.HK", "yahoo", start, end)

plt.figure(figsize=(16,8))
plt.title('BYD close price',fontsize=18)
plt.plot(df['Close'])
plt.xlabel('Date',fontsize=18)
plt.ylabel('Close price HK($)',fontsize=18)
plt.show()

data = df.filter(['close'])
dataset = data.values
trainning_data_len =math.ceil(len (dataset)*.8)

scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(dataset)

当我尝试检查 scaled_data

时报错
ValueError: Found array with 0 feature(s) (shape=(2698, 0)) while a minimum of 1 is required by MinMaxScaler.

我不知道如何解决这个问题。 提前致谢。

更新: 我运行的环境是jupyterLab 1.2.6,错误日志如下:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-9-146c8eeabe3c> in <module>
      1 scaler = MinMaxScaler()
----> 2 scaled_data = scaler.fit_transform(dataset)

/opt/anaconda3/lib/python3.7/site-packages/sklearn/base.py in fit_transform(self, X, y, **fit_params)
    569         if y is None:
    570             # fit method of arity 1 (unsupervised transformation)
--> 571             return self.fit(X, **fit_params).transform(X)
    572         else:
    573             # fit method of arity 2 (supervised transformation)

/opt/anaconda3/lib/python3.7/site-packages/sklearn/preprocessing/_data.py in fit(self, X, y)
    337         # Reset internal state before fitting
    338         self._reset()
--> 339         return self.partial_fit(X, y)
    340 
    341     def partial_fit(self, X, y=None):

/opt/anaconda3/lib/python3.7/site-packages/sklearn/preprocessing/_data.py in partial_fit(self, X, y)
    371         X = check_array(X,
    372                         estimator=self, dtype=FLOAT_DTYPES,
--> 373                         force_all_finite="allow-nan")
    374 
    375         data_min = np.nanmin(X, axis=0)

/opt/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
    592                              " a minimum of %d is required%s."
    593                              % (n_features, array.shape, ensure_min_features,
--> 594                                 context))
    595 
    596     if warn_on_dtype and dtype_orig is not None and array.dtype != dtype_orig:

ValueError: Found array with 0 feature(s) (shape=(2698, 0)) while a minimum of 1 is required by MinMaxScaler.

【问题讨论】:

请添加完整的错误日志。 @HIMANSHUKAWALE 是的,我更新了错误日志,请检查一下 【参考方案1】:

您的数据框:

Index(['High', 'Low', 'Open', 'Close', 'Volume', 'Adj Close'], dtype='object')

所以应该是 df.filter(['Close']) 而不是 df.filter(['close'])

data = df.filter(['Close'])
dataset = data.values
trainning_data_len =math.ceil(len (dataset)*.8)

scaler = MinMaxScaler()
scaled_data = scaler.fit_transform(dataset)

scaled_data[:5]
array([[0.09673202],
       [0.10424837],
       [0.10441177],
       [0.10571895],
       [0.10571895]])

【讨论】:

以上是关于ValueError: 找到具有 0 个特征的数组 (shape=(2698, 0)),而 MinMaxScaler 要求最小值为 1的主要内容,如果未能解决你的问题,请参考以下文章

ValueError:X每个样本具有231个特征;期待1228

Python 3 - ValueError: 找到包含 0 个样本的数组 (shape=(0, 11)),而 MinMaxScaler 至少需要 1

ValueError:找到的数组带有0个样本(形状=(0,35)),而StandardScaler至少需要1个]]

ValueError:检查目标时出错:预期(keras 序列模型层)具有 n 维,但得到的数组具有形状

ValueError:检查输入时出错:预期 conv2d_input 有 4 个维度,但得到的数组具有形状(无,1)

Numpy hstack - “ValueError:所有输入数组必须具有相同的维数” - 但它们确实如此