MinMaxScaler 在任何 NumPy 数组上显示奇怪的输出

Posted

技术标签:

【中文标题】MinMaxScaler 在任何 NumPy 数组上显示奇怪的输出【英文标题】:MinMaxScaler is showing weird output on any of NumPy array 【发布时间】:2021-04-07 04:23:26 【问题描述】:

我有一个 numpy 数组,您可以通过以下代码行获得:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import requests
from sklearn.preprocessing import MinMaxScaler
url = "https://query1.finance.yahoo.com/v7/finance/download/RELIANCE.BO?period1=1577110559&period2=1608732959&interval=1d&events=history&includeAdjustedClose=true"
r = requests.get(url)
open(stock+'.csv','wb').write(r.content) #________download the stock data and save it in a csv
r = pd.read_csv(r'RELIANCE.csv',date_parser='Date') #________read the dataset
r.head(1) #view the sample of dataset

我想用MinMaxScaler将它转换为0到1之间的数字,所以我写了以下代码:

rc = r['Close'] #________select only the Close column
rc = np.array(rc) #________convert it into a np.array
def remove_nan(ac): #________the dataset has an NaN value, so to remove it I made this function 
    array1 = np.array(ac)
    nan_array = np.isnan(array1)
    not_nan_array = ~ nan_array
    ac = array1[not_nan_array]
    return ac
rc = remove_nan(rc)
rc = rc.reshape(1, -1) #________reshaping the data for conversion, as asked in the fucntion
scale = MinMaxScaler() #________initialize the scaler
rc = scale.fit_transform(rc) #________transform the data
print('rc') #see the result

但是在实现代码时,我得到了一个只包含 0 的 np.array

[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
  0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]

【问题讨论】:

如果可以,请包括预期输出数据和输入数据。以下是创建最小可重现示例 (MRE) 的方法:***.com/questions/20109391/… 数据集是一个csv 文件,我如何在此处包含它?虽然你可以像我一样通过代码下载它 【参考方案1】:

MinMaxScaler 期望第一个维度来索引单个样本。也就是说,您的情况下的形状必须是 (NUM_SAMPLES, 1)。但是你将它改造成 (1, NUM_SAMPLES)。尝试以下代码:

rc = rc.reshape(-1, 1)

【讨论】:

以上是关于MinMaxScaler 在任何 NumPy 数组上显示奇怪的输出的主要内容,如果未能解决你的问题,请参考以下文章

ValueError: 找到具有 0 个特征的数组 (shape=(2698, 0)),而 MinMaxScaler 要求最小值为 1

ValueError:找到具有 0 个样本 (s) 的数组(形状 = (0, 1),而 MinMaxScaler 要求最小值为 1

Python 3 - ValueError: 找到包含 0 个样本的数组 (shape=(0, 11)),而 MinMaxScaler 至少需要 1

如何用数组(或任何其他支持加法以便它可以偏移的东西)干净地索引numpy数组[重复]

将 Numpy 数组保存为图像

将numpy数组设置为切片而不进行任何就地操作