尝试合并多个数据帧时,如何解决“ValueError: If using all scalar values, you must pass an index”
Posted
技术标签:
【中文标题】尝试合并多个数据帧时,如何解决“ValueError: If using all scalar values, you must pass an index”【英文标题】:When attempting to merge multiple dataframes, how to resolve "ValueError: If using all scalar values, you must pass an index" 【发布时间】:2019-07-06 02:33:12 【问题描述】:我正在尝试从 Bitfinex 交易所获取和存储所有历史 1 分钟蜡烛数据。尝试将新数据帧附加到现有数据帧时,我收到此错误“ValueError:如果使用所有标量值,则必须传递索引”,尽管在构造函数中传递了索引。
在这里尝试了解决方案 - 在 DataFrame 构造函数中传递索引: Constructing pandas DataFrame from values in variables gives "ValueError: If using all scalar values, you must pass an index"。它可能很简单,但没有运气。
# Example: https://api-pub.bitfinex.com/v2/candles/trade:1m:tBTCUSD/hist?limit=100&start=1549086300000&end=1549174500000
# Params: timeframe, ticker, number of candles, MS start, MS end
# Note: parameter "end" seems to be unnecessary.
# JSON: [[MTS, OPEN, CLOSE, HIGH, LOW, VOLUME],]
import json
import time
import datetime
import requests
import pandas as pd
url = 'https://api-pub.bitfinex.com/v2/'
# Return dataframe of all historical 1m candles
def get_candles_all(symbol):
symbol = symbol
limit = 5000
tf = '1m'
targettime = (time.time() - 120) * 1000
start = get_genesis_timestamp(symbol)
df = get_candles_period('1m', symbol, limit, start)
while df.index[-1] <= targettime:
start = df.index[-1] # reset start to last timestamp
newdata = pd.DataFrame(get_candles_period('1m', symbol, limit, start), index=[0])
result = df.append(newdata)
df = result
return df
# Return timestamp-indexed dataframe of requested timeframe candles
def get_candles_period(tf, symbol, limit, start):
symbol = symbol
response = requests.get(url +"candles/trade:" + tf + ':t' + symbol + '/hist?limit=' + str(limit) + '&start=' + str(start) + '&sort=1').json()
df = pd.DataFrame(response)
df.columns = ["MS", "Open", "High", "Low", "Close", "Vol"]
df.set_index("MS", inplace=True)
return df
# Return timestamp of first available 1 min candle of given asset
def get_genesis_timestamp(symbol):
symbol = symbol
response = requests.get(url + "candles/trade:1m:t" + symbol + '/hist?limit=1&sort=1').json()
df = pd.DataFrame(response)
df.columns = ["MS", "Open", "High", "Low", "Close", "Vol"]
df.set_index("MS", inplace=True)
timestamp = df.index[0]
return timestamp
symbol = "ETHUSD"
get_candles_all(symbol)
我希望 get_candles_all() 方法迭代地将“newdata”附加到“df”,直到 df 的最终索引(时间戳)在目标时间的 2 分钟内。
继续“ValueError: If using all scalar values, you must pass an index”错误,尽管有各种尝试使用非标量值或传递索引。
【问题讨论】:
【参考方案1】:df.set_index(["MS"], inplace=True)
或
df = pd.DataFrame(response,index=[value])
【讨论】:
以上是关于尝试合并多个数据帧时,如何解决“ValueError: If using all scalar values, you must pass an index”的主要内容,如果未能解决你的问题,请参考以下文章
当我合并两个数据帧时,如何防止 Pandas 将我的整数转换为浮点数?
虽然尝试使用用于olp的spark-connector发布到索引层,但在创建数据帧时获得错误不是术语。如何解决此错误?
TypeError: unhashable type: 'numpy.ndarray' 合并来自 BigQuery 的 pandas 数据帧时