Python 中的 InfluxDB 和 pandas 错误

Posted

技术标签:

【中文标题】Python 中的 InfluxDB 和 pandas 错误【英文标题】:InfluxDB and pandas errors in Python 【发布时间】:2018-12-24 06:36:19 【问题描述】:

我 following the instructions 将数据从 influx 读取到 pandas 中,我收到以下错误:

ValueError                                Traceback (most recent call last) <ipython-input-13-1e63a2e6d3db> in <module>()
----> 1 df = pd.DataFrame(AandCStation)
      2 
      3 #AandCStation['time'] # gets the name
      4 
      5 #AandCStation.values

C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in __init__(self, data, index, columns, dtype, copy)
    328                                  dtype=dtype, copy=copy)
    329         elif isinstance(data, dict):
--> 330             mgr = self._init_dict(data, index, columns, dtype=dtype)
    331         elif isinstance(data, ma.MaskedArray):
    332             import numpy.ma.mrecords as mrecords

C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in _init_dict(self, data, index, columns, dtype)
    459             arrays = [data[k] for k in keys]
    460 
--> 461         return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
    462 
    463     def _init_ndarray(self, values, index, columns, dtype=None, copy=False):

C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in _arrays_to_mgr(arrays, arr_names, index, columns, dtype)    6161   
# figure out the index, if necessary    6162     if index is None:
-> 6163         index = extract_index(arrays)    6164     else:    6165         index = _ensure_index(index)

C:\Users\user\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in extract_index(data)    6200     6201         if not indexes and not raw_lengths:
-> 6202             raise ValueError('If using all scalar values, you must pass'    6203                              ' an index')    6204 

ValueError: If using all scalar values, you must pass an index

Read DataFrame defaultdict(<class 'list'>, 'NoT/machinename':         MachineName  MachineType SensorWorking  \

这是我正在运行的代码:

client = DataFrameClient(host, port, user, password, dbname)

print("Read DataFrame")
AandCStation = client.query("""SELECT * FROM "NoT/machinename" WHERE time >= now() - 12h""")
print(AandCStation)

print(type(AandCStation))

df = pd.DataFrame(AandCStation)

这是数据:

Read DataFrame
defaultdict(<class 'list'>, 'NoT/sensor':                                       MachineName  MachineType SensorWorking  \
2018-07-16 04:11:19.912895848+00:00  Quench tank          Yes   
2018-07-16 04:11:22.961838564+00:00  Quench tank          Yes   
2018-07-16 04:11:25.872680626+00:00  Quench tank          Yes   
2018-07-16 04:11:28.850205591+00:00  Quench tank          Yes   
...                                           ...          ...           ...   
2018-07-16 16:08:05.188868516+00:00  Quench tank          Yes   
2018-07-16 16:08:08.169862344+00:00  Quench tank          Yes   
2018-07-16 16:08:11.144413930+00:00  Quench tank          Yes   
2018-07-16 16:08:14.126290232+00:00  Quench tank          Yes   
2018-07-16 16:08:17.107127232+00:00  Quench tank          Yes   
2018-07-16 16:08:20.079248843+00:00  Quench tank          Yes   

                                     TempValue  
2018-07-16 04:09:50.467145647+00:00      32.69  
2018-07-16 04:09:53.888973858+00:00      32.69  
2018-07-16 04:09:55.879811649+00:00      32.69  
2018-07-16 04:09:58.818001127+00:00      32.69  
...                                        ...  
2018-07-16 16:08:05.188868516+00:00      34.19  
2018-07-16 16:08:08.169862344+00:00      34.19  
2018-07-16 16:09:43.209347998+00:00      34.19  
2018-07-16 16:09:46.187872612+00:00      34.19  

[12233 rows x 4 columns])
<class 'collections.defaultdict'>

任何想法为什么我得到错误?

【问题讨论】:

【参考方案1】:

或者,您可以只使用测量名称索引 dicts 键,以获取作为 DataFrame 的查询结果:

client.query("""SELECT * FROM "NoT/machinename" WHERE time >= now() - 12h""")["NoT/machinename"]

【讨论】:

【参考方案2】:

我今天遇到了同样的问题。

事实证明,您得到了一个 DataFrames 字典,您可以通过 concat 和 droplevel 获得所需的列。

client = DataFrameClient(host, port, user, password, dbname)

print("Read DataFrame")
AandCStation = client.query("""SELECT * FROM "NoT/machinename" WHERE time >= now() - 12h""")
AandCStation = pd.concat(AandCStation, axis=1)
AandCStation.columns = AandCStation.columns.droplevel()

print(AandCStation.head())

print(type(AandCStation))

希望这会有所帮助!

来源:

https://github.com/influxdata/influxdb-python/issues/278 Python: How to turn a dictionary of Dataframes into one big dataframe with column names being the key of the previous dict?

【讨论】:

谢谢,有帮助!非常感激!如果您有任何其他关于通过 pandas 分析 influxdb 时间序列数据的技巧将不胜感激?

以上是关于Python 中的 InfluxDB 和 pandas 错误的主要内容,如果未能解决你的问题,请参考以下文章

Python pands和matplotlib常用命令

Python 中的 InfluxDB 和 pandas 错误

influxdb 中的查询和高级操作

python 使用Python中的Modbus室内温控器的InfluxDb示例

Numpy and Pands

使用 Influxdb 和 python 在 DB 上写入数据