


【中文标题】遍历嵌套字典以创建数据框并添加新的列值【英文标题】:Iterate Through Nested Dictionary to Create Dataframe and Add New Column Value 【发布时间】:2021-05-04 23:25:21 【问题描述】:

Python 菜鸟,请多多包涵。


    json =
    ['Meta Data': '1. Information': 'Monthly Prices (open, high, low, close) and Volumes', '2. 
    Symbol': 'AAPL', '3. Last Refreshed': '2021-01-29', '4. Time Zone': 'US/Eastern', 'Monthly Time 
    Series': '2021-01-29': '1. open': '133.5200', '2. high': '145.0900', '3. low': '126.3820', '4. 
    close': '131.9600', '5. volume': '2239366098', '2020-12-31': '1. open': '121.0100', '2. high': 
    '138.7890', '3. low': '120.0100', '4. close': '132.6900', '5. volume': '2319687808',

    'Meta Data': '1. Information': 'Monthly Prices (open, high, low, close) and Volumes', '2. 
    Symbol': 'ZM', '3. Last Refreshed': '2021-01-29', '4. Time Zone': 'US/Eastern', 'Monthly Time 
    Series': '2021-01-29': '1. open': '340.4000', '2. high': '404.4400', '3. low': '331.1000', '4. 
    close': '372.0700', '5. volume': '121350349', '2020-12-31': '1. open': '434.7200', '2. high': 
    '434.9900', '3. low': '336.1000', '4. close': '337.3200', '5. volume': '150168985']


    df = [pd.DataFrame.from_dict(i['Monthly Time Series'], orient= 'index').sort_index(axis=1) for i in json]


    [             1. open   2. high    3. low  4. close   5. volume
    2021-01-29  133.5200  145.0900  126.3820  131.9600  2239366098
    2020-12-31  121.0100  138.7890  120.0100  132.6900  2319687808
                  1. open   2. high    3. low  4. close  5. volume
    2021-01-29  340.4000  404.4400  331.1000  372.0700  121350349
    2020-12-31  434.7200  434.9900  336.1000  337.3200  150168985]

我想要的是从'2中提取值。 Symbol' 来自 json 并将相应的股票代码附加到相应的数据中,如下所示:

    [             1. open   2. high    3. low  4. close   5. volume  ticker
    2021-01-29  133.5200  145.0900  126.3820  131.9600  2239366098  AAPL
    2020-12-31  121.0100  138.7890  120.0100  132.6900  2319687808  AAPL
                  1. open   2. high    3. low  4. close  5. volume  ticker
    2021-01-29  340.4000  404.4400  331.1000  372.0700  121350349  ZM
    2020-12-31  434.7200  434.9900  336.1000  337.3200  150168985  ZM



首先,json 不是字典。请在继续之前在type 上确认 谢谢。那我该怎么办? 如果能确认类型就好了。 【参考方案1】:



df = [ (pd.DataFrame.from_dict(i['Monthly Time Series'] , orient= 'index').sort_index(axis=1).assign(ticker=i['Meta Data']['2.Symbol'])) for i in json]

json 数据:

json =[
    'Meta Data': 
        '1. Information': 'Monthly Prices (open, high, low, close) and Volumes','2.Symbol': 'AAPL', '3. Last Refreshed': '2021-01-29', '4. Time Zone': 'US/Eastern',
'Monthly Time Series': 
        '1. open': '133.5200', '2. high': '145.0900','3. low': '126.3820', '4. close': '131.9600', '5. volume': '2239366098'
        '2020-01-30': '1. open': '121.0100', '2. high': '138.7890', '3. low': '120.0100', 
        '4. close': '132.6900', '5. volume': '2319687808'
    'Meta Data': 
        '1. Information': 'Monthly Prices (open, high, low, close) and Volumes','2.Symbol': 'ZM', '3. Last Refreshed': '2021-01-01', '4. Time Zone': 'US/Eastern',
        'Monthly Time Series': 
            '2020-02-02': '1. open': '133.5200', '2. high': '145.0900','3. low': '126.3820',
            '4. close' : '131.9600', '5. volume': '2239366098'
    '1. open': '121.0100', '2. high': '138.7890', '3. low': '120.0100','4. close' : '132.6900', '5. volume': '2319687808'

利用 assign 添加新列

addTimeSeries = lambda i : pd.DataFrame.from_dict(i['Monthly Time Series'] , orient= 'index').sort_index(axis=1)

addVal = lambda i: addTimeSeries(i).assign(ticker=i['Meta Data']['2.Symbol'])
df = [ addVal(i) for i in json]


[             1. open   2. high    3. low  4. close   5. volume ticker
 2020-01-29  133.5200  145.0900  126.3820  131.9600  2239366098   AAPL
 2020-01-30  121.0100  138.7890  120.0100  132.6900  2319687808   AAPL,
              1. open   2. high    3. low  4. close   5. volume ticker
 2020-02-02  133.5200  145.0900  126.3820  131.9600  2239366098     ZM
 2020-02-31  121.0100  138.7890  120.0100  132.6900  2319687808     ZM]


谢谢。不幸的是仍然无法正常工作。我现在收到响应:“TypeError:字符串索引必须是整数” 不知道你是如何得到那个错误 bcz 运行上面的代码我没有得到任何这样的错误。 在一行中更新了答案而不使用 lambda。



如何使用 for 循环将列值添加到数据框字典中,以便每个数据框都有一个唯一的列?




Python ❀ 字典