使用嵌套字典绘制多条线，并将未知变量绘制到折线图

Posted 2023-02-15

技术标签:

【中文标题】使用嵌套字典绘制多条线，并将未知变量绘制到折线图【英文标题】：Plotting multiple lines with a Nested Dictionary, and unknown variables to Line Graph 【发布时间】：2022-01-19 03:09:41 【问题描述】：

我能够找到我的问题的一些答案，但它不像我的字典那样嵌套，所以我真的不确定如何继续，因为我对 python 还是很陌生。我目前有一个嵌套字典，例如

'140.10': '46': '1': '-49.50918', '2': '-50.223637', '3': '49.824406', '28': '1': '-49.50918', '2': '-50.223637', '3': '49.824406':

我想绘制它，以便“140.10”成为图表的标题，“46”和“28”成为单独的线，键“1”例如在 y 轴上，x 轴是最终数字（在本例中为 '-49.50918）。基本上是这样的图表：

我使用 csv 文件生成了这个图表，该文件仅使用 excel 编写在代码的另一部分：

[![在此处输入图片描述][2]][2]

我遇到的问题是这些键是从较大的 csv 文件自动生成的，在运行代码之前我不会知道它们的确切值。因为每个键都是在脚本的早期部分自动生成的。因为我将在名为 Graph name 的各种文件上运行它，并且每个文件将具有不同的值：

key1:key2_1: key3_1: value1, key3_2: value2, key3_3: value3, key_2_2 ...

我尝试过这样做：

for filename in os.listdir(Directory):
if filename.endswith('.csv'):
    q = filename.split('.csv')[0]
    s = q.split('_')[0]
    if s in time_an_dict:
        atom = list(time_an_dict[s])
        ion = time_an_dict[s]
        for f in time_an_dict[s]:
            x_val = []
            y_val = []
            fz = ion[f]
            for i in time_an_dict[s][f]:
                pos = (fz[i])
                frame = i
                y_val.append(frame)
                x_val.append(pos)

        '''ions = atom
        frame = frames
        position = pos
        plt.plot(frame, position, label = frames)
        plt.xlabel("Frame")
        plt.ylabel("Position")
        plt.show()
        #plt.savefig('_Pos.png'.format(s))'''

但它没有按预期运行。我也试过：

for filename in os.listdir(Directory):
if filename.endswith('_Atom.csv'):
    q = filename.split('.csv')[0]
    s = q.split('_')[0]
    if s in window_dict:
        name = s + '_Atom.csv'
        time_an_dict[s] = analyze_time(name,window_dict[s])
        new = '_A_pos.csv'.format(s)
        ions = list(time_an_dict.values())[0].keys()
        for i in ions:
            x_axis_values = []
            y_axis_values = []
            frame = list(time_an_dict[s][i])
            x_axis_values.append(frame)
            empty = []
            print(x_axis_values)
            for x in frame:
                values = time_an_dict[s][i][x]
                empty.append(values)
                y_axis_values.append(empty)
            plt.plot(x_axis_values, y_axis_values, label = x )
plt.show()

但不断收到错误：

Traceback（最近一次调用最后一次）：文件“Atoms_pos.py”，第 175 行，在 plt.plot(x_axis_values, y_axis_values, label = x ) 文件 "/Users/hxb51/opt/anaconda3/lib/python3.8/site-packages/matplotlib/pyplot.py", 第 2840 行，在情节中返回 gca().plot( 文件 "/Users/hxb51/opt/anaconda3/lib/python3.8/site-packages/matplotlib/axes/_axes.py", 第 1743 行，在情节中 lines = [*self._get_lines(*args, data=data, **kwargs)] 文件 "/Users/hxb51/opt/anaconda3/lib/python3.8/site-packages/matplotlib/axes/_base.py", 第 273 行，在调用从 self._plot_args(this, kwargs) 文件“/Users/hxb51/opt/anaconda3/lib/python3.8/site-packages/matplotlib/axes/_base.py”产生，第 394 行，在 _plot_args self.axes.xaxis.update_units(x) 文件“/Users/hxb51/opt/anaconda3/lib/python3.8/site-packages/matplotlib/axis.py”，第 1466 行，在 update_units 中默认 = self.converter.default_units(data, self) 文件“/Users/hxb51/opt/anaconda3/lib/python3.8/site-packages/matplotlib/category.py”，第 107 行，在 default_units 中 axis.set_units（UnitData（数据））文件“/Users/hxb51/opt/anaconda3/lib/python3.8/site-packages/matplotlib/category.py”，第 176 行，在 init 中 self.update（数据）文件“/Users/hxb51/opt/anaconda3/lib/python3.8/site-packages/matplotlib/category.py”，第 209 行，更新中 For val in OrderedDict.fromkeys(data): TypeError: unhashable type: 'numpy.ndarray'

这是生成我正在使用的文件和字典的代码的其余部分。我在另一个问题中被告知这可能会有所帮助。

    # importing dependencies
import math
import sys
import pandas as pd
import MDAnalysis as mda
import os
import numpy as np
import csv
import matplotlib.pyplot as plt
################################################################################

###############################################################################
Directory = '/Users/hxb51/Desktop/Q_prof/Displacement_Charge/Blah'
os.chdir(Directory)

################################################################################
''' We are only looking at the positions of the CLAs and SODs and not the DRUDE counterparts. We are assuming the DRUDE
are very close and it is not something that needs to be concerned with'''

def Positions(dcd, topo):
    fields = ['Window', 'ION', 'ResID', 'Location', 'Position', 'Frame', 'Final']
    with open('_Atoms.csv'.format(s), 'a') as d:
        writer = csv.writer(d)
        writer.writerow(fields)
    d.close()
    CLAs = u.select_atoms('segid IONS and name CLA')
    SODs = u.select_atoms('segid IONS and name SOD')
    CLA_res = len(CLAs)
    SOD_res = len(SODs)
    frame = 0
    for ts in u.trajectory[-10:]:
        frame +=1
        CLA_pos = CLAs.positions[:,2]
        SOD_pos = SODs.positions[:,2]
        for i in range(CLA_res):
            ids = i + 46
            if CLA_pos[i] < 0:
                with open('_Atoms.csv'.format(s), 'a') as q:
                    new_line = [s,'CLA', ids, 'Bottom', CLA_pos[i], frame,10]
                    writes = csv.writer(q)
                    writes.writerow(new_line)
                    q.close()
            else:
                with open('_Atoms.csv'.format(s), 'a') as q:
                    new_line = [s,'CLA', ids, 'Top', CLA_pos[i], frame, 10]
                    writes = csv.writer(q)
                    writes.writerow(new_line)
                    q.close()
        for i in range(SOD_res):
            ids = i
            if SOD_pos[i] < 0:
                with open('_Atoms.csv'.format(s), 'a') as q:
                    new_line = [s,'SOD', ids, 'Bottom', SOD_pos[i], frame,10]
                    writes = csv.writer(q)
                    writes.writerow(new_line)
                    q.close()
            else:
                with open('_Atoms.csv'.format(s), 'a') as q:
                    new_line = [s,'SOD', ids, 'Top', SOD_pos[i], frame, 10]
                    writes = csv.writer(q)
                    writes.writerow(new_line)
                    q.close()
    csv_Data = pd.read_csv('_Atoms.csv'.format(s))
    filename = s + '_Atom.csv'
    sorted_df = csv_Data.sort_values(["ION", "ResID", "Frame"],
                         ascending=[True, True, True])
    sorted_df.to_csv(filename, index = False)
    os.remove('_Atoms.csv'.format(s))

''' this function underneath looks at the ResIds, compares them to make sure they are the same and then counts how many
 times the ion flip flops around the boundaries'''
def turn_dict(f):
    read = open(f)
    reader = csv.reader(read, delimiter=",", quotechar = '"')
    my_dict = 
    new_list = []
    for row in reader:
        new_list.append(row)
    for i in range(len(new_list[:])):
        prev = i - 1
        if new_list[i][2] == new_list[prev][2]:
            if new_list[i][3] != new_list[prev][3]:
                if new_list[i][2] in my_dict:
                    my_dict[new_list[i][2]] += 1
                else:
                    my_dict[new_list[i][2]] = 1
    return my_dict

def plot_flips(f):
    dict = turn_dict(f)
    ions = list(dict.keys())
    occ = list(dict.values())
    plt.bar(range(len(dict)), occ, tick_label = ions)
    plt.title("".format(s))
    plt.xlabel("Residue ID")
    plt.ylabel("Boundary Crosses")
    plt.savefig('_Flip.png'.format(s))

def analyze_time(f, dicts):
    read = open(f)
    reader = csv.reader(read, delimiter=",", quotechar='"')
    new_list = []
    keys = list(dicts.keys())
    time_dict = 
    pos_matrix = 
    for row in reader:
        new_list.append(row)
    fields = ['ResID', 'Position', 'Frame']
    with open('_A_pos.csv'.format(s), 'a') as k:
        writer = csv.writer(k)
        writer.writerow(fields)
    k.close()
    for i in range(len(new_list[:])):
        if new_list[i][2] in keys:
            with open('_A_pos.csv'.format(s), 'a') as k:
                new_line = [new_list[i][2], new_list[i][4], new_list[i][5]]
                writes = csv.writer(k)
                writes.writerow(new_line)
                k.close()
    read = open('_A_pos.csv'.format(s))
    reader = csv.reader(read, delimiter=",", quotechar='"')
    time_list = []
    for row in reader:
        time_list.append(row)
    for j in range(len(keys)):
        for i in range(len(time_list[1:])):
            if time_list[i][0] == keys[j]:
                pos_matrix[time_list[i][2]] = time_list[i][1]
        time_dict[keys[j]] = pos_matrix
    return time_dict


window_dict = 
for filename in os.listdir(Directory):
    s = filename.split('.dcd')[0]
    fors = s + '.txt'
    topos = '/Users/hxb51/Desktop/Q_prof/Displacement_Charge/topo.psf'
    if filename.endswith('.dcd'):
        print('We are starting with  \n '.format(s))
        u = mda.Universe(topos, filename)
        Positions(filename, topos)
        name = s + '_Atom.csv'
        plot_flips(name)
        window_dict[s] = turn_dict(name)
        continue
time_an_dict = 
for filename in os.listdir(Directory):
    if filename.endswith('.csv'):
        q = filename.split('.csv')[0]
        s = q.split('_')[0]
        if s in window_dict:
            name = s + '_Atom.csv'
            time_an_dict[s] = analyze_time(name,window_dict[s])
for filename in os.listdir(Directory):
    if filename.endswith('.csv'):
        q = filename.split('.csv')[0]
        s = q.split('_')[0]
        if s in time_an_dict:
            atom = list(time_an_dict[s])
            ion = time_an_dict[s]
            for f in time_an_dict[s]:
                x_val = []
                y_val = []
                fz = ion[f]
                for i in time_an_dict[s][f]:
                    pos = (fz[i])
                    frame = i
                    y_val.append(frame)
                    x_val.append(pos)

            '''ions = atom
            frame = frames
            position = pos
            plt.plot(frame, position, label = frames)
            plt.xlabel("Frame")
            plt.ylabel("Position")
            plt.show()
            #plt.savefig('_Pos.png'.format(s))'''

这里的一切都运行良好，除了最后的底部代码块。这涉及尝试从嵌套字典制作图形。任何帮助将不胜感激！

谢谢！

【问题讨论】：

【参考方案1】：

我想出了答案：

for filename in os.listdir(Directory):
if filename.endswith('_Atom.csv'):
    q = filename.split('.csv')[0]
    s = q.split('_')[0]
    if s in window_dict:
        name = s + '_Atom.csv'
        time_an_dict[s] = analyze_time(name,window_dict[s])
        new = '_A_pos.csv'.format(s)
        ions = list(time_an_dict[s])
        plt.yticks(np.arange(-50, 50, 5))
        plt.xlabel('Frame')
        plt.ylabel('Z axis position(Ang)')
        plt.title([s])
        for i in ions:
            x_value = []
            y_value = []
            time_frame =len(time_an_dict[s][i]) +1
            for frame in range(1,time_frame):
                frame = str(frame)
                x_value.append(int(frame))
                y_value.append(float(time_an_dict[s][i][frame]))
            plt.plot(x_value, y_value, label=[i])
            plt.xticks(np.arange(1, 11, 1))
        plt.legend()
        plt.savefig('_Positions.png'.format(s))
        plt.clf()
    os.remove("_A_pos.csv".format(s))

从那里，结合代码的其他部分，它会生成以下图表：

只要有更多的“.dcd”文件，就可以使用超过 1 个文件。

【讨论】：

以上是关于使用嵌套字典绘制多条线，并将未知变量绘制到折线图的主要内容，如果未能解决你的问题，请参考以下文章