Python熊猫错误
Posted
技术标签:
【中文标题】Python熊猫错误【英文标题】:Python pandas error 【发布时间】:2016-06-11 23:09:20 【问题描述】:import pandas as pd
df1=pd.read_csv('inputfile.txt',names=['chr','start','stop','gene','strand'], delimiter=r'\s+')
print(df1)
count =0
c = 0
for i in df1:
for y in df1:
if abs(df1.loc[i,"start"]- df1.loc[y,"stop"]) < 201:
if i != y:
index
c +=1
print(c)
我有一个示例输入文件:
chr15 74436458 74466677 pi-1700016M24Rik.1 -
chr17 79734018 79754230 pi-Cdc42ep3.1 -
chr3 124103907 124128909 pi-1700006A11Rik.1 -
chr5 102261978 102280532 pi-Wdfy3.1 -
chr6 85061409 85076088 pi-Gm5878.1 -
chr9 51573456 51661164 pi-Arhgap20.1 +
chr10 127114107 127132221 pi-Tmem194.1 +
chr11 103286577 103315010 11-qE1-9443.1 +
chr11 107855325 107859037 11-qE1-3997.1 +
chr11 108278889 108286739 11-qE1-252.1 -
chr12 99620581 99658258 12-qE-23911.1 -
chr12 99658453 99692927 12-qE-7089.1 +
chr13 21595489 21598393 13-qA3.1-213.1 -
chr13 24997468 25026901 13-qA3.1-355.1 +
chr1 94888921 94893644 1-qD-4525.1 -
chr13 50363393 50412729 13-qA5-208.1 +
chr13 50607591 50690856 13-qA5-464.1 -
chr13 51001008 51029517 13-qA5-703.1 -
chr13 52192103 52219527 13-qA5-967.1 +
chr13 53489036 53549907 13-qB1-1517.1 +
chr14 20445381 20472632 14-qA3-3095.1 -
chr14 24901215 24939690 14-qA3-19970.1 +
chr14 25184829 25189036 14-qA3-2286.1 -
chr14 25244385 25249047 14-qA3-284.1 -
chr14 45377787 45409614 14-qC1-1261.1 -
chr14 45546497 45569941 14-qC1-1010.1 +
chr15 59081442 59106777 15-qD1-17920.1 -
chr15 59106921 59123501 15-qD1-4001.1 +
chr15 74466817 74478882 15-qD3-14639.1 +
chr15 78483658 78500962 15-qE1-8387.1 -
chr15 79758435 79764840 15-qE1-1119.1 +
chr1 127071468 127074556 1-qE3-706.1 +
chr17 22634368 22656090 17-qA3.3-352.1 +
chr17 27425220 27461973 17-qA3.3-27363.1 -
chr17 27462141 27504428 17-qA3.3-26735.1 +
chr17 49251595 49252836 17-qC-935.1 -
chr17 50378485 50382342 17-qC-59.1 +
chr17 66556151 66581098 17-qE1.1-7037.1 +
chr18 67189100 67226114 18-qE1-36451.1 -
chr18 67226241 67241315 18-qE1-1295.1 +
chr19 37333596 37338356 19-qC2-1361.1 -
chr2 92381298 92439234 2-qE1-35981.1 +
chr2 127517589 127529447 2-qF1-2536.1 +
chr2 150953183 150984330 2-qG3-1029.1 +
chr3 20301593 20405121 3-qA2-617.1 -
chr3 34725552 34777871 3-qA3-2052.1 +
chr4 57373062 57377138 4-qB3-3994.1 -
chr4 61881631 61891970 4-qB3-639.1 -
chr4 61892039 61900375 4-qB3-277.1 +
chr4 93946842 93998314 4-qC5-17839.1 -
chr4 123510867 123519209 4-qD2.2-2182.1 -
chr4 123571373 123573843 4-qD2.2-349.1 -
chr4 135182710 135186113 4-qD3-2082.1 +
chr5 113752221 113769115 5-qF-14508.1 -
chr5 113769157 113794752 5-qF-14224.1 +
chr5 115284179 115303596 5-qF-4633.1 -
chr5 137395015 137412982 5-qG2-950.1 +
chr5 144519247 144527999 5-qG2-2301.1 +
chr5 150592651 150627915 5-qG3-23659.1 -
chr6 81843811 81860488 6-qC3-6258.1 -
chr6 83525934 83538118 6-qC3-100.1 +
chr6 85937105 85953600 6-qC3-2394.1 -
chr6 87932334 87944161 6-qD1-2831.1 -
chr10 18516611 18551736 10-qA3-2592.1 -
chr6 127726093 127746390 6-qF3-8009.1 -
chr6 127746448 127791908 6-qF3-28913.1 +
chr7 60142976 60169237 7-qB5-6255.1 +
chr7 77019095 77054469 7-qD1-9417.1 -
chr7 77054649 77111245 7-qD1-16444.1 +
chr7 80242711 80250159 7-qD1-654.1 -
chr7 80250197 80271441 7-qD1-19431.1 +
chr7 80926316 80961355 7-qD2-24830.1 -
chr1 57405819 57434364 1-qC1.3-637.1 -
chr7 80961480 80977906 7-qD2-11976.1 +
chr7 132476266 132493286 7-qF3-3125.1 -
chr7 132493384 132508334 7-qF3-246.1 +
chr10 20030311 20032118 10-qA3-143.1 -
chr8 28403548 28406760 8-qA2-343.1 -
chr8 38155119 38158009 8-qA4-332.1 -
chr8 38166951 38168562 8-qA4-155.1 -
chr8 94713358 94718315 8-qC5-8200.1 +
chr8 95933840 95951276 8-qC5-2209.1 -
chr8 112641565 112656356 8-qE1-3748.1 +
chr9 3184709 3199792 9-qA1-178.1 -
chr9 54054980 54097630 9-qA5.3-24188.1 -
chr9 54097752 54117106 9-qA5.3-1495.1 +
chr9 67539058 67581593 9-qC-31469.1 -
chr9 67581751 67608736 9-qC-10667.1 +
chr9 122711578 122714587 9-qF4-150.1 -
chr10 62114440 62164257 10-qB4-6488.1 +
chr10 66154778 66160884 10-qB5.1-5404.1 -
chr10 66161040 66171440 10-qB5.1-221.1 +
chr10 75300268 75324443 10-qC1-12816.1 +
chr10 83951038 83967582 10-qC1-117.1 +
chr10 85211306 85238346 10-qC1-2617.1 +
chr10 86011423 86054254 10-qC1-1527.1 -
chr10 86079756 86088620 10-qC1-875.1 +
chr10 94136457 94151187 10-qC2-545.1 -
chr11 50755203 50757227 11-qB1.3-590.1 -
column1=chr
column2=start
column3=end
column4=gene
column5=orientation
我正在尝试查找具有相同染色体但差异为 200 的位点。这是我目前所拥有的,并且不断出现错误。
如果有人可以请保留。 KeyError: '标签 [chr] 不在 [index] 中'
【问题讨论】:
【参考方案1】:for i in df1
行实际上是遍历数据框的列,而不是行,你想要for i in df1.index:
顺便说一句,最好对列进行矢量化操作,而不是像这样迭代,比如
import numpy as np
c = np.sum(np.abs(df['start'] - df['stop']) < 201)
【讨论】:
以上是关于Python熊猫错误的主要内容,如果未能解决你的问题,请参考以下文章