分段读取csv文件并可视化处理

Posted 空中旋转篮球

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了分段读取csv文件并可视化处理相关的知识,希望对你有一定的参考价值。

1.数据

使用数据为csv文件,数据有89万条记录,名称字段具体如下:

Time (sec)

Delta Time (sec)

Segment ID

Latitude (deg)

Longitude (deg)

Easting (m)

Northing (m)

Cross-Track (m)

Along-Track (m)

Height (m HAE)

Height (m MSL)

Classification

Signal Confidence

Dem_height

luccID

32.56000081

78976682.22

161266

29.08135819

113.4059348

734190.7155

3219390.276

-22.7432411

20201.67926

77.74754333

93.76404762

0

0

0

0

32.56000081

78976682.22

161266

29.08135858

113.4059346

734190.7011

3219390.318

-22.7526257

20201.72316

91.41132355

107.4278278

0

0

0

0

32.56000081

78976682.22

161266

29.08135802

113.4059348

734190.722

3219390.257

-22.7389931

20201.65946

71.5628891

87.57939339

0

0

0

0

32.56000081

78976682.22

161266

29.08135773

113.4059349

734190.733

3219390.224

-22.7318915

20201.62621

61.22312164

77.23962593

0

0

0

0

32.56000081

78976682.22

161266

29.08135811

113.4059348

734190.7185

3219390.267

-22.7412967

20201.67021

74.91683197

90.93333626

0

0

0

0

32.56000081

78976682.22

161266

29.08135616

113.4059355

734190.7915

3219390.052

-22.6938061

20201.44813

5.772859573

21.78936386

0

0

0

0

32.56000081

78976682.22

161266

29.08135614

113.4059355

734190.7922

3219390.05

-22.6933262

20201.44587

5.074115753

21.09062004

0

0

0

0

32.56000081

78976682.22

161266

29.08135651

113.4059354

734190.7784

3219390.091

-22.7023598

20201.48811

18.22643089

34.24293518

0

0

0

0

32.56000081

78976682.22

161266

29.08135656

113.4059354

734190.7765

3219390.096

-22.7035772

20201.49374

19.99853134

36.01503563

0

0

0

0

32.56000081

78976682.22

161266

29.08135584

113.4059356

734190.8036

3219390.016

-22.6859499

20201.41132

-5.66560459

10.3508997

0

0

0

0

32.56010081

78976682.22

161266

29.08136468

113.405934

734190.6294

3219390.994

-22.7452324

20202.4024

80.65776825

96.67429392

0

0

0

0

2.读取数据 

2.1完整读取数据并显示

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

inputpath=r"E:\\csv_dbscan\\ATL03_20200703015729_01180802_005_01_gt1r.csv"
df=pd.read_csv(inputpath)
X=df['Time (sec)']
Y=df['Height (m HAE)']

plt.figure()
plt.scatter(X,Y,marker='o',s=0.000003,label='Point Cloud')
plt.legend()
plt.show()

放大显示:

 2.2分段读取数据并显示

(1)根据总记录数和分段数来计算每段的数量是多少,然后我们每次只读取一段数据。

方法1:使用chunksize参数实现分段读取和显示

chunksize是按照行记录数量来分段的

df_chunker = pd.read_csv(inputpath,chunksize=segment_count)
for df_item in df_chunker:
    X_seg = df_item['Time (sec)']
    Y_seg = df_item['Height (m HAE)']
    plt.figure()
    plt.scatter(X_seg, Y_seg, marker='o', s=1, label='Point Cloud')
    plt.legend()
    plt.show()

比如我们每段数据设置为segment_count=10000条,那么就有segment_n=N/segment_count(segment_count每段记录数,N总记录数,segment_n为段数)段数据。

我们按照顺序读取每段数据并显示:

方法2:使用沿线距离来分段显示 

比如我们使用Along-Track (m)属性1000m分段显示:第一段的范围是20201.67926-20301.67926,后面依次加1000.

len_data=len(df['Time (sec)']) #行记录数
segment_count=1000
Along_track_n=int((df['AlongTrack'][len_data-1]-df['AlongTrack'][0])/segment_count)+1
start=df['AlongTrack'][0]
end=df['AlongTrack'][len_data-1]
for len_seg in range(Along_track_n):
    df_seg = df.loc[(df['AlongTrack'] >= (start + len_seg * segment_count)) & (df['AlongTrack'] <= (start+ (len_seg + 1) * segment_count)),:]
    X_seg = df_seg['AlongTrack']
    Y_seg = df_seg['Height (m HAE)']
    if len(X_seg)==0:
        print("该段没数据!")
    plt.figure()
    plt.scatter(X_seg, Y_seg, marker='o', s=1, label='Point Cloud')
    plt.legend()
    plt.show()

以上是关于分段读取csv文件并可视化处理的主要内容,如果未能解决你的问题,请参考以下文章

分段读取csv文件并可视化处理

分段错误:从文件访问 csv 记录时核心转储

读取CSV温度数据并可视化

python之小应用:读取csv文件并处理01数据串

如何从流中读取 CSV 文件并在写入时处理每一行?

HTML5 文件域+FileReader 分段读取文件并上传-WebSocket