python 将concat dir文件导入Dataframe

Posted 2021-05-08

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了python 将concat dir文件导入Dataframe相关的知识，希望对你有一定的参考价值。

#READ IN FILES, SKIP FIRST ROW AND ADD NEW COLUMN WITH FILE NAME AS VALUE,CONCAT ALL
path = "pathname\\"
files = glob.glob(os.path.join(path, '*.csv'))
df=pd.concat([pd.read_csv(fp).assign(new_col_name=os.path.basename(fp).split('.')[0]) for fp in files],sort=False)

#CONCAT FILES HEADER DEFINED,DROP DUPLICATES
files = glob.glob("*.csv") 
df = pd.concat((pd.read_csv(f, header = 0) for f in files),sort=False)
df_deduplicated = df.drop_duplicates(keep='first')
df_deduplicated.to_csv("texas_imp_merged.csv",index=False)

#CONCAT ALL FILES (DOESNT WATCH FOR HEADER)
os.getcwd() #get directory path
path = r'C:\Users\...\...\...\...\...\...'
# get all the csv files in directory
files = glob.glob(os.path.join(path, '*.csv'))
# loop through files and read them in with pandas
frames = []  # a list to hold all the individual DataFrames
for file in files:
    df = pd.read_csv(file)
    frames.append(df)
# concatenate them all together
df = pd.concat(frames, ignore_index=True)

以上是关于python 将concat dir文件导入Dataframe的主要内容，如果未能解决你的问题，请参考以下文章