Data Cleaning 1

Posted 阿难的机器学习计划

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Data Cleaning 1相关的知识,希望对你有一定的参考价值。

1. Read mutiple data files;

  import pandas as pd

  data_files = [
  "ap_2010.csv",
  "class_size.csv",
  "demographics.csv",
  "graduation.csv",
  "hs_directory.csv",
  "sat_results.csv"
  ]

  data = {}

  for f in data_files:
  file = pd.read_csv("schools/{0}".format(f)) #Format string syntax
  f = f.replace(".csv","")#Delete all the .csv and save as file name
  data[f] = file

2. Read .txt file and combine function:

  all_survey = pd.read_csv("schools/survey_all.txt",delimiter = "\t", encoding = "windows-1252") #what is the meaning of delimiter and encoding?
  d75_survey = pd.read_csv("schools/survey_d75.txt",delimiter = "\t", encoding = "windows-1252") 
  survey = pd.concat([all_survey,d75_survey],axis = 0) #combine function

以上是关于Data Cleaning 1的主要内容,如果未能解决你的问题,请参考以下文章

Data Cleaning 3

Data Cleaning 5

data cleaning

Data Cleaning 4

importing-cleaning-data-in-r-case-studies

Data Cleaning_Chicago Air-quality Case_TBC!!!