Data Cleaning 3
Posted 阿难的机器学习计划
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Data Cleaning 3相关的知识,希望对你有一定的参考价值。
1. Find correlations for each type of data by using corr()
correlations = combined.corr(method = "pearson")
print(correlations["sat_score"])
note: The value of correlation is from -1 to 1. If the data close to 1, they are positive correlated. If the value close to -1, they are negative correlated. If the data close to 0, they are not correlated.
2. Then we can plot these data by using plot() function.
%matplotlib inline
import matplotlib.pyplot as plt
combined.plot(‘total_enrollment‘,‘sat_score‘,kind = "scatter") #plot(x,y,kind)
3. Then we can filter the data to digging some info we need.
4. We mapping out the school we need in certain area.
from mpl_toolkits.basemap import Basemap
m = Basemap(projection = "merc",llcrnrlat = 40.496044, urcrnrlat = 40.915256, llcrnrlon = -74.255735,urcrnrlon = -73.700272,resolution = "i") # urcrnrlon = upper right corner longititude. llcrnrlon = lower left corner longitude. urcrnrlat = upper right corner latitute,llcrnrlat = lower left corner latitude.
m.drawmapboundary(fill_color=‘#85A6D9‘)
m.drawcoastlines(color=‘#6D5F47‘, linewidth=.4)
m.drawrivers(color=‘#6D5F47‘, linewidth=.4)
latitudes = combined["lat"].tolist()
longitudes = combined["lon"].tolist()
m.scatter(longitudes,latitudes,s = 20, zorder = 2 , latlon = True ) # scatter can only shows the list.
5. We can change the parameter of the scatter() to change the
plt.show
以上是关于Data Cleaning 3的主要内容,如果未能解决你的问题,请参考以下文章