BK: Data mining, Chapter 2 - getting to know your data

Posted dulun

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了BK: Data mining, Chapter 2 - getting to know your data相关的知识,希望对你有一定的参考价值。

Why: real-world data are typically noisy, enormous in volume, and may originate from a hodgepodge of heterogeneous sources. 

mean; median; mode(most common value); distribution; 

Knowing such basic statistics regarding each attribute makes it easier to fill in missing values, smooth noisy values, and spot outliers during data preprocessing.

以上是关于BK: Data mining, Chapter 2 - getting to know your data的主要内容,如果未能解决你的问题,请参考以下文章

python 来自https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition/blob/master/ipynb/Chapter

Data Mining 论文翻译:Deep Learning for Spatio-Temporal Data Mining: A Survey

data mining 1 concept

Data Mining Note

CSCE 474/874: Introduction to Data Mining

cluster analysis in data mining