python 处理缺失数据（NaN）

Posted 2021-05-09

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了python 处理缺失数据（NaN）相关的知识，希望对你有一定的参考价值。

from sklearn.preprocessing import Imputer

# mean imputation (other options: median, most_frequent)
imr = Imputer(missing_values='NaN', strategy='mean', axis=0)
imr = imr.fit(df)
imputed_data = imr.transform(df.values)

# return the number of missing values per column
df.isnull().sum()

# drop rows with missing values
df.dropna()

# drop columns with NaN in any row
df.dropna(axis=1)

# only drop rows where all columns are NaN
df.dropna(how='all')

# drop rows that have not at least 4 non-NaN values
df.dropna(thresh=4)

# only drop rows where NaN appear in specific columns (here: 'C')
df.dropna(subset=['C'])

以上是关于python 处理缺失数据（NaN）的主要内容，如果未能解决你的问题，请参考以下文章