Warm tip: This article is reproduced from serverfault.com, please click

python-在pandas中用模式值填充空列

(python - Filling empty columns with mode value in pandas)

发布于 2020-11-28 07:46:05

我想使用使用 pandas 的模式为性别列填充nan,但是我的方法不起作用

# change gender to string datatype
df['gender'] = df['gender'].map(str)

# Replace empty gender(73) with there most common gender
mode = df['gender'].mode()
df['gender'].fillna(mode, inplace=True)

df['gender'].value_counts()

输出

M 4417

F 1504

南73

名称:性别,dtype:int64

Questioner
Shadow Walker
Viewed
0
jezrael 2020-11-28 18:52:41

用数据测试:

df = pd.read_pickle('gender.pkl')
print (df)
       gender
0           M
1           M
2           M
3           M
4           M
      ...
114746      M
114747      M
114748      M
114749      M
114750      F

print (df['gender'].isna().sum())
785

print (df['gender'].value_counts())
M    85893
F    28073
Name: gender, dtype: int64

你需要选择modeby的第一个值Series.iat

mode = df['gender'].mode().iat[0]
df['gender'].fillna(mode, inplace=True)


print (df['gender'].isna().sum())
0

print (df['gender'].value_counts())

M    86678
F    28073
Name: gender, dtype: int64