向pandas DataFrame对象添加空列的最简单方法是什么?我碰到的最好的是
df['foo'] = df.apply(lambda _: '', axis=1)
有没有更合理的方法?
向pandas DataFrame对象添加空列的最简单方法是什么?我碰到的最好的是
df['foo'] = df.apply(lambda _: '', axis=1)
有没有更合理的方法?
当前回答
如果我理解正确,assignment应该填写:
>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame({"A": [1,2,3], "B": [2,3,4]})
>>> df
A B
0 1 2
1 2 3
2 3 4
>>> df["C"] = ""
>>> df["D"] = np.nan
>>> df
A B C D
0 1 2 NaN
1 2 3 NaN
2 3 4 NaN
其他回答
如果你有一个列列表,你想要为空,你可以使用赋值,然后理解字典,然后字典解包。
>>> df = pd.DataFrame({"A": [1,2,3], "B": [2,3,4]})
>>> nan_cols_name = ["C","D","whatever"]
>>> df.assign(**{col:np.nan for col in nan_cols_name})
A B C D whatever
0 1 2 NaN NaN NaN
1 2 3 NaN NaN NaN
2 3 4 NaN NaN NaN
如果希望不同列有不同的值,还可以在解包的字典中解包多个字典。
df = pd.DataFrame({"A": [1,2,3], "B": [2,3,4]})
nan_cols_name = ["C","D","whatever"]
empty_string_cols_name = ["E","F","bad column with space"]
df.assign(**{
**{col:np.nan for col in my_empy_columns_name},
**{col:"" for col in empty_string_cols_name}
}
)
如果我理解正确,assignment应该填写:
>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame({"A": [1,2,3], "B": [2,3,4]})
>>> df
A B
0 1 2
1 2 3
2 3 4
>>> df["C"] = ""
>>> df["D"] = np.nan
>>> df
A B C D
0 1 2 NaN
1 2 3 NaN
2 3 4 NaN
你可以这样做
df['column'] = None #This works. This will create a new column with None type
df.column = None #This will work only when the column is already present in the dataframe
如果您想从列表中添加列名
df=pd.DataFrame()
a=['col1','col2','col3','col4']
for i in a:
df[i]=np.nan
我喜欢:
df['new'] = pd.Series(dtype='int')
# or use other dtypes like 'float', 'object', ...
如果您有一个空的数据框架,这个解决方案确保没有只包含NaN的新行被添加。
指定dtype并不是必须的,但是如果没有指定dtype,更新的Pandas版本会产生DeprecationWarning。