我有一个熊猫数据帧,看起来像这样(它非常大)
date exer exp ifor mat
1092 2014-03-17 American M 528.205 2014-04-19
1093 2014-03-17 American M 528.205 2014-04-19
1094 2014-03-17 American M 528.205 2014-04-19
1095 2014-03-17 American M 528.205 2014-04-19
1096 2014-03-17 American M 528.205 2014-05-17
现在我想逐行迭代,当我遍历每一行时,ifor的值
在每一行可以改变取决于某些条件,我需要查找另一个数据框架。
现在,我如何在迭代时更新它。
试过几招,都没用。
for i, row in df.iterrows():
if <something>:
row['ifor'] = x
else:
row['ifor'] = y
df.ix[i]['ifor'] = x
这些方法似乎都不起作用。我没有在数据框架中看到更新的值。
好吧,如果你无论如何都要迭代,为什么不使用最简单的方法df['Column'].values[i]
df['Column'] = ''
for i in range(len(df)):
df['Column'].values[i] = something/update/new_value
或者如果你想比较新值和旧值或者类似的东西,为什么不把它存储在一个列表中,然后在最后追加。
mylist, df['Column'] = [], ''
for <condition>:
mylist.append(something/update/new_value)
df['Column'] = mylist
Pandas DataFrame object should be thought of as a Series of Series. In other words, you should think of it in terms of columns. The reason why this is important is because when you use pd.DataFrame.iterrows you are iterating through rows as Series. But these are not the Series that the data frame is storing and so they are new Series that are created for you while you iterate. That implies that when you attempt to assign tho them, those edits won't end up reflected in the original data frame.
好了,现在问题已经解决了:我们该怎么做?
在这篇文章之前的建议包括:
pd.DataFrame。set_value在Pandas 0.21版已弃用
pd.DataFrame.ix已弃用
pd.DataFrame.loc很好,但可以在数组索引器上工作,你可以做得更好
我的建议
使用pd.DataFrame.at
for i in df.index:
if <something>:
df.at[i, 'ifor'] = x
else:
df.at[i, 'ifor'] = y
你甚至可以把它改为:
for i in df.index:
df.at[i, 'ifor'] = x if <something> else y
回应评论
如果我需要使用前一行的值if条件?
for i in range(1, len(df) + 1):
j = df.columns.get_loc('ifor')
if <something>:
df.iat[i - 1, j] = x
else:
df.iat[i - 1, j] = y