这是我的代码:

import pandas as pd

data = pd.DataFrame({'Odd':[1,3,5,6,7,9], 'Even':[0,2,4,6,8,10]})

for i in reversed(data):
    print(data['Odd'], data['Even'])

当我运行这段代码时,我得到以下错误:

Traceback (most recent call last):
  File "C:\Python33\lib\site-packages\pandas\core\generic.py", line 665, in _get_item_cache
    return cache[item]
KeyError: 5

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\*****\Documents\******\********\****.py", line 5, in <module>
    for i in reversed(data):
  File "C:\Python33\lib\site-packages\pandas\core\frame.py", line 2003, in __getitem__
    return self._get_item_cache(key)
  File "C:\Python33\lib\site-packages\pandas\core\generic.py", line 667, in _get_item_cache
    values = self._data.get(item)
  File "C:\Python33\lib\site-packages\pandas\core\internals.py", line 1656, in get
    _, block = self._find_block(item)
  File "C:\Python33\lib\site-packages\pandas\core\internals.py", line 1936, in _find_block
    self._check_have(item)
  File "C:\Python33\lib\site-packages\pandas\core\internals.py", line 1943, in _check_have
    raise KeyError('no item named %s' % com.pprint_thing(item))
KeyError: 'no item named 5'

为什么我得到这个错误? 我该怎么解决呢? 什么是反转pandas.DataFrame的正确方法?


data.reindex(index=data.index[::-1])

或者仅仅是:

data.iloc[::-1]

将反转你的数据帧,如果你想有一个从下到上的for循环,你可以这样做:

for idx in reversed(data.index):
    print(idx, data.loc[idx, 'Even'], data.loc[idx, 'Odd'])

or

for idx in reversed(data.index):
    print(idx, data.Even[idx], data.Odd[idx])

你会得到一个错误,因为reversed首先调用data.__len__()返回6。然后它尝试为范围(6,0,-1)中的j调用数据[j -1],并且第一次调用将是数据[5];但是在pandas dataframe中,数据[5]表示第5列,而没有第5列,所以它会抛出异常。(见文档)


你可以用一种更简单的方式来反转行:

df[::-1]

如此:

    for i,r in data[::-1].iterrows():
        print(r['Odd'], r['Even'])

所有现有的答案都不会在反转数据帧后重置索引。

为此,请执行以下步骤:

 data[::-1].reset_index()

下面是一个实用函数,它也删除了旧的索引列,就像@Tim的评论一样:

def reset_my_index(df):
  res = df[::-1].reset_index(drop=True)
  return(res)

简单地将你的数据帧传递给函数


如果处理排序的范围索引,一种方法是:

data = data.sort_index(ascending=False)

这种方法的优点是:(1)是单行,(2)不需要效用函数,最重要的是(3)不实际改变数据框架中的任何数据。

注意:这是通过索引降序排序来实现的,因此对于任何给定的Dataframe可能并不总是合适或泛化的。


反转熊猫数据帧的正确方法是什么?

TL;博士:df (:: 1)

这是反转DataFrame的最佳方法,因为1)它是恒定的运行时间,即O(1) 2)它是一个单一的操作,3)简洁/可读(假设熟悉切片符号)。


长版本

我发现ol'切片技巧df[::-1](或等效df.loc[::-1]1)是反转DataFrame的最简洁和惯用的方法。这反映了python列表反转语法lst[::-1],其意图很明确。使用loc语法,如果需要,还可以对列进行切片,因此更加灵活。

处理索引时需要考虑以下几点:

"what if I want to reverse the index as well?" you're already done. df[::-1] reverses both the index and values. "what if I want to drop the index from the result?" you can call .reset_index(drop=True) at the end. "what if I want to keep the index untouched (IOW, only reverse the data, not the index)?" this is somewhat unconventional because it implies the index isn't really relevant to the data. Perhaps consider removing it entirely? Although what you're asking for can technically be achieved using either df[:] = df[::-1] which creates an in-place update to df, or df.loc[::-1].set_index(df.index), which returns a copy.

1: df。Loc[::-1]和df。Iloc[::-1]是等价的,因为切片语法保持不变,无论你是通过位置(Iloc)还是标签(loc)反转。


见分晓

x轴表示数据集大小。y轴表示倒车时间。没有一种方法能像切片方法一样缩放,它一直在图的底部。参考基准测试代码,使用perfplot生成的图。


对其他解决方案的评论

df.reindex(index=df.index[::-1]) is clearly a popular solution, but on first glance, how obvious is it to an unfamiliar reader that this code is "reversing a DataFrame"? Additionally, this is reversing the index, then using that intermediate result to reindex, so this is essentially a TWO step operation (when it could've been just one). df.sort_index(ascending=False) may work in most cases where you have a simple range index, but this assumes your index was sorted in ascending order and so doesn't generalize well. PLEASE do not use iterrows. I see some options suggesting iterating in reverse. Whatever your use case, there is likely a vectorized method available, but if there isn't then you can use something a little more reasonable such as list comprehensions. See How to iterate over rows in a DataFrame in Pandas for more detail on why iterrows is an antipattern.