我有以下DataFrame(df):

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.rand(10, 5))

我通过分配添加更多列:

df['mean'] = df.mean(1)

如何将列的意思移到前面,即将其设置为第一列,而其他列的顺序保持不变?


当前回答

只需键入要更改的列名,然后为新位置设置索引。

def change_column_order(df, col_name, index):
    cols = df.columns.tolist()
    cols.remove(col_name)
    cols.insert(index, col_name)
    return df[cols]

对于您的情况,这将是:

df = change_column_order(df, 'mean', 0)

其他回答

要根据其他列的名称将现有列设置为右侧/左侧,请执行以下操作:

def df_move_column(df, col_to_move, col_left_of_destiny="", right_of_col_bool=True):
    cols = list(df.columns.values)
    index_max = len(cols) - 1

    if not right_of_col_bool:
        # set left of a column "c", is like putting right of column previous to "c"
        # ... except if left of 1st column, then recursive call to set rest right to it
        aux = cols.index(col_left_of_destiny)
        if not aux:
            for g in [x for x in cols[::-1] if x != col_to_move]:
                df = df_move_column(
                        df, 
                        col_to_move=g, 
                        col_left_of_destiny=col_to_move
                        )
            return df
        col_left_of_destiny = cols[aux - 1]

    index_old = cols.index(col_to_move)
    index_new = 0
    if len(col_left_of_destiny):
        index_new = cols.index(col_left_of_destiny) + 1

    if index_old == index_new:
        return df

    if index_new < index_old:
        index_new = np.min([index_new, index_max])
        cols = (
            cols[:index_new]
            + [cols[index_old]]
            + cols[index_new:index_old]
            + cols[index_old + 1 :]
        )
    else:
        cols = (
            cols[:index_old]
            + cols[index_old + 1 : index_new]
            + [cols[index_old]]
            + cols[index_new:]
        )

    df = df[cols]
    return df

E.g.

cols = list("ABCD")
df2 = pd.DataFrame(np.arange(4)[np.newaxis, :], columns=cols)
for k in cols:
    print(30 * "-")
    for g in [x for x in cols if x != k]:
        df_new = df_move_column(df2, k, g)
        print(f"{k} after {g}:  {df_new.columns.values}")
for k in cols:
    print(30 * "-")
    for g in [x for x in cols if x != k]:
        df_new = df_move_column(df2, k, g, right_of_col_bool=False)
        print(f"{k} before {g}:  {df_new.columns.values}")

输出:

我很喜欢Shoresh的回答:当你不知道位置时,使用集合功能来删除列,但这不符合我的目的,因为我需要保持原始的列顺序(具有任意的列标签)。

不过,我通过使用boltons包中的IndexedSet实现了这一点。

我还需要重新添加多个列标签,因此对于更一般的情况,我使用了以下代码:

from boltons.setutils import IndexedSet
cols = list(IndexedSet(df.columns.tolist()) - set(['mean', 'std']))
cols[0:0] =['mean', 'std']
df = df[cols]

希望这对搜索此线程以寻求通用解决方案的任何人都有用。

只需键入要更改的列名,然后为新位置设置索引。

def change_column_order(df, col_name, index):
    cols = df.columns.tolist()
    cols.remove(col_name)
    cols.insert(index, col_name)
    return df[cols]

对于您的情况,这将是:

df = change_column_order(df, 'mean', 0)

我有一个在panda中重新排序列名的非常具体的用例。有时我在基于现有列的数据帧中创建一个新列。默认情况下,panda将在末尾插入我的新列,但我希望新列插入到它派生的现有列旁边。

def rearrange_list(input_list, input_item_to_move, input_item_insert_here):
    '''
    Helper function to re-arrange the order of items in a list.
    Useful for moving column in pandas dataframe.

    Inputs:
        input_list - list
        input_item_to_move - item in list to move
        input_item_insert_here - item in list, insert before 

    returns:
        output_list
    '''
    # make copy for output, make sure it's a list
    output_list = list(input_list)

    # index of item to move
    idx_move = output_list.index(input_item_to_move)

    # pop off the item to move
    itm_move = output_list.pop(idx_move)

    # index of item to insert here
    idx_insert = output_list.index(input_item_insert_here)

    # insert item to move into here
    output_list.insert(idx_insert, itm_move)

    return output_list


import pandas as pd

# step 1: create sample dataframe
df = pd.DataFrame({
    'motorcycle': ['motorcycle1', 'motorcycle2', 'motorcycle3'],
    'initial_odometer': [101, 500, 322],
    'final_odometer': [201, 515, 463],
    'other_col_1': ['blah', 'blah', 'blah'],
    'other_col_2': ['blah', 'blah', 'blah']
})
print('Step 1: create sample dataframe')
display(df)
print()

# step 2: add new column that is difference between final and initial
df['change_odometer'] = df['final_odometer']-df['initial_odometer']
print('Step 2: add new column')
display(df)
print()

# step 3: rearrange columns
ls_cols = df.columns
ls_cols = rearrange_list(ls_cols, 'change_odometer', 'final_odometer')
df=df[ls_cols]
print('Step 3: rearrange columns')
display(df)

与上面的答案类似,还有一种方法可以使用deque()及其rotate()方法。rotate方法获取列表中的最后一个元素并将其插入开头:

from collections import deque

columns = deque(df.columns.tolist())
columns.rotate()

df = df[columns]