我有以下DataFrame(df):

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.rand(10, 5))

我通过分配添加更多列:

df['mean'] = df.mean(1)

如何将列的意思移到前面,即将其设置为第一列,而其他列的顺序保持不变?


当前回答

下面是一个超级简单的方法示例。如果您要从excel复制标题,请使用.split('\t')

df = df['FILE_NAME DISPLAY_PATH SHAREPOINT_PATH RETAILER LAST_UPDATE'.split()]

其他回答

要根据其他列的名称将现有列设置为右侧/左侧,请执行以下操作:

def df_move_column(df, col_to_move, col_left_of_destiny="", right_of_col_bool=True):
    cols = list(df.columns.values)
    index_max = len(cols) - 1

    if not right_of_col_bool:
        # set left of a column "c", is like putting right of column previous to "c"
        # ... except if left of 1st column, then recursive call to set rest right to it
        aux = cols.index(col_left_of_destiny)
        if not aux:
            for g in [x for x in cols[::-1] if x != col_to_move]:
                df = df_move_column(
                        df, 
                        col_to_move=g, 
                        col_left_of_destiny=col_to_move
                        )
            return df
        col_left_of_destiny = cols[aux - 1]

    index_old = cols.index(col_to_move)
    index_new = 0
    if len(col_left_of_destiny):
        index_new = cols.index(col_left_of_destiny) + 1

    if index_old == index_new:
        return df

    if index_new < index_old:
        index_new = np.min([index_new, index_max])
        cols = (
            cols[:index_new]
            + [cols[index_old]]
            + cols[index_new:index_old]
            + cols[index_old + 1 :]
        )
    else:
        cols = (
            cols[:index_old]
            + cols[index_old + 1 : index_new]
            + [cols[index_old]]
            + cols[index_new:]
        )

    df = df[cols]
    return df

E.g.

cols = list("ABCD")
df2 = pd.DataFrame(np.arange(4)[np.newaxis, :], columns=cols)
for k in cols:
    print(30 * "-")
    for g in [x for x in cols if x != k]:
        df_new = df_move_column(df2, k, g)
        print(f"{k} after {g}:  {df_new.columns.values}")
for k in cols:
    print(30 * "-")
    for g in [x for x in cols if x != k]:
        df_new = df_move_column(df2, k, g, right_of_col_bool=False)
        print(f"{k} before {g}:  {df_new.columns.values}")

输出:

我认为这个函数更简单。您只需在开始或结束处或同时指定列的子集:

def reorder_df_columns(df, start=None, end=None):
    """
        This function reorder columns of a DataFrame.
        It takes columns given in the list `start` and move them to the left.
        Its also takes columns in `end` and move them to the right.
    """
    if start is None:
        start = []
    if end is None:
        end = []
    assert isinstance(start, list) and isinstance(end, list)
    cols = list(df.columns)
    for c in start:
        if c not in cols:
            start.remove(c)
    for c in end:
        if c not in cols or c in start:
            end.remove(c)
    for c in start + end:
        cols.remove(c)
    cols = start + cols + end
    return df[cols]

与上面的答案类似,还有一种方法可以使用deque()及其rotate()方法。rotate方法获取列表中的最后一个元素并将其插入开头:

from collections import deque

columns = deque(df.columns.tolist())
columns.rotate()

df = df[columns]

我自己也遇到了一个类似的问题,只是想补充一下我已经解决的问题。我喜欢用于更改列顺序的reindex_axis()方法。这是有效的:

df = df.reindex_axis(['mean'] + list(df.columns[:-1]), axis=1)

另一种基于@Jorge评论的方法:

df = df.reindex(columns=['mean'] + list(df.columns[:-1]))

虽然reindex_axis在微基准测试中似乎比reindex稍快,但我认为我更喜欢后者,因为它的直接性。

这里有一个非常简单的答案(只有一行)。

在将“n”列添加到df中之后,可以执行以下操作。

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.rand(10, 5))
df['mean'] = df.mean(1)
df
           0           1           2           3           4        mean
0   0.929616    0.316376    0.183919    0.204560    0.567725    0.440439
1   0.595545    0.964515    0.653177    0.748907    0.653570    0.723143
2   0.747715    0.961307    0.008388    0.106444    0.298704    0.424512
3   0.656411    0.809813    0.872176    0.964648    0.723685    0.805347
4   0.642475    0.717454    0.467599    0.325585    0.439645    0.518551
5   0.729689    0.994015    0.676874    0.790823    0.170914    0.672463
6   0.026849    0.800370    0.903723    0.024676    0.491747    0.449473
7   0.526255    0.596366    0.051958    0.895090    0.728266    0.559587
8   0.818350    0.500223    0.810189    0.095969    0.218950    0.488736
9   0.258719    0.468106    0.459373    0.709510    0.178053    0.414752


### here you can add below line and it should work 
# Don't forget the two (()) 'brackets' around columns names.Otherwise, it'll give you an error.

df = df[list(('mean',0, 1, 2,3,4))]
df

        mean           0           1           2           3           4
0   0.440439    0.929616    0.316376    0.183919    0.204560    0.567725
1   0.723143    0.595545    0.964515    0.653177    0.748907    0.653570
2   0.424512    0.747715    0.961307    0.008388    0.106444    0.298704
3   0.805347    0.656411    0.809813    0.872176    0.964648    0.723685
4   0.518551    0.642475    0.717454    0.467599    0.325585    0.439645
5   0.672463    0.729689    0.994015    0.676874    0.790823    0.170914
6   0.449473    0.026849    0.800370    0.903723    0.024676    0.491747
7   0.559587    0.526255    0.596366    0.051958    0.895090    0.728266
8   0.488736    0.818350    0.500223    0.810189    0.095969    0.218950
9   0.414752    0.258719    0.468106    0.459373    0.709510    0.178053