我有以下DataFrame(df):

import numpy as np
import pandas as pd

df = pd.DataFrame(np.random.rand(10, 5))

我通过分配添加更多列:

df['mean'] = df.mean(1)

如何将列的意思移到前面,即将其设置为第一列,而其他列的顺序保持不变?


当前回答

我想到了和Dmitriy Work一样的答案,显然是最简单的:

df["mean"] = df.mean(1)
l =  list(np.arange(0,len(df.columns) -1 ))
l.insert(0,-1)
df.iloc[:,l]

其他回答

只需按所需顺序分配列名:

In [39]: df
Out[39]: 
          0         1         2         3         4  mean
0  0.172742  0.915661  0.043387  0.712833  0.190717     1
1  0.128186  0.424771  0.590779  0.771080  0.617472     1
2  0.125709  0.085894  0.989798  0.829491  0.155563     1
3  0.742578  0.104061  0.299708  0.616751  0.951802     1
4  0.721118  0.528156  0.421360  0.105886  0.322311     1
5  0.900878  0.082047  0.224656  0.195162  0.736652     1
6  0.897832  0.558108  0.318016  0.586563  0.507564     1
7  0.027178  0.375183  0.930248  0.921786  0.337060     1
8  0.763028  0.182905  0.931756  0.110675  0.423398     1
9  0.848996  0.310562  0.140873  0.304561  0.417808     1

In [40]: df = df[['mean', 4,3,2,1]]

现在,“mean”列出现在前面:

In [41]: df
Out[41]: 
   mean         4         3         2         1
0     1  0.190717  0.712833  0.043387  0.915661
1     1  0.617472  0.771080  0.590779  0.424771
2     1  0.155563  0.829491  0.989798  0.085894
3     1  0.951802  0.616751  0.299708  0.104061
4     1  0.322311  0.105886  0.421360  0.528156
5     1  0.736652  0.195162  0.224656  0.082047
6     1  0.507564  0.586563  0.318016  0.558108
7     1  0.337060  0.921786  0.930248  0.375183
8     1  0.423398  0.110675  0.931756  0.182905
9     1  0.417808  0.304561  0.140873  0.310562
import numpy as np
import pandas as pd
df = pd.DataFrame()
column_names = ['x','y','z','mean']
for col in column_names: 
    df[col] = np.random.randint(0,100, size=10000)

您可以尝试以下解决方案:

解决方案1:

df = df[ ['mean'] + [ col for col in df.columns if col != 'mean' ] ]

解决方案2:


df = df[['mean', 'x', 'y', 'z']]

解决方案3:

col = df.pop("mean")
df = df.insert(0, col.name, col)

解决方案4:

df.set_index(df.columns[-1], inplace=True)
df.reset_index(inplace=True)

解决方案5:

cols = list(df)
cols = [cols[-1]] + cols[:-1]
df = df[cols]

解决方案6:

order = [1,2,3,0] # setting column's order
df = df[[df.columns[i] for i in order]]

时间比较:

解决方案1:

CPU时间:用户1.05 ms,sys:35µs,总计:1.08 ms壁时间:995µs

解决方案2:

CPU时间:用户933µs,系统:0 ns,总计:933µ壁时间:800µs

解决方案3:

CPU时间:用户0 ns,sys:1.35 ms,总计:1.35 ms壁时间:1.08 ms

解决方案4:

CPU时间:用户1.23毫秒,系统:45µs,总计:1.27毫秒壁时间:986µs

解决方案5:

CPU时间:用户1.09 ms,系统:19µs,总计:1.11 ms壁时间:949µs

解决方案6:

CPU时间:用户955µs,系统:34µs,总计:989µs壁时间:859µs

这里有一种移动一个现有列的方法,它将修改现有的数据帧。

my_column = df.pop('column name')
df.insert(3, my_column.name, my_column)  # Is in-place

要根据其他列的名称将现有列设置为右侧/左侧,请执行以下操作:

def df_move_column(df, col_to_move, col_left_of_destiny="", right_of_col_bool=True):
    cols = list(df.columns.values)
    index_max = len(cols) - 1

    if not right_of_col_bool:
        # set left of a column "c", is like putting right of column previous to "c"
        # ... except if left of 1st column, then recursive call to set rest right to it
        aux = cols.index(col_left_of_destiny)
        if not aux:
            for g in [x for x in cols[::-1] if x != col_to_move]:
                df = df_move_column(
                        df, 
                        col_to_move=g, 
                        col_left_of_destiny=col_to_move
                        )
            return df
        col_left_of_destiny = cols[aux - 1]

    index_old = cols.index(col_to_move)
    index_new = 0
    if len(col_left_of_destiny):
        index_new = cols.index(col_left_of_destiny) + 1

    if index_old == index_new:
        return df

    if index_new < index_old:
        index_new = np.min([index_new, index_max])
        cols = (
            cols[:index_new]
            + [cols[index_old]]
            + cols[index_new:index_old]
            + cols[index_old + 1 :]
        )
    else:
        cols = (
            cols[:index_old]
            + cols[index_old + 1 : index_new]
            + [cols[index_old]]
            + cols[index_new:]
        )

    df = df[cols]
    return df

E.g.

cols = list("ABCD")
df2 = pd.DataFrame(np.arange(4)[np.newaxis, :], columns=cols)
for k in cols:
    print(30 * "-")
    for g in [x for x in cols if x != k]:
        df_new = df_move_column(df2, k, g)
        print(f"{k} after {g}:  {df_new.columns.values}")
for k in cols:
    print(30 * "-")
    for g in [x for x in cols if x != k]:
        df_new = df_move_column(df2, k, g, right_of_col_bool=False)
        print(f"{k} before {g}:  {df_new.columns.values}")

输出:

书中最黑客的方法

df.insert(0, "test", df["mean"])
df = df.drop(columns=["mean"]).rename(columns={"test": "mean"})