我想从

['$a', '$b', '$c', '$d', '$e']

to

['a', 'b', 'c', 'd', 'e']

当前回答

这里有一个我喜欢用来减少打字的漂亮小函数:

def rename(data, oldnames, newname):
    if type(oldnames) == str: # Input can be a string or list of strings
        oldnames = [oldnames] # When renaming multiple columns
        newname = [newname] # Make sure you pass the corresponding list of new names
    i = 0
    for name in oldnames:
        oldvar = [c for c in data.columns if name in c]
        if len(oldvar) == 0:
            raise ValueError("Sorry, couldn't find that column in the dataset")
        if len(oldvar) > 1: # Doesn't have to be an exact match
            print("Found multiple columns that matched " + str(name) + ": ")
            for c in oldvar:
                print(str(oldvar.index(c)) + ": " + str(c))
            ind = input('Please enter the index of the column you would like to rename: ')
            oldvar = oldvar[int(ind)]
        if len(oldvar) == 1:
            oldvar = oldvar[0]
        data = data.rename(columns = {oldvar : newname[i]})
        i += 1
    return data

下面是一个如何工作的示例:

In [2]: df = pd.DataFrame(np.random.randint(0, 10, size=(10, 4)), columns = ['col1', 'col2', 'omg', 'idk'])
# First list = existing variables
# Second list = new names for those variables
In [3]: df = rename(df, ['col', 'omg'],['first', 'ohmy'])
Found multiple columns that matched col:
0: col1
1: col2

Please enter the index of the column you would like to rename: 0

In [4]: df.columns
Out[5]: Index(['first', 'col2', 'ohmy', 'idk'], dtype='object')

其他回答

如果您已经获得了数据帧,df.columns将所有内容转储到您可以操作的列表中,然后作为列的名称重新分配到数据帧中。。。

columns = df.columns
columns = [row.replace("$", "") for row in columns]
df.rename(columns=dict(zip(columns, things)), inplace=True)
df.head() # To validate the output

最佳方式?我不知道。一种方式——是的。

评估问题答案中提出的所有主要技术的更好方法如下:使用cProfile测量内存和执行时间@kadee、@kaitlyn和@eumiro拥有执行时间最快的函数-尽管这些函数非常快,但我们比较了所有答案的0.000和0.001秒舍入。寓意:我上面的答案可能不是“最好”的方式。

import pandas as pd
import cProfile, pstats, re

old_names = ['$a', '$b', '$c', '$d', '$e']
new_names = ['a', 'b', 'c', 'd', 'e']
col_dict = {'$a': 'a', '$b': 'b', '$c': 'c', '$d': 'd', '$e': 'e'}

df = pd.DataFrame({'$a':[1, 2], '$b': [10, 20], '$c': ['bleep', 'blorp'], '$d': [1, 2], '$e': ['texa$', '']})

df.head()

def eumiro(df, nn):
    df.columns = nn
    # This direct renaming approach is duplicated in methodology in several other answers:
    return df

def lexual1(df):
    return df.rename(columns=col_dict)

def lexual2(df, col_dict):
    return df.rename(columns=col_dict, inplace=True)

def Panda_Master_Hayden(df):
    return df.rename(columns=lambda x: x[1:], inplace=True)

def paulo1(df):
    return df.rename(columns=lambda x: x.replace('$', ''))

def paulo2(df):
    return df.rename(columns=lambda x: x.replace('$', ''), inplace=True)

def migloo(df, on, nn):
    return df.rename(columns=dict(zip(on, nn)), inplace=True)

def kadee(df):
    return df.columns.str.replace('$', '')

def awo(df):
    columns = df.columns
    columns = [row.replace("$", "") for row in columns]
    return df.rename(columns=dict(zip(columns, '')), inplace=True)

def kaitlyn(df):
    df.columns = [col.strip('$') for col in df.columns]
    return df

print 'eumiro'
cProfile.run('eumiro(df, new_names)')
print 'lexual1'
cProfile.run('lexual1(df)')
print 'lexual2'
cProfile.run('lexual2(df, col_dict)')
print 'andy hayden'
cProfile.run('Panda_Master_Hayden(df)')
print 'paulo1'
cProfile.run('paulo1(df)')
print 'paulo2'
cProfile.run('paulo2(df)')
print 'migloo'
cProfile.run('migloo(df, old_names, new_names)')
print 'kadee'
cProfile.run('kadee(df)')
print 'awo'
cProfile.run('awo(df)')
print 'kaitlyn'
cProfile.run('kaitlyn(df)')

另一种替换原始列标签的方法是从原始列标签中删除不需要的字符(此处为“$”)。

这可以通过在df.columns上运行for循环并将剥离的列附加到df.column来完成。

相反,我们可以通过使用下面的列表理解在一个语句中巧妙地做到这一点:

df.columns = [col.strip('$') for col in df.columns]

(Python中的strip方法会从字符串的开头和结尾剥离给定的字符。)

让我们通过一个小例子来理解重命名。。。

使用映射重命名列:df=pd.DataFrame({“A”:[1,2,3],“B”:[4,5,6]})#创建列名为A和B的dfdf.reame({“A”:“new_A”,“B”:“new_B”},axis='columns',inplace=True)#用'new_A'重命名列A,用'new_B'重命名列B输出:新a新b0 1 41 2 52 3 6使用映射重命名索引/Row_Name:df.reame({0:“x”,1:“y”,2:“z”},axis='index',inplace=True)#行名称被'x'、'y'和'z'替换。输出:新a新bx 142015年z 3 6

如果您只想删除“$”符号,请使用以下代码

df.columns = pd.Series(df.columns.str.replace("$", ""))

这里有一个我喜欢用来减少打字的漂亮小函数:

def rename(data, oldnames, newname):
    if type(oldnames) == str: # Input can be a string or list of strings
        oldnames = [oldnames] # When renaming multiple columns
        newname = [newname] # Make sure you pass the corresponding list of new names
    i = 0
    for name in oldnames:
        oldvar = [c for c in data.columns if name in c]
        if len(oldvar) == 0:
            raise ValueError("Sorry, couldn't find that column in the dataset")
        if len(oldvar) > 1: # Doesn't have to be an exact match
            print("Found multiple columns that matched " + str(name) + ": ")
            for c in oldvar:
                print(str(oldvar.index(c)) + ": " + str(c))
            ind = input('Please enter the index of the column you would like to rename: ')
            oldvar = oldvar[int(ind)]
        if len(oldvar) == 1:
            oldvar = oldvar[0]
        data = data.rename(columns = {oldvar : newname[i]})
        i += 1
    return data

下面是一个如何工作的示例:

In [2]: df = pd.DataFrame(np.random.randint(0, 10, size=(10, 4)), columns = ['col1', 'col2', 'omg', 'idk'])
# First list = existing variables
# Second list = new names for those variables
In [3]: df = rename(df, ['col', 'omg'],['first', 'ohmy'])
Found multiple columns that matched col:
0: col1
1: col2

Please enter the index of the column you would like to rename: 0

In [4]: df.columns
Out[5]: Index(['first', 'col2', 'ohmy', 'idk'], dtype='object')