我想从

['$a', '$b', '$c', '$d', '$e']

to

['a', 'b', 'c', 'd', 'e']

当前回答

这里有一个我喜欢用来减少打字的漂亮小函数:

def rename(data, oldnames, newname):
    if type(oldnames) == str: # Input can be a string or list of strings
        oldnames = [oldnames] # When renaming multiple columns
        newname = [newname] # Make sure you pass the corresponding list of new names
    i = 0
    for name in oldnames:
        oldvar = [c for c in data.columns if name in c]
        if len(oldvar) == 0:
            raise ValueError("Sorry, couldn't find that column in the dataset")
        if len(oldvar) > 1: # Doesn't have to be an exact match
            print("Found multiple columns that matched " + str(name) + ": ")
            for c in oldvar:
                print(str(oldvar.index(c)) + ": " + str(c))
            ind = input('Please enter the index of the column you would like to rename: ')
            oldvar = oldvar[int(ind)]
        if len(oldvar) == 1:
            oldvar = oldvar[0]
        data = data.rename(columns = {oldvar : newname[i]})
        i += 1
    return data

下面是一个如何工作的示例:

In [2]: df = pd.DataFrame(np.random.randint(0, 10, size=(10, 4)), columns = ['col1', 'col2', 'omg', 'idk'])
# First list = existing variables
# Second list = new names for those variables
In [3]: df = rename(df, ['col', 'omg'],['first', 'ohmy'])
Found multiple columns that matched col:
0: col1
1: col2

Please enter the index of the column you would like to rename: 0

In [4]: df.columns
Out[5]: Index(['first', 'col2', 'ohmy', 'idk'], dtype='object')

其他回答

另一种替换原始列标签的方法是从原始列标签中删除不需要的字符(此处为“$”)。

这可以通过在df.columns上运行for循环并将剥离的列附加到df.column来完成。

相反,我们可以通过使用下面的列表理解在一个语句中巧妙地做到这一点:

df.columns = [col.strip('$') for col in df.columns]

(Python中的strip方法会从字符串的开头和结尾剥离给定的字符。)

Use:

old_names = ['$a', '$b', '$c', '$d', '$e'] 
new_names = ['a', 'b', 'c', 'd', 'e']
df.rename(columns=dict(zip(old_names, new_names)), inplace=True)

这样,您可以根据需要手动编辑new_names。当您只需要重命名几个列来纠正拼写错误、重音、删除特殊字符等时,它非常有用。

假设这是您的数据帧。

可以使用两种方法重命名列。

使用dataframe.columns=[#list]df.columns=[‘a’,‘b’,‘c’,‘d’,‘e’]此方法的限制是,如果必须更改一列,则必须传递完整的列列表。此外,此方法不适用于索引标签。例如,如果您通过以下步骤:df.columns=[‘a’、‘b’、‘c’、‘d’]这将引发错误。长度不匹配:预期轴有5个元素,新值有4个元素。另一种方法是Pandasrename()方法,用于重命名任何索引、列或行df=df.rename(列={‘$a‘:‘a‘})

同样,您可以更改任何行或列。

假设您可以使用正则表达式,则此解决方案无需使用正则表达式进行手动编码:

import pandas as pd
import re

srch = re.compile(r"\w+")

data = pd.read_csv("CSV_FILE.csv")
cols = data.columns
new_cols = list(map(lambda v:v.group(), (list(map(srch.search, cols)))))
data.columns = new_cols

这里有一个我喜欢用来减少打字的漂亮小函数:

def rename(data, oldnames, newname):
    if type(oldnames) == str: # Input can be a string or list of strings
        oldnames = [oldnames] # When renaming multiple columns
        newname = [newname] # Make sure you pass the corresponding list of new names
    i = 0
    for name in oldnames:
        oldvar = [c for c in data.columns if name in c]
        if len(oldvar) == 0:
            raise ValueError("Sorry, couldn't find that column in the dataset")
        if len(oldvar) > 1: # Doesn't have to be an exact match
            print("Found multiple columns that matched " + str(name) + ": ")
            for c in oldvar:
                print(str(oldvar.index(c)) + ": " + str(c))
            ind = input('Please enter the index of the column you would like to rename: ')
            oldvar = oldvar[int(ind)]
        if len(oldvar) == 1:
            oldvar = oldvar[0]
        data = data.rename(columns = {oldvar : newname[i]})
        i += 1
    return data

下面是一个如何工作的示例:

In [2]: df = pd.DataFrame(np.random.randint(0, 10, size=(10, 4)), columns = ['col1', 'col2', 'omg', 'idk'])
# First list = existing variables
# Second list = new names for those variables
In [3]: df = rename(df, ['col', 'omg'],['first', 'ohmy'])
Found multiple columns that matched col:
0: col1
1: col2

Please enter the index of the column you would like to rename: 0

In [4]: df.columns
Out[5]: Index(['first', 'col2', 'ohmy', 'idk'], dtype='object')