

        0          1     2
0   354.7      April   4.0
1    55.4     August   8.0
2   176.5   December  12.0
3    95.5   February   2.0
4    85.6    January   1.0
5     152       July   7.0
6   238.7       June   6.0
7   104.8      March   3.0
8   283.5        May   5.0
9   278.8   November  11.0
10  249.6    October  10.0
11  212.7  September   9.0







Performing the operation in-place, and keeping the same variable name. This requires one to pass inplace=True as follows: df.sort_values(by=['2'], inplace=True) # or df.sort_values(by = '2', inplace = True) # or df.sort_values('2', inplace = True) If doing the operation in-place is not a requirement, one can assign the change (sort) to a variable: With the same name of the original dataframe, df as df = df.sort_values(by=['2']) With a different name, such as df_new, as df_new = df.sort_values(by=['2'])


        0          1     2
4    85.6    January   1.0
3    95.5   February   2.0
7   104.8      March   3.0
0   354.7      April   4.0
8   283.5        May   5.0
6   238.7       June   6.0
5     152       July   7.0
1    55.4     August   8.0
11  212.7  September   9.0
10  249.6    October  10.0
9   278.8   November  11.0
2   176.5   December  12.0


df.reset_index(drop = True, inplace = True)

# or

df = df.reset_index(drop = True)


        0          1     2
0    85.6    January   1.0
1    95.5   February   2.0
2   104.8      March   3.0
3   354.7      April   4.0
4   283.5        May   5.0
5   238.7       June   6.0
6     152       July   7.0
7    55.4     August   8.0
8   212.7  September   9.0
9   249.6    October  10.0
10  278.8   November  11.0
11  176.5   December  12.0


df = df.sort_values(by=['2']).reset_index(drop = True)


        0          1     2
0    85.6    January   1.0
1    95.5   February   2.0
2   104.8      March   3.0
3   354.7      April   4.0
4   283.5        May   5.0
5   238.7       June   6.0
6     152       July   7.0
7    55.4     August   8.0
8   212.7  September   9.0
9   249.6    October  10.0
10  278.8   November  11.0
11  176.5   December  12.0


If one is not doing the operation in-place, forgetting the steps mentioned above may lead one (as this user) to not be able to get the expected result. There are strong opinions on using inplace. For that, one might want to read this. One is assuming that the column 2 is not a string. If it is, one will have to convert it: Using pandas.to_numeric df['2'] = pd.to_numeric(df['2']) Using pandas.Series.astype df['2'] = df['2'].astype(float) If one wants in descending order, one needs to pass ascending=False as df = df.sort_values(by=['2'], ascending=False) # or df.sort_values(by = '2', ascending=False, inplace=True) [Out]: 0 1 2 2 176.5 December 12.0 9 278.8 November 11.0 10 249.6 October 10.0 11 212.7 September 9.0 1 55.4 August 8.0 5 152 July 7.0 6 238.7 June 6.0 8 283.5 May 5.0 0 354.7 April 4.0 7 104.8 March 3.0 3 95.5 February 2.0 4 85.6 January 1.0







如果您想动态排序列,而不是按字母顺序排序。 并且不想使用pd.sort_values()。 你可以试试下面的解决方案。

问题:在这个序列['A', 'C', 'D', 'B']中排序列"col1"

import pandas as pd
import numpy as np

## Sample DataFrame ##
df = pd.DataFrame({'col1': ['A', 'B', 'D', 'C', 'A']})

>>> df
0    A
1    B
2    D
3    C
4    A
## Solution ##

conditions = []
values = []

for i,j in enumerate(['A','C','D','B']):
    conditions.append((df['col1'] == j))

df['col1_Num'] = np.select(conditions, values)

df.sort_values(by='col1_Num',inplace = True)

>>> df

    col1  col1_Num
0    A         0
4    A         0
3    C         1
2    D         2
1    B         3

我尝试了上面的解决方案,但没有达到效果,所以我找到了一个适合我的不同的解决方案。升序=False是将数据帧按降序排列,默认情况下为True。我使用的是python 3.6.6和pandas 0.23.4版本。

final_df = df.sort_values(by=['2'], ascending=False)



In [18]:

        0          1     2
4    85.6    January   1.0
3    95.5   February   2.0
7   104.8      March   3.0
0   354.7      April   4.0
8   283.5        May   5.0
6   238.7       June   6.0
5   152.0       July   7.0
1    55.4     August   8.0
11  212.7  September   9.0
10  249.6    October  10.0
9   278.8   November  11.0
2   176.5   December  12.0

如果希望按两列排序,则将列标签列表传递给sort_values,其中列标签按照排序优先级排序。如果用df。Sort_values(['2', '0']),则结果将按第2列和第0列排序。当然,这对于这个例子来说没有意义,因为df['2']中的每个值都是唯一的。


ID         cost      tax    label
1       216590      1600    test      
2       523213      1800    test 
3          250      1500    experiment

(df['label'].value_counts().to_frame().reset_index()).sort_values('label', ascending=False)


    index   label
0   test        2
1   experiment  1