Di chuyển cột theo tên lên trước bảng theo hình gấu trúc

Question 1

Đây là df của tôi:

                             Net   Upper   Lower  Mid  Zsore
Answer option                                                
More than once a day          0%   0.22%  -0.12%   2    65 
Once a day                    0%   0.32%  -0.19%   3    45
Several times a week          2%   2.45%   1.10%   4    78
Once a week                   1%   1.63%  -0.40%   6    65

Làm cách nào để tôi có thể di chuyển một cột theo tên ( "Mid") lên trước bảng, chỉ mục 0. Đây là kết quả sẽ trông như thế nào:

                             Mid   Upper   Lower  Net  Zsore
Answer option                                                
More than once a day          2   0.22%  -0.12%   0%    65 
Once a day                    3   0.32%  -0.19%   0%    45
Several times a week          4   2.45%   1.10%   2%    78
Once a week                   6   1.63%  -0.40%   1%    65

Mã hiện tại của tôi di chuyển cột theo chỉ mục bằng cách sử dụng df.columns.tolist()nhưng tôi muốn chuyển nó theo tên.

Question 2

Chúng tôi có thể sử dụng ixđể sắp xếp lại bằng cách chuyển một danh sách:

In [27]:
# get a list of columns
cols = list(df)
# move the column to head of list using index, pop and insert
cols.insert(0, cols.pop(cols.index('Mid')))
cols
Out[27]:
['Mid', 'Net', 'Upper', 'Lower', 'Zsore']
In [28]:
# use ix to reorder
df = df.ix[:, cols]
df
Out[28]:
                      Mid Net  Upper   Lower  Zsore
Answer_option                                      
More_than_once_a_day    2  0%  0.22%  -0.12%     65
Once_a_day              3  0%  0.32%  -0.19%     45
Several_times_a_week    4  2%  2.45%   1.10%     78
Once_a_week             6  1%  1.63%  -0.40%     65

Một phương pháp khác là lấy tham chiếu đến cột và lắp lại ở phía trước:

In [39]:
mid = df['Mid']
df.drop(labels=['Mid'], axis=1,inplace = True)
df.insert(0, 'Mid', mid)
df
Out[39]:
                      Mid Net  Upper   Lower  Zsore
Answer_option                                      
More_than_once_a_day    2  0%  0.22%  -0.12%     65
Once_a_day              3  0%  0.32%  -0.19%     45
Several_times_a_week    4  2%  2.45%   1.10%     78
Once_a_week             6  1%  1.63%  -0.40%     65

Bạn cũng có thể sử dụng locđể đạt được kết quả tương tự như ixsẽ không còn được dùng trong phiên bản gấu trúc trong tương lai từ 0.20.0trở đi:

df = df.loc[:, cols]

Question 3

Có lẽ tôi đang thiếu một cái gì đó, nhưng rất nhiều câu trả lời có vẻ quá phức tạp. Bạn sẽ có thể chỉ đặt các cột trong một danh sách:

Cột phía trước:

df = df[ ['Mid'] + [ col for col in df.columns if col != 'Mid' ] ]

Hoặc nếu thay vào đó, bạn muốn chuyển nó ra phía sau:

df = df[ [ col for col in df.columns if col != 'Mid' ] + ['Mid'] ]

Hoặc nếu bạn muốn di chuyển nhiều hơn một cột:

cols_to_move = ['Mid', 'Zsore']
df           = df[ cols_to_move + [ col for col in df.columns if col not in cols_to_move ] ]

Question 4

Bạn có thể sử dụng hàm df.reindex () ở gấu trúc. df là

                      Net  Upper   Lower  Mid  Zsore
Answer option                                      
More than once a day  0%  0.22%  -0.12%    2     65
Once a day            0%  0.32%  -0.19%    3     45
Several times a week  2%  2.45%   1.10%    4     78
Once a week           1%  1.63%  -0.40%    6     65

xác định danh sách tên cột

cols = df.columns.tolist()
cols
Out[13]: ['Net', 'Upper', 'Lower', 'Mid', 'Zsore']

di chuyển tên cột đến bất cứ nơi nào bạn muốn

cols.insert(0, cols.pop(cols.index('Mid')))
cols
Out[16]: ['Mid', 'Net', 'Upper', 'Lower', 'Zsore']

sau đó sử dụng df.reindex()chức năng để sắp xếp lại

df = df.reindex(columns= cols)

đặt ra là: df

                      Mid  Upper   Lower Net  Zsore
Answer option                                      
More than once a day    2  0.22%  -0.12%  0%     65
Once a day              3  0.32%  -0.19%  0%     45
Several times a week    4  2.45%   1.10%  2%     78
Once a week             6  1.63%  -0.40%  1%     65

Question 5

Tôi thích giải pháp này hơn:

col = df.pop("Mid")
df.insert(0, col.name, col)

Nó đơn giản hơn để đọc và nhanh hơn các câu trả lời được đề xuất khác.

def move_column_inplace(df, col, pos):
    col = df.pop(col)
    df.insert(pos, col.name, col)

Đánh giá hiệu suất:

Đối với thử nghiệm này, cột cuối cùng hiện tại được chuyển lên phía trước trong mỗi lần lặp lại. Các phương pháp tại chỗ thường hoạt động tốt hơn. Trong khi giải pháp của citynorman có thể được thực hiện tại chỗ, phương pháp của Ed Chum dựa trên .locvà phương pháp của sachinnm dựa trên reindexkhông thể.

Trong khi các phương pháp khác là chung chung, giải pháp của citynorman chỉ giới hạn ở pos=0. Tôi không quan sát thấy bất kỳ sự khác biệt hiệu suất nào giữa df.loc[cols]và df[cols], đó là lý do tại sao tôi không đưa vào một số đề xuất khác.

Tôi đã thử nghiệm với python 3.6.8 và pandas 0.24.2 trên MacBook Pro (giữa năm 2015).

import numpy as np
import pandas as pd

n_cols = 11
df = pd.DataFrame(np.random.randn(200000, n_cols),
                  columns=range(n_cols))

def move_column_inplace(df, col, pos):
    col = df.pop(col)
    df.insert(pos, col.name, col)

def move_to_front_normanius_inplace(df, col):
    move_column_inplace(df, col, 0)
    return df

def move_to_front_chum(df, col):
    cols = list(df)
    cols.insert(0, cols.pop(cols.index(col)))
    return df.loc[:, cols]

def move_to_front_chum_inplace(df, col):
    col = df[col]
    df.drop(col.name, axis=1, inplace=True)
    df.insert(0, col.name, col)
    return df

def move_to_front_elpastor(df, col):
    cols = [col] + [ c for c in df.columns if c!=col ]
    return df[cols] # or df.loc[cols]

def move_to_front_sachinmm(df, col):
    cols = df.columns.tolist()
    cols.insert(0, cols.pop(cols.index(col)))
    df = df.reindex(columns=cols, copy=False)
    return df

def move_to_front_citynorman_inplace(df, col):
    # This approach exploits that reset_index() moves the index
    # at the first position of the data frame.
    df.set_index(col, inplace=True)
    df.reset_index(inplace=True)
    return df

def test(method, df):
    col = np.random.randint(0, n_cols)
    method(df, col)

col = np.random.randint(0, n_cols)
ret_mine = move_to_front_normanius_inplace(df.copy(), col)
ret_chum1 = move_to_front_chum(df.copy(), col)
ret_chum2 = move_to_front_chum_inplace(df.copy(), col)
ret_elpas = move_to_front_elpastor(df.copy(), col)
ret_sach = move_to_front_sachinmm(df.copy(), col)
ret_city = move_to_front_citynorman_inplace(df.copy(), col)

# Assert equivalence of solutions.
assert(ret_mine.equals(ret_chum1))
assert(ret_mine.equals(ret_chum2))
assert(ret_mine.equals(ret_elpas))
assert(ret_mine.equals(ret_sach))
assert(ret_mine.equals(ret_city))

Kết quả :

# For n_cols = 11:
%timeit test(move_to_front_normanius_inplace, df)
# 1.05 ms ± 42.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit test(move_to_front_citynorman_inplace, df)
# 1.68 ms ± 46.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit test(move_to_front_sachinmm, df)
# 3.24 ms ± 96.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum, df)
# 3.84 ms ± 114 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_elpastor, df)
# 3.85 ms ± 58.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum_inplace, df)
# 9.67 ms ± 101 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


# For n_cols = 31:
%timeit test(move_to_front_normanius_inplace, df)
# 1.26 ms ± 31.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_citynorman_inplace, df)
# 1.95 ms ± 260 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_sachinmm, df)
# 10.7 ms ± 348 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum, df)
# 11.5 ms ± 869 µs per loop (mean ± std. dev. of 7 runs, 100 loops each
%timeit test(move_to_front_elpastor, df)
# 11.4 ms ± 598 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum_inplace, df)
# 31.4 ms ± 1.89 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Question 6

Tôi không thích cách tôi phải chỉ định rõ ràng tất cả các cột khác trong các giải pháp khác để điều này phù hợp nhất với tôi. Mặc dù nó có thể chậm đối với các khung dữ liệu lớn ...?

df = df.set_index('Mid').reset_index()

Question 7

Đây là một bộ mã chung mà tôi thường sử dụng để sắp xếp lại vị trí của các cột. Bạn có thể thấy nó hữu ích.

cols = df.columns.tolist()
n = int(cols.index('Mid'))
cols = [cols[n]] + cols[:n] + cols[n+1:]
df = df[cols]

Question 8

Để sắp xếp lại các hàng của DataFrame, chỉ cần sử dụng một danh sách như sau.

df = df[['Mid', 'Net', 'Upper', 'Lower', 'Zsore']]

Điều này làm cho nó rất rõ ràng những gì đã được thực hiện khi đọc mã sau đó. Cũng sử dụng:

df.columns
Out[1]: Index(['Net', 'Upper', 'Lower', 'Mid', 'Zsore'], dtype='object')

Sau đó cắt và dán để sắp xếp lại.

Đối với DataFrame có nhiều cột, hãy lưu trữ danh sách các cột trong một biến và đưa cột mong muốn lên đầu danh sách. Đây là một ví dụ:

cols = [str(col_name) for col_name in range(1001)]
data = np.random.rand(10,1001)
df = pd.DataFrame(data=data, columns=cols)

mv_col = cols.pop(cols.index('77'))
df = df[[mv_col] + cols]

Bây giờ df.columnscó.

Index(['77', '0', '1', '2', '3', '4', '5', '6', '7', '8',
       ...
       '991', '992', '993', '994', '995', '996', '997', '998', '999', '1000'],
      dtype='object', length=1001)

Question 9

Đây là một câu trả lời rất đơn giản cho điều này.

Đừng quên hai (()) "ngoặc" xung quanh tên các cột, nếu không, nó sẽ gây ra lỗi cho bạn.


# here you can add below line and it should work 
df = df[list(('Mid','Upper', 'Lower', 'Net','Zsore'))]
df

                             Mid   Upper   Lower  Net  Zsore
Answer option                                                
More than once a day          2   0.22%  -0.12%   0%    65 
Once a day                    3   0.32%  -0.19%   0%    45
Several times a week          4   2.45%   1.10%   2%    78
Once a week                   6   1.63%  -0.40%   1%    65

Question 10

Điều đơn giản nhất bạn có thể thử là:

df=df[[ 'Mid',   'Upper',   'Lower', 'Net'  , 'Zsore']]