Warm tip: This article is reproduced from serverfault.com, please click

selecting columns based on row value, Python, Pandas

发布于 2020-11-27 23:37:34

I need to clean a Dataframe, and would like to select only columns with specific values in one of the rows. For instance extracting only those columns where the values in row number 3 is NaN.

Questioner
Glassmanet
Viewed
0
ChillerObscuro 2020-11-28 08:14:35

Joe's answer shows how to get rows based on column values, it seems like you want to get columns based on row values. Here's a simple way to achieve this using list comprehension.

In [45]: df = pd.DataFrame({'one': [2, 3, 4], 'two': [5, 6, 7], 'three': [8, 6, 1]})                                                                                                                 
In [46]: df                                                                                                                                                                                          
Out[46]: 
   one  two  three
0    2    5      8
1    3    6      6
2    4    7      1

Now we'll assign variables to say which row we're looking at, and the value which needs to be there in order to keep the column. Then we do the list comprehension and give the filtered df a new name

In [50]: row = 1                                                                                                                                                                                     
In [51]: value = 6                                                                                                                                                                                   
In [53]: list_comp = [c for c in df.columns if df[c][row] == value]                                                                                                                                   
In [54]: filtered_df = df[list_comp]                                                                                                                                                                  
In [55]: filtered_df                                                                                                                                                                                 
Out[55]: 
   two  three
0    5      8
1    6      6
2    7      1