Warm tip: This article is reproduced from serverfault.com, please click

plotting count of zeros and ones in a dataframe

发布于 2020-11-27 23:16:25

I have a dataframe like this:

    Name        Detection
0   name1           0
1   name2           0
2   name3           0
3   name1           1
4   name3           1
5   name2           1
6   name1           1
7   name1           0

I want a bar plot that shows the count of 0's and 1's on the y-axis, separately, based on each name on the x-axis, something like this:

enter image description here

How can I do that?

Questioner
meee
Viewed
0
swatchai 2020-12-01 09:19:36

Try this runnable code:

import pandas as pd
from io import StringIO
from matplotlib import pyplot

data2 ="""Index     Name        Detection
0   name1           0
1   name2           0
2   name3           0
3   name1           1
4   name3           1
5   name2           1
6   name1           1
7   name1           0"""

df2 = pd.read_csv(StringIO( data2 ), sep='\s+', index_col='Index', engine='python')
print(df2)
result = df2.groupby(['Name']).count()
print()
print(result)

result.plot(kind='bar')

Output text:

        Name  Detection
Index                  
0      name1          0
1      name2          0
2      name3          0
3      name1          1
4      name3          1
5      name2          1
6      name1          1
7      name1          0

       Detection
Name            
name1          4
name2          2
name3          2

Output plot:

barplot

Edit

Updated code:

import pandas as pd
from io import StringIO
from matplotlib import pyplot

data2 ="""Index     Name        Detection
0   name1           0
1   name2           0
2   name3           0
3   name1           1
4   name3           1
5   name2           1
6   name1           1
7   name1           0"""

df2 = pd.read_csv(StringIO( data2 ), sep='\s+', index_col='Index', engine='python')
result = df2.groupby(by=['Name','Detection']).size().reset_index()
result.rename(columns={0:"count"}, inplace=True)
# plot bar graph
result.set_index(["Name","Detection"])['count'].unstack().plot.bar()

Output plot:

newbarplot

Note that

result.set_index(["Name","Detection"])

is the dataframe in this form:

                 count
Name  Detection       
name1 0              2
      1              2
name2 0              1
      1              1
name3 0              1
      1              1

and it is the essence of your question rather than the bar plot of it. The dataframe now has 2 indexes that reveal the count values clearly for each paired indexes of (name, detection), more intuitive than the original. The 2-indexes dataframe can be used to created the required bar plot more easily.

You miss to focus on the processing of the dataframe and make people think that you ask for something that already have good answer.