table() function in r-is there a better way with e.g., dplyr?

Len Greski 2020-12-02 19:59:10

Another approach is to use tables::tabular() as follows.

textData <- "id Country                        Relationship_type
1 Algeria                                      2
2 Bulgaria                                     1
3 USA                                          2
4 Algeria                                      3
5 Germany                                      2
6 USA                                          1
7 Algeria                                      1
8 Bulgaria                                     3
9 USA                                          2
10 Algeria                                     2
11 Germany                                     1
12 USA                                         3"

df <- read.table(text=textData,header=TRUE)
library(tables)
tabular(Factor(Country) ~ Factor(Relationship_type),data=df)

...and the output:

          Relationship_type    
 Country  1                 2 3
 Algeria  1                 2 1
 Bulgaria 1                 0 1
 Germany  1                 1 0
 USA      1                 2 1

Still another approach is to recast the output from table() as a data frame, and pivot it wider with tidyr::pivot_wider().

# another approach: recast table output as data.frame
tableData <- data.frame(table(df$Country,df$Relationship_type))
library(dplyr)
library(tidyr)
tableData %>% 
     pivot_wider(id_cols = Var1,
                 names_from = Var2,
                 values_from = Freq)

...and the output:

> tableData %>% 
+      pivot_wider(id_cols = Var1,
+                  names_from = Var2,
+                  values_from = Freq)
# A tibble: 4 x 4
  Var1       `1`   `2`   `3`
  <fct>    <int> <int> <int>
1 Algeria      1     2     1
2 Bulgaria     1     0     1
3 Germany      1     1     0
4 USA          1     2     1

If we add a dplyr::rename() to the pipeline, we can rename the Var1 column to Country.

tableData %>% 
     pivot_wider(id_cols = Var1,
                 names_from = Var2,
                 values_from = Freq) %>%
     rename(Country = Var1)

As usual, there are many ways in R to accomplish this task. Depending on the reason why the desired output is a CSV file, there are a variety of approaches that could fit the requirements. If the ultimate goal is to create presentation quality tables, then it's worth a look at this summary of packages that create presentation quality tables: How gt fits with other packages that create display tables.

chillos 2020-12-02 11:49:21

Thanks you for the answer! I will definitely check the gt:) The tabular gave me only one's in each of the Relationship_type column, as I have numerous participants from the given Country as was shown with the table() output I added in the first post (and not the binary yes or no situation), so I don't think it improves the current situation anyhow. ooo! Haven't noticed your pivot suggestion. That's what I have been looking for, THANK YOU!

Len Greski 2020-12-02 12:01:10

@chillos - thanks for the feedback. I've updated my answer. If you remove the *(N=1) tabular() will automatically count the number of observations. I also added test data to demonstrate that the updated solution works, as you can see from the output where both solutions now have a total of 12 counts across the table cells, and some of the counts are >1.

is there a better way with e.g., dplyr?

热门github