Know how many times a number is repeated in a Dataframe in R

0

I am new to R and I have the following question.

I have the following data in R

      Estado
1      1
2      1
3      2
4      3
5      3
6      3
7      4
8      5
9      5

and I want to get the number of times that are repeated each

Exit

Estado  Incidencias

    1   2
    2   1
    3   3
    4   1
    5   2

I hope you can help me thank you very much

    
asked by Hermes Fce 16.04.2018 в 06:57
source

2 answers

2

Welcome.

In R there are several ways to do what you are looking for. The most direct, using the base functions that are already installed with R, is to use the function table() .

Let's say that your data.frame is called df and the column you want the counts of is called Estados . In that case you could see the counts asking R to make a table with the Estados column of the data.frame df . I'm going to create some data with that structure so you can see the example:

df <- data.frame(Estados = c (1, 2, 3, 3, 3, 4, 5, 5))

table (df$Estados)    #Uso el signo $ para indicar la columna del data.frame que me interesa.

1 2 3 4 5
1 1 3 1 2

The console output has two lines, above the names and below the counts of each one.

It may be clearer if we directly use state names (in this case from Mexico)

nuevo_df <- data.frame(Estados =c ("Tlaxcala", "Querétaro", "Nayarit","Nayarit", "Querétaro", "Tlaxcala", "Chiapas", "Nayarit"))

table (nuevo_df$Estados)

 Chiapas   Nayarit Querétaro  Tlaxcala
       1         3         2         2

If you need your output to have the structure that you presented in your question, you can do it, but you should make sure you have installed the tidyverse library, to use the functions group_by() and 'count ().

#install.packages(tidyverse)  #Sólo si no tienes instalado el paquete. Al tener un # detrás no se ejecutará.

library(tidiyverse)          #Para disponer de las funciones del paquete ne tu entorno de trabajo. 

df %>%                       #Primero llamo a los datos. Uso el símbolo %>% para conectar las operaciones. 
  group_by(Estados) %>%       #Luego indico que quiero un grupo por cada valor único en Estados, podría agupar por más variables si las tuviera/quisiera.
  tally()                      #tally cuenta cuantos elementos hay en cada grupo. 

What comes back in console as follows:

# A tibble: 4 x 2
  Estados       n
  <fct>     <int>
1 Chiapas       1
2 Nayarit       3
3 Querétaro     2
4 Tlaxcala      2
    
answered by 16.04.2018 в 07:34
2

Another basic option is to use aggregate() that it's something like a GROUP BY of sql

df <- data.frame(Estado = c (1, 1, 2, 3, 3, 3, 4, 5, 5))
new_df<-aggregate(df$Estado, df, length)

This basically groups by df$Estado and applies the function length() on each group to end up getting the amount of each. It would simply be necessary to rename the final column so that the output is as you would expect:

colnames(new_df)[2]<-"Incidencias"
new_df

  Estado Incidencias
1      1           2
2      2           1
3      3           3
4      4           1
5      5           2
    
answered by 16.04.2018 в 15:48