Rstudio Removewords doubt

3

I only have one question, is there any way to see how many words were removed using removewords?

Thank you!

    
asked by Sergio Camilo 10.04.2017 в 02:03
source

1 answer

4

This is one way:

z = "R es un lenguaje de programación interpretado y un ambiente de desarrollo
especializado en estadística. Es utilizado por especialistas en estadística y
en minería de datos para el diseño de herramientas de software para el análisis
estadístico de datos. Es una implementación del lenguaje S, desarrollado por
Bell Labs en 1976. Aunque R funciona principalmente a través de una herramienta
de línea de comandos, existen varias interfaces gráficas disponibles (como
RCMDR y RStudio)."

list of words to be removed

quitar = c("es","de","el")

zSin = removeWords(z,quitar)

zSin no longer contains the words given in quitar

To calculate how many have been removed from each of these must first separate the text into individual words, making cuts in spaces or line breaks ( \n )

zSeparado = strsplit(z,split = "( |\n)")[[1]]

L = sapply(X = quitar,FUN = function(x){sum(zSeparado %in% x)})

Result:

> L
es de el 
 1  9  2 

9 "of", 2 "the" and one "is" of the text have been removed. Note that both removeWords and %in% are case sensitive

    
answered by 05.05.2017 в 06:57