I only have one question, is there any way to see how many words were removed using removewords?
Thank you!
I only have one question, is there any way to see how many words were removed using removewords?
Thank you!
This is one way:
z = "R es un lenguaje de programación interpretado y un ambiente de desarrollo
especializado en estadística. Es utilizado por especialistas en estadística y
en minería de datos para el diseño de herramientas de software para el análisis
estadístico de datos. Es una implementación del lenguaje S, desarrollado por
Bell Labs en 1976. Aunque R funciona principalmente a través de una herramienta
de línea de comandos, existen varias interfaces gráficas disponibles (como
RCMDR y RStudio)."
list of words to be removed
quitar = c("es","de","el")
zSin = removeWords(z,quitar)
zSin
no longer contains the words given in quitar
To calculate how many have been removed from each of these must first separate the text into individual words, making cuts in spaces or line breaks ( \n
)
zSeparado = strsplit(z,split = "( |\n)")[[1]]
L = sapply(X = quitar,FUN = function(x){sum(zSeparado %in% x)})
Result:
> L
es de el
1 9 2
9 "of", 2 "the" and one "is" of the text have been removed. Note that both removeWords
and %in%
are case sensitive