The error is mainly due to the fact that some of the columns is a factor
and not a numerical data. This would be an example:
normalize <- function(x) {return ((x - min(x)) / (max(x) - min(x)))}
v <- factor(rnorm(10))
normalize(v)
Error in Summary.factor(c(9L, 3L, 4L, 5L, 6L, 7L, 8L, 2L, 1L, 10L), na.rm = FALSE) :
‘min’ not meaningful for factors
Already the first function min()
does not make sense with factor
. Here are some possibilities:
1. Indeed the data is a factor and does not correspond to normalize it
You can condition the application of normalize
or modify normalize
so you have in mind what to do if you get a factor
, but clearly we should not apply this function in the columns that are factor
, then ideally for my taste, it is to apply normalize
only in the corresponding columns and not in the data.frame
complete. With names(datos)[-sapply(df, is.factor)]
we get the names of the columns that are not factors, so we can subset the original data.frame
and apply normalize
in it:
lapply(datos[ ,names(datos)[-sapply(df, is.factor)], drop=FALSE], normalize)
2. Even being a factor we want to normalize it
Converting a% co_from% to numeric (as long as it makes sense to do so) is done using factor
, your as.numeric(as.character(x))
could be expressed like this:
lapply(datos, function(x) {if(is.factor(x)){as.numeric(as.character(x))}else{normalize(x)}})
Note: if necessary, rather than converting the lapply
into a numeric, is to analyze why the reading of the factor
produces a csv
and correct this.