error with apply function

1

I had the following problem:

I have a matrix with several columns and I want to count the cells that are not na of the last 6 cells of each column (0 being the smallest possible number and 6 the largest).

To solve it, put the following function with a apply , but it gives me a vector with incorrect values as a result. I would really appreciate a help. Here is my code:

comprobacion<-function(datos,x){
  G<-tail(datos[,x],6)!="NA"
  sum(G, na.rm = T)
}

sumas<-apply(ret.2,2,comprobacion,datos=ret.2)
    
asked by dami 03.10.2018 в 17:02
source

1 answer

1

dami,

Welcome. It is always a good idea to present at least a part of your data to facilitate the answers. Anyway with the description I think it's enough. I created some data that have (I think) the same structure as yours and with those I looked for a solution.

The data

set.seed(2018)
matriz <- matrix(runif(100, 1, 20)[sample(x = c(NA, TRUE), 
                                          prob = c(0.3, 0.7), 
                                          size = 50, 
                                          replace = TRUE)
                                   ], 
                  ncol = 10)

This code creates a random number vector of length 100, adds 0.3 to% random_co_de and converts it to a matrix of 10 * 10.

The function

comprobacion <- function(x) {
  sum(is.na(tail(x,6)))
}

apply (matriz, 2, comprobacion)

Explanation

The NA values in R are logical values (although different from NA and TRUE ). Therefore it is not possible to verify equality with FALSE . In that case I would have a == "NA" when the character string TRUE is, but it's not what you're interested in.

To verify equality with "NA" the function NA is used. If we pass a vector to is.na() we will return another vector of equal length with is.na() in the same positions in which it found TRUE and NA when it found something else.

One property of R is that when cohering a logical vector to a numeric one (that's what we implicitly do with the FALSE ) the sum() is considered 1 and the TRUE is considered 0. Therefore when adding the amount of values FALSE , are the NA that silently passed to 1.

TRUE , as you were using in your original function, shows the last elements of a data structure, in this case the last 6.

Then if we add the logical vector resulting from the test tail() of the last 6 elements of each column we obtain the number of is.na() in that location.

NA is responsible for passing a function for each column - or row - of a matrix. It gives that data already separated to the function apply , so it is not necessary to tell comprobación to operate by rows, make subsets, or specify the data. The function we passed with comprobacion will only see vectors. This makes things easier, it is not necessary that our function --in this case apply - manage that part of the code execution, we program and test it with a vector and it will work well for matrices or data frames. Therefore, it is not necessary to specify that comprobación is neither in the function nor in the x call.

  

If your data is different or the function is not doing what you need please mention it in a comment or reformulate the question to make it more specific. Sure there is a solution.

    
answered by 03.10.2018 в 21:08