Avoiding the use of For

2

Once again here trying to do things without the use of for , I have a problem that has a solution with for but I would like to give it a focus without for .

The problem is that I have some data with days and I want to create a variable in the same data that is the month to which these data correspond.

inicio <- c("01012018","01022018","01032018","01042018","01052018","01062018",
            "01072018","01082018","01092018","01102018","01112018","01122018")

fin <- c("31012018","28022018","31032018","30042018","31052018","30062018",
         "31072018","31082018","30092018","31102018","30112018","31122018")

mes <- c("enero","febrero","marzo","abril","mayo","junio",
         "julio","agosto","septiembre","octubre","noviembre","diciembre")

fechas <- data.table(inicio,fin,mes)

dias <- c("01032018","02042018","14062018","13012018","20102018")

datos <- data.table(dias)

datos$mes <- data.table(ifelse(fechas$inicio <= datos$dias & datos$dias <= fechas$fin,fechas$mes,"error"))

The output I get is the following:

       dias     mes
1: 01032018   enero
2: 02042018 febrero
3: 14062018   marzo
4: 13012018   abril
5: 20102018    mayo

How can you buy the solution is not correct, in addition, I get the following notice:

Warning messages:
1: In fechas$inicio <= datos$dias :
  longer object length is not a multiple of shorter object length
2: In datos$dias <= fechas$fin :
  longer object length is not a multiple of shorter object length
3: In '[<-.data.table'(x, j = name, value = value) :
  Supplied 12 items to be assigned to 5 items of column 'mes' (7 unused)
    
asked by Uko 03.06.2018 в 15:53
source

1 answer

0

There are several issues in the way you have posed the problem, let's see:

  • First of all, I understand that this is some kind of exercise, because obtaining a month from the date is something trivial with date functions.

  • One of the problems is that you are not working with data of date type but with characters, so the comparisons will be in the alphabetical order, in which case the dates as you have formatted them do not work. In order to correctly order a date as a character, the format must be set according to the magnitudes (higher to lower), that is, they must be in YYYYMMDD format.

  • The ifelse does not work in this case, it is usually applied to the data of a single object or of two objects consisting of dimensions, in this case fechas and datos are not consistent row by row , from there the Warnings, and the results

  • Solution

    First of all let's "reformat" the dates

    library(data.table)
    
    inicio <- c("01012018","01022018","01032018","01042018","01052018","01062018",
                "01072018","01082018","01092018","01102018","01112018","01122018")
    fin <- c("31012018","28022018","31032018","30042018","31052018","30062018",
             "31072018","31082018","30092018","31102018","30112018","31122018")
    mes <- c("enero","febrero","marzo","abril","mayo","junio",
             "julio","agosto","septiembre","octubre","noviembre","diciembre")
    
    # Formateamos las fechas en la forma YYYYMMDD 
    inicio <- paste0(substring(inicio,5,8),substring(inicio,3,4),substring(inicio,1,2))
    fin <- paste0(substring(fin,5,8),substring(fin,3,4),substring(fin,1,2))
    
    fechas <- data.table(inicio,fin,mes)
    
    dias <- c("01032018","02042018","14062018","13012018","20102018")
    # Formateamos las fechas en la forma YYYYMMDD 
    dias <- paste0(substring(dias,5,8),substring(dias,3,4),substring(dias,1,2))
    datos <- data.table(dias)
    

    Since in this case the date ranges do not overlap, we can use only one of the limits, that is, fechas$inicio and the function findIntervals() in the following way:

    datos$mes <- fechas$mes[findInterval(datos$dias, intervalos)]
    datos
    
           dias     mes
    1: 20180301   marzo
    2: 20180402   abril
    3: 20180614   junio
    4: 20180113   enero
    5: 20181020 octubre
    
        
    answered by 03.06.2018 в 17:43