What to do so that the result of a calculation with a loop is saved in my df?

1

My intention with the following code is to divide two df based on the field date , to then apply a calculation and save the result of each date in a single df . This is what I tried:

for (i in unique(temp_maxmin$date))
     Tmaxmin_subset<- temp_maxmin[temp_maxmin$date==i,]
    radiosond_subset<-radiosond[radiosond$date==i,]
    basedata<-as.data.frame(seq(2500,10000,50))
    colnames(basedata)<-c("height")
    basedata$date<-Tmaxmin_subset$date
    basedata$Tmax<-(Tmaxmin_subset$max_val[[1]]-(9.8/1000*(basedata$height-radiosond_subset$height[[1]])))
    basedata$Classic<-(Tmaxmin_subset$classic[[1]]-(9.8/1000*(basedata$height-radiosond_subset$height[[1]])))
    basedata$ICU<-(Tmaxmin_subset$icubogota[[1]]-(9.8/1000*(basedata$height-radiosond_subset$height[[1]])))

I am having some problems, first hand and the most important is that they are not saving all the data corresponding to each date if not the last date. Also, I have not found a line in which I can say that only the calculation is made when the subsets coincide in the date and rest omit to do the calculation.

Sample of the df

        date  max_val  min_val  classic icubogota
      <date>    <dbl>    <dbl>    <dbl>     <dbl>
1 2006-04-17 290.4017 283.5183 288.5183  286.5183
2 2006-04-18 291.9247 283.2837 288.2837  286.2837
3 2006-04-19 292.3280 283.9537 288.9537  286.9537
4 2006-04-20 292.0320 284.1527 289.1527  287.1527
5 2006-04-21 290.8660 282.9913 287.9913  285.9913
6 2006-04-22 290.9757 282.6947 287.6947  285.6947



  press height   temp dwpt relh mixr drct sknt  thta  thte  thtv       date
1 753.0   2546 281.35  6.5   89 8.12    0    0 305.1 329.8 306.6 1999-03-11
2 740.0   2688 282.75  6.5   81 8.27   47    4 308.1 333.6 309.7 1999-03-11
3 735.1   2743 282.65  6.2   80 8.13   65    5 308.6 333.7 310.1 1999-03-11
4 708.2   3048 281.85  4.3   74 7.40  160   10 311.0 334.1 312.4 1999-03-11
5 700.0   3143 281.55  3.7   72 7.18  135    8 311.8 334.3 313.1 1999-03-11
6 682.4   3353 280.05  2.9   76 6.95  135    7 312.4 334.3 313.7 1999-03-11

Any Help is welcome

    
asked by kriouz 14.11.2017 в 16:34
source

2 answers

0

You have a concept problem. Within the cycle for , for each iteration you are doing:

basedata<-as.data.frame(seq(2500,10000,50))

What you re-define each time again the data.frame , each return basedata is created again in white and you complete the data, therefore at the end of the cycle basedata will only have the values of the last iteration.

I clarify that there are much better ways to solve than using a cycle for , but I'll leave that for you to investigate. Based on your example, what you can do is:

temp_maxmin <- read.table(text="date max_val min_val classic icubogota
2006-04-17 290.4017 283.5183 288.5183 286.5183
2006-04-18 291.9247 283.2837 288.2837 286.2837
2006-04-19 292.3280 283.9537 288.9537 286.9537
2006-04-20 292.0320 284.1527 289.1527 287.1527
2006-04-21 290.8660 282.9913 287.9913 285.9913
2006-04-22 290.9757 282.6947 287.6947 285.6947", sep=" ", header=TRUE)

radiosond <- read.table(text="press height temp dwpt relh mixr drct sknt thta thte thtv date
753.0 2546 281.35 6.5 89 8.12 0 0 305.1 329.8 306.6 1999-03-11
740.0 2688 282.75 6.5 81 8.27 47 4 308.1 333.6 309.7 2006-04-17
735.1 2743 282.65 6.2 80 8.13 65 5 308.6 333.7 310.1 2006-04-17
708.2 3048 281.85 4.3 74 7.40 160 10 311.0 334.1 312.4 1999-03-11
700.0 3143 281.55 3.7 72 7.18 135 8 311.8 334.3 313.1 1999-03-11
682.4 3353 280.05 2.9 76 6.95 135 7 312.4 334.3 313.7 2006-04-22", sep=" ", header=TRUE)

basedata<-data.frame()
for (i in unique(temp_maxmin$date)) {
    Tmaxmin_subset <- temp_maxmin[temp_maxmin$date==i,]
    radiosond_subset <- radiosond[radiosond$date==i,]

    if (nrow(Tmaxmin_subset)!= 0 & nrow(radiosond_subset)!=0) {
        tmp<-as.data.frame(seq(2500,10000,50))
        colnames(tmp)<-c("height")

        tmp$date <- Tmaxmin_subset$date
        tmp$Tmax <- (Tmaxmin_subset$max_val[[1]]-(9.8/1000*(tmp$height-radiosond_subset$height[[1]])))
        tmp$Classic <- (Tmaxmin_subset$classic[[1]]-(9.8/1000*(tmp$height-radiosond_subset$height[[1]])))
        tmp$ICU <- (Tmaxmin_subset$icubogota[[1]]-(9.8/1000*(tmp$height-radiosond_subset$height[[1]])))

        if (nrow(basedata)==0) {
            basedata <- tmp
        } else {
            basedata <- rbind(as.matrix(basedata), as.matrix(tmp))
        }
    }
}

Comments:

  • basedata we create it out of the cycle as a data.frame empty
  • I added a control to be sure that the process runs only if we have data of the two data.farmes of input: if (nrow(Tmaxmin_subset)!= 0 & nrow(radiosond_subset)!=0)
  • Then we create a temporary data.frame and complete it with your logic
  • If it is the first cycle ( nrow(basedata)==0 ) we assign to basedata the value of tmp , otherwise we use rbind() to" add " tmp to basedata
answered by 15.11.2017 в 18:32
0

You should take a look at the Tidyverse or data.table package, it allows you to work with data frames with the best options. I also think that you could use a look at Purrr, with the "map" functions that allow you to apply functions through lists and / or data frames. (I answer and do not comment because with the reputation that I have in Spanish does not allow me) Greetings and I hope you help.

    
answered by 18.11.2017 в 17:14