The ifelse()
could only serve you, if both data.frame
have the same number of records, the same ID
and the same order:
df1 <- data.frame(ID=c(1,2,3), edad=c(20, 30, 45))
df2 <- cbind(df1, sexo = c("F", "M", "F"))
df1$NuevaVariable <- ifelse(df1$ID == df2$ID & df1$edad == df2$edad, as.character(df2$sexo) ,"NO")
df1
ID edad NuevaVariable
1 1 20 F
2 2 30 M
3 3 45 F
As you can see df2
is exactly equal to df1
with only one additional column, and very important, we are using the and
logical vectorized and simple, that is &
, do not use in these cases the &&
since this one in particular, only checks the first element of any vector. (more info ). Now, if you only had different amounts of elements, or different ID
or different order, this sentence either would not work or would do it in an inappropriate way, for example, simply by modifying the order in the example:
df2 <- df2[c(2,3,1), ]
df1$NuevaVariable <- ifelse(df1$ID == df2$ID & df1$edad == df2$edad, as.character(df2$sexo) ,"NO")
df1
ID edad NuevaVariable
1 1 20 NO
2 2 30 NO
3 3 45 NO
We can verify that it no longer works as expected. For what you are looking for, without a doubt the merge()
, is the appropriate way to solve it:
df1 <- data.frame(ID=c(1,2,3), edad=c(20, 30, 45))
df2 <- cbind(df1, sexo = c("F", "M", "F"))
df1 <- merge(df1, df2, by = c("ID","edad"), all.x=TRUE)
df1
ID edad sexo
1 1 20 F
2 2 30 M
3 3 45 F
The merge
makes a "matching" of the columns indicated by by
, in this case as the names of the columns are the same, the function already understands this, otherwise the columns of the two tables would have to be defined separately with by.x
and by.y
the other important parameter is all.x=TRUE
with which we indicate that we want all the rows of df1
that coincide or not with df2
.
If you only want the new column, you could do:
df1$NuevaVariable <- merge(df1, df2, by = c("ID","edad"), all.x=TRUE)[, "sexo"]