Actually what you are looking for is to substitute one value for another and you would not need to use gsub
so you can put together a data.frame
of replacement values so that it is a little simpler. I will use the data from your previous question as an example:
Specieslevel <- read.table(text="Site Date Habitat Season Year Taxa
Q1F 08_09_2015 Oak Autumn 2015-2016 Artemisia_herba_alta
Q2F 08_09_2015 Oak Autumn 2015-2016 Artemisia_herba_alta
Q4F 08_09_2015 Oak Autumn 2015-2016 Allium
Q1P 08_09_2015 Oak Autumn 2015-2016 Artemisia_herba_alta
Q2P 08_09_2015 Oak Autumn 2015-2016 Amaranthus
Q4P 08_09_2015 Oak Autumn 2015-2016 Anacyclus
Q4P 08_09_2015 Oak Autumn 2015-2016 Asparagus
Q4P 08_09_2015 Oak Autumn 2015-2016 Amaranthus_retroflex", sep=" ", header=TRUE, stringsAsFactors=FALSE)
species = data.frame(value = c('Artemisia_herba_alta', 'Amaranthus_retroflex'),
replaceby = c('Artemisia', 'Amaranthus'),
stringsAsFactors=FALSE)
matches <- match(Specieslevel$Taxa, species$value, nomatch=0)
Specieslevel$Taxa[matches>0] <- species$replaceby[matches]
Specieslevel
The final result:
Site Date Habitat Season Year Taxa
1 Q1F 08_09_2015 Oak Autumn 2015-2016 Artemisia
2 Q2F 08_09_2015 Oak Autumn 2015-2016 Artemisia
3 Q4F 08_09_2015 Oak Autumn 2015-2016 Allium
4 Q1P 08_09_2015 Oak Autumn 2015-2016 Artemisia
5 Q2P 08_09_2015 Oak Autumn 2015-2016 Amaranthus
6 Q4P 08_09_2015 Oak Autumn 2015-2016 Anacyclus
7 Q4P 08_09_2015 Oak Autumn 2015-2016 Asparagus
8 Q4P 08_09_2015 Oak Autumn 2015-2016 Amaranthus
Explanation:
- We create a
data.frame
species
that will contain the search and replacement values. It should work also if the variables are factors. I define it as data.frame
to make the code more readable, but eventually it could be a matrix to write a little less.
- With
match(Specieslevel$Taxa, species$value, nomatch=0)
we get a vector of the size of Specieslevel
where we will have the row of the replacement data or 0
in case of mismatch
- We apply these
matches
replacing only those that correspond: Specieslevel$Taxa[matches>0] <- species$replaceby[matches]
What happens if what we want to modify from the dataframe
is a Factor
? , well, the previous code does not work since in the same one we operate on the complete column, and now only you should work on levels
. The solution is even simpler and faster:
sp <- levels(Specieslevel$Taxa)
matches <- match(sp, species$value, nomatch=0)
sp[matches>0] <- species$replaceby[matches]
levels(Specieslevel$Taxa) <- sp
In this case we first create a vector of the levels
of the column ( sp <- levels(Specieslevel$Taxa)
) and the match
and replace
we do on it. Then what we need is to redefine the levels
of the column doing levels(Specieslevel$Taxa) <- sp