Fix a single level of variable when running linear regressions in R

0

I have the following simple linear model in R:

xmdl <- lm(Voto ~ Edad + Educacion,
           data = datostotales)
summary(xmdl)

Imagine that "Age" is a variable with values ranging from 20 to 100 (in discrete steps of 1), and that "Education" is a factor that has two levels "High" and "Low".

I would like to know how to run the linear regression by setting a single level of each variable. For example:

How do I do the regression by setting the variable "Age" to be equal to 25? How do I do the regression by setting the variable "Education" to be "High"?

    
asked by pyring 10.10.2017 в 17:49
source

1 answer

1

If possible try to add a valid example of the data, you can do it in the following way: dput(datostotales) . As I do not have these data I will generate an example in the following way:

datostotales <- data.frame(Edad=sample(x=c(18:90), size=1000, replace = TRUE),
                           Educacion=sample(x=c("Alta", "Baja"), size=1000, replace = TRUE),
                           Voto=sample(x=c(1, 2, 3, 4), size=1000, replace = TRUE),
                           stringsAsFactors = TRUE)

Now, if we want to apply the regression only on those rows where Edad == 25 can do this:

xmdl <- lm(Voto ~ Edad + Educacion, data = datostotales[datostotales$Edad==25,])

In the case of education you have two levels Alta and Baja if you only set a level besides having little sense to incorporate the data to the formula, to be a Factor the regression would give you an error. Anyway, you could eventually solve it in the following way:

xmdl <- lm(Voto ~ Edad + as.numeric(Educacion), data = datostotales[datostotales$Educacion == "Alta",])

Note: Just in case I mention it, Votos seems to be a categorical variable, in which case linear regression does not seem to be the best model if what you end up looking for is making a prediction.

    
answered by 10.10.2017 / 18:31
source