Homework Six

To do this assignment we will need to know how to make predictions (plugging points into the regression line). Consider the data in the file silly.data. Fitting the obvious linear model mod.1 <- lm(y ~ x1 + x2) we get an estimate of the intercept to be -41.1370, the slope β1=3.8893, and the slope β2=-2.9812. If we had an interest in the following input points:

x1x2
1647
1342
1953

we could plug them into the line:

β0 + β1 16 + β2 47 = -119.0691
β0 + β1 13 + β2 42 = -115.8262
β0 + β1 19 + β2 53 = -125.2941

But we'd like to not have to do this by hand. Thus the R command:

predict(mod.1,newdata=data.frame(x1=16,x2=47))
predict(mod.1,newdata=data.frame(x1=13,x2=42))
predict(mod.1,newdata=data.frame(x1=19,x2=53))
or all in one fell swoop: predict(mod.1,newdata=data.frame(x1=c(16,13,19),x2=c(47,42,53))

  1. This data set, ustemp.data gives the January average temperature for various US cities (the y variable) along with longitudes and latitudes for these cities. Build an appropriate model, prediction is the issue here. Then use your model to predict the January average temperature for Tyler and Kansas City.
  2. Consider the wages.data data. The response variable is hours, the average number of hours worked yearly (think of this as a crude measure of productivity -- if nobody worked very much we'd likely have a pretty crappy economy!) As issue is the effect of the wage.rate independent variable, so it should always be included in your models (in the context of these ancient data the discussion was about a "negative income tax" but you can think of this relative to current debates about the minimum wage). Independent variables should be fairly self-explanatory, other.earnings is earnings of family members other than the primary wage earner and spouse, non-earned is non-earned income and race is percentage of white respondants (in that particular group, the data are averages over 39 separate demographic groups). Build and interpret appropriate model(s), particularly with regard to the effect of the wage.rate variable on hours worked.