A simple example
The following example shows one use-case. Using Anderson’s
iris
data, we calculate the regression of
Petal.Length
on Sepal.Length
for each
Species
and then merge this slope coefficient back into the
original data.
coefs <- lapply(split(iris, iris$Species),
function(dat) lm(Petal.Length~Sepal.Length, dat)$coef)
coefs <- do.call("rbind",coefs)
coefs <- as.data.frame(coefs)
coefs$Species <- rownames(coefs)
coefs
## (Intercept) Sepal.Length Species
## setosa 0.8030518 0.1316317 setosa
## versicolor 0.1851155 0.6864698 versicolor
## virginica 0.6104680 0.7500808 virginica
library(lookup)
iris = transform(iris,
slope1 = lookup(iris$Species, coefs$Species, coefs[,"Sepal.Length"]),
slope2 = vlookup(iris$Species, coefs, "Species", "Sepal.Length"))
head(iris)
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species slope1 slope2
## 1 5.1 3.5 1.4 0.2 setosa 0.1316317 0.1316317
## 2 4.9 3.0 1.4 0.2 setosa 0.1316317 0.1316317
## 3 4.7 3.2 1.3 0.2 setosa 0.1316317 0.1316317
## 4 4.6 3.1 1.5 0.2 setosa 0.1316317 0.1316317
## 5 5.0 3.6 1.4 0.2 setosa 0.1316317 0.1316317
## 6 5.4 3.9 1.7 0.4 setosa 0.1316317 0.1316317
Admittedly, a better way to approach this problem would be with the
dplyr
package and the group_by
and
summarize
functions. But, this example does not depend on
external packages.
History
I wrote the lookup()
function for my own personal in
2004 for the simple reason that I could never remember the syntax of the
merge()
function. When Jenny Bryan posted
vlookup()
on Twitter,
I modified her version and decided there would be value in making both
of these functions widely available in a package.