Apply a Function to Each Element of a Vector or List
TL;DRmap(.x, .f) repeats the operation of function .f to each element of .x (vector, list), and reruns the output as a list of the same length as input .x. The variant functions map_lgl(), map_int(), and map_chr() work in a similar way, but returns an atomic vector of the indicated type (feasible if .f returns a vector of length one for each iteration).
The map() function applies a function to each element of a vector or list, and returns a list. For instance, map(c(1, 2, 3), ~ .x + 1) adds one to each vector element. The basic argument structure follows map(.x, .f) (as in other functions in the map_* family):
.x: a vector or list
.f: the function to be applied to each element of .x. It is in the format of a named function (without quote), e.g. mean; or an anonymous function, e.g., \(x) x + 1, ~ .x + 1, function(x) x + 1, or ~ mean(.x, na.rm = T).
The output is a list of the same length as .x.
If for each iteration the return of .f is a vector of length one, then you can also use the map_*() functions to return a vector of indicated type:
map_dbl() returns a vector of type of double (i.e., numeric)
map_lgl() returns a vector of type of logic
map_int() returns a vector of type of integer
map_chr() returns a vector of type of character
Repeat operation across elements of a vector
e.g.1. The following code passes 1, 5, and 10, respectively, into rnorm() as values of the mean argument.
Note that map_*() functions are feasible only if the .f - specified operation that is applied to each element of the list (e.g., each student) returns a vector of length-one (e.g., a single mean value).
e.g.3. Here we calculate the correlation between Sepal.Length and Sepal.Width for each iris species.
iris %>%# split the `iris` dataset into a list of data frames# with each element (data frame) being a subset of a 'Species' type split(iris$Species) %>%# For each list element (of a species type), # create a linear model between 'Sepal.Length' and 'Sepal.Width'; output as a listmap(~lm(Sepal.Length ~ Sepal.Width, data = .x)) %>%# For each linear model, create a summary and extract the R2map(summary) %>%map_dbl(.f ="r.squared")
An even more powerful approach is to group the dataset with nest() of the tidyr package, create a list-column of the model objects with mutate() and map(), and then extract and unfold the model parameters using the broom package. We’ll dive into more details of this powerful approach in a later tutorial.
iris %>%# creates a 'data' column, which is a list of tibbles for each species tidyr::nest(-Species) %>%# create a 'model' column, which is a list of model objectsmutate(model =map( data, ~lm(Sepal.Length ~ Sepal.Width, data = .x))) %>%# create a 'glance' column, which is a list of model parameters, # which are extracted using the broom packagemutate(glance =map(model, broom::glance)) %>%# display the model parameters tidyr::unnest(glance)
e.g.4. The data frame is a special format of a list, with each column being an “element”. Below we calculate the mean of each column, and output the result as a named vector.