Get Elements from Nested Data Structure (1/2): Basics of pluck()

pluck() is a generalized form of [[ ]] that extracts elements from nested data structure.

We’ll demonstrate the use of pluck() and its similarity & difference from [[ ]] using the following example.

person1 <- list(  name = "John",  age = 30,  address = list(street = "123 Main St",                 city = "Anytown",                 country = "USA"))
person2 <- list( name = "Alice", address = list(street = "456 Oak St", city = "Sometown", country = "USA"), hobbies = c("painting", "biking", "gardening"))
# Update json_data to include the new personpeople <- list(person1, person2)

Extract elements with position index or names

Get the information of the 2nd person. Here we use index 2 to specify the element position. The code below is equivalent to people[[2]].

library(purrr)pluck(people, 2)

Output:

$name
[1] "Alice" $address $address$street
[1] "456 Oak St" $address$city
[1] "Sometown" $address$country
[1] "USA" $hobbies
[1] "painting" "biking" "gardening"

Get the 2nd person’s address. Here we use a combination of index position and element name to specify the element to extract. The code below is equivalent to people[[2]][["address"]]

pluck(people, 2, "address")

Output:

$street
[1] "456 Oak St" $city
[1] "Sometown" $country
[1] "USA"

Get the 2nd person’ street name. The code below is equivalent to people[[2]][["address"]][["street"]]

pluck(people, 2, "address", "street")

Output:

[1] "456 Oak St"

Deal with elements that don’t exist

If the element to be extracted is not present, pluck() consistently returns a NULL.

pluck(people, 3) # extract the 3rd element that is not existing

Output:

NULL

You can otherwise use .default to change the output value for not existing elements.

pluck(people, 3, .default = "not available") 

Output:

[1] "not available"

In comparison, the [[ ]] approach returns an error for missing elements. (see more discussion here)

people[[3]]# Error in people[[3]] : subscript out of bounds

You can use pluck_exists() to check if the specified element is present or not.

pluck_exists(people, 3)

Output:

[1] FALSE

Work with the pipe operator %>%

pluck() streamlined perfectly with the pipe operator.

people %>% pluck(2, 2, 1)

Output:

[1] "456 Oak St"

You can use [[ ]] with the special feature of the pipe operator to extract the second element of people, e.g., using people %>% .[[2]]. However, it is not so convenient for elements further embedded in the nested structure.

# this line is not correct, and gives error message: # Error in .[[.[[2]][[2]], 1]] : incorrect number of subscriptspeople %>% .[[2]][[2]][[1]]

Instead it must be presented as:

(people %>% .[[2]])[[2]][[1]]

Output:

[1] "456 Oak St"