Unite Multiple Columns into a Single Column

Opposite to separate(), unite() combines multiple columns into one column by pasting strings together.

library(tidyr)library(dplyr)
data <- tibble(  first_name = c("John", "Jane",     "Alice"),  last_name  = c("Doe",  "Smith",    "Johnson"),  DOB        = c( NA,    "3/2/1993", "8/4/1997"),  State      = c("MA",   "NJ",        NA),  employer   = c("MIT",   NA,         NA))
data

Output:
# A tibble: 3 × 5
  first_name last_name DOB      State employer
  <chr>      <chr>     <chr>    <chr> <chr>
1 John       Doe       <NA>     MA    MIT
2 Jane       Smith     3/2/1993 NJ    <NA>
3 Alice      Johnson   8/4/1997 <NA>  <NA>

The unite() function has four basic arguments, and can be conveniently summarized as unite(data, col, ..., sep = "_"):

data specifies the input data frame, and is the first argument. It can be conveniently passed into this function using the pipe operator %>%.
col specifies the name of the new column to be created.
... specifies the columns to unite. Selection helpers can be used for convenient pick of a range of columns.
sep specifies the separator used between united values. If not specified, the underscore _ is used as the separator by default.

# Create a new column combining the full and last name, connected by a hyphendata %>% unite(col = full_name, first_name, last_name, sep = "-")

Output:
# A tibble: 3 × 4
  full_name     DOB      State employer
  <chr>         <chr>    <chr> <chr>
1 John-Doe      <NA>     MA    MIT
2 Jane-Smith    3/2/1993 NJ    <NA>
3 Alice-Johnson 8/4/1997 <NA>  <NA>

Use remove = FALSE to keep the original columns from the output.

data %>%   unite(    col = full_name, first_name, last_name, sep = "-",    remove = F) # do not remove original columns "first_name" and "last_name"

Output:
# A tibble: 3 × 6
  full_name     first_name last_name DOB      State employer
  <chr>         <chr>      <chr>     <chr>    <chr> <chr>
1 John-Doe      John       Doe       <NA>     MA    MIT
2 Jane-Smith    Jane       Smith     3/2/1993 NJ    <NA>
3 Alice-Johnson Alice      Johnson   8/4/1997 <NA>  <NA>

Combine all the columns using everything(). Note that missing values are concatenated as “NA” strings.

data %>% unite(col = people, everything(), remove = F)

Output:
# A tibble: 3 × 6
  people                       first_name last_name DOB      State employer
  <chr>                        <chr>      <chr>     <chr>    <chr> <chr>
1 John_Doe_NA_MA_MIT           John       Doe       <NA>     MA    MIT
2 Jane_Smith_3/2/1993_NJ_NA    Jane       Smith     3/2/1993 NJ    <NA>
3 Alice_Johnson_8/4/1997_NA_NA Alice      Johnson   8/4/1997 <NA>  <NA>

Use na.rm = TRUE to remove missing values prior to uniting the columns.

data %>% unite(col = people, everything(), remove = F, na.rm = T)

Output:
# A tibble: 3 × 6
  people                 first_name last_name DOB      State employer
  <chr>                  <chr>      <chr>     <chr>    <chr> <chr>
1 John_Doe_MA_MIT        John       Doe       <NA>     MA    MIT
2 Jane_Smith_3/2/1993_NJ Jane       Smith     3/2/1993 NJ    <NA>
3 Alice_Johnson_8/4/1997 Alice      Johnson   8/4/1997 <NA>  <NA>

Amazing eBook to learn ggplot2 FAST & EASY