Replace Matched Pattern with Other Strings

  • str_replace() replaces the first match in each string element
  • str_replace_all() replaces all matches in each string element
  • `str_replace_na() replaces missing value NA into character "NA"

str_replace() is similar to str_extract(), but instead of extracting the matched component, it replaces it with a new string.

  • In each string element, only the first match is replaced.

  • The pattern argument takes a specific string, or a regular expression as input. Special characters need to be escaped by the double-backslash to be interpreted literally, instead of as a functional operator.

  • For the replacement argument, it is not mandatory to escape special characters by the double-backslash; what is inside the quote will be treated literally.

library(stringr)
phones <- c( "Mia Smith, 728*971*9652", "Max Lee, 683*976*9876", "Ava Johnson, 912*254*3387")
# replace the first asterisk with a dot.str_replace(phones, pattern = "\\*", # search for a single asterisk replacement = ".")

Output:

[1] "Mia Smith, 728.971*9652" "Max Lee, 683.976*9876" "Ava Johnson, 912.254*3387"

str_replace_all() replaces all the matches. Here we replace all asterisks with a dot.

str_replace_all(phones, pattern = "\\*", replacement = ".")

Output:

[1] "Mia Smith, 728.971.9652" "Max Lee, 683.976.9876" "Ava Johnson, 912.254.3387"

The pattern and replacement can be vectorized. In the code below, we replace each person’s name with their initials, respectively.

str_replace(phones,             pattern = c("Mia Smith", "Max Lee", "Ava Johnson"),            replacement = c("MS", "ML", "AJ"))

Output:

[1] "MS, 728*971*9652" "ML, 683*976*9876" "AJ, 912*254*3387"

The replacement can be a function to be applied to the matched components. Here, we turn all names to upper case using str_to_upper().

str_replace(phones,             pattern = c("Mia Smith", "Max Lee", "Ava Johnson"),            replacement = str_to_upper)

Output:

[1] "MIA SMITH, 728*971*9652" "MAX LEE, 683*976*9876" "AVA JOHNSON, 912*254*3387"

The above code can be slightly simplified using rebus package. While WRD matches only a single letter or digit, str_replace_all() repeats such replacement across the entire string.

library(rebus)str_replace_all(phones, pattern = WRD,                 replacement = str_to_upper)

Output:

[1] "MIA SMITH, 728*971*9652" "MAX LEE, 683*976*9876" "AVA JOHNSON, 912*254*3387"

str_replace_na() turns missing value NA to the character string "NA".

str_replace_na(c("abc", NA, "xyz"))

Output:

[1] "abc" "NA" "xyz"