Escape Characters

An escape character is a character that indicates that its following character(s) should be interpreted differently (escaping from its original meaning). Backslash \ is the most common escape character.

Escape a special character

For a special character to be a matching pattern (i.e., to be treated as a literal character), it has to be immediately escaped (preceded) by a backslash \. For example:

$ and $ separately matches the left and right literal parenthesis.
\[ and \] separately matches the left and right literal square bracket.
\. is treated as a dot itself, instead of a wildcard.
\^ and \$ is treated respectively as a literal carat and dollar sign, instead of a position anchor.

Since the backslash itself is a special character, it needs to be escaped with another backslash to be interpreted literally, e.g., using \\., \\^, and \\$, and \\(.

eg.1. \\. matches a literal dot, and .* matches a string of any length (here the dot is a wildcard). Thus, \\..* matches a literal dot and its following characters, i.e., the file extension.

library(stringr)x <- c("raw_data.xlsx", "data_analysis.RData")

str_view_all(x, "\\..*")

Output:
[1] │ raw_data<.xlsx>
[2] │ data_analysis<.RData>

str_extract(x, "\\..*")

Output:
[1] ".xlsx"  ".RData"

Special characters with characer class

When special characters are used with character class (within a pair of square brackets), they are interpreted literally, and does not need the backslash to escape.

eg.2. [$^*] matches “$”, “^”, and “*” as literal characters.

s <- c("an book $", "carot or carat ^", "stars ** in the sky")

str_view_all(s, "[$^*]")

Output:
[1] │ an book <$>
[2] │ carot or carat <^>
[3] │ stars <*><*> in the sky

str_extract(s, "[$^*]")

Output:
[1] "$" "^" "*"

Escape a regular letter

As demonstrated above, when a special character is escaped (preceded) with a backslash \, it is interpreted literally as a character itself. On the other hand, an ordinary letter can be escaped to convey a different meaning:

\d matches a single digit
\D matches a single non-digit
\w matches a word character (alphanumeric + underscore)
\W matches a non-word character
\s matches any whitespace
\S matches a non-whitespace
\b matches a word boundary
\B matches a position that is not a word boundary
\t matches a tab character
\n matches a newline character

Again, a second backslash is needed to escape itself, e.g., using \\S.

Consider the following examples.

eg.3. \\$ matches a literal dollar sign, and \\d+ matches one or more digits. As such, \\$\\d+ matches a dollar amount.

d <- c("book of $123", "price at 20% off")

str_view_all(d, "\\$\\d+")

Output:
[1] │ book of <$123>
[2] │ price at 20% off

str_extract(d, "\\$\\d+")

Output:
[1] "$123" NA

eg.4. \\d{3}\\. matches three consecutive digits, followed with a literal dot. As such, \\d{3}\\.\\d{3}\\.\\d{4} matches a phone number in the form of xxx.xxx.xxxx.

a <- c("Bob: 787.902.1068", "Mike: 910.087.1483")p <- "\\d{3}\\.\\d{3}\\.\\d{4}"

str_view_all(a, p)

Output:
[1] │ Bob: <787.902.1068>
[2] │ Mike: <910.087.1483>

str_extract(a, p)

Output:
[1] "787.902.1068" "910.087.1483"

Escape a special character

Special characters with characer class

Escape a regular letter

Amazing eBook to learn ggplot2 FAST & EASY