Specify the Number of Matched Patterns

Quantifiers specify the number of occurrences of the immediately preceding character.

? (zero or one occurrence) , e.g., colou?r matches “color” and “colour”.

library(stringr)a <- c("Life's colorful", "vivid colours")
str_view_all(a, "colou?r") 

Output:

[1] │ Life's <color>ful
[2] │ vivid <colour>s
str_extract_all(a, "colou?r", simplify = T) 

Output:

[,1]
[1,] "color"
[2,] "colour"

+ (one or more) , e.g., a+ matches one or more consecutive letter “a”.

library(stringr)x <- c("bb", "ba++", "baaa-naaaa-nAAA")
str_view_all(x, "a+") 

Output:

[1] │ bb
[2] │ b<a>++
[3] │ b<aaa>-n<aaaa>-nAAA
str_extract_all(x, "a+", simplify = T) 

Output:

[,1] [,2]
[1,] "" ""
[2,] "a" ""
[3,] "aaa" "aaaa"

* (zero or more) , e.g., ba* matches “b” followed by zero or more “a” characters.

str_view_all(x, "ba*") 

Output:

[1] │ <b><b>
[2] │ <ba>++
[3] │ <baaa>-naaaa-nAAA
str_extract_all(x, "ba*", simplify = T)

Output:

[,1] [,2]
[1,] "b" "b"
[2,] "ba" ""
[3,] "baaa" ""

{n} (exactly n times) , e.g., ba{3} matches “baaa”.

str_view_all(x, "ba{3}") 

Output:

[1] │ bb
[2] │ ba++
[3] │ <baaa>-naaaa-nAAA
str_extract_all(x, "ba{3}", simplify = T) 

Output:

[,1]
[1,] ""
[2,] ""
[3,] "baaa"

{n,} (at Least n times) , e.g., ba{2,} match “baa”, “baaa”, “baaaa”, and so on. Note that there should be no white space inside the curly braces.

str_view_all(x, "ba{2,}")

Output:

[1] │ bb
[2] │ ba++
[3] │ <baaa>-naaaa-nAAA
str_extract_all(x, "ba{2,}", simplify = T) 

Output:

[,1]
[1,] ""
[2,] ""
[3,] "baaa"

{n,m} (between n and m times) , e.g., ba{1,2} matches “ba” or “baa”. Again, no white space should be present in the curly braces.

str_view_all(x, "ba{1,2}")

Output:

[1] │ bb
[2] │ <ba>++
[3] │ <baa>a-naaaa-nAAA
str_extract_all(x, "ba{1,2}", simplify = T) 

Output:

[,1]
[1,] ""
[2,] "ba"
[3,] "baa"

Combination of Character Class and Quantifier

[0-9]{3,4} or [:digit:]{3,4} matches 3 to 4 consecutive numeric characters. The following examples extract the area and subscriber code from telephone numbers.

s <- c("Alice: 137-807-6865", "Mike: 732-987-1986")p <- "[:digit:]{3,4}"
str_view_all(s, p)

Output:

[1] │ Alice: <137>-<807>-<6865>
[2] │ Mike: <732>-<987>-<1986>
str_extract_all(s, p, simplify = T)

Output:

[,1] [,2] [,3]
[1,] "137" "807" "6865"
[2,] "732" "987" "1986"

As comparison, [:digit:] alone without the {3,4} quantifier considers each individual number as a match.

str_extract_all(s, "[:digit:]", simplify = T)

Output:

[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] "1" "3" "7" "8" "0" "7" "6" "8" "6" "5"
[2,] "7" "3" "2" "9" "8" "7" "1" "9" "8" "6"