Replace Missing Value NA with Specified Values

replace_na() replaces NA in a vector or columns of a data frame with specified values.

library(tidyr)library(dplyr)
df <- tibble(A = c(1, 2, NA), B = c("a", NA, "b"), C = c(NA, NA, 10))df

Output:

# A tibble: 3 × 3
A B C
<dbl> <chr> <dbl>
1 1 a NA
2 2 <NA> NA
3 NA b 10

Replace NA with zero.

df$A %>% replace_na(replace = 0)

Output:

[1] 1 2 0

Replace NA with “unknown”.

df$B %>% replace_na(replace = "unknown") 

Output:

[1] "a" "unknown" "b"

Use mutate() to replace NA in columns A and B of the dataset.

df %>% mutate(A = replace_na(A, 0),              B = replace_na(B, "unknown"))

Output:

# A tibble: 3 × 3
A B C
<dbl> <chr> <dbl>
1 1 a NA
2 2 unknown NA
3 0 b 10

You can use selection helper across() to apply the replacement to a range of columns, e.g., replacing NA with 0 in all numeric columns, and with "unknown" in all character columns.

df %>% mutate(  across(is.numeric, ~replace_na(.x, 0)),  across(is.character, ~replace_na(.x, "unknown")))

Output:

# A tibble: 3 × 3
A B C
<dbl> <chr> <dbl>
1 1 a 0
2 2 unknown 0
3 0 b 10

The dataset can be also a direct input to replace_na() as the first argument. In this case, the replace argument takes a list of values, with one value for each column that has NA values to be replaced.

# Replace `NA` in column `A` with 0, and replace `NA` in column `B` with "unknown".df %>% replace_na(replace = list(A = 0, B = "unknown"))

Output:

# A tibble: 3 × 3
A B C
<dbl> <chr> <dbl>
1 1 a NA
2 2 unknown NA
3 0 b 10