library(tidyr)library(dplyr)
<- tibble(A = c(1, 2, NA), B = c("a", NA, "b"), C = c(NA, NA, 10)) df df
Output:
# A tibble: 3 × 3
A B C
<dbl> <chr> <dbl>
1 1 a NA
2 2 <NA> NA
3 NA b 10
NA
with Specified Valuesreplace_na()
replaces NA
in a vector or columns of a data frame with specified values.
library(tidyr)library(dplyr)
<- tibble(A = c(1, 2, NA), B = c("a", NA, "b"), C = c(NA, NA, 10)) df df
Output:
# A tibble: 3 × 3
A B C
<dbl> <chr> <dbl>
1 1 a NA
2 2 <NA> NA
3 NA b 10
Replace NA
with zero.
$A %>% replace_na(replace = 0) df
Output:
[1] 1 2 0
Replace NA
with “unknown”.
$B %>% replace_na(replace = "unknown") df
Output:
[1] "a" "unknown" "b"
Use mutate()
to replace NA
in columns A
and B
of the dataset.
%>% mutate(A = replace_na(A, 0), df B = replace_na(B, "unknown"))
Output:
# A tibble: 3 × 3
A B C
<dbl> <chr> <dbl>
1 1 a NA
2 2 unknown NA
3 0 b 10
You can use selection helper across()
to apply the replacement to a range of columns, e.g., replacing NA
with 0
in all numeric columns, and with "unknown"
in all character columns.
%>% mutate( df across(is.numeric, ~replace_na(.x, 0)), across(is.character, ~replace_na(.x, "unknown")))
Output:
# A tibble: 3 × 3
A B C
<dbl> <chr> <dbl>
1 1 a 0
2 2 unknown 0
3 0 b 10
The dataset can be also a direct input to replace_na()
as the first argument. In this case, the replace
argument takes a list of values, with one value for each column that has NA
values to be replaced.
# Replace `NA` in column `A` with 0, and replace `NA` in column `B` with "unknown".%>% replace_na(replace = list(A = 0, B = "unknown")) df
Output:
# A tibble: 3 × 3
A B C
<dbl> <chr> <dbl>
1 1 a NA
2 2 unknown NA
3 0 b 10