Create Tibbles, and Their Important Features

tibble() is a nice way to create a tibble from scratch, and as_tibble() converts other data frames (e.g., data.frame and matrix) to a tibble. Below we’ll discuss how to create tibbles using these two functions; during the processing of creating tibbles, we’ll also discuss their important properties.


1. You can create a tibble in a similar way as with data.frame simply by specifying the name of the column and the associated cell values. Different from data.frame, a tibble when printed will nicely display the dataset dimension (row number × column number), and the types of each column, making it informative for downstream analysis.

library(tibble)library(dplyr)
a <- tibble(Name = c("Alice", "Bob", "Charlie"), Age = c(25, 30, 35), City = c("New York", "Los Angeles", "Chicago"))a

Output:

# A tibble: 3 × 3
Name Age City
<chr> <dbl> <chr>
1 Alice 25 New York
2 Bob 30 Los Angeles
3 Charlie 35 Chicago

A tibble has an official class of tbl_df**, as well as data.frame and tbl.

class(a)

Output:

[1] "tbl_df" "tbl" "data.frame"

2. You can include lists in a tibble column. This is helpful to do high-performance functional programming. (You cannot do this with data.frame.)

tibble(x = 1:3, y = list(1:5, 1:10, 1:20))

Output:

# A tibble: 3 × 2
x y
<int> <list>
1 1 <int [5]>
2 2 <int [10]>
3 3 <int [20]>

3. tibble() builds columns sequentially. When defining a column, you can refer to columns created earlier; e.g., column C is created based on prior-defined A and B. (You cannot do this with data.frame.)

tibble(A = 1:3, B = c(1, 3, 6), C = A^2 + B)

Output:

# A tibble: 3 × 3
A B C
<int> <dbl> <dbl>
1 1 1 2
2 2 3 7
3 3 6 15

4. tibble() faithfully reserves complicated column names.

tibble(`:)` = 1:3, `A^2 + B` = 4:6, `3` = c("a", "b", "c")) 

Output:

# A tibble: 3 × 3 `:)` `A^2 + B` `3`
<int> <int> <chr>
1 1 4 a
2 2 5 b
3 3 6 c

In comparison, data.frame may alter the column name.

data.frame(`:)` = 1:3, `A^2 + B` = 4:6, `3` = c("a", "b", "c")) 

Output:

X.. A.2...B X3
1 1 4 a
2 2 5 b
3 3 6 c

5. Only columns of length one can be recycled. In the code below, for instance, you cannot recycle B = c(1, 2) three times to create a total of six rows. This limit is designed so because recycling vectors of longer length is a frequent source of bugs. In comparison, data.frame can recycle vectors of longer length(e.g., data.frame(A = 1:6, B = c(1, 2)) recycles B = c(1, 2) three times.)

tibble(A = 1:6, B = c(1, 2))# Error:# ! Tibble columns must have compatible sizes.# • Size 6: Existing data.# • Size 2: Column `B`.# ℹ Only values of size one are recycled.
data.frame(A = 1:6, B = c(1, 2))

Output:

A B
1 1 1
2 2 2
3 3 1
4 4 2
5 5 1
6 6 2

6. Use as_tibble() to convert a data.frame to a tibble.

as_tibble(mtcars) # convert 'data.frame' mtcars dataset to a tibble

Output:

# A tibble: 32 × 11
mpg cyl disp hp drat wt qsec vs am gear carb
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 21 6 160 110 3.9 2.62 16.5 0 1 4 4
2 21 6 160 110 3.9 2.88 17.0 0 1 4 4
3 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
4 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
5 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
6 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
7 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4
8 24.4 4 147. 62 3.69 3.19 20 1 0 4 2
9 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2
10 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4
# ℹ 22 more rows

7. Tibbles do not support row names. Note that the rownames of mtcars are dropped after the data frame is converted to a tibble.

rownames(mtcars)

Output:

[1] "Mazda RX4" "Mazda RX4 Wag" "Datsun 710" "Hornet 4 Drive" "Hornet Sportabout" "Valiant" "Duster 360" "Merc 240D" "Merc 230" "Merc 280" "Merc 280C" "Merc 450SE" "Merc 450SL"
[14] "Merc 450SLC" "Cadillac Fleetwood" "Lincoln Continental" "Chrysler Imperial" "Fiat 128" "Honda Civic" "Toyota Corolla" "Toyota Corona" "Dodge Challenger" "AMC Javelin" "Camaro Z28" "Pontiac Firebird" "Fiat X1-9"
[27] "Porsche 914-2" "Lotus Europa" "Ford Pantera L" "Ferrari Dino" "Maserati Bora" "Volvo 142E"
as_tibble(mtcars) %>% rownames()

Output:

[1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30" "31" "32"

You can reserve row names as a separate column using rownames_to_column() from the tibble package.

rownames_to_column(mtcars) %>% as_tibble()

Output:

# A tibble: 32 × 12
rowname mpg cyl disp hp drat wt qsec vs am gear carb
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Mazda RX4 21 6 160 110 3.9 2.62 16.5 0 1 4 4
2 Mazda RX4 Wag 21 6 160 110 3.9 2.88 17.0 0 1 4 4
3 Datsun 710 22.8 4 108 93 3.85 2.32 18.6 1 1 4 1
4 Hornet 4 Drive 21.4 6 258 110 3.08 3.22 19.4 1 0 3 1
5 Hornet Sportabout 18.7 8 360 175 3.15 3.44 17.0 0 0 3 2
6 Valiant 18.1 6 225 105 2.76 3.46 20.2 1 0 3 1
7 Duster 360 14.3 8 360 245 3.21 3.57 15.8 0 0 3 4
8 Merc 240D 24.4 4 147. 62 3.69 3.19 20 1 0 4 2
9 Merc 230 22.8 4 141. 95 3.92 3.15 22.9 1 0 4 2
10 Merc 280 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 4
# ℹ 22 more rows

Equivalently, the code above can be rewritten using mutate() from the dplyr package.

mtcars %>%   mutate(carnames = rownames(mtcars), .before = 1) %>%  as_tibble()


The next tutorial will discuss tribble(), another great way to create tibbles on a row-wise basis in a very intuitive manner.