[R]-Data structures

Cite: http://adv-r.had.co.nz/Data-structures.html

R's base data structures can be organised by their dimensionality (1d, 2d, or nd) and whether they're homogeneous (all contents must be of the same type) or heterogeneous (the contents can be of different types). This gives rise to the five data types most often used in data analysis:

Homogeneous Heterogeneous
1d (vector) Atomic vector List
2d Matrix Data frame
nd Array -

Note that R has no 0-dimensional, or scalar types. Individual numbers or strings, which you might think would be scalars, are actually vectors of length one.

Given an object, the best way to understand what data structures it’s composed of is to use str():

vector and matrix are just aliases for one- and two-dimensional array respectively.

Vector

The basic data structure in R is the vector. Vectors come in two flavours: atomic vector and list. They have three common properties:

  • Type, typeof(), what it is.
  • Length, length(), how many elements it contains.
  • Attributes, attributes(), additional arbitrary metadata.
Atomic vector

There are four common types of atomic vectors: logical, integer, double (often called numeric), and character. There are two rare types that I will not discuss further: complex and raw. Atomic vectors are usually created with c(), short for combine.

Atomic vectors are always flat, even if you nest c()’s:

c(1, c(2, c(3, 4)))
#> [1] 1 2 3 4
# the same as
c(1, 2, 3, 4)
#> [1] 1 2 3 4

Given a vector, you can determine its type with typeof(), or check if it's a specific type with an "is" function:

is.character()
is.double()
is.integer()
is.logical()
# or, more generally
is.atomic()

# examples
int_var <- c(1L, 6L, 10L)
typeof(int_var)
#> [1] "integer"
is.integer(int_var)
#> [1] TRUE
is.atomic(int_var)
#> [1] TRUE

dbl_var <- c(1, 2.5, 4.5)
typeof(dbl_var)
#> [1] "double"
is.double(dbl_var)
#> [1] TRUE
is.atomic(dbl_var)
#> [1] TRUE

is.numeric() 相当于 is.integer() | is.double():

is.numeric(int_var)
#> [1] TRUE
is.numeric(dbl_var)
#> [1] TRUE
List

You construct lists by using list() instead of c():

x <- list(1:3, "a", c(TRUE, FALSE, TRUE), c(2.3, 5.9))
str(x)
#> List of 4
#>  $ : int [1:3] 1 2 3
#>  $ : chr "a"
#>  $ : logi [1:3] TRUE FALSE TRUE
#>  $ : num [1:2] 2.3 5.9

Lists are sometimes called recursive vectors, because a list can contain other lists:

x <- list(list(list(list())))
str(x)
#> List of 1
#>  $ :List of 1
#>   ..$ :List of 1
#>   .. ..$ : list()
is.recursive(x)
#> [1] TRUE

c() will combine several lists into one. If given a combination of atomic vectors and lists, c() will coerce the vectors to lists before combining them. Compare the results of list() and c():

x <- list(list(1, 2), c(3, 4))
y <- c(list(1, 2), c(3, 4))
str(x)
#> List of 2
#>  $ :List of 2
#>   ..$ : num 1
#>   ..$ : num 2
#>  $ : num [1:2] 3 4
str(y)
#> List of 4
#>  $ : num 1
#>  $ : num 2
#>  $ : num 3
#>  $ : num 4

You can turn a list into an atomic vector with unlist(). If the elements of a list have different types, unlist() uses the same coercion rules as c().

Lists are used to build up many of the more complicated data structures in R. For example, both data frames (described in data frames) and linear models objects (as produced by lm()) are lists:

is.list(mtcars)
#> [1] TRUE

mod <- lm(mpg ~ wt, data = mtcars)
is.list(mod)
#> [1] TRUE
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容

  • 韩剧总是套路满满,我们观众也会边追边吐槽“韩剧有三宝,车祸、癌症、失忆了”。万万没想到本以为是狗血的偶像剧情节竟然...
    健康管家阅读 367评论 0 1
  • It's so good to be understood, yet it feels even better t...
    郭绿狮阅读 237评论 0 1
  • 所谓自控力,即意志力,也是控制自己的注意力、情绪和欲望的能力。 从失控――自控,你需要知道什么? 1.什么是...
    追寻真我阅读 434评论 0 1
  • 今天不更小说了,可能是最近太幸福了,小说有点卡,先停一下,缓缓神。 文|云云先生 今天参加了宝宝的中考誓师大会,会...
    燕小白阅读 321评论 7 11
  • 妞儿拿着这份作业给我看时,我是真的有些惊讶,我一直觉得傻傻长不大的孩子原来也很有自己的想法和见解。 这是英语老师留...
    萱I草阅读 266评论 0 0