Skip to content

01: Data Structures

htrieu93 edited this page Oct 8, 2017 · 10 revisions

What are the six types of atomic vector? How does a list differ from an atomic vector?

Susannah:

Integer, character, raw, double, logical, complex are the 6 types of atomic vectors. Lists are recursive, but atomic vectors are flat.


What makes is.vector() and is.numeric() fundamentally different to is.list() and is.character()?

Susannah:

is.vector and is.numeric are general tests, not specific tests for types. is.list and is.character test for exact types.

Jenny:

Although is.vector()'s suggests it would test if an object is a vector, it does not. It seems to test more whether it is just a vector -- of the most basic kind. It tests if "the object is a vector with no attributes apart from names". is.atomic() is more suitable for testing if an object is a vector. is.list() tests whether an object is truly a list. is.numeric() tests for "numberliness" and will return TRUE for integer or double atomic vectors. This is different from is.character() which is testing for one of the basic types.*

Andrew:

is.vector returs TRUE for any vector of any type that doesn't have attributes besides names. either atomic or list.

is.numeric, similarly, is TRUE for either integer or double vectors, but not for lists.


Test your knowledge of vector coercion rules by predicting the output of the following uses of c():

c(1, FALSE)

numeric 1, 0

c("a", 1)

character "a", "1"

c(list(1), "a")

list 1, "a"

c(TRUE, 1L)

numeric 1, 1


Why do you need to use unlist() to convert a list to an atomic vector? Why doesn’t as.vector() work?

Alathea:

Lists are also vectors so as.vector will not coerce to an atomic vector, just leave as a list

Jenny:

as.vector() won't work because I think lists are already vectors, just not atomic vectors. I know I can create a list using vector(mode = "list"), which is why I think applying as.vector() to a list object won't actually convert it.

Susannah:

as.vector simply removes attributes, while unlist simplifies the list structure.


Why is 1 == "1" true? Why is -1 < FALSE true? Why is "one" < 2 false?

Andrew:

from the help:

If the two arguments are atomic vectors of different types, one is coerced to the type of the other, the (decreasing) order of precedence being character, complex, numeric, integer, logical and raw.

Hieu:

-1 == "1": 1 will get coerced into char "1" since the more flexible type here is character. Hence "1" == "1" is true.

--1 < FALSE: FALSE will get coerced into int 0 since the more flexible type here is integer. Hence -1 < 0 is true.

-"one" < 2: 2 will get coerced into char "2" since the more flexible type here is character. Hence "one" < "2" is false.


Why is the default missing value, NA, a logical vector? What’s special about logical vectors? (Hint: think about c(FALSE, NA_character_).)

Jenny:

I think of logicals as the most finicky type and the most vulnerable to coercion. So if you make NAs logical, coercion will kick in when and if necessary and you'll eventually get to the appropriate type of NA.

Andrew:

as this example illustrates, using c with a character NA coerces everyting to character. Logical is a the "bottom" of the heirarchy, so it can't drag anything else up to its level.

Alathea:

Maybe because it is the least flexible data type... c(FALSE, NA_character_) would cause FALSE to change data type.


An early draft used this code to illustrate structure():

structure(1:5, comment = "my attribute")
#> [1] 1 2 3 4 5

But when you print that object you don’t see the comment attribute. Why? Is the attribute missing, or is there something else special about it? (Hint: try using help)

Jenny:

Apparently a comment attribute is a "thing" that has been anticipated in R; I guess we are expected to use it to annotate data frames and model fits. Who knew? Anyway, according to the help,

Contrary to other attributes, the comment is not printed (by print or print.default)

Susannah:

Choosing attribute tags that are not comment print fine. comment is a special case that is not printed.

Alathea:

from the help:

These functions set and query a comment attribute for any R objects. This is typically useful for data.frames or model fits. Contrary to other attributes, the comment is not printed (by print or print.default).


What happens to a factor when you modify its levels?

f1 <- factor(letters)
levels(f1) <- rev(levels(f1))

Susannah:

Reversing the factor levels reversed both the vector and the levels.

Jenny:

When we first create f1, it will print as a b c ... and have levels a b c .... The above resetting of levels will not cause an error because all the original levels appear -- they're just in a new order. However even I, veteran of many factor battles, got caught out here somewhat. I knew that the levels of f1 would become z y x ... but I was surprised to see that f1 itself now printed as z y x ....


What does this code do? How do f2 and f3 differ from f1?

f2 <- rev(factor(letters))

Jenny:

f2 will be a factor. It will print z x y ... but have levels a b c ....

Susannah:

Reversing the factor reversed the vector, but left the levels intact

f3 <- factor(letters, levels = rev(letters))

Jenny:

f3 will be a factor that prints a b c ... with levels z y x ....

Susannah:

reversed the levels, but not the vector


What does dim() return when applied to a vector?

NULL


If is.matrix(x) is TRUE, what will is.array(x) return?

TRUE because a matrix is an array


How would you describe the following three objects? What makes them different to 1:5?

x1 <- array(1:5, c(1, 1, 5))
x2 <- array(1:5, c(1, 5, 1))
x3 <- array(1:5, c(5, 1, 1))

Andrew:

These are all 3D arrays with length(dim()) = 3

Jenny:

x1, x2, and x3 are all multi-dimensional (three-dimensional, actually) arrays, whereas 1:5 is an atomic vector. Granted, they each have only one dimension whose extent is greater than one, but that is still a totally different beast from an atomic vector. I think of them like so: x1 has 1 row, 1 column and 5 slices. x2 has 1 row, 5 columns and 1 slice. x3 has 5 rows, 1 column and 1 slice.

Susannah:

x1 is 1 row by 1 column by 5 deep x2 is 1 row by 5 columns by 1 deep x3 is 5 rows by 1 column by 1 deep They are different from 1:5 because of their dimensionality.


What attributes does a data frame possess?

names (same as col.names), row.names, class


What does as.matrix() do when applied to a data frame with columns of different types?

Coercion into the same type following the coercion rules from this chapter.


Can you have a data frame with 0 rows? What about 0 columns?

Yes, you can have a completely empty data frame.

Clone this wiki locally