-
Notifications
You must be signed in to change notification settings - Fork 39
01: Data Structures
Susannah:
Integer, character, raw, double, logical, complex are the 6 types of atomic vectors. Lists are recursive, but atomic vectors are flat.
Susannah:
is.vector
and is.numeric
are general tests, not specific tests for types. is.list
and is.character
test for exact types.
Jenny:
Although is.vector()
's suggests it would test if an object is a vector, it does not. It seems to test more whether it is just a vector -- of the most basic kind. It tests if "the object is a vector with no attributes apart from names". is.atomic()
is more suitable for testing if an object is a vector. is.list()
tests whether an object is truly a list. is.numeric()
tests for "numberliness" and will return TRUE
for integer or double atomic vectors. This is different from is.character()
which is testing for one of the basic types.*
Andrew:
is.vector
returs TRUE
for any vector of any type that doesn't have attributes besides names. either atomic or list.
is.numeric
, similarly, is TRUE
for either integer or double vectors, but not for lists.
numeric 1, 0
character "a", "1"
list 1, "a"
numeric 1, 1
Why do you need to use unlist()
to convert a list to an atomic vector? Why doesn’t as.vector()
work?
Alathea:
Lists are also vectors so as.vector
will not coerce to an atomic vector, just leave as a list
Jenny:
as.vector()
won't work because I think lists are already vectors, just not atomic vectors. I know I can create a list using vector(mode = "list")
, which is why I think applying as.vector()
to a list object won't actually convert it.
Susannah:
as.vector
simply removes attributes, while unlist
simplifies the list structure.
Andrew:
from the help:
If the two arguments are atomic vectors of different types, one is coerced to the type of the other, the (decreasing) order of precedence being character, complex, numeric, integer, logical and raw.
Hieu:
-1 == "1"
: 1 will get coerced into char
"1" since the more flexible type here is character. Hence "1" == "1" is true.
--1 < FALSE
: FALSE will get coerced into int
0 since the more flexible type here is integer. Hence -1 < 0 is true.
-"one" < 2
: 2 will get coerced into char
"2" since the more flexible type here is character. Hence "one" < "2" is false.
Why is the default missing value, NA
, a logical vector? What’s special about logical vectors? (Hint: think about c(FALSE, NA_character_)
.)
Jenny:
I think of logicals as the most finicky type and the most vulnerable to coercion. So if you make NA
s logical, coercion will kick in when and if necessary and you'll eventually get to the appropriate type of NA
.
Andrew:
as this example illustrates, using c
with a character NA
coerces everyting to character. Logical is a the "bottom" of the heirarchy, so it can't drag anything else up to its level.
Alathea:
Maybe because it is the least flexible data type... c(FALSE, NA_character_)
would cause FALSE
to change data type.
structure(1:5, comment = "my attribute")
#> [1] 1 2 3 4 5
But when you print that object you don’t see the comment attribute. Why? Is the attribute missing, or is there something else special about it? (Hint: try using help
)
Jenny:
Apparently a comment attribute is a "thing" that has been anticipated in R; I guess we are expected to use it to annotate data frames and model fits. Who knew? Anyway, according to the help,
Contrary to other attributes, the comment is not printed (by
print.default
)
Susannah:
Choosing attribute tags that are not comment print fine. comment
is a special case that is not printed.
Alathea:
from the help:
These functions set and query a comment attribute for any R objects. This is typically useful for data.frames or model fits. Contrary to other attributes, the comment is not printed (by
print.default
).
f1 <- factor(letters)
levels(f1) <- rev(levels(f1))
Susannah:
Reversing the factor levels reversed both the vector and the levels.
Jenny:
When we first create f1
, it will print as a b c ...
and have levels a b c ....
The above resetting of levels will not cause an error because all the original levels appear -- they're just in a new order. However even I, veteran of many factor battles, got caught out here somewhat. I knew that the levels of f1
would become z y x ...
but I was surprised to see that f1
itself now printed as z y x ....
Jenny:
f2
will be a factor. It will print z x y ...
but have levels a b c ...
.
Susannah:
Reversing the factor reversed the vector, but left the levels intact
Jenny:
f3
will be a factor that prints a b c ...
with levels z y x ....
Susannah:
reversed the levels, but not the vector
NULL
TRUE
because a matrix is an array
x1 <- array(1:5, c(1, 1, 5))
x2 <- array(1:5, c(1, 5, 1))
x3 <- array(1:5, c(5, 1, 1))
Andrew:
These are all 3D arrays with length(dim()) = 3
Jenny:
x1
, x2
, and x3
are all multi-dimensional (three-dimensional, actually) arrays, whereas 1:5
is an atomic vector. Granted, they each have only one dimension whose extent is greater than one, but that is still a totally different beast from an atomic vector. I think of them like so: x1
has 1 row, 1 column and 5 slices. x2
has 1 row, 5 columns and 1 slice. x3
has 5 rows, 1 column and 1 slice.
Susannah:
x1
is 1 row by 1 column by 5 deep x2
is 1 row by 5 columns by 1 deep x3
is 5 rows by 1 column by 1 deep They are different from 1:5
because of their dimensionality.
names
(same as col.names
), row.names
, class
Coercion into the same type following the coercion rules from this chapter.
Yes, you can have a completely empty data frame.