Data Structures

CSULB Intro to R

April 13, 2018

Agenda

  1. Vectors and Matrices
  2. Lists and Data Frames
  3. Type Coercion
  4. Troubleshooting

Data Types in R

Data Structures in R

  1. One-dimensional:
    • Vectors
    • Lists
  2. Multi-dimensional:
    • Matrices
    • Data frames

 

Vectors in R

numVec <- c(2,3,4)      # <- is the assigning operator
numVec
## [1] 2 3 4

Examples of Vectors

Examples of character, logical, and complex vectors:

intVec <- c(2L, 3L, 4L)
intVec
## [1] 2 3 4
charVec <- c("red", "green", "blue")
charVec
## [1] "red"   "green" "blue"
logVec <- c(TRUE, FALSE, FALSE, T, F)
logVec
## [1]  TRUE FALSE FALSE  TRUE FALSE

Matrices

myMat <- matrix(nrow = 2, ncol = 4)
myMat
##      [,1] [,2] [,3] [,4]
## [1,]   NA   NA   NA   NA
## [2,]   NA   NA   NA   NA
attributes(myMat)
## $dim
## [1] 2 4

Matrices

myMat <- matrix(1:8, nrow = 2, ncol = 4)
myMat # matrices are filled in column-wise
##      [,1] [,2] [,3] [,4]
## [1,]    1    3    5    7
## [2,]    2    4    6    8

A matrix is a special case of a vector

myVec <- 1:8
myVec
## [1] 1 2 3 4 5 6 7 8
dim(myVec) <- c(2,4)
myVec
##      [,1] [,2] [,3] [,4]
## [1,]    1    3    5    7
## [2,]    2    4    6    8

Other Ways to Create a Matrix

vec1 <- 1:4
vec2 <- sample(1:100, 4, replace = FALSE)
vec3 <- sample(1:20, 4, replace=TRUE)
colMat <- cbind(vec1, vec2, vec3)
colMat
##      vec1 vec2 vec3
## [1,]    1   50   15
## [2,]    2   72   18
## [3,]    3   15    8
## [4,]    4   93   19

Other Ways to Create a Matrix

vec1 <- 1:4
vec2 <- sample(1:100, 4, replace = TRUE)
vec3 <- sample(1:20, 4, replace=FALSE)
rowMat <- rbind(vec1, vec2, vec3)
rowMat
##      [,1] [,2] [,3] [,4]
## vec1    1    2    3    4
## vec2   86   94   82   60
## vec3   14   16    3    7

Factors

Sex <- rep(c("Female", "Male"), times = 3)
Sex
## [1] "Female" "Male"   "Female" "Male"   "Female" "Male"
SexFac1 <- factor(Sex)
SexFac1
## [1] Female Male   Female Male   Female Male  
## Levels: Female Male

Factors

levels(SexFac1)
## [1] "Female" "Male"
table(SexFac1)
## SexFac1
## Female   Male 
##      3      3

Factors

SexFac1 # levels are ordered alphabetically - 1st level = BaseLevel
## [1] Female Male   Female Male   Female Male  
## Levels: Female Male
SexFac2 <- factor(Sex, levels = c("Male", "Female"))
SexFac1
## [1] Female Male   Female Male   Female Male  
## Levels: Female Male
SexFac2
## [1] Female Male   Female Male   Female Male  
## Levels: Male Female

Lists

myVec <- c(10, "R", 5L, T)
myVec
## [1] "10"   "R"    "5"    "TRUE"

Lists

myList <- list(10, "R", 5L, T)
myList
## [[1]]
## [1] 10
## 
## [[2]]
## [1] "R"
## 
## [[3]]
## [1] 5
## 
## [[4]]
## [1] TRUE

Data Frames

studentID <- paste("S#", sample(c(6473:7392), 10), sep = "")
score <- sample(c(0:100), 10)
sex <- sample(c("female", "male"), 10, replace = TRUE)
data <- data.frame(studentID = studentID, score = score, sex = sex)
head(data)
##   studentID score    sex
## 1    S#7019    40 female
## 2    S#6968     9   male
## 3    S#7025    14   male
## 4    S#6972    73 female
## 5    S#7320    78   male
## 6    S#7279    79 female

Special Values

There are some special values in R:

intVec <- c(1L, 2L, 3L, 4L) 
intVec
## [1] 1 2 3 4
typeof(intVec)
## [1] "integer"
intVec*Inf
## [1] Inf Inf Inf Inf
a <- Inf; b <- 0
rslt <- c(b/a, a/a)
rslt
## [1]   0 NaN

Missing Values

a <- c(1,2)
a[3]
## [1] NA
b <- 0/0
b
## [1] NaN

Data Type Coercion

numCharVec <- c(3.14, "a")
numCharVec                 # What do you expect to be printed? 

numLogVec <- c(pi, T)
numLogVec                   

charLogVec <- c("a", TRUE)
charLogVec 

Data Type Coercion

numVec <- seq(from = 1200, to = 1300, by = 15)
numVec
## [1] 1200 1215 1230 1245 1260 1275 1290
numToChar <- as(numVec, "character")
numToChar
## [1] "1200" "1215" "1230" "1245" "1260" "1275" "1290"
numToChar==as.character(numVec)
## [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE
logVec <- c(F, T, F, T, T)
as(logVec, "numeric")
## [1] 0 1 0 1 1
as.numeric(logVec)
## [1] 0 1 0 1 1

Data Type Coercion

charVec <- c("2.5", "3", "2.8", "1.5", "zero")
as(charVec, "numeric")
## Warning in asMethod(object): NAs introduced by coercion
## [1] 2.5 3.0 2.8 1.5  NA
charVec <- c("2.5", "3", "2.8", "1.5", zero)
## Error in eval(expr, envir, enclos): object 'zero' not found

Troubleshooting

  1. Try to replicate the error. If you know what types of input will cause an error and which types won’t, this is a clue.
  2. Narrow down on where the error is occuring. This typically involves running chunks of code line-by-line or block-by-block.
  3. Try fixing the error.
  4. Google

Summary

Up Next