Introduction

In this exercise, you will work on implementing various data structures in R.

Part A

  1. Create a vector using vec1 <- c(1L, 2L, 3L). What type of vector is this? Verify your answer using typeof(vec1).
    • This is an integer vector.
  2. Run the following code: vec2 <- as.numeric(vec1). Compare vec1 and vec2 using typeof() and object.size(). What are the differences?
    • vec2 is a numeric vector, specifically double. This refers to the precision of the object that the computer stores, i.e. it can hold more decimal places. Thus, the more precise, the more memory that object takes up in your memory.
  3. Run the code: vec2 <- vec2 + c(1, 1.5, 1.99). Before checking the output, what is the result of this line of code?
    • You should have 2 3.5 4.99
  4. Coerce vec2 to integer-type. What happens to non-integers?
    • 2 3 4 All non-integers are rounded down to the nearest whole number (integer).
  5. Create another vector vec3 and assign the values "1L", "2L", "3L". What type of vector is this? Verify using typeof(). Coerce this to a numeric vector.
    • This is a character vector. When trying to coerce this to numeric or integer, R doesn’t know how to handle the "L"s, so it returns a vector of NAs. Additionally, you should recieve Warning message: NAs introduced by coercion.

Part B

Open a new R script to write and save your code. You will need to re-use these results for Exercise 3!

  1. Create vectors using c() with the following attributes.
    1. A vector, name, with 10 names of your choosing. What type of vector is this?
      • Character
    2. A vector, female, with TRUEs and FALSEs indicating the sex of the people in your name vector. What type of vector is this? What is another way we could represent this information?
      • Logical. We could use 1’s and 0’s to indicate TRUE or FALSE.
    3. A vector, edu, indicating the highest degree of education completed for the people in your name vector. Use "HS" for high school, "BS" for a bachelor’s degree, "MS" for master’s, and "PhD" for doctorate.

    4. A vector, salary, indicating the salary in 1,000s of dollars for the people in your name vector.

  2. Wrap this into a data.frame, data.
name <- c("Sally", "Michelle", "Lupe", "Wendy", "Maritza",
          "Jorge", "Sam", "Joe", "Glenn", "Buck")
female <- c(TRUE, TRUE, TRUE, TRUE, TRUE, 
            FALSE, FALSE, FALSE, FALSE, FALSE)
edu <- c("HS", "BS", "BS", "MS", "PhD", 
         "HS", "HS", "BS", "MS", "PhD")
salary <- c(30, 38, 53, 65, 89, 29, 33, 42, 107, 246)

data <- data.frame(name = name,
                   female = female,
                   edu = edu,
                   salary = salary)

data
##        name female edu salary
## 1     Sally   TRUE  HS     30
## 2  Michelle   TRUE  BS     38
## 3      Lupe   TRUE  BS     53
## 4     Wendy   TRUE  MS     65
## 5   Maritza   TRUE PhD     89
## 6     Jorge  FALSE  HS     29
## 7       Sam  FALSE  HS     33
## 8       Joe  FALSE  BS     42
## 9     Glenn  FALSE  MS    107
## 10     Buck  FALSE PhD    246
str(data)
## 'data.frame':    10 obs. of  4 variables:
##  $ name  : Factor w/ 10 levels "Buck","Glenn",..: 8 7 5 10 6 4 9 3 2 1
##  $ female: logi  TRUE TRUE TRUE TRUE TRUE FALSE ...
##  $ edu   : Factor w/ 4 levels "BS","HS","MS",..: 2 1 1 3 4 2 2 1 3 4
##  $ salary: num  30 38 53 65 89 29 33 42 107 246
data$name <- as.character(data$name)
data$female <- as.numeric(data$female)
data$edu <- factor(data$edu, levels=c("HS", "BS", "MS", "PhD"), ordered=TRUE)

data
##        name female edu salary
## 1     Sally      1  HS     30
## 2  Michelle      1  BS     38
## 3      Lupe      1  BS     53
## 4     Wendy      1  MS     65
## 5   Maritza      1 PhD     89
## 6     Jorge      0  HS     29
## 7       Sam      0  HS     33
## 8       Joe      0  BS     42
## 9     Glenn      0  MS    107
## 10     Buck      0 PhD    246
str(data)
## 'data.frame':    10 obs. of  4 variables:
##  $ name  : chr  "Sally" "Michelle" "Lupe" "Wendy" ...
##  $ female: num  1 1 1 1 1 0 0 0 0 0
##  $ edu   : Ord.factor w/ 4 levels "HS"<"BS"<"MS"<..: 1 2 2 3 4 1 1 2 3 4
##  $ salary: num  30 38 53 65 89 29 33 42 107 246