# This is a note. Adding the # stops R from reading the line as code so you can make notes
Week 1: Introduction to R
Week 1: Introduction to R
This week we cover R installation and basics.
In-Class Links:
Demo Code:
Making notes:
Quick math in R:
2 + 2
[1] 4
Making variables:
<- 2 + 2 x = 4 y == y x
[1] TRUE
<- "Cameron" my_name print(my_name)
[1] "Cameron"
printing multiple variables
print(c("My name is", my_name))
[1] "My name is" "Cameron"
You might also need to make a string of variables. For this you will need to use the function c(), where c stands for concatenate. From Merriam-Webster, concatenate: to link together in a series or a chain.
<- c(1, 2, 3, 4) my_list my_list
[1] 1 2 3 4
#You can also have R fill in numbers in a series for you. <- c(1:500) my_long_list
Once you have a list, you probably are going to want to access certain parts of it. In R, we do this using variable[location] formatting.
2] my_list[
[1] 2
78:146] my_long_list[
[1] 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 [20] 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 [39] 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 [58] 135 136 137 138 139 140 141 142 143 144 145 146
But most likely, you are going to have a data frame of data. This is how you manually create one.
<- data.frame(v1 = c(1, 3, 7, 9), my_df v2 = c(2, 5, 9, 8), v3 = c(5, 7, 9, 3)) print(my_df)
v1 v2 v3 1 1 2 5 2 3 5 7 3 7 9 9 4 9 8 3
Accessing parts of a dataframe:
Like using lists, you will often find yourself wanting to access specific parts of your data frame. You can do this using one of the following notation styles.
df$column_name
df[row, column]
df[row,] (leave out the row id and it will return the whole row)
df[,column] (returns the whole column)
$v1 my_df
[1] 1 3 7 9
2,3] my_df[
[1] 7
2,] my_df[
v1 v2 v3 2 3 5 7
3] my_df[,
[1] 5 7 9 3
Functions
A function is a set of R code that has been predefined to do a function for you. In this example, we take the mean of the variable v1 in my_df using the R function “mean.” Then we take the mean again, manually.
mean(my_df$v1)
[1] 5
sum(my_df$v1) / nrow(my_df)
[1] 5
#Maybe not that much more difficult to do it manually, but you would need to know how to use the sum() and nrow() functions. You can imagine that doing larger tasks with a single function can make things a lot easier.
But, what if I have a function and I don’t know how to use it or what it does?
help(mean) ?mean # or if you don't remember the name... ??sum
Functions
R has many additional functions that are organized in packages. These can be hosted on CRAN, bioconductR, github, etc. For this bootcamp, you only need to be familiar with CRAN packages. They can be installed using the following code or by clicking packages > install
Note: to update a package you just re-run the installation code.
#install.packages("ggplot2")
Activity:
Install the following packages into your local R environment:
- Tidyverse
- ggplot2
- readr
- readxl
- stringr