- Review Basic Operation
- Data Types & Data Structures
- Statistics
- Open R Studio Desktop
- Setting working directory (See in 01-Intro-R)
- Create Script file name
02_63130500xxx.Rand save in your working directory - Open progress form Class 02 and checked on Study Check-In and 0 prerequisite
# create variable
x <- 1
y <- 2
x+y
print(x + y)
z <- x+y
z
There are 3 types:
- Numeric: Any number with or without a decimal point. Example - 12.3, 5, 999
- Logical: Two possible values — either
TRUEorFALSE - Character: Any grouping of characters on your keyboard (letters, numbers, spaces, symbols, etc.) or text.Example - "Hello World"
# Numeric ex 1, 1.0
varA <- 100
class(varA)
# Logical: TRUE, FALSE
varB <- TRUE
class(varB)
class(1==2)
# Character
varC <- "Hello, My name is Safe"
class(varC)
Main data structure that we will learn in this course are Vector, List, Data Frame
Vector is the simple data structure which is a single entity consisting of an ordered collection of values. We use c() function which means to concatenate the elements into a vector.
# Character Vectors
c("Ant","Bird","Cat")
# Logical Vectors
c(TRUE,FALSE,TRUE)
# Numeric Vectors
c(100,245,305,411,555)
- Use
class()to see class or type of an object - Use
length()to get the length of vectors - Moreover you can get elements at position x by using
variable[x]which position is starting with 1
# Create Variable name
v1 <- c(1,2,3,4,5)
v2 <- c(6,7,8,9,10)
# Replicate
v3 <- rep(c(1,2,3),5)
# Creating integer sequences
v4 <- c(1:100)
You can use + - * / with vector
v1+v2
An ordered collection of objects (components). A list allows you to gather a variety of (possibly unrelated) objects under one name.
# Initial
name <- c("Antony","Boky","Caty")
age <-c(10,25,30)
club <-c("Sec A","Sec B","Sec A")
retired <- c(T,F,T)
# Create list
myList <- list(name,age,club,retired)
# Or assign name
myList <- list(stdName = name,
stdAge = age,
stdClub = club,
retired = retired)
Data frames are tabular data objects. The first column can be numeric while the second column can be character and third column can be logical. It is a list of vectors of equal length.
Data Frames are created using the data.frame() function.
continent <- c("Africa","Asia","Europe","North America","Oceania","South America","Antarctica")
countries <- c(54,48,51,23,14,12,0)
world <- data.frame(continent,countries)
View(world)
x <- c(1:10)
mean(x)
sum(x)
# Summaries
summary(x)
# Help Function
?
help()
# Useful Functions
length(object) # number of elements or components
str(object) # structure of an object
class(object) # class or type of an object
Study More... https://www.statmethods.net/input/datatypes.html
Create script file HW01_63130500xxx.R and do exercise in this file and answer by using comment. Example
# Example 0
x <- 1
y <- 2
print(x+y) #3
After finished send in LEB2
Finding the average, sum, median, sd, variance of 10.4, 5.6, 3.1, 6.4, 21.7
2.1. Create data structure in variable named marvel_movies and explain why you using this data structure ?
# List of Marvel movies (Order by Marvel Phase released)
names <- c("Iron Man","The Incredible Hulk","Iron Man 2","Thor","Captain America: The First Avenger",
"The Avengers","Iron Man 3","Thor: The Dark World","Captain America: The Winter Soldier",
"Guardians of the Galaxy","Avengers: Age of Ultron","Ant-Man","Captain America: Civil War",
"Doctor Strange","Guardians of the Galaxy 2","Spider-Man: Homecoming","Thor: Ragnarok","Black Panther",
"Avengers: Infinity War","Ant-Man and the Wasp","Captain Marvel","Avengers: Endgame",
"Spider-Man: Far From Home","WandaVision","Falcon and the Winter Soldier","Loki","Black Widow")
# List of released year of Marvel movies
years <- c(2008,2008,2010,2011,2011,2012,2013,2013,2014,2014,2015,2015,2016,2016,
2017,2017,2017,2017,2018,2018,2019,2019,2019,2021,2021,2021,2021)
# Or using Function
years <- c(2008,2008,2010,2011,2011,2012,rep(2013:2016,each=2),
rep(2017,4),rep(2018,2),rep(2019,3),rep(2021,4))
2.2 Finding the information:
- The numbers of movies
- Finding the 19th movies name
- Which year is most released movies (In this question can using observation data, not necessary to used command to find answer)