Questions will be listed on this page and also in the midterm.R
script.
Answers and R code should be filled out in the R script midterm.R
.
Every sentence or word in the R script that is not R code should be preceeded with the #
symbol.
That means when I run your script, it won’t give me an error because it is not R code.
Do not comment out R code / queries if that is part of your answer.
I am going to run all the lines of code to see if the output is correct and answers the question.
Two data sets with similar topics. Traffic stop data in San Diego and in Little Rock.
Make the case that one data set is better than another.
Provide 3 concrete examples of stories that are possible to research and tell because of the better data set’s structure.
This is the only question in which you will be allowed to browse outside this page
Assign it the name parkingtix
http://andrewbatran.com/ccsu-2017/assets/data/2013.csv
This is a data set from New York City of every parking ticket issued in 2013.
What is the structure of the data frame?
...(parkingtix)
parkingtix %>%
summarize(count=n())
Which are the 5 most-ticketed types of cars in New York City?
What am I doing wrong? Fix the query below.
parkingtix %>%
groupby(Vehicle_Make) %>%
summarize(total=n()) %>%
arrange(total) %>%
head(5)
Fine
column so it’s numericUse the gsub()
function and the as.numeric()
function.
avgboro
that has the average and total amount of fines per Boro