% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/get_gescrss.R
\name{get_gescrss}
\alias{get_gescrss}
\title{Get GES/CRSS data}
\usage{
get_gescrss(
  years = 2014:2023,
  regions = c("mw", "ne", "s", "w"),
  source = c("zenodo", "nhtsa"),
  proceed = FALSE,
  dir = NULL,
  cache = NULL
)
}
\arguments{
\item{years}{Years to be downloaded, in yyyy (character or numeric formats,
defaults to last 10 years).}

\item{regions}{(Optional) Regions to keep: mw=midwest, ne=northeast, s=south, w=west.}

\item{source}{The source of the data: 'zenodo' (the default) pulls the prepared
dataset from \href{https://zenodo.org/records/17162674}{Zenodo}, 'nhtsa'
pulls the raw files from NHTSA's FTP site and prepares them on your machine.
'zenodo' is much faster and provides the same dataset produced by using source='nhtsa'.}

\item{proceed}{Logical, whether or not to proceed with downloading files without
asking for user permission (defaults to FALSE, thus asking permission)}

\item{dir}{Directory in which to search for or save a 'GESCRSS data' folder. If
NULL (the default), files are downloaded and unzipped to temporary
directories and prepared in memory. Ignored if source = 'zenodo'.}

\item{cache}{The name of an RDS file to save or use. If the specified file (e.g., 'myFARS.rds')
exists in 'dir' it will be returned; if not, an RDS file of this name will be
saved in 'dir' for quick use in subsequent calls. Ignored if source = 'zenodo'.}
}
\value{
A GESCRSS data object (a list with six tibbles: flat, multi_acc,
    multi_veh, multi_per, events, and codebook).
}
\description{
Bring GES/CRSS data into the current environment, whether by downloading it anew
    or by using pre-existing files.
}
\details{
This function provides the GES/CRSS database for the specified years and regions
   By default, it pulls from a Zenodo repository for speed and memory efficiency.
   It can also pull the raw files from \href{https://www.nhtsa.gov/file-downloads?p=nhtsa/downloads/}{NHTSA} and process them in memory, or
   use an RDS file saved on your machine.

   If source = 'nhtsa' and no directory (dir) is specified, SAS files are downloaded into a
   tempdir(), where they are also prepared, combined, and then brought into
   the current environment. If you specify a directory (dir), the function will
   look there for a 'GESCRSS data' folder. If not found, it will be created and
   populated with raw and prepared SAS and RDS files, otherwise the
   function makes sure all requested years are present and asks permission
   to download any missing years.

   The object returned is a list with class 'GESCRSS'. It contains six tibbles:
   flat, multi_acc, multi_veh, multi_per, events, and codebook.

   Flat files are wide-formatted and presented at the person level.
   All \emph{crashes} involve at least one motor \emph{vehicle}, each of
   which may contain one or multiple \emph{people}. These are the three
   entities of crash data. The flat files therefore repeat some data elements
   across multiple rows. Please conduct your analysis with your entity in mind.

   Some data elements can include multiple values for any data level
   (e.g., multiple weather conditions corresponding to the crash, or multiple
   crash factors related to vehicle or person). These elements have been
   collected in the yyyy_multi_[acc/veh/per].rds files in long format.
   These files contain crash, vehicle, and person identifiers, and two
   variables labelled \code{name} and \code{value}. These correspond to
   variable names from the raw data files and the corresponding values,
   respectively.

   The events tibble provides a sequence of events for all vehicles involved
   in the crash. See Crash Sequences vignette for an example.

   The codebook tibble serves as a searchable codebook for all files of any given year.

   Please review the \href{https://crashstats.nhtsa.dot.gov/Api/Public/ViewPublication/813707}{CRSS Analytical User's Manual}

   Regions are as follows:
      mw = Midwest   = OH, IN, IL, MI, WI, MN, ND, SD, NE, IA, MO, KS
      ne = Northeast = PA, NJ, NY, NH, VT, RI, MA, ME, CT
      s  = South     = MD, DE, DC, WV, VA, KY, TN, NC, SC, GA, FL, AL, MS, LA, AR, OK, TX
      w  = West      = MT, ID, WA, OR, CA, NV, NM, AZ, UT, CO, WY, AK, HI
}
\examples{

  \dontrun{
    myGESCRSS <- get_gescrss(years = 2021, regions = "s")
  }
}
