install.packages("tidyverse")
Setting Up Your Environment
You are reading Tidy Finance with R. You can find the equivalent chapter for the sibling Tidy Finance with Python here.
We aim to lower the bar for starting empirical research in financial economics. We want that using R is easy for you. However, given that Tidy Finance is a platform that supports multiple programming languages, we also consider the possibility that you are not familiar with R at all. Hence, we provide you with a simple guide to get started with R and RStudio. If you were not using R before, you will be able to use it after reading this chapter.
The R language
Some good news first: The software you need is free and easy to download. We will start with downloading and installing R and follow up with doing the same for RStudio.
R is provided via The Comprehensive R Archive Network (or short CRAN). CRAN does not only provide the main software but also nearly all extensions that you need. We will cover these extensions or packages later, as we usually visit the CRAN website only to download the base version. Now, go ahead and visit CRAN. On the landing page, you can choose your operating systems (i.e., Linux, macOS, and Windows). Click the respective link that fits your system:
- R comes as a part of many Linux distributions. If it does not, CRAN provides installation guides for individual Linux distributions.
- For macOS, the choice currently depends on some hardware specifications, but the right version for your system is clearly indicated.
- For Windows, you want to use the base version provided.
After downloading and installing the software to your system, you are nearly ready to go. In fact, you could just use R now. Unfortunately for many users, R is not a program but a programming language and comes with an interpreter that you would use like a command line. While using R like this might make you feel like a hacker (not that we do not endorse any criminal activity), it is in your best interest to combine R with RStudio.
R is constantly being updated, with new versions being released multiple times a year. This means that you might want to return to CRAN in the future to fetch yourself an update. You know it is time for an update if packages remind you that you are using an outdated version of R.
RStudio
Assuming you are looking for a more comfortable way of using R, you will get RStudio next. You can download it for free from Posit (i.e., the company that created RStudio, which was previously called RStudio itself). When you follow the instructions, you will see that Posit asks you to install R. However, you should have done that already and can move straight to downloading and installing RStudio.
RStudio is a program similar to other programs you most likely use, like a browser, text editor, or anything else. It comes with many advantages, including a project manager, Github integration, and much more. Unfortunately, Tidy Finance is not the right scope to elaborate more on these possibilities or introduce the basics of programming, but we point you to some excellent resources below. For the purposes of this book, you have completed your excursions to websites that provide you with the necessary software installers.
R Packages and Environments
Following your read of the preface to this book, you might now wonder why we did not download the tidyverse
yet. Therefore, you must understand one more concept, namely packages in R. You can think of them as extensions that you use for specific purposes, whereas R itself is the core pillar upon which everything rests. Comfortably, you can install packages within R with the following code.
Simply specify the package you want where we placed tidyverse
. You typically only need to install packages once - except for updates or project-specific R environments. Once installed, you can then load a package with a call to library(tidyverse)
to use it.
To keep track of the packages’ versions and make our results replicatable, we rely on the package renv
. It creates a project-specific installation of R packages and you can find the full list of packages used here in the colophon below. The recorded package versions can also be shared with collaborators to ensure consistency. Our use of renv
also makes it easier for you to install the exact package versions we were using (if you want that) by initializing renv
with our renv.lock-file from Github.
One more piece of advice is the use of RStudio projects. They are a powerful tool to save you some time and make working with R more fun. Without going into more detail here, we refer you to Wickham, Çetinkaya-Rundel, and Grolemund (2023)’s chapter on Workflow: scripts and projects.
Your First Steps with R
While we believe that downloading and installing R and RStudio is sufficiently easy, you might find help from Grolemund (2014) on R and RStudio, packages, as well as updating the software.
This book’s scope cannot be to give you an introduction to R itself. It is not our comparative advantage. However, we can point you to a possible path that you could follow to familiarize yourself with R. Therefore, we make the following suggestion:
- If you are new to R itself, a very gentle and good introduction to the workings of R can be found in Grolemund (2014). He provides a wonderful example in the form of the weighted dice project. Once you are done setting up R on your machine, try to follow the instructions in this project.
- The main book on the
tidyverse
, Wickham, Çetinkaya-Rundel, and Grolemund (2023), is available online and for free: R for Data Science explains the majority of the tools we use in our book. Working through this text is an eye-opening experience and really useful.
Additional resources we can encourage you to use are the following:
- If you are an instructor searching to effectively teach R and data science methods, we recommend taking a look at the excellent data science toolbox by Mine Cetinkaya-Rundel.
- RStudio provides a range of excellent cheat sheets with extensive information on how to use the
tidyverse
packages.
Creating Environment Variables
If you plan to share your own code with collaborators or the public, you may encounter the situation that your projects require sensitive information, such as login credentials, that you don’t want to publish. Environment variables are widely used in software development projects because they provide a flexible and secure way to configure applications and store secrets. In later chapters, we use such environment variables to store private login data for a remote database.
You can use .Renviron
-files to store environment variables. Upon startup, R and RStudio look for .Renviron
files in your home and project directory. .Renviron
-files can be either at the user or project level. If there is a project-level .Renviron
, the user-level file will not be sourced. A simple way to create your own .Renviron
-file is the function usethis::edit_r_environ()
.
::edit_r_environ(scope = "project") usethis
This command will open your .Renviron
-file and you can add variables. For the purpose of this book, we create and save the following variables (where user
and password
are our private login credentials)
WRDS_USER=user
WRDS_PASSWORD=password
After you have restarted your RStudio session, you can access these environment variables via Sys.getenv()
for future sessions using the specific project or user.
Sys.getenv("WRDS_USER")
Sys.getenv("WRDS_PASSWORD")
Note that you can also store other login credentials, API keys, or file paths in the same environment file.
If you use version control, then you should make sure that the .Renviron
-file is included in your .gitignore
with the following code line.
::edit_git_ignore(scope = "project") usethis
Colophon
This book was written in RStudio using bookdown
(Xie 2016). The website was rendered using quarto
(Allaire et al. 2022) and it is hosted via GitHub Pages. The complete source is available from GitHub. We generated all plots in this book using ggplot2
and its classic dark-on-light theme (theme_bw()
).
This version of the book was built with R (R Core Team 2022) version 4.4.1 (2024-06-14, Race for Your Life) and the following packages:
Package | Version |
---|---|
RPostgres | 1.4.5 |
RSQLite | 2.3.1 |
broom | 1.0.5 |
brulee | 0.3.0 |
dbplyr | 2.5.0 |
dplyr | 1.1.4 |
fixest | 0.11.1 |
forcats | 1.0.0 |
frenchdata | 0.2.0 |
furrr | 0.3.1 |
ggplot2 | 3.4.3 |
glmnet | 4.1-8 |
hardhat | 1.3.0 |
hexSticker | 0.4.9 |
httr2 | 1.0.3 |
jsonlite | 1.8.8 |
kableExtra | 1.3.4 |
lmtest | 0.9-40 |
lubridate | 1.9.3 |
nloptr | 2.1.1 |
purrr | 1.0.2 |
ranger | 0.15.1 |
readr | 2.1.4 |
renv | 1.0.3 |
rlang | 1.1.3 |
rmarkdown | 2.21 |
sandwich | 3.0-2 |
scales | 1.2.1 |
slider | 0.3.1 |
stringr | 1.5.0 |
tibble | 3.2.1 |
tidyfinance | 0.4.1 |
tidymodels | 1.1.0 |
tidyr | 1.3.1 |
tidyverse | 2.0.0 |
timetk | 2.8.3 |
torch | 0.11.0 |
wesanderson | 0.3.6 |