Chapter 6 Basic organization

6.1 Download the folder structure

The next parts of this example use a specific file and folder structure, which you can download using the following instructions:

  1. First, download the example_project.zip file located here.
  2. Second, unzip the file. It will expand into a series of nested folders with the following structure:
## -- R
##    |__analysis_report.Rmd
##    |__functions.R

For now there are two folders:

  • data: Where our dataset will be stored
  • R: Where we store scripts, e.g. one that defines our custom functions

Earlier we looked at some scripts that are necessary to run targets, but which are not all included here yet. We’ll add more to this folder structure as we start putting the workflow together. You might also want to add other folders to your projects, such as figures/ or documents/ to store any products you export.

6.2 Get started with targets

Now that you have your files in order, make sure that you have installed the targets and tarchetypes packages. If you have not yet done so you will need to run this code:

install.packages("targets")
install.packages("tarchetypes")

More information on targets installation can be found on the package website.

Now you should set your working directory to main folder of this project, i.e., example_project/. A good practice is to use an RStudio project with new analyses such as this one. RStudio projects automatically assign the working directory of the RStudio project to the project folder they’re located in. There’s a good overview in Wickham & Grolemund’s R for Data Science. It’s not required here, though, so whichever option you choose just make sure you have your working directory set.

We now have everything in order to begin working With targets! Let’s build a pipeline.

6.2.1 Initiate the pipeline

As mentioned earlier, the ingredients of a functional targets pipeline include a _targets.R script. We could make one by hand, but the package comes with a built-in function create one for us. Let’s use that.

First, load the targets package, then use the tar_script() function to create a new _targets.R script. It will create the file in whatever directory you’re in.

library(targets)

tar_script()

Now if you go to your project folder you should find that script and be able to open it in R. You can open it manually or from R using tar_edit(). Here’s what it probably looks like:

library(targets)
# This is an example _targets.R file. Every
# {targets} pipeline needs one.
# Use tar_script() to create _targets.R and tar_edit()
# to open it again for editing.
# Then, run tar_make() to run the pipeline
# and tar_read(summary) to view the results.

# Define custom functions and other global objects.
# This is where you write source(\"R/functions.R\")
# if you keep your functions in external scripts.
summ <- function(dataset) {
  summarize(dataset, mean_x = mean(x))
}

# Set target-specific options such as packages.
tar_option_set(packages = "dplyr")

# End this file with a list of target objects.
list(
  tar_target(data, data.frame(x = sample.int(100), y = sample.int(100))),
  tar_target(summary, summ(data)) # Call your custom functions as needed.
)

This text is instructional and helpful, but for our purposes we can reduce it to just the lines below:

library(targets)
library(tarchetypes)

source("R/functions.R")

# Set target-specific options such as packages.
tar_option_set(packages = "tidyverse")

# End this file with a list of target objects.
list(
  
)

Notice that I’ve made a couple of small changes in the above code block:

  1. I added a line to load the tarchetypes package
  2. I uncommented the code, source("R/functions.R"), which will read in the contents of a functions.R script in which we’ll store custom functions for our workflow
  3. I changed the package in tar_option_set() to "tidyverse" from "dplyr". This line just tells targets that by default we’ll load the tidyverse package for each of our targets in this workflow unless we specify otherwise

Now that we’ve made it this far we technically have a functional workflow! You can take a second here to test that it’s working properly by using tar_make(), which runs whatever workflow we’ve defined in _targets.R.

tar_make()
* end pipeline

You’ll likely only receive the * end pipeline output shown above. This is because our list of targets in the workflow is currently empty, so the pipeline doesn’t have anything to build yet!

With this milestone complete we’ll move on to sketching out the pipeline in the next section.