Roberto Villegas-Diaz
Data Manager @ University of Liverpool

In reality:
“The goal of {furrr} is to combine {purrr}’s family of mapping functions with {future}’s parallel processing capabilities.”
Shout out to Tom Smith @ Nottingham University Hospitals NHS Trust:
Note: Replacing a
mapfunction by its equivalentfuture_map, does not auto-magically parallelise your code! 🥲
# Set a "plan" for how the code should run.
future::plan(future::multisession, workers = 2)
# This does run in parallel!
furrr::future_map(c("hello", "{purrr}!"), ~.x)[[1]]
[1] "hello"
[[2]]
[1] "{purrr}!"
Other functions:
future_imap(), future_imap_chr(), …,
future_map2(), future_map2_chr(), …,
future_walk(), future_map_chr(), …, and more.
Reference: https://furrr.futureverse.org/reference
future::planningsequential: uses the current R processmultisession: uses separate R sessionsmulticore: uses separate forked R processescluster: uses separate R sessions on one or more machinesReference: https://future.futureverse.org/reference/plan.html
For testing at home:
To find the available CPUs (i.e., max number of workers for the plan function):
future::availableCores()
To add progress bar, include .progress = TRUE in the function call:
furrr::future_map(x, fx, .progress = TRUE)
⚠️ the documentation suggests shifting to the progressr framework.

Imagine we want to compute some spatial indicator X at UPRN (Unique Property Reference Number) level, how long will that take?
Some UPRN stats:
UPRNs are available under the Open Government License (OGL) from the Ordnance Survey Data Hub.

access_to_green_spaces <- function(uprn, ...) {
Sys.sleep(1E-3) # do your thing
return(uprn) # result
}
# Load datasets derived with the R/uprn_example.R script
ons_uprn_nw_cm_icb <-
readr::read_rds("../data/ons_uprn_nw_cm_icb.Rds")
sub_icb_boundaries_cm <-
readr::read_rds("../data/sub_icb_boundaries_cm.Rds")Code: R/uprn_example.R
2672.98 sec elapsed
292.05 sec elapsed
[1] 0.2080856 0.2080856 0.2080856
[1] 0.1552317 0.4877356 0.5330014
user system elapsed
0.024 0.001 0.282
user system elapsed
0.342 0.502 1.191
A possible solution, instead of using an anonymous function within the environment of the “large” object, define the function separately:
user system elapsed
0.297 0.055 0.590