Chapter 2 Setting up Spark with R and sparklyr
An exhaustive list of instructions on setting up R with Spark and sparklyr is not in the scope of this publication as it is extensively covered elsewhere, below we provide a quick set of instructions to get a local Spark instance working with sparklyr in an interactive setting.
We have however prepared a dedicated Docker image that has all the prerequisites readily available to use and we recommend using this pre-built image for the best experience using the code present in this book.
2.1 Interactive manual installation
In case the Docker approach is not suitable for you, the following are very basic instructions to install the sparklyr package with its dependencies, the nycflights13 package for example data and Spark version 2.4.3.
install.packages("sparklyr")
install.packages("nycflights13")
sparklyr::spark_install(version = "2.4.3")
For troubleshooting and more detailed step-by-step guides please refer to:
- The Getting Started chapter of the Mastering Spark with R book
- The Prerequisites appendix of the Mastering Spark with R book
- RStudio’s spark website.