-
Notifications
You must be signed in to change notification settings - Fork 21
Home
Leonardo is a service that provisions Spark clusters and stands up Jupyter notebooks on them.
Jupyter notebooks are becoming an increasingly popular way of creating reproducible bioinformatics analysis tasks. They combine familiar and powerful programming languages, like R and Python, with the ability to create and share documents containing code, results, and narrative text.
Jupyter can integrate with powerful compute paradigms running on horizontally scalable environments, such as Spark or Tensorflow, and provides an excellent environment in which to run leading genomic analysis software such as Hail.
Many systems would also like to provide a hosted version of this capability that provides resource management and security in a cloud-based environment. Leonardo aims to provide those capabilities and is being used as part of FireCloud and the All of Us Researcher Tools platform.
- REST-based service
- Endpoint and resource based access control
- End user can pip install additional packages
- Automated provisioning of Google Cloud Platform Dataproc clusters with Jupyter notebooks
- 2-way SSL encryption between dataproc cluster and proxy service
Immediate development plans are racing to meet the needs of the All of Us Researcher Tools platform as well as researcher usage within Broad and Verily on Google Cloud Platform. Through pluggable authorization and credential providers, we would like to be able to support many other security and infrastructure use cases.