ibmos2spark

The ibmos2spark library facilitates data read/write connections between Apache Spark clusters and the various IBM Object Storage services.

Object Storage Documentation

Requirements

Apache Spark with stocator library

The easiest way to install the stocator library with Apache Spark is to pass the Maven coordinates at launch. Other installation options are described in the stocator documentation.

Apache Spark at IBM

The stocator and ibmos2spark libraries are pre-installled and available on

Languages

The library is implemented for use in Python, R and Scala/Java.

Details

This library only does two things.

Uses the SparkContext.hadoopConfiguration object to set the appropriate keys to define a connection to an object storage service.
Provides the caller with a URL to objects in their object store, which are typically passed to a SparkContext object to retrieve data.

Example Usage

The following code demonstrates how to use this library in Python and connect to the Cloud Object Storage service, described in the far left pane of the image above.

import ibmos2spark

credentials = {
  'auth_url': 'https://identity.open.softlayer.com',  #your URL might be different
  'project_id': '',
  'region': '',
  'user_id': '',
  'username': '',
  'password': '',
}

configuration_name = 'my_bluemix_objectstore'  #you can give any name you like

bmos = ibmos2spark.bluemix(sc, credentials, configuration_name)  #sc is the SparkContext instance

container_name = 'some_name'
object_name = 'file_name'

data_url = bmos.url(container_name, object_name)

data = sc.textFile(data_url)

Name		Name	Last commit message	Last commit date
Latest commit History 245 Commits
fig		fig
python		python
r		r
scala		scala
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ibmos2spark

Object Storage Documentation

Requirements

Apache Spark at IBM

Languages

Details

Example Usage

About

Releases 5

Packages

Contributors 2

Languages

ibm-watson-data-lab/ibmos2spark

Folders and files

Latest commit

History

Repository files navigation

ibmos2spark

Object Storage Documentation

Requirements

Apache Spark at IBM

Languages

Details

Example Usage

About

Resources

Stars

Watchers

Forks

Releases 5

Packages 0

Contributors 2

Languages

Packages