POC: Jobs recommendation on Apache Spark and BigDL

What is BigDL?

BigDL is a distributed deep learning library for Apache Spark; with BigDL, users can write their deep learning applications as standard Spark programs, which can directly run on top of existing Spark or Hadoop clusters.

Deploy BigDL application on Databricks cloud

Step1 build analytics-zoo jar clone analytics zoo to local: git clone https://github.com/intel-analytics/analytics-zoo build job2career-with-dependencies.jar: mvn clean install -DskipTests
Step 2 login on to web with credentials
Step 3 setup cluster
- Clusters -> create cluster
- give a name “intel” set up workers 1, uncheck auto scaling.
- Set up spark configuration here, for example
  - spark.executor.cores 4
  - spark.cores.max 4
  - spark.shuffle.reduceLocality.enabled false
  - spark.shuffle.blockTransferService nio
  - spark.scheduler.minRegisteredResourcesRatio 1.0
  - spark.speculation false
Step 4, upload data and dependency jar
- Data-> create table -> upload data, give a name for example ”NEG50”
- /FileStore/taAbles/Jobs2Career/indexed/indexed/
- /FileStore/tables/Jobs2Career/indexed/NEG50/
- /FileStore/tables/Jobs2Career/lib/job2career_1_0_SNAPSHOT_job-0ca74.jar
Step 5 run job
- Jobs -> Create job -> give a name
- set Jar, Upload jar, give main class “com.intel.analytics.bigdl.apps.job2Career.TrainWithD2VGlove”, give arguments "--inputDir /FileStore/tables/Jobs2Career/indexed/“
- Add dependency lib dbfs:/FileStore/tables/Jobs2Career/lib/job2career_1_0_SNAPSHOT_job-0ca74.jar
- Edit cluster -> existing cluster, choose the one you created
- confirm -> run now -> see results from log

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.idea		.idea
scripts		scripts
src		src
README.md		README.md
job2career.iml		job2career.iml
pom.xml		pom.xml
spark-submit-data-local.sh		spark-submit-data-local.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

POC: Jobs recommendation on Apache Spark and BigDL

What is BigDL?

Deploy BigDL application on Databricks cloud

About

Releases

Packages

Contributors 2

Languages

songhappy/Job2Career

Folders and files

Latest commit

History

Repository files navigation

POC: Jobs recommendation on Apache Spark and BigDL

What is BigDL?

Deploy BigDL application on Databricks cloud

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages