Errors Running RSparkling on Databricks Azure Cluster

Description

I tried running RSparkling on Databricks/Azure, but ran into some errors following the docs (step 4)

1. RCurl dependency is missing in the docs. I wasn't able to install h2o without installing RCurl. I noticed this is done explicitly in the H2O for R documentation.

2. Unable to start H2O Context:
running h2o_context(sc) does not work. According to the error message, the function couldn't be found.
I tried with rsparkling::h2o_context(sc), but that didn't work either.

3. H2OConf() returns Error : java.lang.ClassNotFoundException: ai.h2o.sparkling.H2OConf Error : java.lang.ClassNotFoundException: ai.h2o.sparkling.H2OConf

Looking at the Rsparkling docs, I tried running H2OConf() before running hc <- H2OContext.getOrCreate(h2oConf), but got an error.

Activity

Show:
Marek Novotny
February 16, 2021, 2:42 PM
Edited

This ticket looks like a duplicate to SW-2516

RE 3: As a temporary fix, you need to add Sparkling Water jar ( E.g ai.h2o:sparkling-water-package_2.12:3.32.0.3-1-3.0) to cluster libraries manually.

Assignee

Unassigned

Reporter

pech

Labels

None

CustomerVisible

Yes

testcase 1

None

testcase 2

None

testcase 3

None

h2ostream link

None

Affected Spark version

None

AffectedContact

None

AffectedCustomers

None

AffectedPilots

None

AffectedOpenSource

None

Support Assessment

None

Customer Request Type

None

Support ticket URL

None

End date

None

Baseline start date

None

Baseline end date

None

Task progress

None

Task mode

None

ReleaseNotesHidden

None

Priority

Major