Add support for Spark Dynamic Allocation

Description

For us not having Spark dynamic allocation is a major obstacle for Sparkling Water
deployment in a multi-tenant environment.

Would be awesome to have this fixed.

We looked at Sparkling Water last year, and we hoped we could start using SW once this is available.

Sparkling Water could just register for LiveListenerBus events as it has events
https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-SparkListener.html

onExecutorAdded / onExecutorRemoved to scale SW's memory structures?

https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-LiveListenerBus.html

Hope to see this fixed soon - we're excited to become SW users/customers but can't have it without Dynamic Allocation working.

Activity

Show:
Jordan Bentley
February 28, 2020, 4:30 PM

Just noticed that this got unassigned, is this still in the future of Sparkling Water? It may determine whether we can stay in the H2O ecosystem long-term.

Ruslan Dautkhanov
May 6, 2019, 10:10 PM

Created for the latter part (Arrow serialization)

Ruslan Dautkhanov
October 20, 2017, 7:52 PM

FYI - Spark 2.3 supports dataframes based on Apache Arrow serialization https://issues.apache.org/jira/browse/SPARK-13534
This might be easier if H2O would use Apache Arrow for its frames, then in the future it would be possible to support Spark Dynamic Allocation?
Arrow was created to address this zero-copy need between different frameworks https://arrow.apache.org/
Arrow: All systems utilize the same memory format; No overhead for cross-system communication
I don't know h2o architecture and probably oversimplifying here.

Michal Malohlava
October 18, 2017, 3:45 PM

Thank you for feedback, in meantime we designed "external" cluster deployment of Sparkling Water. It separates life-cycle of Spark driver and H2O cluster. But still H2O cluster needs to be deployed in no-elastic environment.

Assignee

Unassigned

Reporter

Ruslan Dautkhanov

Labels

None

CustomerVisible

No

testcase 1

None

testcase 2

None

testcase 3

None

h2ostream link

None

Affected Spark version

None

AffectedContact

None

AffectedCustomers

None

AffectedPilots

None

AffectedOpenSource

None

Support Assessment

None

Customer Request Type

None

Support ticket URL

None

End date

None

Baseline start date

None

Baseline end date

None

Task progress

None

Task mode

None

ReleaseNotesHidden

None

Components

Affects versions

Priority

Critical