Issues with HTTP libraries on SPark 2.3

Description

scala> new H2OFrame(URI.create("s3://h2o-public-test-data/smalldata/airlines/allyears2k.zip"))
java.lang.IllegalStateException: Socket not created by this factory
at org.apache.http.util.Asserts.check(Asserts.java:34)
at org.apache.http.conn.ssl.SSLSocketFactory.isSecure(SSLSocketFactory.java:435)

Activity

Show:
Jakub Hava
July 2, 2018, 7:27 PM

The cleaner workaround is to ship httpclient 4.5.2 with Sparkling Water 2.3.x and add it on the spark driver classpath via --driver-class-path in our starting helper scripts, such as ./bin/sparkling-shell. This way this library will be first on the classpath before the original httpclient library

Jakub Hava
July 2, 2018, 2:27 PM

CC:

Jakub Hava
July 2, 2018, 2:24 PM
Edited

The workaround is to put httpclient-4.5.2.jar httpcore-4.4.4.jar in the /jars directory of your spark 2.3.x instalation instead of the original versions

Jakub Hava
July 2, 2018, 1:58 PM

This also raises priority for the hadoop smoke tests with Sparkling Water

Jakub Hava
July 2, 2018, 1:38 PM

update: Hadoop lib versions used in H2O should not affect SW as we are excluding them. We are using the ones provided by the Hadoop.

This is true Spark bug -> upgrade of httpclient is causing incompatibility issues with Spark 2.3, explained here even though different project https://github.com/druid-io/druid/issues/4456

Fixed

Assignee

Jakub Hava

Reporter

Jakub Hava

Labels

None

CustomerVisible

No

testcase 1

None

testcase 2

None

testcase 3

None

h2ostream link

None

Affected Spark version

None

AffectedContact

None

AffectedCustomers

None

AffectedPilots

None

AffectedOpenSource

None

Support Assessment

None

Customer Request Type

None

Support ticket URL

None

End date

None

Baseline start date

None

Baseline end date

None

Task progress

None

Task mode

None

ReleaseNotesHidden

None

Fix versions

Priority

Major