Issues with HTTP libraries on SPark 2.3
Description
scala> new H2OFrame(URI.create("s3://h2o-public-test-data/smalldata/airlines/allyears2k.zip"))
java.lang.IllegalStateException: Socket not created by this factory
at org.apache.http.util.Asserts.check(Asserts.java:34)
at org.apache.http.conn.ssl.SSLSocketFactory.isSecure(SSLSocketFactory.java:435)
Activity
The cleaner workaround is to ship httpclient 4.5.2 with Sparkling Water 2.3.x and add it on the spark driver classpath via --driver-class-path in our starting helper scripts, such as ./bin/sparkling-shell. This way this library will be first on the classpath before the original httpclient library
CC:
The workaround is to put httpclient-4.5.2.jar httpcore-4.4.4.jar in the /jars directory of your spark 2.3.x instalation instead of the original versions
This also raises priority for the hadoop smoke tests with Sparkling Water
update: Hadoop lib versions used in H2O should not affect SW as we are excluding them. We are using the ones provided by the Hadoop.
This is true Spark bug -> upgrade of httpclient is causing incompatibility issues with Spark 2.3, explained here even though different project https://github.com/druid-io/druid/issues/4456