Integrate XGBoost in Sparkling Water
We want to have H2O's XGBoost available in SW as well. XGBoost should be exposed just like the other algos.
We will have to document how to set memory requirements and configure Spark. XGBoost will allocate off-heap memory, this is an issue on Hadoop and the containers need to get proper memory configuration.
It would be good to have (real-life deployment) tests of this functionality because integrating XGBoost can be tricky.
Sure!, at this point I think it just requires testing it out and documenting the memory config as you mentioned. People can use it the same way as they are used to in H2O.
We can also expose the XGBoost as Sparkling Water pipeline Stage, but I would put that into another JIRA
can you please plan this into a release?