GBM early stopping is computed on training set logloss
Description
The early stopping metric logloss seems to be computed on the training frame and not on the validation frame
Here is a snipped of the code
I noticed that looking at the logloss plot in the H2O flow interface the decease in the validation set (orange line) is less than the threshold but the tree generation does not stop after the chosen number of rounds, see picture attached.
Please, don't mind the clearly overfitting model
Activity
Show:
Veronika Maurerová
March 28, 2019, 2:42 PM
I think it is the same problem as here: https://0xdata.atlassian.net/browse/PUBDEV-6099.
However, I only guess, it is not so clear from the image...
Assignee
Fix versions
None
Reporter
Support ticket URL
Labels
Affected Spark version
None
Customer Request Type
None
Task progress
None
ReleaseNotesHidden
None
CustomerVisible
Yes