DeepLearning MOJO support in Generic Model

Description

DeepLearning MOJOs are not supported, yet there are users who’d like to use them.
https://stackoverflow.com/questions/59360013/h2o-autoencoder-serialized-to-mojo-format-cannot-be-loaded-by-h2o-library

Activity

Show:
Pavel Pscheidl
February 9, 2021, 1:50 PM
Edited

Hello , we’ll attempt to schedule a fix release version soon for greater transparency of deadlines.

TODO: FAQ entry or further clarification in the existing documentation

Urs
February 9, 2021, 12:24 PM

Hello Pavel,

in June 2020 we reported an error (7605) regarding MOJO read/write function or serialization.

Call of h2o.anomaly fails for deeplearning autoencoder model saved and imported as MOJO - JIRA (atlassian.net)<https://h2oai.atlassian.net/browse/PUBDEV-7605>

In the meantime, some corrections (e.g. 7746 - Deserialization Values of MOJO ModelParameter Does Not Work If the Value Type Is int[]) have been made.

Are there any plans to fix also the bug we reported, because finally this bug prevents the version-independent storage of models and thus has a big impact on the model management.

We would be happy if you could give us some feedback on this.

Many greetings
Urs

Pavel Pscheidl
June 8, 2020, 11:00 AM

This has nothing to do with the original JIRA, but it definitely looks like a bug and I’ve opened a new Jira for you:

Pavel Pscheidl
June 8, 2020, 10:58 AM

Generic model is limited in functionality. MOJOs are not fill representations of the original model, as the original model (may also be called native or binary) is heavily dependent on H2O versions. Algorithms are improved in time, parameters are added etc. This results in theoretical incomptibility of models among H2O versions. The correct and right solution is to simply re-train the model using a new verison.

MOJOs are not intended to be a full representation of the H2O model. That’s why the information contained inside MOJO will always be limited (even though full parameters and model output is still available in modelDetails.json file inside that MOJO zip file).

What you’re encountering is a bug. MOJO models are intended to support scoring, or better said in this case anomaly detection. In case of Generic MOJO the predict function works, but the anomaly function doesn’t as MOJO is in fact a Generic model internally inside H2O, not the original model. This is about re-routing the requests.

Urs
June 8, 2020, 7:16 AM

Call of h2o.anomaly fails for deeplearning autoencoder model saved and imported as MOJO

Calling h2o.anomaly for a deeplearning autoencoder fails with the error message if the model was previously saved or loaded using h2o.saveMojo or h2o.import_mojo:

"water.exceptions.H2OIllegalArgumentException: Requires a Deep Learning, GLRM, DRF or GBM model."

If the same model is managed as a binary file with h2o.saveModel or h2o.loadModel, then the call of h2o.anomaly will be executed correctly and without error message.

It is noticeable that h2o.saveMojo only creates an incomplete image of the model. Some properties of the original model (e.g. parameters/hidden) are missing in the zip-archive.

Why does a h2o.import_mojo restore a generic model instead of a deeplearning model?

What is the correct way of saving and restoring h2o version independent models without loosing any properties and features?

Pseudocode:

library(h2o)

h2o.Init()

Model.grid <- h2o.grid( "deeplearning",
x = channels,
training_frame = Traindata.H2O.Train,
validation_frame = Traindata.H2O.Eval,
autoencoder = TRUE,
(…),
)

// train model
Model <- h2o.getModel(Model.grid@model_ids)

// save and load model (is working, but is strongly h2o version dependent)
ModelPath <- h2o.saveModel(Model, WorkDir, force = TRUE)
LoadedModel <- h2o.loadModel(ModelPath)

// is working
MSE <- h2o.anomaly(Model,Recall.Values.H2O)
PredictH2O <- h2o.predict(Model,Recall.Values.H2O)

// used, to avoid h2o version dependency
ModelMojoPath <- h2o.saveMojo(Model, WorkDir, force = TRUE)
ModelMojo <- h2o.import_mojo(ModelMojoPath)

// is not working, couse a generic model was imported instead of deeplearning
// error: water.exceptions.H2OIllegalArgumentException: Requires a Deep Learning, GLRM, DRF or GBM model
MSEMojo <- h2o.anomaly(Committee.Model,Recall.Values.H2O)

// but this is still working?
PredictH2O <- h2o.predict(Committee.Model,Recall.Values.H2O)

Flagged
Fixed

Assignee

Pavel Pscheidl

Fix versions

Reporter

Pavel Pscheidl

Support ticket URL

None

Labels

None

Affected Spark version

None

Customer Request Type

None

Task progress

None

ReleaseNotesHidden

None

CustomerVisible

No

Affects versions

Priority

Major