Change HGLM interface according to Erin suggestion
Description
hey @wendy i noticed that we have some weird default values in R for rand_family and rand_link in GLM. I think you were trying to default to a list with one element, e.g. c("[gaussian]") but instead, in R, that just became a string that looks like a python list. i think this needs to be fixed… in the gen_R.py file.
currently:
rand_family = c("[gaussian]"),
rand_link = c("[identity]", "[family_default]"),
Is it always going to just be one value? if so we can simply use:
Option 1:
rand_family = c("gaussian"),
rand_link = c("identity", "family_default"),
somehow, this is already supported, so there’s nothing to do here except change the R bindings to specify the defaults differently:
gg <- h2o.glm(y = 1, training_frame = as.h2o(iris), rand_family = "[gaussian]")
========================================================================== | 100% |
========================================================================== | 100% |
========================================================================== | 100% |
========================================================================== | 100% |
erin 10:27 PM
while it seems like Option 1 makes the most sense (assuming we only pass 1 value at a time), it doesn’t match up with python, which expects a list. this will fail if i try to pass a string, so i am curious why the list needs to be there:
H2OTypeError: Argument `rand_family` should be a ?list(Enum["gaussian"]), got string gaussian
(edited)
erin 10:33 PM
currently must do this:
h2o_glm = H2OGeneralizedLinearEstimator(HGLM=True,
family="gaussian",
rand_family=["gaussian"],
random_columns=z,
rand_link=["identity"],
calc_like=True)
Activity
@erin: so is there only one value that can be passed at a time? or do we need the input to be an array/list?
@wendy
I looked at GLMV3.java and it seems that rand_family and rand_link are both arrays.