The 'max_categorical_levels' Parameter Is Not Propagated to MOJO

Description

This problem is related to all algorithms. An example with GBM:

  • https://gist.github.com/mn-mikke/8177b9503bcffb124131c5b8e15b5b46#file-gbm-model-enumlimited-json-L1170

  • Parameters passed to H2O backend:
    nbins -> 20
    col_sample_rate_per_tree -> 1.0
    stopping_rounds -> 0
    ignored_columns -> null
    max_categorical_levels -> 10
    huber_alpha -> 0.9
    class_sampling_factors -> null
    seed -> 1
    categorical_encoding -> EnumLimited
    min_split_improvement -> 1.0E-5
    calibrate_model -> false
    keep_cross_validation_fold_assignment -> false
    response_column -> CAPSULE
    fold_assignment -> AUTO
    score_tree_interval -> 0
    distribution -> AUTO
    max_after_balance_size -> 5.0
    nbins_top_level -> 1024
    sample_rate -> 1.0
    min_rows -> 10.0
    ntrees -> 50
    max_depth -> 5
    fold_column -> null
    ignore_const_cols -> true
    balance_classes -> false
    build_tree_one_node -> false
    col_sample_rate_change_per_level -> 1.0
    custom_metric_func -> null
    monotone_constraints -> [Lhex.KeyValue;@2baf9cd4
    max_runtime_secs -> 0.0
    score_each_iteration -> false
    model_id -> null
    col_sample_rate -> 1.0
    offset_column -> null
    custom_distribution_func -> null
    histogram_type -> AUTO
    weights_column -> null
    parallelize_cross_validation -> true
    check_constant_response -> true
    export_checkpoints_dir -> null
    sample_rate_per_class -> null
    nfolds -> 0
    training_frame -> frame_rdd_25231464716
    max_abs_leafnode_pred -> 1.7976931348623157E308
    stopping_tolerance -> 0.001
    learn_rate -> 0.1
    learn_rate_annealing -> 1.0
    pred_noise_bandwidth -> 0.0
    keep_cross_validation_predictions -> false
    quantile_alpha -> 0.5
    gainslift_bins -> -1
    calibration_frame -> null
    tweedie_power -> 1.5
    stopping_metric -> AUTO
    keep_cross_validation_models -> true
    nbins_cats -> 1024

Activity

Show:
Pavel Pscheidl
September 14, 2020, 2:10 PM

I can take this one and fix this evening if required fast.

Marek Novotny
September 14, 2020, 2:44 PM

Thanks for looking into it, but no need to stress about it.

Marek Novotny
September 15, 2020, 1:54 PM
Edited

I think this ticket is irrelevant since the max_categorical_levels is not listed in the fields variable of *ParametersV3 classes.

Pavel Pscheidl
September 15, 2020, 2:55 PM

Saw the PR. Let's close it then.

Marek Novotny
September 15, 2020, 4:14 PM

Closing… Thanks !

Assignee

Pavel Pscheidl

Fix versions

None

Reporter

Marek Novotny

Support ticket URL

None

Labels

None

Affected Spark version

None

Customer Request Type

None

Task progress

None

ReleaseNotesHidden

None

CustomerVisible

No

Priority

Major
Configure