Project OOM when creating an h2o frame poc

Description

We discussed this idea with Michal on Epsilon customer on-boarding call.

The idea is to project final h2o frame size while h2o frame is being built,
and fail such h2o frame creating early enough
to save whole H2O cluster from going down (or failing other user's work).

For example, if 2% of frame records were created, and so far h2o frame occupies 10Gb,
we can project that 100% of frame will take probably around of 500Gb of memory.
H2O then can also see what's JVM max heap size is (minus some safety %), and
kill h2o frame creation based on prjected frame size.

Thank you.

Activity

Show:
Jan Sterba
February 1, 2020, 9:04 AM

To be continued PUBDEV-6614…

Ruslan Dautkhanov
December 12, 2019, 7:20 PM

Thank you Michal and Jan

 

Michal Kurka
December 12, 2019, 3:11 PM

POC done - we will make a new Jira for turning it into a production code.

Jan Sterba
November 23, 2019, 8:58 AM

PoC was done on a single-node cluster to check the feasibility of such feature.

How was this implemented:

  • based on import progress reported by parser, regulary check available and used memory

  • based on used memory by the frame use linear extrapolation to predict the final frame size

  • if the final frame size fits into the available memory with sufficient margin continue the parsing otherwise stop the job

Results:

  • this seems to work and could be valuable to users

  • we need to improve the extrapolation fucntion since uncompressed chunks may make it more aggressive than necessary

  • multi-node scenario will require additional work

 

 

Fixed

Assignee

Jan Sterba

Fix versions

Reporter

Ruslan Dautkhanov