Delta lake fails to import (in Python)

Description

Delta lake file import was added to H2O (https://h2oai.atlassian.net/browse/PUBDEV-7923), but it fails for Python API

key is to disable the workaround in Python and it will start working:
```H2OFrame._LOCAL_EXPANSION_ON_SINGLE_IMPORT_ = False```

Activity

Show:
Neema Mashayekhi
January 7, 2021, 1:03 AM
Edited

Good point.

The error show the crc and json paths but converted all the forward slashes and colon to underscore:"dbfs:/mnt/delta/events3/_delta_log/00000000000000000000.crc" ->

dbfs__mnt_delta_events3__delta_log_00000000000000000000.crc

Michal Kurka
January 6, 2021, 11:34 PM

It doesn’t work when a slash is inserted at the end of the directory name (eg. by user or by python - hence the workaround). The regular expression that filters out the log files needs to be revised in order to work with/without slash.

Fixed

Assignee

Michal Kurka

Fix versions

Reporter

Neema Mashayekhi

Support ticket URL

Labels

None

Affected Spark version

None

Customer Request Type

None

Task progress

None

ReleaseNotesHidden

None

CustomerVisible

No

Priority

Major