If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Parquet Writer should write NULL values for MAP types

RESOLVED FIXED

Status

Data Platform and Tools
Pipeline Ingestion
P1
normal
RESOLVED FIXED
2 months ago
2 months ago

People

(Reporter: frank, Assigned: ashort, Mentored)

Tracking

Details

(Reporter)

Description

2 months ago
Currently, the parquet writer doesn't write a key/value in a MAP if the value is NULL. Parquet schema allows for optional values, in which case the value should be written as NULL.
(Reporter)

Updated

2 months ago
Blocks: 1382336

Updated

2 months ago
Assignee: trink → ashort
Mentor: mtrinkala@mozilla.com
Status: NEW → ASSIGNED
Points: --- → 2
(Assignee)

Comment 1

2 months ago
This is a consequence of representing Parquet NULL as Lua's "nil". Lua tables cannot have nil as a value for a key, it's equivalent to the key being absent.

Some other representation must be found for nulls; cjson uses a null-pointer lightuserdata object and we could do that here too.
This is nearly done... there has been a round of implementation / code-review back-and-forth that you can see in the pull request (https://github.com/mozilla-services/lua_sandbox_extensions/pull/154) if you're interested in the gritty details. Currently we (where "we" is :ashort) *think* the code is complete, and are waiting for :trink to review the latest changes.
Flags: needinfo?(mtrinkala)
Reviewed and commented still r-
Flags: needinfo?(mtrinkala)
Fix in https://github.com/mozilla-services/lua_sandbox_extensions/commit/ffadf3b401e01e55461d2a93363ff21b76baea04
Status: ASSIGNED → RESOLVED
Last Resolved: 2 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.