Closed Bug 1382673 Opened 7 years ago Closed 7 years ago

Parquet Writer should write NULL values for MAP types

Categories

(Data Platform and Tools :: General, enhancement, P1)

enhancement
Points:
2

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: frank, Assigned: ashort, Mentored)

References

Details

Currently, the parquet writer doesn't write a key/value in a MAP if the value is NULL. Parquet schema allows for optional values, in which case the value should be written as NULL.
Blocks: 1382336
Assignee: trink → ashort
Mentor: mtrinkala
Status: NEW → ASSIGNED
Points: --- → 2
This is a consequence of representing Parquet NULL as Lua's "nil". Lua tables cannot have nil as a value for a key, it's equivalent to the key being absent.

Some other representation must be found for nulls; cjson uses a null-pointer lightuserdata object and we could do that here too.
This is nearly done... there has been a round of implementation / code-review back-and-forth that you can see in the pull request (https://github.com/mozilla-services/lua_sandbox_extensions/pull/154) if you're interested in the gritty details. Currently we (where "we" is :ashort) *think* the code is complete, and are waiting for :trink to review the latest changes.
Flags: needinfo?(mtrinkala)
Reviewed and commented still r-
Flags: needinfo?(mtrinkala)
Fix in https://github.com/mozilla-services/lua_sandbox_extensions/commit/ffadf3b401e01e55461d2a93363ff21b76baea04
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Component: Pipeline Ingestion → General
You need to log in before you can comment on or make changes to this bug.