Bug 1968939 Comment 1 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

Original comment by

Tarek Ziadé (:tarek)

on 2025-05-28 06:16:52 PDT

This change will come in several components:

new C++ APIs surfaced in JS:

- `splitOnnxFile` : a function in MLUtils that gets an ONNX model and splits it into a separate graph and data, bonus point if we can make this one streamable so we don't load the whole model in memory.
- `compileGraph` : a function in MLUtils that gets an ONNX graph and returns a compiled graph

From there, ModelHub can decide in the main process at download time if it wants to compile on the fly the model and split data.

If it does, it will store for each `model.onnx` file:

- `model.ort` : the compiled graph
- `model.data` : the weights

And we'll use those files when running inference

Revision 1 by

Tarek Ziadé (:tarek)

on 2025-05-28 06:38:35 PDT

This change will come in several components:

new C++ APIs surfaced in JS:

- `splitOnnxFile` : a function in MLUtils that gets an ONNX model and splits it into a separate graph and data, bonus point if we can make this one streamable so we don't load the whole model in memory.
- `compileGraph` : a function in MLUtils that gets an ONNX graph and returns a compiled graph

From there, ModelHub can decide in the main process at download time if it wants to compile on the fly the model and split data.

If it does, it will store for each `model.onnx` file:

- `model.onnx` : the compiled graph, stripped of the weight data
- `model.onnx_data` : the weights

And we'll use those files when running inference

Revision 2 by

Tarek Ziadé (:tarek)

on 2025-05-28 06:38:52 PDT

This change will come in several components:

new C++ APIs surfaced in JS:

- `splitOnnxFile` : a function in MLUtils that gets an ONNX model and splits it into a separate graph and data, bonus point if we can make this one streamable so we don't load the whole model in memory.
- `compileGraph` : a function in MLUtils that gets an ONNX graph and returns a compiled graph

From there, ModelHub can decide in the main process at download time if it wants to compile on the fly the model and split data.

If it does, it will store for each `model.onnx` file:

- `model.onnx` : the compiled graph, stripped of the weight data
- `model.onnx_data` : the weights

And we'll use those files when running inference (the WASM runtime already supports that)

Back to Bug 1968939 Comment 1