Open Bug 1968939 Opened 7 months ago Updated 5 days ago

Add an API to optimize ONNX models

Tracking

()

Status:

NEW

People

(Reporter: padenot, Assigned: padenot)

References

(Blocks 1 open bug)

Details

Attachments

(3 files)

Bug 1968939 - Updater onnx api header to 1.22. r?tarek 7 months ago Paul Adenot (:padenot) 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1968939 - Move utility functions in InferenceSession to a dedicated file. r?tarek 7 months ago Paul Adenot (:padenot) 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1968939 - Expose a function to js to compile an ONNX model with optimization. r?tarek 7 months ago Paul Adenot (:padenot) 48 bytes, text/x-phabricator-request		Details \| Review

Paul Adenot (:padenot)

Assignee

Description

•

7 months ago

It's exposed via MLUtils in js, and allows optimizing models after having them downloaded. Then when inferencing, the optimized model is used, saving a large amount of time on the critical path, reducing end-to-end latency for the user.

Paul Adenot (:padenot)

Assignee

Updated

•

7 months ago

Assignee: nobody → padenot

Tarek Ziadé (:tarek)

Comment 1

•

7 months ago

•

Edited

This change will come in several components:

new C++ APIs surfaced in JS:

splitOnnxFile : a function in MLUtils that gets an ONNX model and splits it into a separate graph and data, bonus point if we can make this one streamable so we don't load the whole model in memory.
compileGraph : a function in MLUtils that gets an ONNX graph and returns a compiled graph

From there, ModelHub can decide in the main process at download time if it wants to compile on the fly the model and split data.

If it does, it will store for each model.onnx file:

model.onnx : the compiled graph, stripped of the weight data
model.onnx_data : the weights

And we'll use those files when running inference (the WASM runtime already supports that)

Tarek Ziadé (:tarek)

Comment 2

•

7 months ago

•

Edited

Notice that we will also need to extend the onnx-native backend call, by feeding the data in session_options.externalData

We should also deactivate in the runtime the optimization step and make the assumption it's done before

Tarek Ziadé (:tarek)

Comment 3

•

7 months ago

For reference, our Python script that splits graph and weights: https://searchfox.org/mozilla-central/source/toolkit/components/ml/tools/convert_to_external_data.py

Paul Adenot (:padenot)

Assignee

Comment 4

•

7 months ago

Attached file Bug 1968939 - Updater onnx api header to 1.22. r?tarek — Details

Paul Adenot (:padenot)

Assignee

Comment 5

•

7 months ago

Attached file Bug 1968939 - Move utility functions in InferenceSession to a dedicated file. r?tarek — Details

Paul Adenot (:padenot)

Assignee

Comment 6

•

7 months ago

Attached file Bug 1968939 - Expose a function to js to compile an ONNX model with optimization. r?tarek — Details

Tarek Ziadé (:tarek)

Updated

•

7 months ago

Severity: -- → S3

Type: defect → enhancement

Priority: -- → P3

Paul Adenot (:padenot)

Assignee

Comment 7

•

3 months ago

Clarification: this is for now blocked on https://github.com/huggingface/transformers.js/pull/1382, that we need because it pulls in a new onnxruntime update, that we need to not risk ABI breakage.

Paul Adenot (:padenot)

Assignee

Updated

•

3 months ago

Blocks: 1993028

Greg Tatum [:gregtatum]

Updated

•

2 months ago

Component: Machine Learning: General → Machine Learning: On Device

Greg Tatum [:gregtatum]

Comment 8

•

5 days ago

I'm simplifying the dependency tree a bit, as I'm finding it confusing what works needs doing here.

No longer blocks: 1993028

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Add an API to optimize ONNX models

Categories

(Core :: Machine Learning: On Device, enhancement, P3)

Tracking

()

People

(Reporter: padenot, Assigned: padenot)

References

(Blocks 1 open bug)

Details

Crash Data

Security

(public)

User Story

Attachments

(3 files)

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Comment 7

Updated

Updated

Comment 8

Attachment

General

Description

File Name

Content Type