Open
Bug 2001690
Opened 4 months ago
Updated 4 months ago
Consumes excessive RAM when running Qwen3-0.6B-ONNX on onnx-native
Categories
(Core :: Machine Learning: On Device, defect)
Core
Machine Learning: On Device
Tracking
()
NEW
People
(Reporter: thasan, Unassigned, NeedInfo)
References
Details
(Whiteboard: [genai])
Performance Profile: https://share.firefox.dev/4rjmyNa
Model: onnx-community/Qwen3-0.6B-ONNX
Configuration:
"reasoning-text-generation": {
modelId: "onnx-community/Qwen3-0.6B-ONNX",
task: "text-generation",
modelRevision: "main",
modelHub: "huggingface",
dtype: "int8",
device: "cpu",
backend: "onnx-native",
inputArgs: [
[
{
role: "system",
content: "You are a helpful AI assistant that thinks step-by-step before answering.",
},
{
role: "user",
content: "Explain why the sky is blue to a 5-year-old.",
},
],
],
runOptions: {
max_new_tokens: 512,
temperature: 0.6,
do_sample: false,
return_full_text: false,
},
timeout: 120,
}
Updated•4 months ago
|
Comment 1•4 months ago
|
||
The severity field is not set for this bug.
:tarek, could you have a look please?
For more information, please visit BugBot documentation.
Flags: needinfo?(tziade)
Updated•4 months ago
|
Severity: -- → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•