Closed
Bug 1995618
Opened 5 months ago
Closed 1 month ago
Pass the links present on the page as a list in the PageExtractor
Categories
(Core :: Machine Learning: On Device, enhancement, P2)
Core
Machine Learning: On Device
Tracking
()
RESOLVED
FIXED
149 Branch
| Tracking | Status | |
|---|---|---|
| firefox149 | --- | fixed |
People
(Reporter: gregtatum, Assigned: thasan)
References
(Blocks 1 open bug)
Details
(Whiteboard: [genai])
Attachments
(2 files)
The text content is extracted from the page, but we don't grab the links explicitly. We should do this, maybe as an optional piece of behavior. We'll need to specify the format, but maybe just doing things in markdown would make sense.
Here is an example test:
Added to:
toolkit/components/pageextractor/tests/browser/browser_dom_extractor.js
add_task(async function test_dom_extractor_links() {
const { actor, cleanup } = await html`
<article>
<h1>Example of Links</h1>
<ul>
<li>Here is the <a href="./example-1.html">First link</a></li>
<li>
Now this is an <a href="https://example.com/link">external link</a>
</li>
</ul>
</article>
`;
const { text, links } = await actor.getText();
is(
text,
"Example of Links\n" +
"Here is the [First link](https://localhost:7372/example-1.html)\n" +
"Now this is an [external link](https://example.com/link)",
);
Assert.deepEqual(
links,
["./example-1.html", "https://example.com/link"]
);
return cleanup();
});
| Reporter | ||
Updated•5 months ago
|
Priority: -- → P3
Updated•4 months ago
|
Component: Machine Learning: General → Machine Learning: On Device
Comment 1•4 months ago
|
||
The component has been changed since the backlog priority was decided, so we're resetting it.
For more information, please visit BugBot documentation.
Priority: P3 → --
Updated•3 months ago
|
| Reporter | ||
Updated•3 months ago
|
Priority: -- → P2
Updated•3 months ago
|
Assignee: nobody → thasan
Attachment #9532488 -
Attachment description: WIP: Bug 1995618 - Pass page links as list to PageExtractor → Bug 1995618 - Pass page links as list to PageExtractor
Status: NEW → ASSIGNED
Pushed by gtatum@mozilla.com:
https://github.com/mozilla-firefox/firefox/commit/5433b8bb8f2d
https://hg.mozilla.org/integration/autoland/rev/f145e47d11fe
Pass page links as list to PageExtractor r=ai-ondevice-reviewers,gregtatum
Pushed by imoraru@mozilla.com:
https://github.com/mozilla-firefox/firefox/commit/efbcc321b8d1
https://hg.mozilla.org/integration/autoland/rev/891741892645
Revert "Bug 1995618 - Pass page links as list to PageExtractor r=ai-ondevice-reviewers,gregtatum" for causing bc failures on browser_dom_extractor.js.
Comment 5•2 months ago
|
||
Revert for causing bc failures on browser_dom_extractor.js and browser_get_page_content.js.
Flags: needinfo?(thasan)
Updated•1 month ago
|
Attachment #9537959 -
Attachment description: WIP: Bug 1995618 - Update GetPageContent to use new links format → Bug 1995618 - Update GetPageContent to use new links format r=gregtatum
Updated•1 month ago
|
Attachment #9532488 -
Attachment description: Bug 1995618 - Pass page links as list to PageExtractor → Bug 1995618 - Pass page links as list to PageExtractor r=gregtatum
Pushed by thasan@mozilla.com:
https://github.com/mozilla-firefox/firefox/commit/6686036b198b
https://hg.mozilla.org/integration/autoland/rev/0a795970d1b3
Pass page links as list to PageExtractor r=ai-ondevice-reviewers,gregtatum
https://github.com/mozilla-firefox/firefox/commit/98c32ffaf198
https://hg.mozilla.org/integration/autoland/rev/e4d57db08628
Update GetPageContent to use new links format r=gregtatum,ai-ondevice-reviewers
Comment 8•1 month ago
|
||
| bugherder | ||
https://hg.mozilla.org/mozilla-central/rev/0a795970d1b3
https://hg.mozilla.org/mozilla-central/rev/e4d57db08628
Status: ASSIGNED → RESOLVED
Closed: 1 month ago
status-firefox149:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → 149 Branch
Updated•24 days ago
|
QA Whiteboard: [qa-triage-done-c150/b149]
You need to log in
before you can comment on or make changes to this bug.
Description
•