Closed Bug 1668414 Opened 4 years ago Closed 4 years ago

Saving a wikipedia page is missing images used from the page (because srcset attribute is not scanned/re-written)

Categories

(Core :: DOM: Core & HTML, defect, P2)

Firefox 81
defect

Tracking

()

RESOLVED FIXED
83 Branch
Tracking Status
firefox83 --- fixed

People

(Reporter: digitalio995, Assigned: emilio)

References

Details

Attachments

(4 files)

Attached image Mozilla.png

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:81.0) Gecko/20100101 Firefox/81.0

Steps to reproduce:

Open a page, right click, "Save page as..."

Actual results:

The list of downloads shows an error. Some page files are saved, some are not.

Expected results:

Page is saved.

I can reproduce the issue on Nightly83.0a1 Windows10. Chrome works without failure.

  1. Open https://developer.mypurecloud.com/forum/t/problem-with-the-transcription-of-an-interaction/8507
  2. Click Accept button if Cookie Consent dialog pops up.
  3. Save page as(Ctrl+S) > Web page complete > Save
Component: Untriaged → File Handling

it fails everytime on Save Page As, but it works when retrying the download from the panel. That sounds like some principal/security failure...

(In reply to Alice0775 White from comment #1)

I can reproduce the issue on Nightly83.0a1 Windows10. Chrome works without failure.

  1. Open https://developer.mypurecloud.com/forum/t/problem-with-the-transcription-of-an-interaction/8507
  2. Click Accept button if Cookie Consent dialog pops up.
  3. Save page as(Ctrl+S) > Web page complete > Save

This looks to me like it's due to mixed content blocking. Can you file a separate bug? Although the result (download fails) is the same, I think the cause is different.

Flags: needinfo?(alice0775)

Can you attach your about:support information? I'd be particularly interested if you have something like uBlock origin or another ad or content blocker installed, or if you're using tracking protection (did you save the page from a private browsing window) ?

As it is, saving the mozilla homepage (which seems to be what you tried to do in comment #0, based on the screenshot) works for me in 82 beta on a relatively clean profile - but there are known issues with the page saving code when ad/tracking-blocking blocks some of the requests for parts of the page (bug 1445211).

Flags: needinfo?(digitalio995)
See Also: → 1668530

(In reply to :Gijs (he/him) from comment #4)

Can you attach your about:support information? I'd be particularly interested if you have something like uBlock origin or another ad or content blocker installed, or if you're using tracking protection (did you save the page from a private browsing window) ?

As it is, saving the mozilla homepage (which seems to be what you tried to do in comment #0, based on the screenshot) works for me in 82 beta on a relatively clean profile - but there are known issues with the page saving code when ad/tracking-blocking blocks some of the requests for parts of the page (bug 1445211).

I have uBlock origin, and disabling it seems to solve the problem in most cases indeed. However, some pages still fail in one way or another.
For instance, saving https://en.wikipedia.org/wiki/Belousov%E2%80%93Zhabotinsky_reaction results in gif animations not shown in the saved page.

Flags: needinfo?(digitalio995)

(In reply to Digi from comment #5)

For instance, saving https://en.wikipedia.org/wiki/Belousov%E2%80%93Zhabotinsky_reaction results in gif animations not shown in the saved page.

Let's morph to address this issue, then, as the uBlock issue is covered elsewhere.

The webbrowserpersist code, which walks the DOM and finds resources that need saving, would seem to be responsible for this, so I'm moving this over to the relevant component and adjusting the summary.

Status: UNCONFIRMED → NEW
Component: File Handling → DOM: Core & HTML
Ever confirmed: true
Flags: needinfo?(alice0775)
Product: Firefox → Core
Summary: Page saving doesn't work properly → Saving a wikipedia page is missing images used from the page

The issue is that we're not rewriting srcset urls. Should be easily fixable.

Assignee: nobody → emilio

(The fix was actually relatively straight-forward, but as it turns out there are close to zero tests for this, so I'm adding some)

Enum classes and a couple other simplifications.

Rewrite the srcset URIs appropriately...

This code was completely untested (modulo one test for forms...) I added a
generic reftest framework to compare original vs. persisted document, so that
testing changes to this code is easier next time.

Depends on D92132

Summary: Saving a wikipedia page is missing images used from the page → Saving a wikipedia page is missing images used from the page (because srcset attribute is not scanned/re-written)
Keywords: leave-open
Pushed by ealvarez@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/c30ee90cac6a Miscellaneous ResponsiveImageSelector cleanups. r=edgar
Severity: -- → S3
Priority: -- → P2

Thanks for asking me to add a test ;)

Using the image responsive selector works for scanning the srcset images, but
since we wouldn't rewrite the <source> uris the invalid urls would still have
preference.

Depends on D92133

Pushed by ealvarez@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/cef0f66ca3e7 Handle srcset in WebBrowserPersist. r=edgar https://hg.mozilla.org/integration/autoland/rev/94a576dd5f89 Also handle <source srcset>. r=edgar
Flags: needinfo?(emilio)
Pushed by ealvarez@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/b8a22668fbef Handle srcset in WebBrowserPersist. r=edgar https://hg.mozilla.org/integration/autoland/rev/effdabb1dfd7 Also handle <source srcset>. r=edgar

I think this might be fixed based on these last commits?

Flags: needinfo?(emilio)
Status: NEW → RESOLVED
Closed: 4 years ago
Flags: needinfo?(emilio)
Resolution: --- → FIXED
Keywords: leave-open
Target Milestone: --- → 83 Branch
Pushed by emilio@crisal.io: https://hg.mozilla.org/integration/autoland/rev/91464994c8ac Restore binary file that was empty when re-landing due to bug 1709608.
Blocks: 1804865
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: