Closed Bug 1645943 Opened 9 months ago Closed 8 months ago

Extensions and IndexedDB stop working after visiting site if root zone is included in FQDN

Categories

(Core :: Storage: Quota Manager, defect, P2)

77 Branch
defect

Tracking

()

VERIFIED FIXED
mozilla79
Tracking Status
firefox-esr78 --- verified
firefox77 --- wontfix
firefox78 - wontfix
firefox79 --- verified

People

(Reporter: j.weerts4, Assigned: tt)

References

(Blocks 2 open bugs, Regression)

Details

Attachments

(5 files)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:77.0) Gecko/20100101 Firefox/77.0

Steps to reproduce:

I visited a website and added the root zone (.) after the TLD. In my case:
https://www.nytimes.com.
https://www.washingtonpost.com.
https://www.youtube.com.

Actual results:

After restarting Firefox, extensions were not working anymore. Ubolock Origin was not blocking ads, 1Password would not autofill usernames and passwords, Facbook Container would only show a white square when clicking on the extensions icon.

Upon investigation, folders where created under $USER\AppData\Roaming\Mozilla\Firefox\Profiles$Profile\storage\default that looked like this:
https+++www.washingtonpost.com.
A folder was created for each website visited that included the root zone in the FQDN.
Two of the folders were able to be deleted with File Explorer, but the Washington Post folder had to be deleted with the command line, as File Explorer was not able to do so.
Another user reported that they could not delete the NY Times folder.
https://www.reddit.com/r/firefox/comments/h9odv4/firefox_broke_firefox_and_the_i_fixed_firefox/
Upon restarting Firefox, extensions were working as expected again.

Expected results:

Visiting those websites should not have impacted the functionality of extensions.

Component: Untriaged → Storage: Quota Manager
Product: Firefox → Core

Thanks for reporting this! I am working on fixing this entirely.

A temporary way for this is probably to set the pref dom.quotaManager.useDOSDevicePathSyntax to false. Such that directories like https+++www.washingtonpost.com. would not be created.

Assignee: nobody → ttung
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
See Also: → 1536796

I just gave that setting a try, and it does not seem to work. I attached a screenshot of the profile folder.

Duplicate of this bug: 1646022

(In reply to Jean Weerts from comment #3)

I just gave that setting a try, and it does not seem to work. I attached a screenshot of the profile folder.

Thanks for verifying this! So, flipping the pref doesn't work.

I am working on a patch to make the broken directory not break the storage initialization and thus not break the extension.
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=661a7b61573854b5d8ae840553a67c7d5e36f380
I will verify it once the build is completed.

So, the information I get so far is:

  • The reason why extension fails is that temporary storage initialization fails to complete
    • DOM Cache for the origins which end with a dot cause temporary storage initialization to fail
    • DOM Cache fails because:
      • If there are no padding files, it checks the database for getting the padding size
      • If the database doesn't exist, it expects to get error codes for file not found.
      • For some reason, it gets an error code for access denied from here[1].

The patch that I pushed to try/treeherder is only a workaround for this, but it should help to not block the temporary storage initialization to fail.

I will keep working on find the root cause and fix it entirely.

A few other things:

  • I check the cache directory for the broken origin (https+++youtube.com.) on the machine, it's empty (no padding files, no database).
  • When [1], I dump dbFileUrl for OpenDatabaseWithFileURL. The file path for the file which is got from dbFileUrl is like: \\?\C:\XXXXX\https+++youtube.com.\cache\cache.sqlite and the spec from dbFileUrl is like file////%3F/C:/XXXXX/https+++youtube.com./cache/cache.sqlite?cache=private
  • An assertion in debug build captures an unexpected situation when getting quotaObject from SQLite APIs. The SQLite uses a file path like: \\?\C:\XXXX\https+++youtube.com\cache\cache.sqlite (note that it missed the ending dot). However, it should use a path like \\?\C:\XXXX\https+++youtube.com.\cache\cache.sqlite

(In reply to Tom Tung [:tt, :ttung] from comment #5)

I am working on a patch to make the broken directory not break the storage initialization and thus not break the extension.
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=661a7b61573854b5d8ae840553a67c7d5e36f380
I will verify it once the build is completed.

It seems to work for me. I went to https://firefox-storage-test.glitch.me/ and it showed that everything storage APIs work.
I will polish the patch and try to write a test for it

(In reply to Tom Tung [:tt, :ttung] from comment #5)

D79959 should be able to make origins like https+++foo.com. not break the extensions. It's verified by the test case in D80001.

I will keep working on find the root cause and fix it entirely.

I am looking into this now.

I think the root cause is related to \\?\ and SQLite.
This assertion is hit when using a debug build. It's called from here. More exactly, it's from here

I now lean to it's something related to spec (which is propagated from here).

When I reproduced this:
The path was like: \\?\C:\Users\xxxx\https+++youtube.com\cache\cache.sqlite which should be \\?\C:\Users\xxxx\https+++youtube.com.\cache\cache.sqlite

The spec was like file/////%3F/C:/Users/xxxx/https+++www.youtube.com./cache/caches.sqlite?cache-private. It doesn't look wrong to me, but it seems that sqlite parses this into file path.

\\?\ does not only enable long paths, but also disables some canonicalization of file paths. The canonicalization includes removing trailing dots and spaces from file paths.

So file paths end with dots (such as https+++youtube.com.) would cause trouble.

Yes, we have tests for that: https://searchfox.org/mozilla-central/rev/eef39502e08bcd3c40573c65a6827828dce4a032/xpcom/tests/gtest/TestFile.cpp#539

The problem lives in SQLite which calls GetFullPathName on the prefixed path so the trailing dot gets removed.

As this bug clearly illustrates, the \\?\ prefix is only supported by a subset of Windows APIs. In fact, if I had reviewed the original patch I would have r-'d the change that enables it because of all of the footguns. Be very careful which APIs you are calling when this prefix is enabled.

Hm, which subset of Windows APIs and what other footguns, is there a documentation for that, are there any other solutions for overcoming the MAX_PATH limit ?
The prefix is not enabled for all nsIFile objects, it's enabled only for nsIFile objects that are used by QuotaManager and its clients (like IndexedDB). We desperately needed it to solve problems with long file paths. Our storage telemetry improved from roughly 99.5% to 99.6% initialization success rate after enabling it which is quite significant.

We are working on a fix for the SQLite issue.

(In reply to Jan Varga [:janv] from comment #13)

Hm, which subset of Windows APIs and what other footguns, is there a documentation for that, are there any other solutions for overcoming the MAX_PATH limit ?

Here is the overarching guide about this. A few comments about its contents:

  • There is not a specific list of APIs that work with the \\?\ prefix, however that document does include a list of APIs that are affected by the longPathAware manifest tag. I expect that there would be substantial (if not complete) overlap between the two. I would use that list as the guide.
    • We won't be able to use the longPathAware manifest tag for a looooong time, if ever.
  • Any file or directory names that end in a period will not work with Windows shell APIs, however no shell APIs are on the aforementioned list.

(In reply to Aaron Klotz [:aklotz] from comment #16)

Here is the overarching guide about this.

This is exactly the documentation which we used for developing the new feature:
https://bugzilla.mozilla.org/show_bug.cgi?id=1536796#c119

Just to be clear, the issue in this bug is not about that QuotaManager or SQLite is using a windows API that doesn't support the prefix. It's about SQLite calling GetFullPathName on \?\ prefixed paths (which we pass to SQLite), so it does explicit normalization. It calls it for no reason for prefixed paths because such paths are already full paths.

The manifest thing is something different AFAIK, it's new in Windows 10.
Since we have to support older versions, like Windows 7, it's not very helpful, isn't it ?

(In reply to Jan Varga [:janv] from comment #17)

Since we have to support older versions, like Windows 7, it's not very helpful, isn't it ?

Not only is it not helpful because of the Windows 10 dependency, but we have so many MAX_PATH references in our code (not to mention any library dependencies) that it would be a huge project to make everything compatible.

But the list of APIs affected by that manifest change is probably the same as the list of APIs that support \\?\.

One other thing about the \\?\ prefix is that it only works on the UTF-16 variants of the APIs. If a dependency converts that path to the ANSI code page before doing something, then things won't work.

(In reply to Aaron Klotz [:aklotz] from comment #16)

  • Any file or directory names that end in a period will not work with Windows shell APIs, however no shell APIs are on the aforementioned list.

One other thing about the \\?\ prefix is that it only works on the UTF-16 variants of the APIs. If a dependency converts that path to the ANSI code page before doing something, then things won't work.

I don't think these potential "footguns" would cause problems for our use case.
Users usually don't need to access our internal files using Windows shell APIs. Most of directories and files will work normally, except the ones with trailing dot.
nsLocalFileWin.cpp uses only APIs with the W suffix. QuotaManager and quota clients access files only using nsIFile interface or SQLite. SQLite does pretty good job at using APIs with the W suffix on NT systems as well.

Severity: -- → S2
Priority: -- → P2

Updates:

  • Causes:

As Jan mentioned in comment 17, this issue is about SQLite calls GetFullPathName thus does explicit normalization for the file paths. \\?\ prefix paths should not be the cases here because they are already full paths.

Consequently, the file component that ends with a dot is normalized (the trailing dot is removed), so it causes DOM Cache to access the wrong database and cause issues in the QuotaManager usage tracking mechanism.

In this case, DOM Cache actually handles the cases when there is no database file, but it got NS_ERROR_FILE_ACCESS_DENINED rather than NS_ERROR_FILE_NOT_FOUND. So that it breaks the temporary storage initialization and therefore all add-ons that use storage are broken.

  • Actions:

We decide to override xFullPathname as a short-term solution in D80202. Meanwhile, we will check if we can have a fix in SQLite as a long-term solution.

D79959 can help that broken cache directory due to cannot find the database file breaks temporary storage initialization (and thus breaks add-ons). (Note that the fix in D80202 should help cache directories from broken and thus help the add-ons as well)

  • How to fix this before our fix gets landed:

Browse to the sites that don't end with a dot (e.g. https://youtube.com if https://youtube.com. caused the breakage), then you can remove the directory (${profile}/storage/default/${origin_directory}; e.g.${profile}/storage/default/https+++www.youtube.com.).

Blocks: 1647316
Attachment #9157223 - Attachment description: Bug 1645943 - A workaround to treat the padding size as zero when there are no padding files and database file; → Bug 1645943 - Check results from nsIFile's functions rather than mozStorage's/SQLite's in LockedGetPaddingSizeFromDB;
Duplicate of this bug: 1647323

(In reply to Tom Tung [:tt, :ttung] from comment #21)

try for D79959 and D80001: https://treeherder.mozilla.org/#/jobs?repo=try&revision=0b7ea8714941d5ba604abe95b386cbb14ab02fb5

Try looks good so I am going to land these two first.

Keywords: leave-open
Pushed by ttung@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3ed2b9424225
Check results from nsIFile's functions rather than mozStorage's/SQLite's in LockedGetPaddingSizeFromDB; r=janv,dom-workers-and-storage-reviewers,sg
https://hg.mozilla.org/integration/autoland/rev/7a2022403814
A testcase to ensure temporary storage initialization is not blocked by a cache directory in an origin directory that ends with a period; r=janv,dom-workers-and-storage-reviewers

Sorry for the backout. I update the patch and it seems that the issue was gone: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d668311cb38ed77ee486dc6d22b206d9e6f9d968

Flags: needinfo?(ttung)
Pushed by ttung@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5e40accdea69
Check results from nsIFile's functions rather than mozStorage's/SQLite's in LockedGetPaddingSizeFromDB; r=janv,dom-workers-and-storage-reviewers,sg
https://hg.mozilla.org/integration/autoland/rev/9556117ca780
A testcase to ensure temporary storage initialization is not blocked by a cache directory in an origin directory that ends with a period; r=janv,dom-workers-and-storage-reviewers
Pushed by ttung@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/fc9e65dc14ac
Override xFullPathname to avoid the use of GetFullPathNameW; r=janv,asuth

I will ask for uplifting three patches once the last patches go to m-c if there is no related failure or report shows up.

Keywords: leave-open
Status: ASSIGNED → RESOLVED
Closed: 8 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla79

[Tracking Requested - why for this release]: It's a serious issue on Windows that found in FF77 and causes users add-ons broken.

Discussions on Reddit:
https://www.reddit.com/r/firefox/comments/h9odv4/firefox_broke_firefox_and_the_i_fixed_firefox/
https://reddit.com/r/firefox/comments/h7smzt/all_my_extensions_stopped_working/
https://www.reddit.com/r/uBlockOrigin/comments/gyadhc/dashboard_broken_seeing_ads_troubleshooting_steps/

I will ask beta-approval for all three patches later today.

Hi Jean,

Would you mind helping me to verify if the latest Nightly fix the issue on your machine? It resolves the issue on my machine but it would be nice if you can verify it as well. Thanks in advance!

Flags: needinfo?(j.weerts4)

Comment on attachment 9157655 [details]
Bug 1645943 - Override xFullPathname to avoid the use of GetFullPathNameW;

Beta/Release Uplift Approval Request

  • User impact if declined: On windows, once they visit origins that are with a trailing dot, their storage would break. That means their add-ons would be broken.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: On Windows,
  1. Visiting https://youtube.com. (note that there is a trailing dot)
  2. Go to https://firefox-storage-test.glitch.me/ and check the result
  3. Restart Firefox
  4. Go to https://firefox-storage-test.glitch.me/ and check the result
  5. Visit https://youtube.com (without a trailing dot)
  6. Check the file layout of ${profile_path}/storage/default/htttps+++youtube.com./cache/

In 2 & 4, all results should be green
In 6 there should contain caches.sqlite file or some other files rather than being empty.
(Note you can find $profile_path in about:profiles)

  • List of other uplifts needed: None
  • Risk to taking this patch: Medium
  • Why is the change risky/not risky? (and alternatives if risky): I wrote medium mostly because these patches haven't stayed on Nightly long enough. (Two patches for a day and one patch for a few hours at this moment) and https://phabricator.services.mozilla.com/D80202 overrides a function in SQLite.
    The risk for the other two patches is low since the main change is to check if the file exists before accessing SQLite API and it's covered by a test.

Note that we have a pref to protect the implementation in (https://phabricator.services.mozilla.com/D80202) so it's possible to disable it by changing the default value of the pref if something bad comes out.
And note that we need all three patches to fix this issue entirely.

  • String changes made/needed:
Attachment #9157655 - Flags: approval-mozilla-beta?
Attachment #9157223 - Flags: approval-mozilla-beta?
Attachment #9157290 - Flags: approval-mozilla-beta?
Flags: qe-verify+

Given this shipped in 77 already, I think we're too close to 78 release to take this.

Comment on attachment 9157223 [details]
Bug 1645943 - Check results from nsIFile's functions rather than mozStorage's/SQLite's in LockedGetPaddingSizeFromDB;

Let's consider this for 78.1esr though.

Attachment #9157223 - Flags: approval-mozilla-beta? → approval-mozilla-esr78?
Attachment #9157290 - Flags: approval-mozilla-beta? → approval-mozilla-esr78?
Attachment #9157655 - Flags: approval-mozilla-beta? → approval-mozilla-esr78?

(In reply to Tom Tung [:tt, :ttung] from comment #34)

Hi Jean,

Would you mind helping me to verify if the latest Nightly fix the issue on your machine? It resolves the issue on my machine but it would be nice if you can verify it as well. Thanks in advance!

Hi Tom, thank y'all very much for working so hard on fixing this. This is the first bug report I have ever made, and y'all have been great.
Here's what I found:

  1. Using Nightly 2020-06-25 the extension issue seems to be fixed
  2. https://firefox-storage-test.glitch.me/ shows that everything is working both before and after restarting Firefox
  3. caches.sqlite is present in the cache folder

However I might have found a related issue. I can only access ${profile_path}/storage/default/htttps+++youtube.com. if ${profile_path}/storage/default/htttps+++youtube.com exists. (Note the lack of period at the end of the folder name in the second one.) If the non-period-appended folder is deleted, the period-appended folder for the same site cannot be read, and more importantly cannot be deleted. This is causing profiles to not be deleted entirely using the profile manager. Let me know if I need to make a separate bug report about his, or I you need anything else from me.

Flags: needinfo?(j.weerts4)

Tom, I think the profile manager needs to use the prefix for removing profiles or delete the storage folder (using the prefix) first.

(In reply to Jean Weerts from comment #38)

Hi Tom, thank y'all very much for working so hard on fixing this. This is the first bug report I have ever made, and y'all have been great.
Here's what I found:

  1. Using Nightly 2020-06-25 the extension issue seems to be fixed
  2. https://firefox-storage-test.glitch.me/ shows that everything is working both before and after restarting Firefox
  3. caches.sqlite is present in the cache folder

Thanks for verifying that the result is resolved on you end as well!

However I might have found a related issue. I can only access ${profile_path}/storage/default/htttps+++youtube.com. if ${profile_path}/storage/default/htttps+++youtube.com exists. (Note the lack of period at the end of the folder name in the second one.) If the non-period-appended folder is deleted, the period-appended folder for the same site cannot be read, and more importantly cannot be deleted. This is causing profiles to not be deleted entirely using the profile manager. Let me know if I need to make a separate bug report about his, or I you need anything else from me.

Ah, I miss this thing and that's a good point! I guess they should either use the prefix (\\?\) or QuotaManager's API to remove the data. But, that should be in other bugs. I will file a bug for that.

p.s. that you should be able to remove the storage for the site in the "Cookies and Site Data" section from"about:preferences"

Blocks: 1648490

(In reply to Jan Varga [:janv] from comment #40)

Tom, I think the profile manager needs to use the prefix for removing profiles or delete the storage folder (using the prefix) first.

Right, and I will fix that in 1648490

Hi, Based on the steps from Comment 35, this issue is Verified as Fixed in our latest Nightly build 79.0a1 (2020-06-29) on Windows 10.

Duplicate of this bug: 1645915

Comment on attachment 9157223 [details]
Bug 1645943 - Check results from nsIFile's functions rather than mozStorage's/SQLite's in LockedGetPaddingSizeFromDB;

Approved for 78.1esr.

Attachment #9157223 - Flags: approval-mozilla-esr78? → approval-mozilla-esr78+
Attachment #9157290 - Flags: approval-mozilla-esr78? → approval-mozilla-esr78+
Attachment #9157655 - Flags: approval-mozilla-esr78? → approval-mozilla-esr78+

This issue is Verified as fixed in our latest ESR build 78.1.0esr based on the steps from Comment 35.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
Regressed by: 1536796
Summary: Extensions stop working after visiting site if root zone is included in FQDN → Extensions and IndexedDB stop working after visiting site if root zone is included in FQDN
You need to log in before you can comment on or make changes to this bug.