"XML Parsing Error: undefined entity" when opening Nightly
Categories
(Toolkit :: Startup and Profile System, defect)
Tracking
()
People
(Reporter: saschanaz, Unassigned)
References
(Blocks 2 open bugs)
Details
Attachments
(3 files)
This prevents opening Nightly.
Reporter | ||
Comment 1•4 years ago
|
||
Not sure it's related, but the icon also incorrectly is an installer one.
Comment 2•4 years ago
|
||
Here's one try. From conversation on Slack, you seem to be able to create a new profile, and that's working.
- Open about:support and check the
Startup Cache
table - Open that path in File Explorer, but move to the folder for your old profile (the broken one)
- Make a copy of the
startupCache
folder, then remove all content, and try to restart with your old profile
Does that change anything?
Reporter | ||
Comment 3•4 years ago
•
|
||
It's startupCache
but the existing profile does not have that directory. Copy-pasting from the new one doesn't help.
Ah no, I went to the wrong directory, %LOCALAPPDATA%\Mozilla\Firefox\Profiles\
but I went %APPDATA%\Mozilla\Firefox\Profiles\
👀
Reporter | ||
Comment 4•4 years ago
|
||
Okay, I can load my previous local build and it runs on my existing profile. So it is something in the latest build.
Comment 5•4 years ago
|
||
Let's see if someone has other ideas to debug this. Could you attach your raw data from about:support to the bug?
Reporter | ||
Comment 6•4 years ago
|
||
Loading the profile from the previous build did something, now the latest build just works. Hmm.
Reporter | ||
Comment 7•4 years ago
|
||
Comment 8•4 years ago
|
||
Did you make a copy of the profile folder before, by any chance? Just to see if there are differences.
Reporter | ||
Comment 9•4 years ago
|
||
I had no backup so this is after it's somehow fixed, but anyway...
Reporter | ||
Comment 10•4 years ago
•
|
||
While I had no backup, "IgnoreDiskCache": true
looks interesting to me as this is false by default in the new one. (Edit: It's somehow now false
and still works well)
Comment 11•4 years ago
|
||
Thank you Kagami! We are chasing this issue for a while and its very hard to get hold on a way to reproduce it.
My leading hypothesis at the moment (although not validated by any experiments) is that sometimes, for some reason we load one of the files as empty (zero-byte-long), and the error you are seeing seems to fit into it - we are loading browser.dtd
as zero-bytes-long, which means that your browser.xhtml cannot apply DTD strings from that file and the result you see (Yellow Screen of Death) indicates that the very first string from that file could not be loaded.
(In reply to Kagami :saschanaz from comment #4)
Okay, I can load my previous local build and it runs on my existing profile. So it is something in the latest build.
That's a very important point! Thank you for testing it! It fits into the zero-byte-long
hypothesis and indicates that the problem is not a profile (since you can load previous build with a current profile), but the build.
So, what might be happening is that you have a working build with a working profile, and then we perform a partial update which for some reason leaves your new build with a zero-byte-long browser.dtd.
Your build does not have any personal data (data is in profile), and may be crucial for unfolding what's going on. Would you feel comfortable zipping your build (not profile) and sending it to me for investigation? My email is zbraniecki at mozilla dot com.
If you prefer not to do that, I can try to debug via you remotely:
There are two omni.ja
files in your build - one is for "browser" and another is for "toolkit". Your browser.xhtml
should live in the browser one, so I'd like you to zero on the one that would be in a path like /Applications/Firefox/Resources/browser/omni.ja
[1].
This is a zip file, can you unzip it (unzip omni.ja -d ./unpacked-omni
) and then check the file ./unpacked-omni/chrome/en-US/locale/browser/browser.dtd
? Does it look normal? Is it empty?
[1] There should be another omni.ja
one level up, that's the toolkit one.
Comment 12•4 years ago
|
||
mhowell: what are the best logs to log out to determine if this is a failed (partial) update? I think that these are *.log
files in the update directory itself.
Kagami: can you find the update directory (listed in about:support
) and try to find as many *.log
files as possible? Thanks!
Reporter | ||
Comment 13•4 years ago
|
||
Oops, I already rebased+rebuilded so it's now not the build I mentioned earlier. Would it be still enough? BTW, do you mean the failing build or the succeeding build? The failing one was just from the public nightly channel.
I have no omni.ja in my local build (probably because it's a debug build), but obj-x86_64-pc-mingw32\dist\bin\browser\chrome\en-US\locale\browser\browser.dtd
looks normal with its 12 KB size.
Comment 14•4 years ago
|
||
(In reply to Nick Alexander :nalexander [he/him] from comment #12)
mhowell: what are the best logs to log out to determine if this is a failed (partial) update? I think that these are
*.log
files in the update directory itself.
Yes, the update directory is the right place; from the directory that the button in about:support
opens in the build that was broken, go into the subdirectory called updates
and you should see last-update.log
and backup-update.log
; those files should tell us if anything went wrong with an application update.
Reporter | ||
Comment 15•4 years ago
|
||
Kagami: can you find the update directory (listed in about:support) and try to find as many *.log files as possible? Thanks!
There is a big file with 61MB size, I'm sending it to your email address instead.
Comment 16•4 years ago
|
||
(In reply to Kagami :saschanaz from comment #15)
Kagami: can you find the update directory (listed in about:support) and try to find as many *.log files as possible? Thanks!
There is a big file with 61MB size, I'm sending it to your email address instead.
Thanks -- I see it. There are 28 update directories, which just means that you've had many versions of Firefox installed at various times. I will try to figure out which was the "failing one [...] from the public nightly channel"... and I think it's C:\ProgramData\Mozilla\updates\308046B0AF4A39CB
, which corresponds to C:\Program Files\Mozilla Firefox
.
backup-update.log
looks healthy. The only thing interesting is:
...
PREPARE ADD defaultagent_localized.ini
...
so we have a new INI file, which nominally feels connected to l10n/YSOD. But it's hard to see how it could interact.
last-update.log
looks healthy:
Performing a replace request
PATCH DIRECTORY C:\ProgramData\Mozilla\updates\308046B0AF4A39CB\updates\0
INSTALLATION DIRECTORY C:\Program Files\Mozilla Firefox
WORKING DIRECTORY C:\Program Files\Mozilla Firefox\updated
Begin moving destDir (C:\Program Files\Mozilla Firefox) to tmpDir (C:\Program Files\Mozilla Firefox.bak)
rename_file: proceeding to rename the directory
Begin moving newDir (C:\Program Files\Mozilla Firefox.bak/updated) to destDir (C:\Program Files\Mozilla Firefox)
rename_file: proceeding to rename the directory
Now, remove the tmpDir
ensure_remove: failed to remove file: C:\Program Files\Mozilla Firefox.bak/updater.exe, rv: -1, err: 13
ensure_remove_recursive: unable to remove directory: C:\Program Files\Mozilla Firefox.bak, rv: -1, err: 41
Removing tmpDir failed, err: -1
remove_recursive_on_reboot: file will be removed on OS reboot: C:\Program Files\Mozilla Firefox\tobedeleted\rep8f9391ff-8d10-4d74-936f-0c7737cbc85e
succeeded
calling QuitProgressUI
That just means the updater couldn't delete itself while running; I think it's perfectly normal.
There's nothing in the update logs to suggest a bad update, but the logs are not so rich that we can rule that situation out.
Reporter | ||
Comment 17•4 years ago
|
||
Nightly channel installs to C:\Program Files\Firefox Nightly
, and this one was failing. Sorry for confusing you 🙏
Comment 18•4 years ago
|
||
Bouncing needinfo for comment #17, and tentatively moving to the updater component - we can move it elsewhere if we narrow down the problem is elsewhere.
Comment 19•4 years ago
|
||
For what it's worth, I don't think this is an installer problem.
Kagami was able to create a new profile with the same (supposedly) broken build, and it was working. If the executable was damaged, that wouldn't have been possible, would it? Everything seems to point to a cache problem, or something else broken in the profile that was "fixed" by the other build.
Comment 20•4 years ago
|
||
(In reply to Francesco Lodolo [:flod] from comment #19)
For what it's worth, I don't think this is an installer problem.
I agree: per #c16, and due to the other information below, this looks like (yet more) startup cache/omnijar cache interaction.
Kagami was able to create a new profile with the same (supposedly) broken build, and it was working. If the executable was damaged, that wouldn't have been possible, would it? Everything seems to point to a cache problem, or something else broken in the profile that was "fixed" by the other build.
I wonder if we should start maintaining a log of the startup cache invalidations that we do in the wild, so that we have some record of how this transient behaviour occurs? I've filed https://bugzilla.mozilla.org/show_bug.cgi?id=1668051 to discuss.
Reporter | ||
Comment 21•4 years ago
|
||
Kagami was able to create a new profile
To be clear, I used an empty profile that I created earlier for testing.
Comment 22•4 years ago
|
||
(In reply to Kagami :saschanaz from comment #21)
Kagami was able to create a new profile
To be clear, I used an empty profile that I created earlier for testing.
OK, that's not what I understood. Having said that, I think comment 19 still stands (if the build is broken, even using an existing profile would break).
Comment 23•4 years ago
|
||
Sadly, this seems to have ground out. My belief is that the update mechanism was not involved; this sounds much more like further fallout from the startup caching work, perhaps interacting with XPI files in ways that we don't understand (similar, perhaps, to Bug 1656515). But we just don't know at this point :(
Clearing NI since I answered in #c20.
Comment 24•4 years ago
|
||
The current theory is that these types of errors are not due to faulty updates (or installs) but instead due to I/O errors of some sort when reading the omnijar. Refiling as such so that it gets triaged next to the YSOD meta ticket (Bug 1675823).
Comment 25•4 years ago
|
||
(not really an XML issue. If XML parser gets bogus data, it is expected to fail.)
Comment 26•3 years ago
|
||
The severity field is not set for this bug.
:mossop, could you have a look please?
For more information, please visit auto_nag documentation.
Updated•3 years ago
|
Comment 27•3 years ago
|
||
Using developer edition. This problem has re-surfaced for me except reported ERL is 743, stops use of browser. Have not changed any settings since. What info is required?
Comment 28•3 years ago
|
||
What info is required?
As much detail as possible about what you did right before it started happening, and if you get it to "fix itself" what steps led to that.
We're struggling to create a so called "steps to reproduce" and we're hunting it down based on vague descriptions of what people do that leads to the problem but at the moment we don't even know if it happens always after an update, or is the update completely unrelated.
Comment 29•3 years ago
|
||
Ok. My background includes php/mysql/frontend dev, more recently cybersecurity.
I also have a LOT of bookmarks in the dev edition, and would like to extract these back to beta version.
The problem with "what was I doing before the error message" is:
-
I was doing a lot
webex
jetbrains/php
jetbrains/pycharm/mysql-connector/hashlib
Dragon
Word
excel
virtual box/windows/ubuntu/kali
git
the problems with Virt Box inhabitants wiping IP settings serendipitously
the problems with Dell BSD DRIVER_POWER_STATE_FAILURE with no real resolution (Nvidia p620 perhaps) serendipitously
etc -
the updates are frequent
and, frankly, I have been impressed with how well a dev edition has operated without complaint with 7 days, 18 hours per day use for a v long time.
Given direction, I can find the current version/date, and I assume you know the xml file involved - does it change frequently?
Comment 30•3 years ago
|
||
I assume you know the xml file involved - does it change frequently?
The problem is not related to the file in question. It is a red herring.
The correct file loaded in certain, unknown to us, circumstances, loads incomplete or empty and results in the XML error.
Description
•