Open Bug 1585379 Opened 1 year ago Updated 1 year ago

Crash in [@ mozilla::dom::WebIDLGlobalNameHash::GetEntry]

Categories

(Core :: DOM: Bindings (WebIDL), defect, P1)

defect

Tracking

()

Tracking Status
firefox71 - wontfix
firefox72 --- wontfix

People

(Reporter: marcia, Unassigned)

References

Details

(Keywords: crash, regression)

Crash Data

This bug is for crash report bp-aae99eb2-5803-4c93-ada6-1c46b0191001.

Seen while looking at nightly crash stats: https://bit.ly/2oa486v. 17 crashes/18 installs in the 7 seven days. This signature appears on 69, but only 1 crash in the same time period.

No comments and no correlations. The spike in nightly started in 20190927215713. Possible regression range based on that build id: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=1dc1a755079a15e35cb234db511e52ab463b2f42&tochange=d23a0a0ffa93fe273c571592bdbff6aa68f6ca2b

Bug 1583819 and Bug 1583636 are in that changeset. Adding some ni's.

Top 10 frames of crashing thread:

0 xul.dll mozilla::dom::WebIDLGlobalNameHash::GetEntry dom/bindings/RegisterBindings.cpp:6773
1 xul.dll mozilla::dom::WebIDLGlobalNameHash::DefineIfEnabled dom/bindings/WebIDLGlobalNameHash.cpp:66
2 xul.dll mozilla::dom::Window_Binding::_resolve dom/bindings/WindowBinding.cpp:19646
3 xul.dll js::NativeGetProperty js/src/vm/NativeObject.cpp:2618
4 xul.dll nsOuterWindowProxy::get dom/base/nsGlobalWindowOuter.cpp:908
5 xul.dll js::Proxy::get js/src/proxy/Proxy.cpp:352
6 xul.dll js::GetProperty js/src/vm/Interpreter.cpp:4542
7 xul.dll static bool Interpret js/src/vm/Interpreter.cpp:2798
8 xul.dll static bool js::RunScript js/src/vm/Interpreter.cpp:424
9 xul.dll js::ExecuteKernel js/src/vm/Interpreter.cpp:813

Flags: needinfo?(echen)
Flags: needinfo?(bzbarsky)

Hmm.

Bug 1583819

This is about adding error reporting for a situation that doesn't actually arise in our code. It did not change the resulting generated code.

Bug 1583636

Same thing: adds a compile-time error for a situation we don't have now (and should not introduce) but doesn't change any runtime code.

In fact, I verified that across the linked regression range the only change in the generated code is from parseFromSafeString being added in bug 1467970.

Looking at the crash reports, they look as follows:

  1. https://crash-stats.mozilla.com/report/index/d6c64064-ec51-4337-9ddb-02c650191001 -- null-deref crash at https://crash-stats.mozilla.com/sources/highlight/?url=https://gecko-generated-sources.s3.amazonaws.com/2d71ade1ee5dcceb98186bb69feec56300eabe5cf2bebda5e587008815dcf508313c10ab29313479bf4c45e20bcfae6f29c7da55cd4fca96077d1307129836cb/dom/bindings/RegisterBindings.cpp&line=6617#L-6617 (the if (js::LinearStringHasLatin1Chars(jsString)) line in WebIDLGlobalNameHash::GetEntry).
  2. https://crash-stats.mozilla.com/report/index/aae99eb2-5803-4c93-ada6-1c46b0191001 random-address-looking crash at https://crash-stats.mozilla.com/sources/highlight/?url=https://gecko-generated-sources.s3.amazonaws.com/ab0089bb81ddd7061494c678cead99acc844ed52fa5933461c2e927996a57283f86504ec1d16130735386e92a58f541a6fdc153d95e3cdc52276595da5d43eef/dom/bindings/RegisterBindings.cpp&line=6773#L-6773 (the auto& entry = mozilla::perfecthash::Lookup line in the latin1-chars case).
  3. https://crash-stats.mozilla.com/report/index/5ba4bf7d-7e18-40a7-8dbb-fa4540191001 -- Like #2.
  4. https://crash-stats.mozilla.com/report/index/527741f6-ebb3-413d-8101-edcac0191001 -- Near-0 crash at https://crash-stats.mozilla.com/sources/highlight/?url=https://gecko-generated-sources.s3.amazonaws.com/7ca00c9553714b125a902859f86882aebd94866127ab04415cac59b7d48c1dd0ca0b3fe044bd68c5dfb9ddfaf42a54b1606bea4bf7964ced0dcb5d94dae47ac0/dom/bindings/RegisterBindings.cpp&line=6774#L-6774 (the auto& entry = mozilla::perfecthash::Lookup line in the non-latin1-chars case).
  5. https://crash-stats.mozilla.com/report/index/c24f7242-c8ed-436f-9f67-a77330191001 -- Like #2.
  6. https://crash-stats.mozilla.com/report/index/0b097293-6877-41b4-933b-c26920191001 -- Like #2
  7. https://crash-stats.mozilla.com/report/index/2880d83b-12bb-4b48-9052-f3c350191001 -- Like #2
  8. https://crash-stats.mozilla.com/report/index/3f5adfd4-fc36-4719-a433-80f9e0190930 -- Like #2
  9. https://crash-stats.mozilla.com/report/index/05104ea0-f1f7-49ca-b3bc-713b50190930 -- Like #2
  10. https://crash-stats.mozilla.com/report/index/c383e8ea-e6b0-4a48-80a5-688610190930 -- Like #2
  11. https://crash-stats.mozilla.com/report/index/0aa51846-bf2b-4b03-96a3-457c90190930 -- Like #2
  12. https://crash-stats.mozilla.com/report/index/318315a5-f804-472a-82ec-a4a040190930 -- Like #2
  13. https://crash-stats.mozilla.com/report/index/0df6ff51-c04b-455c-b88c-48c5d0190930 -- Like #2
  14. https://crash-stats.mozilla.com/report/index/f063ed65-4d6a-4ed5-9a51-e479d0190930 -- Like #2
  15. https://crash-stats.mozilla.com/report/index/35a9ce71-deec-46ac-8055-0214f0190930 -- Like #4
  16. https://crash-stats.mozilla.com/report/index/fbe4832a-0cdd-4a78-9355-9affe0190929 -- Like #2
  17. https://crash-stats.mozilla.com/report/index/8cb8158f-b452-4d95-af65-8316a0190929 -- Like #2
  18. https://crash-stats.mozilla.com/report/index/88383fcf-8ca8-4657-92ac-cec460190929 -- Like #1
  19. https://crash-stats.mozilla.com/report/index/2d2cbc31-fc4b-450d-9c63-dcc610190929 -- unclear; no useful stack
  20. https://crash-stats.mozilla.com/report/index/e0eadc42-6ccf-40c1-88ad-cd1010190929 -- Like #2
  21. https://crash-stats.mozilla.com/report/index/0be31386-fc8d-4593-897b-ce4c30190928 -- Like #2

Except the crash address is pretty often 0x8012500000, not actually that random.

All of this suggests that the string landing in GetEntry is corrupt in some way.

Looking at nsGlobalWindowInner::DoResolve it does return right up front if !JSID_IS_STRING(aId). So that part should all be fine if the string that comes into DoResolve is valid to start with.

There were changes to SpiderMonkey strings in bug 1578339 in the linked range. I'm not sure whether those could be an issue here.

It could also be memory corruption from pretty much anywhere....

Flags: needinfo?(bzbarsky)

ccing the people from bug 1578339 too.

Priority: -- → P1

[Tracking Requested - why for this release]: Apparent crash regression

js::LinearStringHasLatin1Chars just checks a flag and doesn't run, at that point, code changed in bug 1578339. js::GetLatin1LinearStringChars doesn't run code changed there, either. So if bug 1578339 is to blame, we'd need a story of how bug 1578339 caused creation of the string in a bogus state.

I re-read the patches for bug 1578339 and, AFAICT, nothing these changes the coupling of the bit flag and the storage type. Also, with one exception, the memory allocations aren't changed. Considering the fuzzing and other usage history of the relevant Rust code, I find it extremely unlikely that any of the operations changed would have written out of their bounds and corrupted something that way.

The one allocation change is this bit that changed the manner of allocation in one case:
https://hg.mozilla.org/mozilla-central/rev/314012b65b23#l6.30

But when those two lines have run, the string is no longer a Latin1 string, so with the stacks reported here, those lines should not have run, so a subtle bug there doesn't appear to explain this.

My real concern is that something is introducing memory corruption into strings here, and this code is just running into it... I really wish we had some way to reproduce this. :(

Flags: needinfo?(echen)

This signature appears on 69, but only 1 crash in the same time period.

The crash data summary now shows 10 installs of 68.0.2 crashing, and the crash location looks the same as in Nightly.

I looked at the URLs - not much to go on there:

Interesting that the crashes in 68 are labeled as startup crashes.

At least for nightly, this could have been caused by bug 1575370 given the timing. See also bug 1584820 for other resulting crashes that looks sort of like this one.

That wouldn't explain the 68 or 69 crashes...

See Also: → 1584820

Is this still happening at elevated levels now that bug 1584820 is fixed?

Flags: needinfo?(mozillamarcia.knous)

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #9)

Is this still happening at elevated levels now that bug 1584820 is fixed?

On 71 nightly, the last crashes I see are in 20191007092954.

There is also one crash in 70.0rc2 - https://crash-stats.mozilla.org/report/index/c501dc7b-be4d-4866-bf23-7e7b50191018

Flags: needinfo?(mozillamarcia.knous)

Right, but is that higher than the background level we had before in 68/69? That is, do we still need to be tracking this for 71?

Flags: needinfo?(mozillamarcia.knous)

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #11)

Right, but is that higher than the background level we had before in 68/69? That is, do we still need to be tracking this for 71?

If your concerns in comment #5 are no longer a worry, I think we can stop tracking this bug.

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #11)

Right, but is that higher than the background level we had before in 68/69? That is, do we still need to be tracking this for 71?

68 release had a total of 27 crashes (68.0 + two dot releases). 69 release had 3 total crashes. Nightly had a total of 29 crashes. So it seems it was about the same overall as what we saw in 68.

Flags: needinfo?(mozillamarcia.knous)

If your concerns in comment #5 are no longer a worry, I think we can stop tracking this bug.

They're still a worry in general, but if the level has gone back down to what it used to be, then the elevated level of memory corruption was coming from the same issue as bug 1584820....

And if the level is back down to background, then we should mark this fixed....

You need to log in before you can comment on or make changes to this bug.