Closed Bug 1866409 Opened 8 months ago Closed 8 months ago

nsRFPService::RandomizePixels() takes 100% CPU on maps.google.com if compiled by GCC

Categories

(Core :: Privacy: Anti-Tracking, defect)

defect

Tracking

()

RESOLVED FIXED
122 Branch
Tracking Status
relnote-firefox --- 120+
firefox120 + fixed
firefox121 + fixed
firefox122 + fixed

People

(Reporter: stransky, Assigned: emilio)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

downstream bug: https://bugzilla.redhat.com/show_bug.cgi?id=2251202

If Firefox is compiled with CGG nsRFPService::RandomizePixels() takes 100% CPU on maps.google.com and the maps can't be used.

Profile: https://share.firefox.dev/3GhYrYE

I tested latest trunk with gcc-13.2.1-4.fc38.x86_64 / Fedora 38 and the bug is here. Trunk built with clang doesn't show this bug.

Flags: needinfo?(stransky)

If numNoises is 255, then the loop will iloop (i = 255 won't end, and i++ will likely wrap to 0). Note: this is undefined behavior, so compiler/options/opt can affect this. (thanks to penguin42 on Matrix for finding this!)

I) think the problem here is the loop limit

// Ensure at least 20 random changes may occur.
uint8_t numNoises = std::clamp<uint8_t>(rnd3, 20, 255);

for (uint8_t i = 0; i <= numNoises; i++) {

I debugged with gdb and to me it looks like I have numNoises=255 - but gdb wont actually print it out for me.

Is

for (uint8_t i=0; i<= 255; i++) {

a solid loop?
(Note the <= and the uint8_t) ?

(I'd filed this as https://bugzilla.redhat.com/show_bug.cgi?id=2251620 )

Yeah it seems the loop is just wrong.

Flags: needinfo?(stransky)
Flags: needinfo?(jh)

If numNoises is 255 the loop never ends because it wraps around to 0.

Assignee: nobody → emilio
Status: NEW → ASSIGNED
Pushed by ealvarez@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5d9ee03c3be9
Fix RFP canvas noise randomization loop. r=timhuang

Just FYI, for me the bug is always reproducible. So that means numNoises is always 255 for me (?).
Seems weird to me as I thought numNoises should be random.

(In reply to Marcel from comment #7)

Just FYI, for me the bug is always reproducible. So that means numNoises is always 255 for me (?).

Well, or that it gets executed enough times such that it statistically always happens at least once, right?

But it might be that gcc just sees that the loop can't terminate if the value is >=255 and thus it assumes that never happens and removes the clamping or what not. Hard to say without looking at the actual machine code.

Compilers will absolutely do stuff like that. And it varies by compiler and opt level

Now now, compilers aren't always that mean!
I've prodded at this with gdb a bit, and I can see there are at least a few different values.
Using the Fedora packages, for me it looks like the current count is at the bottom of R11 and the limit as the bottom of R15.
As well as the FF value, I've seen fb,2e,b3,2e,6c,b1
Which leaves the question of wth is g.maps doing reading the image back so much;
I've got a theory - I think it's the little 'Layers' button at the bottom left - I think it's generating a thumbnail to use for that button.

Status: ASSIGNED → RESOLVED
Closed: 8 months ago
Resolution: --- → FIXED
Target Milestone: --- → 122 Branch

This bad loop seems to cause issues with the Dark Reader addon as well:
https://github.com/darkreader/darkreader/issues/11834
https://github.com/darkreader/darkreader/issues/11943

Lots of reports from Fedora there, and in the profiles shared (1, 2) we are stuck in nsRFPService::RandomizePixels(), called from CanvasRenderingContext2D.getImageData().

Comment on attachment 9365551 [details]
Bug 1866409 - Fix RFP canvas noise randomization loop. r=timhuang,tjr

Beta/Release Uplift Approval Request

  • User impact if declined: comment 0, and probably also other hangs in the wild.
  • Is this code covered by automated tests?: Unknown
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: none
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Trivial fix.
  • String changes made/needed: none
  • Is Android affected?: Yes
Attachment #9365551 - Flags: approval-mozilla-release?
Attachment #9365551 - Flags: approval-mozilla-beta?

Comment on attachment 9365551 [details]
Bug 1866409 - Fix RFP canvas noise randomization loop. r=timhuang,tjr

Approved for 121.0b5.

Attachment #9365551 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Comment on attachment 9365551 [details]
Bug 1866409 - Fix RFP canvas noise randomization loop. r=timhuang,tjr

Approved for 120.0.1 dot release

Attachment #9365551 - Flags: approval-mozilla-release? → approval-mozilla-release+
Duplicate of this bug: 1867437

(In reply to Martin Stránský [:stransky] (ni? me) from comment #0)

I tested latest trunk with gcc-13.2.1-4.fc38.x86_64 / Fedora 38 and the bug is here. Trunk built with clang doesn't show this bug.

I looked at what the loop code looks like in Firefox 120 for Windows x64 (so before patch, compiled with clang too):

// edx is rnd3
  cmp     dl, 21
  mov     r11d, 20
  cmovae  r11d, edx
// r11b is now std::clamp<uint8_t>(rnd3, 20, 255)
  inc     r11b
  movzx   edx, r11b
  cmp     dl, 2
  mov     r11d, 1
  cmovae  r11d, edx
// r11b is now 1 if std::clamp<uint8_t>(rnd3, 20, 255) was 255
// otherwise, r11b is now std::clamp<uint8_t>(rnd3, 20, 255) + 1
  [...]
loop:
  [...]
  dec     r11b
  jne     loop

When applied to a rnd3 value in 0..254, this code yields a loop with rnd3+1 turns, as expected from reading the C++ source code. When applied to a rnd3 value of 255 however, it yields a loop with only 1 turn, and not an infinite loop like we may expect. I think that clang didn't care about guaranteeing a particular behavior when rnd3 is 255 because this case breaks forward progress (in simple terms, "infinite loops without side effects are undefined behavior"). I believe that the case where rnd3 had a value of 255 was therefore undefined behavior, thus explaining the difference in behavior across different builds.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: