Closed Bug 398336 Opened 12 years ago Closed 12 years ago

Opening a very wide <select> freezes the entire OS

Categories

(Core :: Widget: Cocoa, defect, P3, critical)

x86
macOS
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: cbook, Assigned: stanshebs)

References

()

Details

(Keywords: hang, regression, testcase)

Attachments

(1 file, 3 obsolete files)

Loading the testcase from Bug 398158 (https://bugzilla.mozilla.org/attachment.cgi?id=283018)(and clicking on this bar) freeze Trunk Builds and the entire mac system.

Its no regression from bug 398157, because Gran Paradiso Alpha 8 mac-builds and 2007100104 Minefield have also the same problem.
Flags: blocking1.9?
correct url from Bug 398157 for the testcase is https://bugzilla.mozilla.org/attachment.cgi?id=283018
Summary: Testcase from Bug 398157 freeze Intel Maschines → Testcase from Bug 398157 freeze Intel Machines
(In reply to comment #0)
> Loading the testcase from Bug 398158

sorry, i meant Bug 398157 of course.
I can reproduce.  It made my machine hang for about 2 minutes (lots of disk activity), but then the machine became responsive again.
Assignee: nobody → joshmoz
Component: Layout → Widget: Cocoa
Keywords: hang, testcase
QA Contact: layout → cocoa
Summary: Testcase from Bug 398157 freeze Intel Machines → Opening a very wide <select> freezes the entire OS
Jesse, can you run Shark while doing that?

It would be nice to find out what is causing this.
Flags: blocking1.9? → blocking1.9+
Target Milestone: --- → mozilla1.9 M10
Assignee: joshmoz → stanshebs
A couple interesting data points. If I save the testcase into a file, and load the file, the select box is small as it should be; it's only crazy large and machine-hanging if I click on the link in this bug. Also, if I manually set the <select> to be width:60000 or whatever, I get a very wide select, but it all works quickly and correctly. So I think the wide select is a symptom of some other bug, memory corruption perhaps.
Are you saving it as "Web Page, Complete" or "HTML only"?  The "Complete" option screws up large widths (bug 373875).
OK, after a rather painfully slow isolation process, I've determined that this is a regression that does not happen with the 2007-06-20 nightly, and affects 2007-06-21 and subsequent versions. I only tested by clicking on the attachment in comment #1 above, and then clicking in the scrollbar.

As an added bit of confusion, if you run a browser that does not exhibit this bug, then subsequently all the buggy browsers will work OK! It's necessary to reboot in order to see the buggy browsers fail again. (Presumably this is because we are managing to scramble private data in an Apple framework, not the first time this has happened.)
(Excuse me, not "scrollbar", but "select bar".)
Priority: -- → P3
Target Milestone: mozilla1.9 M10 → ---
Turns out that while the older nightlies don't fail, browsers built from source still do. I've traced it to the Gecko units patch in bug 177805, landed back in February; previous code works, apply patch, rebuild, now it fails.
Blocks: pixels
Keywords: regression
Based on a little watching with Quartz Debug, I'm currently most suspicious that the (empty) menu that drops down from the <select> has crazy size - it's the first thing to get drawn after the long hang, and it's so long it goes offscreen.


FWIW: see Bug 363858 about why that <select>s are this huge in the first place.
OK, there are actually two problems here. First is the question of what to do with an empty <select>; under some circumstances (not entirely clear when), the "unconstrained" value 1<<30 migrates from layout into widget creation, resulting in dropdowns that are 17 million pixels wide ( 1<<30 / 60 ). Cocoa seems to be able to handle these generally - the select dropdown is drawn correctly, scrolls correctly, etc - and it doesn't take very long to process because it all clips to the browser. But a 17-million-pixel popup window is another story. The setFrame: method simply takes a long time to run, and hogs the machine while it's running; manual creation of a million-wide <select> takes only a few seconds, but if you wave the mouse around, you can see the Dock not responding etc. So I think we simply need to constrain popups to not be wider than the screen, and that will cover both unintentional and intentional creation of very wide <select> boxes.

Sounds reasonable to me, I guess the question is do we just constrain the width at the widget level or stop layout from ever giving us widths like that.
I checked out Safari - they have no problem with the ultrawide widget, but constrain the popup to somewhere around 900 pixels wide, even if the widget is small enough to fit entirely on the screen. A hacky patch to constrain popup size works well, and not only eliminates the possibility for malicious html coding to be a problem, but reduces the layout bug to a minor mishandling of a testcase, since it can only happen with an empty and untitled <select> with an empty table inside, a situation not commonly seen on real websites. :-)

I filed bug 404281 about layout passing nasty widths into widget. Since that is done, this bug should probably focus on practical constraints at the widget level. Questions to be answered are:

1. What is an unacceptable size for nsIWidget instances?
2. Given an unacceptable size request, should we just make the widget as large as we can in the unacceptable directions(s) or do nothing and return failure?
3. Do we want to impose different constraints per platform or do we want to do something in nsBaseWidget?
Depends on: 404281
No longer depends on: 404281
Blocks: 404281
I don't think there's a problem with allowing widgets to be arbitrarily large, because the containing window gets scrollbars and you can roll around to see it all. Windows (including popups), however, should never be created or programmatically resized to have any part of them offscreen - what if there is no title bar or resize gizmo? The user has no way to maneuver the window to see the undrawn part. Only the user should be able to decide to size or position a window to be partially offscreen. This can be fixed in nsCocoaWindow alone I think.


Although we now have a patch to prevent some kinds of bad widths going through layout, we still need a lower-level patch to forestall both accidental and intentional extreme sizes getting into Cocoa and causing hangs. I propose 100,000 pixels as a limit, which allows designers to do some freaky things should that ever seem like a good idea, but which is well below the point at which Cocoa starts to get bogged down in its internal processing.
Attachment #290271 - Flags: review?(joshmoz)
So sayeth the documentation of -[NSWindow setFrame:display:]:

"Note that the window server limits window position coordinates to ±16,000 and sizes to 10,000."

I suppose the docs might be technically correct, in the sense that if most of the window is offscreen, there's no way to know if a requested size has been honored or clipped. But *some* part of the window machinery gets gradually slower when one requests larger and larger windows - the delay is small but noticeable at one million pixels, and gets worse from there.

(In reply to comment #19)
> I suppose the docs might be technically correct, in the sense that if most of
> the window is offscreen, there's no way to know if a requested size has been
> honored or clipped. But *some* part of the window machinery gets gradually
> slower when one requests larger and larger windows - the delay is small but
> noticeable at one million pixels, and gets worse from there.

Sure, I wasn't clear: I was suggesting we limit to 10,000 pixels, not 100,000 :)
Come to think of it, the 10,000-pixel number has got to be obsolete, because you could get to that now using four big screens side by side, expensive but quite plausible.

How about we just set a limit of 100,000 in xpwidgets so it applies to all platforms. The actual number itself is something we could debate for a long time and it wouldn't make much difference in the end. I picked 100,000 because it probably encompasses most unforseen crazy things but is also small enough that we know it doesn't cause a problem on Mac. If Windows freaks out with 100,000 we could lower it, or make the # a macro, or whatever.
100000 isn't that big.  Plenty of long pages are taller than 100000 pixels.  (Think of pages that are more than 100 PgDn's high.)  Don't some of those have widgets over the whole page?
True, we'd need to constrain this to top-level windows in which case maybe a generic solution isn't a good idea.
Comment on attachment 290271 [details] [diff] [review]
Patch to prevent extremely large windows

I think we should fail if the window size is unreasonable. I don't like the idea of just giving it a different size, makes the API unnecessarily complicated and covers up bugs.
Attachment #290271 - Flags: review?(joshmoz) → review-
The patch modified to return an error code works, but it's unfortunate from the user interface point of view; you click on the select, it highlights, and that's all. You have to click a couple times to get the never-seen popup menu to go away and get back to an unhighlighted bar. Yes, there's a message to the console, but normal users aren't going to see that. Anyway, if that's acceptable, then I'll post the patch for review.
That is the expected behavior. That won't happen once the layout fix lands.
Basically the same patch as before, but we return with an error instead of clipping the sizes. (The browser keeps running, only window creation/resize fails).
Attachment #290271 - Attachment is obsolete: true
Attachment #290756 - Flags: review?(joshmoz)
Comment on attachment 290756 [details] [diff] [review]
Modified patch, error out on bad window sizes

Can you post another patch that calls "CheckSize" "WindowSizeAllowed" and has it return a c++ bool?  Then just check that bool and return NS_ERROR_FAILURE from the callers if that is false. That name is more informative and there is no need to gecko-ize this method with gecko error codes.
Actually, I don't see any reason that WindowSizeAllowed has to be part of that object. Please just make it a C helper function in nsCocoaWindow.mm, not exposed in the header.
Attached patch Simplified version of last patch (obsolete) — Splinter Review
Made the requested changes.
Attachment #290756 - Attachment is obsolete: true
Attachment #290774 - Flags: review?(joshmoz)
Attachment #290756 - Flags: review?(joshmoz)
Comment on attachment 290774 [details] [diff] [review]
Simplified version of last patch

Use BOOL instead of PRBool on checkin.

Note - we decided to name it that way for legibility despite the fact that all callers reverse the return value. Not something we overlooked.
Attachment #290774 - Flags: superreview?(roc)
Attachment #290774 - Flags: review?(joshmoz)
Attachment #290774 - Flags: review+
Attachment #290774 - Flags: superreview?(roc) → superreview+
Final version for checkin.
Attachment #290774 - Attachment is obsolete: true
landed on trunk
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Flags: in-testsuite?
You need to log in before you can comment on or make changes to this bug.