Closed Bug 541897 Opened 14 years ago Closed 14 years ago

[10.6] endless console logging of kCGErrorIllegalArgument with Flash and Silverlight content

Categories

(Camino Graveyard :: Plug-ins, defect)

x86
macOS
defect
Not set
major

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: phiw2, Assigned: stuart.morgan+bugzilla)

References

Details

(Keywords: regression, Whiteboard: [camino-2.0.6])

Attachments

(7 files, 1 obsolete file)

STR - with default profile, Flash 10.0r42, Flashblock OFF
1. load http://adobe.com/products/flashplayer
2. Hide Camino (Cmd-H)

AR logging start.

This doesn't stop until the window with the active content is closed.
I cannot reproduce this with Firefox 3.0.16 (or newer versions)

From Console log (first occurences):
1/25/10 12:18:52 PM	Camino[4492]	kCGErrorIllegalArgument: CGSGetWindowBounds: NULL window
1/25/10 12:18:52 PM	Camino[4492]	kCGErrorFailure: Set a breakpoint @ CGErrorBreakpoint() to catch errors as they are logged.
1/25/10 12:18:52 PM	[0x0-0x3d83d8].org.mozilla.camino[4492]	Mon Jan 25 12:18:52 yugao.local Camino[4492] <Error>: kCGErrorIllegalArgument: CGSGetWindowBounds: NULL window
1/25/10 12:18:52 PM	[0x0-0x3d83d8].org.mozilla.camino[4492]	Mon Jan 25 12:18:52 yugao.local Camino[4492] <Error>: kCGErrorFailure: Set a breakpoint @ CGErrorBreakpoint() to catch errors as they are logged.
1/25/10 12:18:52 PM	Camino[4492]	kCGErrorIllegalArgument: CGSGetWindowBounds: NULL window
1/25/10 12:18:52 PM	[0x0-0x3d83d8].org.mozilla.camino[4492]	Mon Jan 25 12:18:52 yugao.local Camino[4492] <Error>: kCGErrorIllegalArgument: CGSGetWindowBounds: NULL window

(via the forum: http://forums.mozillazine.org/viewtopic.php?f=12&t=1682805)
As far as I can tell, nothing bad happens in Camino after this.
It only spams the error console endlessly, and if my older log files are indicative, suppress any other logging.
Modifying Josh's test plugin to log SetWindow calls and seeing what happens when a window is hidden in the two browsers might be interesting (as would seeing what the backtrace is when breaking as instructed in the logging, to see what the browser call triggering this is--although my guess would be the null event, given the spam, in which case that doesn't really help)
(In reply to comment #2)
> Modifying Josh's test plugin to log SetWindow calls and seeing what happens

Does smichaud's (bug 441880) already do that?  I can't tell if any of those bits of logging are what we'd want here or not….  (I also don't know where to find Josh's.)
Armen is getting bitten by this pretty badly.
All right so this bug is causing serious issues for me. I received ~27000 entries over a span of approximately 50 minutes in my logs. Whether directly, or indirectly through Spotlight attempting to index my logs, this brings my machine to it's knees until it becomes completely unresponsive and I have to do a hard reboot.

So, I installed 1.08 of the DebugEventsPlugin, hit test.html, played some Flash, hid Camino, then showed it again. The resulting log events from system.log are captured here: http://pastebin.org/83747

I hope this is helpful in debugging the cause of this issue. In the mean time I guess the way to avoid the issue is to never hide Camino in OS X 10.6.
For comparison purposes, this is the logging I get when running Steven's plug-in with the test.html file in Camino 2.0.1 and Fx3.0.16 on 10.5.8.

Two interesting things:
1) The first time I ran this in Firefox, Firefox logged a lot more stuff--on a delay.  It looks like it's double-logging stuff and delaying writing to the Console (see the timestamps).

1a) After that, I was able to run the tests (repeatedly) without triggering the double-logging/delay somehow, so the second log block seems valid.

3) Camino appears not to trigger a NPP_SetWindow call on hiding, and, either the plug-in's logging is wrong (only looking for ToolbarWindows and other non-embedding window types?), or we end up calling the "hide" NPP_SetWindow on showing!

This is on 10.5.8 with release builds, but I tried with my 2.1 debug build and didn't notice a material difference in the logging.
I have a second capture of log events that is hopefully more clear here:
http://pastebin.org/83756

Also note that after my comments and the marker where I am "showing Camino
again" it took me a second to switch back to Camino after inserting the marker
so, those events in there all happened in that timeframe. They stop after
Camino is unhidden.
(In reply to comment #7)
> For comparison purposes, this is the logging I get when running Steven's
> plug-in with the test.html file in Camino 2.0.1 and Fx3.0.16 on 10.5.8.

To be clear, the exact methodology was this:

1) Launch {Camino|Firefox} with a fresh profile
2) Position the browser window so that I can see the Console's system.log and the mark button; close the browser window
3) Position the cursor over the Console window's "mark" button
4) Open test.html via the keyboard, wait for logging to stop
5) Carefully Cmd-click the "mark" button
6) Hide Camino with Cmd-H
7) Wait for logging to stop
5) Carefully click the "mark" button (Console.app is active now)
8) Show Camino via Cmd-Tab
9) Wait for logging to stop
10) Carefully Cmd-click the "mark" button
11) Cmd-Q, and make sure the only additional plug-in logging generated is the shutdown logging

This prevents any spam from spurious mouse-moves, etc.

My log in comment 7 encompasses the logging generated from step 5 to step 9.

Armen's logging so far has been too noisy (and too hard to tell where the "mark"s are) to really tell anything, but it does look like the plug-in is logging more on 10.6--some focus changes, as well as that the windows are BrowserWindows.
Chris did a logging session on 10.6, and while it's far less noisy, I'm not sure what it tells us, other than Firefox is generating stuff completely differently than us :-(  (I've snipped the duplicate messages from the fx log for this attachment; full output is http://pastebin.mozilla.org/700243.)

In particular, Fx seems to make more SetWindow calls, almost in pairs. On hide, they don't seem to call it with a null window (contra 10.5); on show, it's one with a ToolbarWindow and one with null.

We get one on show and one on hide, both with null windows (both cgContextWindow and pluginWindow). Also, on opening the page, we make 5 calls, 3 with cgContextWindow (null), pluginWindow <BrowserWindow:0x1a81cef0>, and 2 with cgContextWindow <BrowserWindow: 0x1a81cef0>, pluginWindow <BrowserWindow: 0x1a81cef0>.
All the same text from comment 10, with 100% more of Chris's logs.
Attachment #424189 - Attachment is obsolete: true
This got logged hundreds of times with a Gecko 1.9.2 based build running on 10.6.2. Same str. Flash logs the same thing.
Also, just noticed, the log messages get spit out when you minimize the window as well.
And another thing, this becomes problematic when it triggers logfile rotation, bzip instances (along with mds) start consuming 100% CPU when that occurs...
Experiencing the same thing, but not with all web sites.  I can minimize many sites, even with multiple tabs, and not get the problem.  Other sites cause the problem right away.  One for sure is Yahoo mail http://us.mg1.mail.yahoo.com.  Minimize a multiple tab window with Yahoo and other sites, the logging starts going nuts.  Remove the Yahoo tab and minimize, no logging.
FYI... OS X 10.6.3  How I found this issue and this bug page is that I had several log files taking up 10 GB of space...
Update... I downloaded 1.6.10... No logging issue whatsoever, so I'll be sticking with this version.  Obviously a 2.0 problem.
SetWindow turned out to be a red herring, it looks like; Gecko isn't clipping the plugin in either browser, and Flash is calling HIPointConvert in both every idle tick. And even if it is clipped (scrolled off the screen), it still happens, just on the slower timer.

The proximal cause is that GetNativeWindowFromWindowRef is succeeding on the HIPointConvert WindowRef destination argument in Firefox, but failing in Camino, returning the NULL that is then passed into CGSGetWindowBounds.

Why that difference, I'm not sure. Something internal to the implementation of the two window classes, maybe? But since they are both NSWindow subclasses, I'm not sure what.

Options for going forward here, as I see it:
1) Did deeper into why GetNativeWindowFromWindowRef is failing (maybe mucking around in internal window structures to try to figure out why Firefox doesn't have this problem).
2) Wait for the Cocoa event model to make this all go away (1.9.3 IIRC).
I had hoped there would be an option 3 around forcing the plugin to be clipped on hide/minimize, but apparently that wouldn't actually help.

I don't have the time/energy to do 1, I'm afraid.
so why would clipping not help again?
Not sure if this is caused by exactly the same underlying issue, but following the STR in comment 9, I get similar errors. This is with a 1.9.2 build on 10.5.8.
Two things people can do to help here (unless you’re awesome with a debugger, in which case please jump on comment 18):

1) See if Safari on 10.6 displays similar logging

2) Find the regression range.  Comment 17 asserts this doesn't happen in 1.6.x, so, if that's true, sometime between the MOZILLA_1_8_BRANCH branching and the release of 2.0.x, a code change (probably in Gecko, but possibly in Camino) did something that causes the logging to start.

http://ftp.mozilla.org/pub/mozilla.org/camino/nightly/

1.8 branched on 2005-08-12, so you may want to start from the present and work backwards since the odds are better that it's a newer checkin rather than an older one.

In 2005 and early 2006 (through mid-April), trunk builds are those that *don't* have "-1.0" style designator on their directories.

Beginning in mid-April 2006, trunk builds have the "-trunk/" designator on the directories.

Starting in late June 2008, trunk builds have the "-2.0-M1.9/" designator instead (starting mid-July 2009, this switches to "-2.1-M1.9/" instead, when the Camino 2.0 branch began, but since 2.0 has the bug, it's probably not necessary to switch here).
This is with a 1.9.2 build on 10.4.11 and Flash 10.1.53.7.
Attachment #440628 - Attachment mime type: application/octet-stream → text/plain
I can already tell that -on 10.6- the issue is present with the 20080115 build [2.0a1pre (1.9b3pre 2008011500)].
On 10.5.8 / Intel

* Version 2.0.2 (1.9.0.18 2010021619)
  with Shockwave Flash 10.0 r45 OK (no logging)
  with Shockwave Flash 10.1 r53 OK (no logging)

* Version 2.1a1pre (1.9.2.4pre 20100412215207)
  Version 2.1a1pre (1.9.2.2pre 20100222003750)
  with Shockwave Flash 10.0 r45  logging
  with Shockwave Flash 10.1 r53  logging
If it is the drawing models, that should also mean using Flash 9 should not cause the logging in any of the build+OS pairs where Flash 10/10.1 do cause it: http://kb2.adobe.com/cps/406/kb406791.html
(In reply to comment #26)
> If it is the drawing models, that should also mean using Flash 9 should not
> cause the logging in any of the build+OS pairs where Flash 10/10.1 do cause it:
> http://kb2.adobe.com/cps/406/kb406791.html

Er, no, you need a version of Flash 9 prior to r60, it looks like (see bug 395983).

Note also that bug 344427 caused bug 395983, which in turn caused a host of bugs (including bug 419187, which only showed up in Camino and Sm, and included a bunch of fun CG errors, and which I guess was resolved by bug 408898).

So there's a rabbit hole, but it's not *too* deep (and maybe Stuart can spot something in the widget and gfx changes there that has a bad assumption?).
(In reply to comment #26)
> If it is the drawing models, that should also mean using Flash 9 should not
> cause the logging in any of the build+OS pairs where Flash 10/10.1 do cause it:
> http://kb2.adobe.com/cps/406/kb406791.html

I was thinking the same, except not really.
10.6 - Flash 9,0,262,0 installed
Version 2.1a1pre (1.9.2.4pre 20100412215207)
Version 2.1a1pre (1.9.0.19pre 2010032701)
logging - even worse:
4/22/10 10:56:29 AM	Camino[71767]	*** process 71767 exceeded 500 log message per second limit  -  remaining messages this second discarded ***

Can't test on 10.5.8 right now. Someone's working on that machine (the horror).
(In reply to comment #28)
> I was thinking the same, except not really.
> 10.6 - Flash 9,0,262,0 installed

Yeah, I was wrong and corrected myself in comment 27; you need 9.0r47 or older (i.e., prior to r60).

I probably have a copy somewhere (since I added r16-r124 to https://wiki.mozilla.org/CrashKill/Mac_Flash_Identifiers), but it's either on the PB getting upgraded or on some unknown CD backup.
Well, mine is PPC-only.  Adobe has http://kb2.adobe.com/cps/142/tn_14266.html, which has a 235 MB Flash 9 Players archive, which hopefully has an Intel 9r47.
Yup, that is the one - it contains UB builds.

Tested on 10.6 with Camino-Gecko1.9.2 and flash 9 r47:
hiding or minimising the window does _not_ trigger any crazy logging.
Some sanity checking with the following builds:

Version 2.1a1pre (1.9.0.19pre 2010042201)
Version 2.0.2 (1.9.0.18 2010021619)
Version 2007032611 (1.2+)
Version 2007032701 (1.2+)

Flash 9,0,47,0 --> no logging
(no  Flash r 60 available in that archive)
Flash 9,0,115,0 --> triggers logging starting with the 2007032701 build
Blocks: 344427
Summary: [10.6] endless logging of kCGErrorIllegalArgument with flash content → [10.6] endless logging of kCGErrorIllegalArgument with Flash and Silverlight content
In addition to nsCocoaWindow.mm (which is the XUL NSWindow subclass), there's nsWindowMap.mm, which has a lot of focus and activation code and apparently gets lists of native windows, and which has a lot of special-casing for Camino (specifically, from the comments, stuff Gecko would send to nsCocoaWindows in XUL apps gets sent directly to our NSViews).
Same problem with Camino 2.1a1pre.
Problem still exists with 2.0.4 released this week.  Camino 2 is crippled.  Why fix other stuff with this huge problem?

I'm moving to Google Chrome.
Yes, clearly it's in everyone's best interests to stop releasing security updates that are already written (by people in the Gecko community who have nothing to do with Camino) until we fix a bug that's already in shipped builds.

Please stop spamming this bug to tell us it's not fixed. If it were fixed, the bug would have been closed.
Summary: [10.6] endless logging of kCGErrorIllegalArgument with Flash and Silverlight content → [10.6] endless console logging of kCGErrorIllegalArgument with Flash and Silverlight content
I have reported this problem to Adobe (Flash bug 2747326), but it has been resolved WONTFIX.
Chris, can you give a link to that bug? Unless Adobe has recently changed their numbering format, that bug number doesn't make sense. (At least, I can't find it on the Flash Player JIRA at bugs.adobe.com.)
Sorry. That bug number is from Adobe's internal bug database. (I am an Adobe employee.) I included that internal bug number for future reference, in case the Flash team decides to revisit this problem.

If this really is a Flash bug (even though it does not repro with Firefox+Flash but does repro with Camino+Silverlight), I can try to escalate the bug inside Adobe.
Well, the issues seems to be (from what I found) that calling HIPointConvert on a minimized/hidden Camino window fails. Since the plugin is the one making that call, and it doesn't work right, it's arguably a Flash bug that it is making that call without checking the window state first.

(The flip side is that there may be something we can do to make it not fail, since it doesn't fail in Firefox, but it's not clear that it's anything but an accident of implementation that it works in Firefox.)
Actually, here's an interesting question:
Does anyone have a 10.6 machine with a GMA 950 integrated graphics card (e.g., a bunch of older MacBooks (not MacBook Pros) and Minis have them)?

If so, does YouTube (or some other site with Flash that doesn't use wmode=opaque/transparent) cause this logging in Safari? That specific configuration should use the Carbon event model even in Safari. If it happens there, maybe Adobe would reconsider :)
(In reply to comment #41)
> Does anyone have a 10.6 machine with a GMA 950 integrated graphics card

*raises hand*

I'm not sure I'm doing this correctly, but no, I can't find anything in windowserver.log while safari is playing a youtube video and is hidden with cmd-H.

This is on 10.6.4 with whatever flash version comes with the OS, and Safari 5.0.2.
...or pretty much any other log that I could find (windowserver.log seemed to have the most recent kCGErrors)
I had an inspiration and made a test app--sure enough this is trivially reproducible there--then played with the window properties. Turns out all we need to do is turn off one-shot on BrowserWindow. Given the documentation for that method we probably don't want it on anyway (I bet we've just never given it any thought and thus had the default), so win-win.

On trunk we should do this in the nib--what OS are we using for that nib these days?

For branch, I'll attach the one line patch momentarily so we don't have to mess with localization.
Flags: camino2.0.6+
(In reply to comment #40)
> Since the plugin is the one making that call, and it doesn't work right, it's arguably
> a Flash bug that it is making that call without checking the window state first.

Or, given the root cause, not really. Sorry you filed a bogus bug on our behalf!
Attached patch branch fixSplinter Review
Assignee: nobody → stuart.morgan+bugzilla
Status: NEW → ASSIGNED
Attachment #488659 - Flags: superreview?(mikepinkerton)
Comment on attachment 488659 [details] [diff] [review]
branch fix

sr=pink. do you want a comment explaining why this is necessary? it seems rather random.
Attachment #488659 - Flags: superreview?(mikepinkerton) → superreview+
I figured since it's branch only it's not worth the comment; we'll uncheck it in the nib on trunk.
Wow. Looking at the documentation for setOneShot: the bug seems pretty obvious now. Nice catch!
For reference, here's the BrowserWindow.nib as landed (one shot unchecked).

http://hg.mozilla.org/camino/rev/78f199411aa9
Comment on attachment 488659 [details] [diff] [review]
branch fix

And branch patch landed on CAMINO_2_0_BRANCH in advance of 2.0.6. :D

As always, these fixes will be in tomorrow's nightly builds (2.0.6pre and 2.1a1pre).
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Whiteboard: [camino-2.0.6]
Verified with both Silverlight and Flash with

Version 2.1a1pre (1.9.2.13pre 20101108001259)
Version 2.0.6pre (1.9.0.19pre 2010110800)


now with 100% more usable console logs :-)
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: