Closed Bug 1152921 Opened 5 years ago Closed 5 years ago

[e10s] crash when attempting to print, due to fatal assertion in chromium IPC code

Categories

(Core :: Printing: Output, defect)

x86_64
Linux
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla40
Tracking Status
e10s + ---
firefox40 --- fixed

People

(Reporter: dbaron, Assigned: dbaron)

References

Details

(Keywords: assertion, crash)

Attachments

(4 files, 1 obsolete file)

I crash every time when attempting to print from a Linux debug build, and have since e10s was enabled.  It's pretty annoying that it crashes the whole build rather than failing in some better way.

It's an abort from inside chromium IPC code.  In PPrintingParent.cpp, the stack goes through this Read call:

    if ((!(Read((&((__v)->isFramesetDocument())), __msg, __iter)))) {
        FatalError("Error deserializing 'isFramesetDocument' (bool) member of 'PrintData'");
        return false;
    }


The chromium IPC call is asserting that the bool it's reading is either 0 or 1, and it's in fact 99.
Just to confirm, this is DEBUG-only? Also, can you make sure this still reproduces after a clobber build? It sounds like the sort of thing that might happen with a broken build.
Flags: needinfo?(dbaron)
The assertion is DEBUG-only, yes.

I'm pretty sure AUTOCLOBBER has made me clobber multiple times in the time I've been seeing this.

(What is the expected behavior of printing on Linux e10s right now?  No-op?  A different crash?)
Flags: needinfo?(dbaron)
(Also, given that the IPC code asserts that reads of bools are 0 or 1 -- perhaps it should assert the same about writes?)
This is what valgrind says about the sending side.
I think the basic problem is that only nsPrintOptionsWin::SerializeToPrintData touches isFramesetDocument() (and even then not all the time), whereas the other implementations leave it uninitialized, which isn't OK.
This fixes the crash in Linux debug builds bringing up the print dialog,
though there's still a crash when closing it.
Attachment #8592476 - Flags: review?(wmccloskey)
Assignee: nobody → dbaron
Status: NEW → ASSIGNED
Comment on attachment 8592476 [details] [diff] [review]
Always initialize platform-specific booleans in PrintData before sending over IPC

Review of attachment 8592476 [details] [diff] [review]:
-----------------------------------------------------------------

I don't really know this code. Mike, is there any reason why we're not filling these fields in with the correct values?
Attachment #8592476 - Flags: review?(wmccloskey) → review?(mconley)
Comment on attachment 8592476 [details] [diff] [review]
Always initialize platform-specific booleans in PrintData before sending over IPC

(In reply to Bill McCloskey (:billm) from comment #8)
> Comment on attachment 8592476 [details] [diff] [review]
> Always initialize platform-specific booleans in PrintData before sending
> over IPC
> 
> Review of attachment 8592476 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I don't really know this code. Mike, is there any reason why we're not
> filling these fields in with the correct values?

Probably just because I didn't realize I needed to initialize the fields in the case they weren't being used. :(

To be extra safe, perhaps we should auto-initialize all of the platform specific fields, just to protect against the individual implementations changing?
Attachment #8592476 - Flags: review?(mconley) → review+
(In reply to Mike Conley (:mconley) - Needinfo me! from comment #9)
> To be extra safe, perhaps we should auto-initialize all of the platform
> specific fields, just to protect against the individual implementations
> changing?

Do you mean in the constructor?

That said, this patch is in the base class, not the GTK implementation, so the idea is that the base class method (which the win/gtk overrides call for most of their work) would always initialize everything.
Flags: needinfo?(mconley)
What I'm asking is whether there's some value we should be using instead of false. For example, why do we only initialize isFramesetFrameSelected on Windows? It seems like we could do it on all platforms. Even if it's not used, it seems cleaner to me.
This fixes a crash in debug builds (due to uninitialized booleans) when
canceling a print dialog.
Attachment #8592517 - Flags: review?(mconley)
(In reply to Bill McCloskey (:billm) from comment #11)
> What I'm asking is whether there's some value we should be using instead of
> false. For example, why do we only initialize isFramesetFrameSelected on
> Windows?

Going over my notes, it looks like I did this because the Windows implementation of nsPrintDialogUtil.cpp attempts to access the nsIWebBrowserPrint[1] from the content to be printed.

An nsIWebBrowserPrint, however, does not serialize nicely. Instead, I took the subset of items that nsPrintDialogUtil reads from nsIWebBrowserPrint, send those up via the PrintData struct, and then wrap the PrintData struct in a "mock" nsIWebBrowserPrint[2] that just returns those values that nsPrintDialogUtils needs.

> It seems like we could do it on all platforms. Even if it's not
> used, it seems cleaner to me.

I'd be fine with that. I can file a follow-up bug to do that work.


[1]: https://hg.mozilla.org/mozilla-central/file/388f5861dc7d/embedding/components/printingui/win/nsPrintDialogUtil.cpp#l1044
[2]: https://dxr.mozilla.org/mozilla-central/source/embedding/components/printingui/ipc/PrintDataUtils.cpp#69
Flags: needinfo?(mconley)
Filed bug 1154519.

(In reply to David Baron [:dbaron] ⏰UTC-7 from comment #10)
> (In reply to Mike Conley (:mconley) - Needinfo me! from comment #9)
> > To be extra safe, perhaps we should auto-initialize all of the platform
> > specific fields, just to protect against the individual implementations
> > changing?
> 
> Do you mean in the constructor?
> 
> That said, this patch is in the base class, not the GTK implementation, so
> the idea is that the base class method (which the win/gtk overrides call for
> most of their work) would always initialize everything.

I completely didn't realize that we initialize strings, but we don't initialize the primitive types. billm just explained it to me, so disregard what I said.
Filed bug 1154522 to consider a way to address this more generally.
I think I'd like to see how bug 1154522 goes before I review this other patch. If bug 1154522 goes through, it'll make this second patch unnecessary.

Actually it'll avoid the crash in both cases.
FWIW, I discussed initialization with bent prior to writing the second patch, and he thought we should (instead of initializing) do what's in the patch, at least partly because the valgrind warnings are useful in catching bugs and initializing would suppress those.
(And bent dictated most of the second patch over my shoulder, thus the credit to him in the From: line.)
I'll update the second patch with a comment and an assertion that I should have had before...
This fixes a crash in debug builds (due to uninitialized booleans) when
canceling a print dialog.
Attachment #8593081 - Flags: review?(mconley)
Attachment #8592517 - Attachment is obsolete: true
Attachment #8592517 - Flags: review?(mconley)
Thanks dbaron - I'll see if I can prod bug 1154522 to some kind of resolution.
tracking-e10s: --- → +
Comment on attachment 8593081 [details] [diff] [review]
Don't send an uninitialized PrintData over IPC when cancelling print dialog (or failing ShowPrintDialog for other reasons)

Review of attachment 8593081 [details] [diff] [review]:
-----------------------------------------------------------------

Sounds like bug 1154522 is unlikely to be resolved soonish, and as bent pointed out, this should result in less IPC data transfer in the failure case.

Thanks dbaron!
Attachment #8593081 - Flags: review?(mconley) → review+
https://hg.mozilla.org/mozilla-central/rev/da7ce9761d8e
https://hg.mozilla.org/mozilla-central/rev/a8cfb09ecc21
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
You need to log in before you can comment on or make changes to this bug.