Closed Bug 1217190 Opened 4 years ago Closed 4 years ago

Printing is broken with e10s on OS X since a43f6bb86588

Categories

(Core :: Printing: Output, defect)

Unspecified
macOS
defect
Not set

Tracking

()

VERIFIED FIXED
mozilla45
Tracking Status
e10s m8+ ---
firefox44 + verified
firefox45 --- verified
b2g-v2.5 --- fixed

People

(Reporter: mconley, Assigned: mconley)

References

Details

(Keywords: regression, Whiteboard: [Possible workaround in comment 12])

Attachments

(3 files)

STR:

1) Ensure e10s is enabled
2) Attempt to print any web page

ER:

The page should be printed.

AR:

Nothing happens, and the browser doesn't allow you to try printing again, complaining that "Some printing functionality is currently not available".

mozregression places the blame on a43f6bb86588, from bug 1209930.
ted - any idea why updating clang might have impacted printing?
Blocks: 1209930
tracking-e10s: --- → ?
Flags: needinfo?(ted)
Keywords: regression
Depends on: 1217194
Let's not blame the compiler before bug 1217194 is fixed.
I don't have any particular ideas. If it persists after fixing that bug I'd try to track down the function that's failing and see if the codegen is different.
Flags: needinfo?(ted)
Assignee: nobody → mconley
Huh. I guess a straight backout isn't possible. :/
I was able to reproduce this issue on Firefox 44.0a1 (2015-10-23) with e10s enabled using Mac OS X 10.10.4. This is not reproducible on Firefox 43.0a2 (2015-10-22) with e10s enabled.
Flags: qe-verify+
So something that bolsters the claim that a43f6bb86588 is at fault, is that I can't reproduce this in a local build. It's only on builds from Mozilla where this problem seems to rear its head.

This suggests that a change to our build steps introduced the problem.

Ehsan - do you know if there is a way I can (temporarily) start using the same version of Clang that our build machines use?
Flags: needinfo?(ehsan)
Yes. Clone https://github.com/mozilla/build-tooltool, and then run:

tooltool.py fetch -m browser/config/tooltool-manifests/macosx64/releng.manifest

This will download and extract the toolchain we use on the build machines.
Flags: needinfo?(ehsan)
Unstated step there: you then need to set CC=/path/to/clang/bin/clang CXX=/path/to/clang/bin/clang++ (tooltool.py will unpack clang in the cwd).
Hrm, there must be more to it that I've failed to do:

 0:19.98 checking for --build-id option to ld... no
 0:20.01 checking for --ignore-unresolved-symbol option to ld... no
 0:20.03 checking if toolchain supports -mssse3 option... yes
 0:20.05 checking if toolchain supports -msse4.1 option... yes
 0:20.08 checking for x86 AVX2 asm support in compiler... yes
 0:20.10 checking whether the C++ compiler supports -Wno-unused-local-typedef... yes
 0:20.13 checking whether the C++ compiler supports -Wno-inline-new-delete... yes
 0:20.13 checking whether the C++ compiler supports -Wno-unused-local-typedef... (cached) yes
 0:20.15 checking whether the C++ compiler supports -Wno-extended-offsetof... yes
 0:20.17 checking for 64-bit OS... yes
 0:20.32 checking for iOS target... no
 0:20.32 /Users/mikeconley/Projects/mozilla-central/configure: line 9381: test: argument expected
 0:20.35 checking for -dead_strip option to ld... no
 0:20.38 checking for -allow_heap_execute option to ld... no
 0:20.41 checking for valid debug flags... no
 0:20.41 configure: error: These compiler flags are invalid: -g
 0:20.41 ------ config.log ------
 0:20.41 configure:9463: checking for -allow_heap_execute option to ld
 0:20.41 configure:9474: /Users/mikeconley/Projects/build-tooltool/clang/bin/clang -o conftest  -std=gnu99 -fno-strict-aliasing -Qunused-arguments   -Wl,-allow_heap_execute conftest.c  1>&5
 0:20.41 ld: library not found for -lcrt1.10.6.o
 0:20.41 clang-3.8: error: linker command failed with exit code 1 (use -v to see invocation)
 0:20.41 configure: failed program was:
 0:20.41 #line 9467 "configure"
 0:20.41 #include "confdefs.h"
 0:20.41
 0:20.41 int main() {
 0:20.41 return 0;
 0:20.41 ; return 0; }
 0:20.41 configure:10333: checking for valid debug flags
 0:20.41 configure:10344: /Users/mikeconley/Projects/build-tooltool/clang/bin/clang -c  -std=gnu99 -fno-strict-aliasing -g -Qunused-arguments  conftest.c 1>&5
 0:20.41 configure:10338:10: fatal error: 'stdio.h' file not found
 0:20.41 #include <stdio.h>
 0:20.41          ^
 0:20.41 1 error generated.
 0:20.41 configure: failed program was:
 0:20.41 #line 10337 "configure"
 0:20.41 #include "confdefs.h"
 0:20.41 #include <stdio.h>
 0:20.41 int main() {
 0:20.41 printf("Hello World\n");
 0:20.41 ; return 0; }
 0:20.41 configure: error: These compiler flags are invalid: -g
 0:20.41 *** Fix above errors and then restart with               "/Applications/Xcode.app/Contents/Developer/usr/bin/make -f client.mk build"
 0:20.41 make[2]: *** [configure] Error 1
 0:20.41 make[1]: *** [/Users/mikeconley/Projects/mozilla-central/obj-debug/Makefile] Error 2
 0:20.41 make: *** [build] Error 2
 0:20.44 330 compiler warnings present.

Halp?
Bug 1217190 - Make PrintingParent return an nsresult when attempting to show print progress. r?bobowen
Via several try builds, some logging, and some luck, I think I've narrowed this down to a problem with how we attempt to show print progress on OS X (which does, and always will, fail, since nsPrintingPromptServiceX returns NS_ERROR_NOT_IMPLEMENTED for ShowProgress).

A workaround is to try setting print.show_print_progress to false in about:config, and then try printing.
Whiteboard: [Possible workaround in comment 12]
Hey ehsan - when you have a second, can you try the workaround I suggested in comment 12 to see if you're able to print? That works for me locally.
Flags: needinfo?(ehsan)
That seems to work around the issue as you suspected.
Flags: needinfo?(ehsan)
I think I've narrowed this down to an uninitialized bool that gets pickled strangely by the chromium IPC code in the e10s case.

Try build with speculative fix (and logging): https://treeherder.mozilla.org/#/jobs?repo=try&revision=eb81717eae79
This is a sad tale, where the main character is my flawed mental model of how outparams work over IPC.

I had assumed that if you passed an outparam over IPC initialized to some value, and the receiver didn't touch it, that the outparam would still have the same value when the message had returned.

This is NOT the case.

The generated IPC code on the receiver side will pass uninitialized memory to the receiving method, and then if the receiver doesn't touch it, that uninitialized memory will be passed back to the sender as the value, overwriting the initialized value that the outparam had been originally set to.
Attachment #8679081 - Flags: review?(bobowen.code)
Comment on attachment 8679081 [details]
MozReview Request: Bug 1217190 - Make PrintingParent return an nsresult when attempting to show print progress. r?bobowen

Bug 1217190 - Make PrintingParent return an nsresult when attempting to show print progress. r?bobowen
Bug 1217190 - Make sure to initialize outparam in PrintingParent::RecvShowProgress. r?bobowen
Attachment #8680793 - Flags: review?(bobowen.code)
Comment on attachment 8679081 [details]
MozReview Request: Bug 1217190 - Make PrintingParent return an nsresult when attempting to show print progress. r?bobowen

https://reviewboard.mozilla.org/r/23365/#review21243

::: embedding/components/printingui/ipc/nsPrintingProxy.cpp:154
(Diff revision 2)
> +  if (NS_FAILED(rv)) {

Looks like existing code doesn't warn when this fails, so I agree not warning here makes sense.
Attachment #8679081 - Flags: review?(bobowen.code) → review+
Comment on attachment 8680793 [details]
MozReview Request: Bug 1217190 - Make sure to initialize outparam in PrintingParent::RecvShowProgress. r?bobowen

https://reviewboard.mozilla.org/r/23683/#review21247

::: embedding/components/printingui/ipc/PrintingParent.cpp:36
(Diff revision 1)
> +  *notifyOnOpen = false;

Should we change IPDL generation so that the outparams work more like normal ones?

If not we should put something in place to try and catch when they haven't been set by the Recv*.
Attachment #8680793 - Flags: review?(bobowen.code) → review+
https://reviewboard.mozilla.org/r/23683/#review21247

> Should we change IPDL generation so that the outparams work more like normal ones?
> 
> If not we should put something in place to try and catch when they haven't been set by the Recv*.

Yeah, I'm going to file a follow-up to investigate ways of filing off this footgun.
https://hg.mozilla.org/integration/mozilla-inbound/rev/265a761a1248e0a26b0f51709ec6d235cead8fba
Bug 1217190 - Make PrintingParent return an nsresult when attempting to show print progress. r=bobowen

https://hg.mozilla.org/integration/mozilla-inbound/rev/ffb2a9217a0cccabc1a3a380982c90a2a51ef2db
Bug 1217190 - Make sure to initialize outparam in PrintingParent::RecvShowProgress. r=bobowen
Filed follow-up bug 1220168 for trying to address the footgun.
https://hg.mozilla.org/mozilla-central/rev/265a761a1248
https://hg.mozilla.org/mozilla-central/rev/ffb2a9217a0c
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla45
Comment on attachment 8679081 [details]
MozReview Request: Bug 1217190 - Make PrintingParent return an nsresult when attempting to show print progress. r?bobowen

Approval Request Comment
[Feature/regressing bug #]:

Bug 1209930 

[User impact if declined]:

Users with e10s enabled on OS X are unlikely to be able to print.

[Describe test coverage new/current, TreeHerder]:

This patch has baked over the weekend. Manually tested. Unfortunately, our automated test story for printing is pretty sad.

[Risks and why]: 

Very low risk. Only affects e10s.

[String/UUID change made/needed]:

None.
Attachment #8679081 - Flags: approval-mozilla-aurora?
Comment on attachment 8680793 [details]
MozReview Request: Bug 1217190 - Make sure to initialize outparam in PrintingParent::RecvShowProgress. r?bobowen

See above.
Attachment #8680793 - Flags: approval-mozilla-aurora?
Comment on attachment 8679081 [details]
MozReview Request: Bug 1217190 - Make PrintingParent return an nsresult when attempting to show print progress. r?bobowen

This is a critical bug, the sooner we uplift the patch, the sooner we can get more test coverage. Let's uplift to Aurora44.
Attachment #8679081 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8680793 [details]
MozReview Request: Bug 1217190 - Make sure to initialize outparam in PrintingParent::RecvShowProgress. r?bobowen

Aurora44+
Attachment #8680793 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Vasicila, could you please verify this is fixed on an Aurora build once the fix lands on Aurora branch? Sometime after the next day or two would be good. Many thanks!
Flags: needinfo?(vasilica.mihasca)
I'd like to track this bug as it seems surprising that such a basic scenario regressed.
(In reply to Ritu Kothari (:ritu) from comment #33)
> I'd like to track this bug as it seems surprising that such a basic scenario
> regressed.

Yes, it's unfortunate that we don't have much in the way of automated testing for actually sending data to a printer. That would have caught this.
(In reply to Ritu Kothari (:ritu) from comment #31)
> Vasicila, could you please verify this is fixed on an Aurora build once the
> fix lands on Aurora branch? Sometime after the next day or two would be
> good. Many thanks!

Verified fixed on Firefox 45.0a1 (2015-11-05) and Firefox 44.0a2 (2015-11-05) under Mac OS X 10.9.5 and 10.10.4. The printing works as expected.
Status: RESOLVED → VERIFIED
Flags: needinfo?(vasilica.mihasca)
removing the b2g 2.5 flag since this commit has been reverted due to an incorrect merge, sorry for the confusion
Duplicate of this bug: 1217194
You need to log in before you can comment on or make changes to this bug.