MOZ_CAN_RUN_SCRIPT related Thunderbird build failures of 2020-04-04
Categories
(MailNews Core :: Backend, defect, P1)
Tracking
(thunderbird76 fixed)
Tracking | Status | |
---|---|---|
thunderbird76 | --- | fixed |
People
(Reporter: mkmelin, Assigned: benc)
References
Details
Attachments
(3 files, 1 obsolete file)
2.68 KB,
patch
|
mkmelin
:
review+
wsmwk
:
approval-comm-beta+
|
Details | Diff | Splinter Review |
16.69 KB,
patch
|
mkmelin
:
review+
|
Details | Diff | Splinter Review |
115.41 KB,
patch
|
Details | Diff | Splinter Review |
+++ This bug was initially created as a clone of Bug #1619914 +++
Thought I had fixed it since it build locally, but this check requires this to be set.
ac_add_options --enable-clang-plugin
I don't see a very easy solution here. Just adding MOZ_CAN_RUN_SCRIPT doesn't work for NS_IMETHODIMPs like nsMsgCopyService::CopyMessages
MOZ_CAN_RUN_SCRIPT_BOUNDARY NS_IMETHODIMP seem to help there, but that gives errors such as
/home/magnus/Code/tb/mozilla3/comm/mailnews/base/src/nsMessenger.cpp:1411:7: error: arguments must all be strong refs or caller's parameters when calling a function marked as MOZ_CAN_RUN_SCRIPT (including the implicit object argument). 'mTxnMgr' is neither.
Any good ideas?
An explanation of how it's supposed to work is apparently here: https://searchfox.org/comm-central/rev/14bbd9a262f7ca60026345024fbd2ee461c04268/mozilla/build/clang-plugin/CanRunScriptChecker.cpp#14-48
Reporter | ||
Updated•5 years ago
|
Reporter | ||
Comment 1•5 years ago
|
||
Updated•5 years ago
|
I realize why you took this action, but this is entirely backwards. What really needs to be done is to enable clan-plugin on all developers builds so that this does not happen.
Actually, what would probably be better for situations such as this would be if you could specify --enable-clang-plugin=warn to make normally fatal errors be warnings, rather than building without it altogether.
Reporter | ||
Comment 5•5 years ago
|
||
We'll enable it again as soon as we get CI build working with it enabled - just can't afford not to find other problems that could creep in while we figure it out. The real fix doesn't appear super trivial for a quick weekend action.
I was not criticizing you for deciding to build without it. I was trying to make a comment on what should be done for proper development going forward. There is no reason for any developers to not be building with the clang-plugin enabled. Secondly if something odd happens and something lads it would be better if there were a way to make all error clang-plugin checks to be warnings so that you don't have to disable it completely so there is still a mechanism to see new things landing that are issues.
Comment hidden (Intermittent Failures Robot) |
Comment 9•5 years ago
|
||
Another possibility is to have another build job in CI that can run daily with --enable-clang-plugin-alpha set. (I'm assuming these checks will show up there prior to being being made default.) The idea being that we will get a little more notice that something is coming. Make is a tier 2 job so it doesn't throw anything off.
Comment 10•5 years ago
|
||
(In reply to mac198442 from comment #4)
Actually, what would probably be better for situations such as this would be if you could specify --enable-clang-plugin=warn to make normally fatal errors be warnings, rather than building without it altogether.
Is that a feature suggestion? It doesn't work:
0:04.37 mozbuild.configure.options.InvalidOptionError: --enable-clang-plugin takes 0 values
Comment 11•5 years ago
|
||
Updated•5 years ago
|
Comment 12•5 years ago
|
||
(In reply to Rob Lemley [:rjl] from comment #10)
(In reply to mac198442 from comment #4)
Actually, what would probably be better for situations such as this would be if you could specify --enable-clang-plugin=warn to make normally fatal errors be warnings, rather than building without it altogether.
Is that a feature suggestion? It doesn't work:
0:04.37 mozbuild.configure.options.InvalidOptionError: --enable-clang-plugin takes 0 values
I am sorry that was a feature suggestion. Probably does not belong in this bug.
Reporter | ||
Comment 13•5 years ago
|
||
I think it's not a problem to do a beta with it disabled.
Reporter | ||
Updated•5 years ago
|
Assignee | ||
Comment 14•5 years ago
|
||
(In reply to Magnus Melin [:mkmelin] from comment #0)
MOZ_CAN_RUN_SCRIPT_BOUNDARY NS_IMETHODIMP seem to help there, but that gives errors such as
/home/magnus/Code/tb/mozilla3/comm/mailnews/base/src/nsMessenger.cpp:1411:7: error: arguments must all be strong refs or caller's parameters when calling a function marked as MOZ_CAN_RUN_SCRIPT (including the implicit object argument). 'mTxnMgr' is neither.Any good ideas?
I think it's mostly a matter of adding [can_run_script]
decorators in .idl files, to all the interface methods which might eventually invoke a script (or call something else which might invoke a script). [can_run_script]
adds the MOZ_CAN_RUN_SCRIPT
into the generated C++ for you.
In practice, I think it'll be a case of the [can_run_script]
methods in the M-C interfaces causing [can_run_script]
ripples, spreading upward through the C-C codebase. A bit tedious, but all sane and sensible stuff.
There are a few other conditions to satisfy - eg scripts might decide to hold a reference to any params you pass in, way beyond the life of that function call. That includes the this
pointer. So you've got to be careful about refcounting stuff you pass in. I can see the static analysis preventing all kinds of subtle footguns here!
Anyway, I've made a start on it. Will see how far the [can_run_script]
ripples spread tomorrow!
Comment 15•5 years ago
|
||
Comment 16•5 years ago
|
||
Assignee | ||
Comment 17•5 years ago
|
||
Uploading what I've got so far as I might not be around so much over the next couple of days.
- marks most of the MOZ_CAN_RUN_SCRIPT functions needed to get the static analysis working, except for a couple of places where we're implementing non-can_run_script M-C interfaces. But they shouldn't be too hard to sort out.
- still a bunch of calls to fix up where we need to observe the parameter restrictions. Mostly trivial stuff, just tedious. Usually just a matter of adding a local nsCOMPtr<> to be really explicit that the parameter is live.
So I don't think we're too far off being able to compile with the static analysis back on.
Updated•5 years ago
|
Assignee | ||
Comment 19•5 years ago
|
||
Hmm. The changes are still rippling outward. It's hard to gauge progress as it only prints out a few error messages at a time (and of course, each fix has the potential to cause other errors).
I'm going to have one more run on it tomorrow, and if that doesn't nail them all I'll step back and reassess. Most likely that means establishing a minimal MOZ_CAN_RUN_SCRIPT_BOUNDARY perimeter, and logging a bug to try and better define it all.
I think the bigger issue is that we've got no clear boundary between the native and and script sides. Virtually any XPCOM object could (in theory) be implemented in script.
In practice, a lot of interfaces only have native implementations, so it probably makes sense to go through and document the ones we're prepared to commit to being non-scriptable, for a start.
And even then, the unit tests mock up a lot of these otherwise-native-only interfaces in script (which is a great capability we'd want to retain).
Comment 20•5 years ago
|
||
IMHO, MOZ_CAN_RUN_SCRIPT_BOUNDARY is the way to go, otherwise you'll end up with a huge patch, see:
https://searchfox.org/comm-central/search?q=MOZ_CAN_RUN_SCRIPT_BOUNDARY&case=false®exp=false&path=mailnews
Assignee | ||
Comment 21•5 years ago
|
||
(In reply to Jorg K (CEST = GMT+2) from comment #20)
IMHO, MOZ_CAN_RUN_SCRIPT_BOUNDARY is the way to go, otherwise you'll end up with a huge patch
For sure. That's the way I'll go now.
But it's essentially bypassing all the C++/script boundary static analysis, which would be nice to have.
I figured it was worth a shot to see if we could go the whole hog and mark everything "correctly". As various M-C interfaces are marked, there might have been a restricted effect in C-C code which was managable.... but no. The ripples quickly turn into a tsunami :-(
Oh well. For now I think its most important to just have the rest of the static analysis going, even if we don't get the benefit of the C++/script boundary checking. But further down the road, it'd be really nice to really nail down where the native/script boundaries lie in TB. I think that'd clarify a whole load of other issues too.
Comment 22•5 years ago
•
|
||
FYI: Adding [can_run_script]
to editor interface methods means that the method calls may run mutation event listeners, mutation observers, input
event listeners, selectionchange
event listeners and layout information observers (I don't know the detail of this). JS can destroy the editor and can change the DOM tree. Therefore, each caller of them needs to guarantee the method's class lifetime and arguments' lifetime. On the other hand, if callers are sure that there are only safe event listeners which don't do that. I.e., cannot be handled by web contents (including HTML email), in other words, it changes only in chrome and which is not touchable from addons, the callers can be marked as MOZ_CAN_RUN_SCRIPT_BOUNDARY
and it's better strategy than marking all callers as MOZ_CAN_RUN_SCRIPT
.
Reporter | ||
Comment 23•5 years ago
|
||
If you had a more complete wip patch for the [can_run_script], can you attach it?
And should we do both? Use MOZ_CAN_RUN_SCRIPT_BOUNDARY but mark stuff as [can_run_script] where it's needed?
Comment 24•5 years ago
|
||
Does this actually block 76 beta2?
Assignee | ||
Comment 25•5 years ago
|
||
This one is the tightest MOZ_CAN_RUN_SCRIPT_BOUNDARY
circle I could draw around the nsITransaction
methods. Gets it compiling for me with ac_add_options --disable-clang-plugin
in my mozconfig.
Here's the approach I used, as I'm sure this will come up again:
- Where possible add the MOZ_CAN_RUN_SCRIPT_BOUNDARY
decoration to the method declaration in the
.h` file (seems to be the preferred approach). - In a lot of places the offending method is declared in a
NS_DECL_...
macro, so there I decorated the method definition in the.cpp
file instead. - Any calls to
[can_run_script]
(akaMOZ_CAN_RUN_SCRIPT
) methods need to ensure none of the arguments can possibly be destroyed during the call - includingthis
. So I addednsCOMPtr<>
orRefPtr<>
locals where required.
Assignee | ||
Comment 26•5 years ago
|
||
This is the abortive attempt to use MOZ_CAN_RUN_SCRIPT
and avoid the use of MOZ_CAN_RUN_SCRIPT_BOUNDARY
. There was a bit more, but I'm afraid I ditched it in frustration.
I still don't entirely grasp the intent of the MOZ_CAN_RUN_SCRIPT
analysis, in the context of C-C.
Comment 22 seems to imply that it's for script running in addons or "in-the-wild" (I guess we don't have too much of that, although presumably there's some issue with html emails..).
Are we able to nail down where those boundaries lie, and extend out our MOZ_CAN_RUN_SCRIPT
circle appropriately?
That said, it seems that a lot of the checking is applicable to any C++/Javascript boundary. But doing that kind of sucks in pretty much our entire codebase...
I don't know. I love the approach of having static analysis detect potentially dodgy calls to script land, but I don't quite grasp how we can best use it. Happy to have a go at improving the approach if anyone knows which direction we should go in...
In any case, if my minimal MOZ_CAN_RUN_SCRIPT_BOUNDARY
patch works, at least we can reenable static analysis (and hopefully set or recommend it as a default for developers doing local builds).
Reporter | ||
Comment 27•5 years ago
|
||
Comment 28•5 years ago
|
||
Reporter | ||
Comment 29•5 years ago
|
||
Comment 30•5 years ago
|
||
Assignee | ||
Comment 31•5 years ago
|
||
(In reply to Magnus Melin [:mkmelin] from comment #29)
Comment on attachment 9141604 [details] [diff] [review]
So should we salvage some of this? Is it useful?
No, I think that way madness lies. We end up with just about every single function requiring MOZ_CAN_RUN_SCRIPT
.
I think, for now, using MOZ_CAN_RUN_SCRIPT_BOUNDARY
to fence off code which calls MOZ_CAN_RUN_SCRIPT
/[can_run_script]
methods is the way to go. Ideally we'd have a clear idea of where the script/native boundaries really are, and use MOZ_CAN_RUN_SCRIPT
up to there, but I don't think we have that clear delineation.
Reporter | ||
Comment 32•5 years ago
|
||
Have you done a try build with https://hg.mozilla.org/comm-central/rev/3cdb2ce79d79 backed out?
Assignee | ||
Comment 33•5 years ago
|
||
(In reply to Magnus Melin [:mkmelin] from comment #32)
Have you done a try build with https://hg.mozilla.org/comm-central/rev/3cdb2ce79d79 backed out?
I assume you meant backing out https://hg.mozilla.org/comm-central/rev/0d3828ea7422 also?
Try build here, which I think looks OK, as far as this bug goes:
Reporter | ||
Updated•5 years ago
|
Comment 34•5 years ago
|
||
Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/f249a765cc4b
Backed out changeset 3cdb2ce79d79 (bug 1627437) now that 1627437 is fixed (adjustments for nsITransaction::Do/UndoTransaction MOZ_CAN_RUN_SCRIPT). r=backout
https://hg.mozilla.org/comm-central/rev/eaf2127db4f3
Backed out changeset 0d3828ea7422 (for beta/release) now that 1627437 is fixed (adjustments for nsITransaction::Do/UndoTransaction MOZ_CAN_RUN_SCRIPT). r=backout DONTBUILD
Comment 35•5 years ago
|
||
backout bugherder uplift |
Thunderbird 76.0b3:
https://hg.mozilla.org/releases/comm-beta/rev/da0a04cfc049
Comment 36•5 years ago
|
||
bugherder uplift |
Thunderbird 76.0b3:
https://hg.mozilla.org/releases/comm-beta/rev/c6fef8bbb3fc
Thunderbird 76.0b3:
https://hg.mozilla.org/releases/comm-beta/rev/12de722b5dc2
Updated•5 years ago
|
Description
•