776017 - Submit addon for admin review to bypass validation (fixes: Yoono addon fails on validation during upload)

Reporter

Description

•

13 years ago

Attached file Yoono xpi (release build) — Details

After uploading the xpi the validation always fails with a "timeout after 90 seconds" error. I've reproduced the error on prod and staging.

Kumar McMillan [:kumar]

Comment 1

•

13 years ago

basta, anything we can do to speed this up? Any immediate workarounds you can suggest for the developer? $ time ./addon-validator -o json --determined ~/tmp/yoono_7.7.12_full.xpi ... real 1m0.605s user 0m45.263s sys 0m13.983s

Matt Basta [:basta]

Assignee

Comment 2

•

13 years ago

Of the 60 seconds that the the timeout is set to locally, 52 of those involve JS processing. Of those 52 seconds, ~30 are simply the calls out to Spidermonkey to parse the JS. 7 seconds alone are spent parsing JSON. There's a LOT of JS in the add-on. My best advice is to simply cut out unnecessary JS and use approved libraries (like Underscore) whenever possible instead of reinventing the wheel. Concatenating JS files that are loaded together will also reduce overhead. Otherwise, this isn't really a defect in the validator, it's just that there's more work to be done than can be accomplished in the 90 seconds that the add-on has to validate.

Status: NEW → RESOLVED

Closed: 13 years ago

Resolution: --- → WONTFIX

Andrew Williamson [:eviljeff]

Reporter

Comment 3

•

13 years ago

This shouldn't be a fatal error that prevents the add-on from being uploaded to AMO though. At the very least it should let the upload proceed and we'll decide if the add-on has unnecessary JS and fits policy. Ideally any results before the timeout is hit should be returned.

Status: RESOLVED → REOPENED

Resolution: WONTFIX → ---

Matt Basta [:basta]

Assignee

Comment 4

•

13 years ago

I stand by my original resolution of WONTFIX: 1. We can make the timeout message a warning, but then nobody would pay attention to it. 2. JavaScript validation happens at the very end of validation. If validation times out, the bulk of JS validation will simply get tossed out. 3. Compatibility bumps will also time out and compatibility warnings and errors will not be raised. 4. The add-on is 11.4MB (!) decompressed and the vast majority of that is JavaScript. If the validator isn't able to process the contents of the package in 90 seconds, it's doubtful that a human would be capable of finding a problem buried in all that code. (Not that I don't have faith in our reviewers, it's the sheer volume of the code) We've only ever seen one add-on like this before (which one escapes me), but we should have an alternate channel for getting add-ons like this reviewed. As a sidebar, do we have policies relating to the performance impact of add-ons? If so, I'd say we should reject this one on those principles. The amount of resources consumed by this add-on are conceivably astronomical.

Andrew Williamson [:eviljeff]

Reporter

Comment 5

•

13 years ago

There isn't an alternative way of getting the Add-on on AMO. If you want to implement a checkbox we (Staff) can click to bypass validation, or an alternative upload form then I'm okay with that. Yoono is a featured addon and has been on AMO for a long while. The previous version was uploaded fine and doesn't seem significantly different in terms of scale. The symptoms points to validator changes (perhaps more checks?) rather than add-on changes. We don't need to review all the code - just what's changed - and we don't prohibit large and complicated addons by policy. Performance concerns are out of the scope of this bug really.

Xavier

Comment 6

•

13 years ago

Thanks Andrew ! Matt, we do take into account the performances recommandations for add-ons as much as we can, which is for instance why we don't merge all of our scripts, so that we can load some of them only if and when they are needed. In this release, a large part of the modifications are related to namespacing some stuff to make sure there is no interference with other add-ons, as requested by the reviewer from the previous release. Another large part of the modifications deal with css prefix removals in CSS: -moz-border-radius-bottomleft becoming border-bottom-left-radius so that the aspect is still fine in FF 14. The rest is mostly the replacement of the Facebook REST API with Graph API since the REST API is deprecated. The size of the add-on is very similar to the previous one if not even a little smaller. If we did sometimes re-invent the wheel, it's because Yoono came before the wheel (namely Underscore), at which point things like "prototype" wouldn't do performance-wise as well as security-wise since it contained evals, for instance. Yoono is very rich in features (try it ;), which is why it's big. I guess you don't assume it's all junk that we forgot to delete :) It has been proved to make people stick to Firefox when Firefox-based Rockmelt browser was first released :) However I did find a few obsolete files, that lower the size of the XPI by about 60Kb, which is not much, and won't still pass validation, but we are working on a new feature that is really big and not even included in this release... If no other (or just one other) extension hits the timeout issue, then increasing it should have no impact... We are aware it's a lot to review, which is why we do provide by email a diff file for each new submission we make, to help reviewers deal with the volume. Thank you all for your understanding !

Matt Basta [:basta]

Assignee

Comment 7

•

13 years ago

Xavier: Thanks for adding your feedback. To address a few things: - The validator is unlike Firefox or other Mozilla apps in that the performance of the script itself doesn't impact execution time, the size of the script does. Even if files aren't loaded by the add-on, they still get validated. Our custom JS interpreter isn't built to accurately execute your code, it's designed to find patterns that could cause problems with your add-on's execution. The reason we're sensitive to having many files is because--in order to provide you with useful filenames, context snippets, and line numbers in your validation report--we have to fire up an instance of Spidermonkey for every chunk of JS in your add-on. The more chunks of JavaScript, the more calls to Spidermonkey. One recommendation I'd make is to try to concatenate JavaScript files that would otherwise be loaded together. By concatenating these, fewer (expensive) calls to Spidermonkey need to be made, fewer strings need to be built, allocated, and deallocated, etc., helping the validation to move a lot more quickly. I would also highly recommend looking into shimming libraries like Underscore or jQuery in where you can. I recognize that you've got a lot of legacy code, but each file that can be replaced with a library gets immediate perf benefits (both during validation and likely during actual usage). The validator knows to ignore files that it recognizes, preventing that code from being validated unnecessarily. I see that you've already got a copy of jQuery included, though it's not a canonical copy. Consider downloading the official 1.6.3 release and including it (unmodified) instead. Another way to improve validation performance is to minimize the number of warnings and notices being generated. An example that I can see from the validation output is to remove json.js, JSON.js, and json2.js (which are rendered rather defunct in the last 15 versions or so of Firefox), which generate a handful of errors. Each bit of output requires the validator to grab a copy of the file and clip it to form a context snippet. I'll talk to Kumar about getting the timeout raised, but that's a temporary (and with a big enough bump, potentially dangerous) fix. The timeout is to catch bugs in the validator that lead to infinite loops and to prevent add-ons that consume incredible amounts of resources from fubaring the celery boxes. There's also a timeout on celery tasks themselves which will cause validation to fail in nondescript, 500-error kinds of ways. Having a validator-level timeout prevents that from happening and returns (at the very least) a list of problems found up until the timeout occurred. In the mean time, try refactoring with some of the tips I mentioned above.

Kumar McMillan [:kumar]

Comment 8

•

13 years ago

As a temporary workaround we can try upping the timeout. Let's try it on -dev to see if it lets this add-on though. Possible long-term solutions: 1. From eviljeff: Allow editors to bypass the validator with a checkbox or something so that authors can at least submit an xpi 2. Split validation into subtasks. This is the best solution but would require a big change. Instead of run all the validation in one task, we could validate each file in separate tasks and stitch the results together later. Since this add-on is only the 2nd we know of that exceeds our timeout, solution #1 might be better to try first.

Andy McKay

Comment 9

•

13 years ago

Could we have the timeout be configurable per task? A checkbox where editors only could submit an addon and then choose to not have it timeout. I still think the value of having the validator run should be kept.

Matt Basta [:basta]

Assignee

Comment 10

•

13 years ago

The option would also have to disable Celery's built-in timeouts. We have a validator timeout to begin with so that we can at least get *something* out of the validation process if it takes a long time rather than getting an error code back.

Kumar McMillan [:kumar]

Comment 11

•

13 years ago

Yep, the celery timeout is per process so we can't adjust that (but we can set it really high). We can adjust the validator timeout at runtime based on an editor setting, sure.

Xavier

Comment 12

•

13 years ago

Hi again, any information on this issue ? Since Facebook broke again its API we would like to release pretty soon, otherwise our users can't add their Facebook account. Thanks

Matt Basta [:basta]

Assignee

Comment 13

•

13 years ago

I increased the timeout by about fifteen seconds. If it hasn't gone live already, it'll go live tomorrow.

Xavier

Comment 14

•

13 years ago

Hi again and thanks. Unfortunately, either it's not enough, or it is not live yet :( It actually timed out at 80 seconds, although my measurement can't be accurate since it seems to start validation while the upload progress bar has not yet reached 100% (but title goes from 'sending' to 'validating' at about 80%) I'll try again later. If I can share my 2 cents about solutions proposed earlier: To me it would make more sense to modify Spidermonkey so that one instance can process several files (okay, easier said than done) instead of launching one instance per file. This would benefit to ALL thousands extensions at each validation, even if not by much, and would decrease significantly the total load on the validation servers. With the recent features such as sockets, workers, etc, I would not be surprised if more and more complex hence big extensions (like Yoono) became not as seldom as right now. A faster validation time would also encourage developers to actually take into consideration the warnings, fix them and submit again knowing they wouldn't have to wait too long for the results. We had a similar issue with the Yahoo JS code compressor that we use. The Java program would only process one file at a time, and launching one VM for each file would require an unrealistic time, so we modified it so that it could process several files, which it now does in a mere few seconds. The modification was admittedly probably easier than the one spidermonkey would require.

Matt Basta [:basta]

Assignee

Comment 15

•

13 years ago

(In reply to Xavier from comment #14) > To me it would make more sense to modify Spidermonkey so that one instance > can process several files (okay, easier said than done) instead of launching > one instance per file. I've already discussed this with a few of the other guys on the team and there are some issues that are preventing us from doing this: Approach 1: If we concatenate all the JS, that would speed up the process significantly. Unfortunately, a single syntax error would eliminate all validation for the entire add-on. Syntax errors aside, it would be impossible to get useful line numbers and contexts out of the results. Approach 2: Spin up a single Spidermonkey instance and iterate over the many JS files. This has a number of problems: 1.) If validation crashes hard, lots and lots of temp files get left on the disk. If the disk fills up, the server fails hard. 2.) There's no huge perf gains since a large chunk of the work takes place in the reflection code, which would still be run individually and synchronously. Approach 3: Use Python threads to parallelize calls to Spidermonkey. This has some benefits, but it is almost wholly negated by the complexity introduced into the codebase. Most of the objects in the validator are not thread safe or immutable, meaning most of the code would need to be rewritten. The other problem is that we can't simply spam the CPU with processes: if an add-on has a few hundred or a few thousand JS files, it's unreasonable to spin up a few hundred processes that each wants to max out a hardware thread. Instead, we'd need to use a pool, which (even with a relatively high cap) would still incur overhead in spinning up processes since Python's GIL makes all of those things block. Approach 4: Break the validation of each file into its own celery task and have the server figure it out. This is viable, but it's not at all a small undertaking. It's not reasonable for developers to need Rabbit installed locally to test add-ons with the CLI, so the existing code would still need to work. _Lots_ of code would need to be written and rewritten with little benefit for nobody but a handful of add-ons. > This would benefit to ALL thousands extensions at each validation, even if > not by much, and would decrease significantly the total load on the > validation servers. > With the recent features such as sockets, workers, etc, I would not be > surprised if more and more complex hence big extensions (like Yoono) became > not as seldom as right now. In general, the vast majority of validations take under a few seconds. Internally, we test add-ons and monitor the performance of the validator and over 90% take under three or four seconds to run. Strictly speaking, there are very, very few add-ons which have enough JavaScript to even come close to the timeout. I should also note that the benefits that I mentioned in comment #7 (especially those regarding the use of libraries) are not insignificant. Simply using canonical versions of libraries rather than re-minified versions is an easy benefit because it skips a few hundred thousand node traversals. Refactoring old code to effectively use the libraries that you're already including also helps validator performance (and in theory, real-life performance thanks to JIT in Firefox) in a non-insignificant way.

Xavier

Comment 16

•

13 years ago

Hi all, Unfortunately there is no way we can refactor that much code (about 260K lines) for now, let alone deal with the no-regression tests on all platforms (Google Chrome, iPhone, Xulrunner application for Mac and Windows, and Firefox) that use all or part of this code, we simply don't have the resources for that, as much as we wish we had. Our only alternative is to have the latest release (7.7.14) available from our website only, but most of the 400k users are stuck with release 7.7.10 for now :( In October Facebook will make breaking changes with its OAuth token handling, that we handle in 7.7.14, and that will cause 7.7.10 to no longer connect to Facebook. Twitter already announced they will release their 1.1 API, that will also break 7.7.10 badly. There is not much difference in size (and no difference at all in architecture, libraries or number of files) between release 7.7.10 that passed validation, and new ones that do not, so are we not overlooking another possible cause for the timeout ? It would be amazingly great if validation could be skipped with a check box somewhere :) Thanks all for all the thoughts given to this issue so far !

Andrew Williamson [:eviljeff]

Reporter

Comment 17

•

13 years ago

Hi Xavier, I understand refactoring all the code is infeasible in the short term, but just replacing a few libraries with versions directly from the developer (e.g. jquery) will make a difference on its own. (And is required anyway now for any new files). A checkbox to bypass validation isn't likely to implemented particularly quickly, and isn't really a solution to the underlying issues either.

Xavier

Comment 18

•

13 years ago

Although we've had in the past regressions on updating jQuery, I'm willing to try that but I need to know what is used by the validator to recognize jQuery as such ? I'm surprised it does not already, since we are using vanilla 1.6.3

Xavier

Comment 19

•

13 years ago

Just tried again, with JQuery and JQueryUI 1.8.0, still timing out. Is there a change that it's not recognized ? Is the file name important ? One more argument "against" such a large refactoring is that there is absolutely no guaranty that it would then pass validation. Also, our next big feature will add about 50K lines of Javascript. We can't put all our other projects (vital, for our company's future) on hold on a mere chance that we would meet some intermediate tool requirements that the target does not really have (even if it theoretically might benefit the target). Not having a validation process for big applications does not encourage feature-rich applications that really add value to the browser. It would be sad to have Firefox with all its APIs, workers, embedded sqlite, sockets, and all, to just see more ToDo Lists management extensions... ;) Let me know if you have other ideas I could test, thanks a lot.

Wil Clouser [:clouserw]

Comment 20

•

13 years ago

An idea I had this morning, posting so it's not lost: We catch the timeout as usual, but instead of bubbling the error up we re-submit the validation with a flag having it skip the complex JS processing. The new validation results are stored as usual and, instead of going into the regular approval queue, the new version gets automatically flagged for admin review (with a note saying why). I don't think that would be too technically difficult since the flows (upload into a queue) is mostly unchanged, just a matter of flagging it. A simpler and faster version would be to skip the second validation altogether and simply put the timed-out package into the admin queue saying it is too large to process. Then the admin reviewers could manually review it (including running the validator themselves off the command line).

Jorge Villalobos [:jorgev] (he/him)

Comment 21

•

13 years ago

That sounds good to me!

Andrew Williamson [:eviljeff]

Reporter

Comment 22

•

13 years ago

(In reply to Xavier from comment #18) > Although we've had in the past regressions on updating jQuery, I'm willing > to try that but I need to know what is used by the validator to recognize > jQuery as such ? > I'm surprised it does not already, since we are using vanilla 1.6.3 The validator compares a hash of the file to one pulled from jquery.com (or wherever) so the file has to be identical. Even re-saving in an IDE the file can mess up the encoding slightly and any build scripts need to skip it. Filenames aren't important. I just realised we didn't have jquery 1.8 so that wouldn't work till the next push anyway.

Xavier

Comment 23

•

13 years ago

Thanks Andrew, So I guess jQueryUI would not benefit from this, since it can be customized to only load useful stuff ? And thanks Will and Jorge, this feature would be uber awesome.

Andrew Williamson [:eviljeff]

Reporter

Comment 24

•

13 years ago

(In reply to Xavier from comment #23) > Thanks Andrew, > > So I guess jQueryUI would not benefit from this, since it can be customized > to only load useful stuff ? If its the full file it works (e.g. http://ajax.googleapis.com/ajax/libs/jqueryui/1.8.23/jquery-ui.min.js) but otherwise no.

Wil Clouser [:clouserw]

Updated

•

13 years ago

Assignee: nobody → mattbasta

Target Milestone: --- → 2012-09-06

Wil Clouser [:clouserw]

Updated

•

13 years ago

Target Milestone: 2012-09-06 → 2012-09-13

Matt Basta [:basta]

Assignee

Comment 25

•

13 years ago

I've done some experiments with parallelizing Spidermonkey access, though this doesn't provide any worthwhile benefit in CPython due to the GIL. Some quick benchmarks actually showed a slowdown of about 10% with this code, mostly due to contention around the subprocess module and locking and unlocking of semaphores to prevent process spamming. My next plan involves running all JavaScript parsing through a single instance of Spidermonkey, since that seems to be the single largest bottleneck in any given validation run. Currently, Spidermonkey processes are synchronously spun up and torn down for every script that's encountered. Ideally, a list of files would be passed to SM, which would iteratively parse each one and store the result in a corresponding temporary file. The files can be read and decoded asynchronously as the Spidermonkey process generates them. I can use the code from the previous experiment to make all this happen. I'll be working on this in the coming weeks. This is quick and dirty and should (for the time being) solve timeout issues for exceedingly large add-ons.

Xavier

Comment 26

•

13 years ago

Hi guys, Facebook changed some data format in its API (a string is now an object), which makes impossible to add a Facebook account into Yoono extension :( Most of our users use Yoono for Facebook. Is there a way we could submit our fix ? Thanks

Kumar McMillan [:kumar]

Updated

•

13 years ago

Summary: Yoono addon fails on validation during upload → Submit addon for admin review to bypass validation (fixes: Yoono addon fails on validation during upload)

Matt Basta [:basta]

Assignee

Comment 27

•

13 years ago

https://github.com/mozilla/amo-validator/commit/6635895bc2d1011b4e34d6a10593334f5cb988c5 This commit will cause the validator to skip exhaustive validation for any add-on that contains more than 1000 JS files. On a cold run, Yoono validates in 13 seconds. I'll refrain from reiterating the drawbacks of this. This effectively implements an equivalent solution to the proposal in comment #20.

Status: REOPENED → RESOLVED

Closed: 13 years ago → 13 years ago

Resolution: --- → FIXED

Kumar McMillan [:kumar]

Comment 28

•

13 years ago

Thanks Matt. This appears to only affect the validator. Can you open an additional bug(s) to hook this up to the submission flow? We need to flag these add-ons for reviewers so that reviewers know to spend additional time on the JS. Also, this needs a requirements.txt bump so it arrives in Zamboni.

Matt Basta [:basta]

Assignee

Comment 29

•

13 years ago

The requirements file has been bumped. The patch reports the big add-on with a warning message for now, so the editors can see (from the validation report) that the exhaustive validation process didn't take place. Did you want to show more/other messages?

Kumar McMillan [:kumar]

Comment 30

•

13 years ago

hm, I guess that'll probably work. I thought we might need to send it to a different review queue but it will go into pending like any other add-on so that's probably fine. Thanks!

Xavier

Comment 31

•

13 years ago

Thanks all. Is it already online ? I tried submitting, and didn't see a difference in the result, so I'm unsure if the file is queued for reviewing or not ?

Jorge Villalobos [:jorgev] (he/him)

Comment 32

•

13 years ago

This fix will be pushed live later today. I recommend that you try again tomorrow, and reopen the bug if the problem persists.

Xavier

Comment 33

•

13 years ago

♬ ♩ ♪ Oups it did it again ♫ ♩ ♬ I'm still getting the Timeout issue :(

Jorge Villalobos [:jorgev] (he/him)

Comment 34

•

13 years ago

Increasing severity due to Britney Spears references.

Severity: normal → major

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Target Milestone: 2012-09-13 → 2012-10-18

Matt Basta [:basta]

Assignee

Comment 35

•

13 years ago

I hit it, baby, one more time and it works for me: http://imgur.com/a/oBK6p

Kris Maglione [:kmag]

Comment 36

•

13 years ago

Just in case anyone needs a vaccine against the depravity in comment 33, for which the poster should be ashamed, ♫ Help! I need somebody. Help! Not just anybody. Help! You know I need someone. Help! ♫ ♬ I know the pieces fit, 'cause I watched them tumble down. No fault, none to blame... ♬ ♪ Freude, schöner Götterfunken, Tochter aus Elysium... ♪

Xavier

Comment 37

•

13 years ago

I may be missing something obvious, but we don't seem to be using the same interface, here is what I have : http://imgur.com/PDBWs (And, yes, Kris, I'm ashamed, what do you think. Sounds like the vaccine came in too late anyway. And next time I'll do it Oppa Gangman style)

Xavier

Comment 38

•

13 years ago

Hi guys, any news ? Should I threaten with more Britney references ? Thanks.

Wil Clouser [:clouserw]

Comment 39

•

13 years ago

comment 35 makes this sound like it worked. Kris or Jorge, can you try?

Xavier

Comment 40

•

13 years ago

But comment #37 shows in a screenshot that it fails, at least for me, and I'm stuck there... If you would like me to send the xpi I'm trying to submit, let me know. Thanks again.

Matt Basta [:basta]

Assignee

Comment 41

•

13 years ago

That would be useful.

Xavier

Comment 42

•

13 years ago

Ok Matt, email with attachment on its way, thanks.

Matt Basta [:basta]

Assignee

Comment 43

•

13 years ago

Lowered the threshold for aborting the exhaustive JS traversal: https://github.com/mozilla/amo-validator/commit/c402e4a82cfa262f30fa6d8bb29a05be53caeee7

Status: REOPENED → RESOLVED

Closed: 13 years ago → 13 years ago

Resolution: --- → FIXED

Xavier

Comment 44

•

13 years ago

I confirm it's no longer timeouting, it's *great*, thanks a lot !

Wil Clouser [:clouserw]

Comment 45

•

13 years ago

(In reply to Xavier from comment #44) > I confirm it's no longer timeouting, it's *great*, thanks a lot ! awesome. thanks everyone.

Status: RESOLVED → VERIFIED

Kris Maglione [:kmag]

Comment 46

•

13 years ago

Reopening, as the implementation only addresses this specific case of an add-on with an obscene number of JS files. This still seems to be an issue for Zotero, which merely has an obscene amount of code.

Status: VERIFIED → REOPENED

Resolution: FIXED → ---

Dan Stillman

Comment 47

•

13 years ago

The validator timeout is preventing us from submitting a Firefox 17–compatible version of Zotero. Recent versions of Zotero now also fail to validate, so this is not due to any changes in Zotero.

Matt Basta [:basta]

Assignee

Comment 48

•

13 years ago

There isn't inherently a problem with having lots of JS to validate; the vast majority of overhead with JS validation is from system calls to spin up Spidermonkey, which doesn't impact add-ons with relatively low numbers of JS files. There has been a sharp increase in validation times in recent weeks that I'm debugging on bug 812538. That bug should be used to capture any information about Zotero and other recently failing add-ons.

Status: REOPENED → RESOLVED

Closed: 13 years ago → 13 years ago

Resolution: --- → FIXED

Kris Maglione [:kmag]

Updated

•

13 years ago

Blocks: 813274

Subrata Mazumdar

Comment 49

•

13 years ago

I am still having problem uploading new version (for FF17) of my add-on : https://addons.mozilla.org/en-US/firefox/addon/key-manager/. I had the same problem when I try to submit for FF16 compatibility update. When I try to upload Windows-only version (file size: ~2.2MB), I get "Unexpected server error while validating." error message after a long pause. When I try to upload for all versions (file size : ~7.9MB), I get "Timeout" error immediately. I have approximately 110 JS files irrespective platform count. Is that too much? The add-on also contains binary files. Any suggestion how to get around this problem? Is there anyway I can have one time exemption of validation error until this problem is fixed? Thanks.

Jorge Villalobos [:jorgev] (he/him)

Comment 50

•

13 years ago

Please send your file to amo-admin-reviews AT mozilla.org and we'll help you.

Nobody; OK to take it and work on it

Updated

•

9 years ago

Product: addons.mozilla.org → addons.mozilla.org Graveyard