Show warnings about compressed/obfuscated/minified code

NEW
Unassigned

Status

addons.mozilla.org Graveyard
Add-on Validation
P2
normal
7 years ago
2 years ago

People

(Reporter: kmag, Unassigned)

Tracking

(Blocks: 1 bug)

unspecified
2015-03
Dependency tree / graph

Details

(Whiteboard: [ReviewTeam:P1])

Attachments

(1 attachment)

User-Agent:       Mozilla/5.0 (X11; Linux x86_64; rv:6.0a1) Gecko/20110513 Firefox/6.0a1
Build Identifier: 

There should be a fairly simple heuristic to detect compressed/obfuscated/minified code. Density of space vs. non-space characters or lack of indentation should be sufficient. Minified code in hosted add-ons should trigger the following warnings:

• Code compression or minification serves little purpose in add-ons as XPIs are deflate compressed and downloaded rarely. Please consider submitting your code in a human readable form.

• If you choose to submit your code in a non-human-readable form, it must be reviewed by an administrator and may incur a significant delay. Please leave instructions on where to obtain a human readable form in your review notes, even if you've already sent it to amo-admin-reviews@mozilla.org

I'm hoping that this will cut down on some needless minification and otherwise make the binary review process somewhat smoother.

Reproducible: Always
We already have bug 540028 in the pipeline, so I think this warning would be more along the lines of "send us the source using the new submission form".
Whiteboard: [required amo-editors]
I'd also like to point out that some authors have said that minifying their code actually reduced their startup time, so that may be something worth investigating. I wouldn't be so keen to saying it's pointless unless we're sure.
Priority: -- → P3
Target Milestone: --- → Q3 2011
I've done these tests, both with a huge codebase (Pentadactyl) as well as with simpler test add-ons. Aside from being absurd in principle (especially given that most methods of loading code use the startup cache), I haven't been able to measure any difference whatsoever.
So, Jorge, is this something we want?
Assignee: nobody → mbasta

Comment 5

7 years ago
If we do this, I'd like to base it on the statistics that I'm going to be gathering with Diggy. I'm planning a Diggy feature that will log add-on properties (metadata, message counts, source files, etc.) of all of AMO's add-ons into a map/reduceable form.

Based on this data, I could come up with some statistics and plot some graphs which would let us come up with a sensible threshold. If nobody has any objections, I can have something ready fairly soon.
(In reply to comment #4)
> So, Jorge, is this something we want?

I think it's useful, yes, but I think we should fix bug 540028 first and then have this as a warning so developers know the next step they need to follow. Adding it as a dependency.
Depends on: 540028
Target Milestone: Q3 2011 → Future
Reclassifying editor bugs and changing to a new whiteboard flag. Spam, spam, spam, spam...
Whiteboard: [required amo-editors] → [ReviewTeam]
(In reply to Jorge Villalobos [:jorgev] from comment #6)
> (In reply to comment #4)
> > So, Jorge, is this something we want?
> 
> I think it's useful, yes, but I think we should fix bug 540028 first and
> then have this as a warning so developers know the next step they need to
> follow. Adding it as a dependency.

Since bug 540028 isn't going to be fixed soon, I think we can look into this again as an independent bug. Having a warning for this sort of code is definitely useful.
Target Milestone: Future → ---

Comment 9

5 years ago
This is going to be important for the feature profile stuff that's coming up. Moving to the Marketplace since it'll get more love that way. I'll put the code in valcom so that the love gets spread back to amo-validator.
Component: Admin/Editor Tools → Validation
Priority: P3 → P2
Product: addons.mozilla.org → Marketplace
Version: unspecified → 1.5

Updated

5 years ago
Assignee: mattbasta → nobody
Whiteboard: [ReviewTeam] → [ReviewTeam:P1]
I'm moving this back to AMO.  The Marketplace validator is being rewritten in JS currently which will distance it from AMO's validator.
Component: Validation → Add-on Validation
Product: Marketplace → addons.mozilla.org
Target Milestone: --- → 2014-08
Version: 1.5 → unspecified
@jorge: can you summarize the needs now that bug 540028 is fixed? I'd prefer to have them clear before starting working on it, thanks :)
Flags: needinfo?(jorge)
We need the validator to detect minified and obfuscated code, and show a warning for it. The warning should be the same we show for binaries. The one I suggested on IRC is:

"Compiled binaries, as well as minified or obfuscated scripts (excluding known libraries) need to have their sources submitted separately for review. Make sure that you use the source code upload field to avoid having your submission rejected."

The binary detection warning should change to this new text as well.
Flags: needinfo?(jorge)
I'm not entirely sure what should trigger this warning. The simplest thing would be to detect extremely long lines of code. Unfortunately, these are often strings which, though they should ideally not be that long, usually do not qualify as obfuscated code.

Ideally, we'd detect, e.g., more than a dozen statements which occur on the same line.

We should also, given the landing of bug 540028, ask for confirmation when attempting to submit a new version/add-on without an accompanying source archive when these warnings, or binary file warnings, have been triggered.
Counting the number of ";" per line would give you a good idea of how minified the code is I guess. Obfuscated code is way more difficult to detect though. How will you compare sources to minified/obfuscated files to make sure that it's the same code?
(In reply to David Larlet [:davidbgk] from comment #14)
> How will you compare sources to minified/obfuscated files to
> make sure that it's the same code?

We require developers to send us instructions on how they generate the minified/obfuscated files so we can reproduce the process.
I think that long lines with large numbers of semicolons or commas would be an acceptable way to detect minified code.

For obfuscated code, I can think of a few things:

• Lack of indentation.
• High percentages of extremely short variable names.
• Certain variable name patterns, which I can try to track down.
• A few patterns for specific obfuscation methods, which js-beautify uses for specific deobfuscation techniques. Examples: "eval(function(p,a,c,k,e,r)" (or similar), '^var _0x[a-f0-9]+ ?\= ?\['
Target Milestone: 2014-08 → 2014-09
Target Milestone: 2014-09 → 2014-10
Assignee: nobody → kmaglione+bmo

Comment 17

3 years ago
I'm just curious, if a developer were to bypass these detections, for example:  

Step1: Obfuscate and Minify
Step2: Obfuscate and Unminify (Using different applications of course) 

Would the code still be reviewable? and would the code still be detected as Minified or Obfuscated? 

-Null
It would not be reviewable without sources. If machine reformatting were sufficient, sources would never be necessary.

It probably would not be automatically detected, but such add-ons are not a major part of the problem we're trying to solve here.

Comment 19

3 years ago
Correction to my previous comment In "Step2" i meant: *Unobfuscate and Unminify*. 

I'm not sure i understand which add-on's you are referring to, but as i understand it, the new warning(s) would help determine the readability of sections of source, which is sometimes easy to determine, while other times it can be more difficult.
Target Milestone: 2014-10 → 2015-03
Created attachment 8623630 [details]
Screen Shot 2015-06-17 at 14.30.15.png

This might be a stupid solution, but why not simply displaying the warning in the "add-on submission checklist"?

There's also a mention in the label for the "upload sources".

I'm not very comfortable with the added complexity that detecting obfuscation and minification would add to the validator codebase, with a very limited gain it seems (only hoping to detect the obfuscation/minification before a reviewer does).

Also, this detection would have to not be run on the included libraries (jquery, underscore...), because it would be a false positive.

What I'm saying is: let's not complexify the codebase if the gain is low.
(Assignee)

Updated

2 years ago
Product: addons.mozilla.org → addons.mozilla.org Graveyard
Assignee: kmaglione+bmo → nobody
You need to log in before you can comment on or make changes to this bug.