If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

SecReview: [HTML5] Re-implement HTML sanitizer using the HTML5 parser

RESOLVED FIXED

Status

mozilla.org
Security Assurance: Review Request
RESOLVED FIXED
6 years ago
4 years ago

People

(Reporter: curtisk, Assigned: freddyb)

Tracking

Details

(Whiteboard: [completed secreview][fuzzing][score:0::Low][Fx])

Attachments

(1 attachment, 2 obsolete attachments)

SecReview tracking bug
Actions regarding the review of the dependent bug should be tracked here.
Risk/Priority Ranking Exercise https://wiki.mozilla.org/Security/RiskRatings

Priority: N/A

Operational: 0 - N/A
User: 0 - N/A
Privacy: 0 - N/A
Engineering: 2 - Normal
Reputational: 0 - N/A

Priority Score: 0
Whiteboard: [pending secreview][fuzzing] → [pending secreview][fuzzing][score:0::Low]

Comment 2

5 years ago
I plan to fuzz the HTML sanitizer for memory safety bugs, but it's also a security surface that needs manual review.
Assignee: jruderman → nobody

Comment 3

5 years ago
jesse, can you fuzz it to see which tags and JS code it lets through? It shouldn't let any JS code whatsoever through.
Flags: needinfo?(jruderman)

Comment 4

5 years ago
My experience so far is that fuzzing is not effective at finding policy bugs, because it's hard to describe the policy in a way that (1) the fuzzer can check and (2) makes sense for the kinds of invalid input that the fuzzer generates.

I can check for <*:script> tags, but there are so many other forms of scripts (javascript: URLs, data: URLs in some cases, on* attributes) and plugins...
Flags: needinfo?(jruderman)
(Assignee)

Comment 5

5 years ago
I suggest this also get's a manual review from somebody websec :)

Comment 6

5 years ago
Jesse, are there a bunch of HTML test files for various means to run Javascript, preferably in a mean / non-obvious / attacking way? We could run all these and see whether the scripts run.
dan can you weigh in on who should do the manual review, would this be good for freddy?
Flags: needinfo?(dveditz)
Sure, he seems to be interested.
Assignee: nobody → fbraun
Flags: needinfo?(dveditz)

Comment 9

5 years ago
Because Freddy asked:
This is used e.g. by Thunderbird as security feature. e.g. many enterprise installations run old versions like thunderbird 3.1, and we don't want enterprises being commercially spied on by simply sending HTML emails that exploit known Gecko bugs. Similarly, I want to be protected against yet unknown Gecko bugs.

So, for the purpose of security, we should run with this assumption (note: assumption, not intended usecase, although some Firefox devs have done similar things with similar implementations, against my advise):
You get an HTML document from an attacker (say, an HTML email from a corporate spy), and you run it through the sanitizer, and then you insert it into chrome (!) HTML as-is. If there is any way for the attacker to do anything outside the document frame, this is a security bug.

Comment 10

5 years ago
Comment 4 wrote:
> I can check for <*:script> tags, but there are so many other forms of scripts
> (javascript: URLs, data: URLs in some cases, on* attributes) and plugins...

Indeed. The core difficulty in writing - and security-checking - a sanitizer is to *think* of all these forms and ways, and not to miss any.
(Assignee)

Comment 11

5 years ago
I want to poke at the HTML sanitizer, but I am wondering how to actually use it. I suppose I would need to build a testbed that allows me to supply some input files and then returns some (hopefully) sanitized output files. Well and I know very little to no C++ :)
Any suggestions?
(In reply to Frederik Braun [:freddyb] from comment #11)
> I want to poke at the HTML sanitizer, but I am wondering how to actually use
> it. I suppose I would need to build a testbed that allows me to supply some
> input files and then returns some (hopefully) sanitized output files. Well
> and I know very little to no C++ :)
> Any suggestions?

Wayne, do you have some idea on who can help Freddy out here?
(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #12)
> Wayne, do you have some idea on who can help Freddy out here?
Joshua is probably closest to this code at the moment.
Flags: needinfo?(Pidgeot18)
Going from how mime calls the HTML sanitizer, it looks like a simple xpcshell script can be written to load a file, call nsIParserUtils::sanitize, and write the result to a file.
Flags: needinfo?(Pidgeot18)
Whiteboard: [pending secreview][fuzzing][score:0::Low] → [pending secreview][fuzzing][score:0::Low][Fx]
(Assignee)

Updated

4 years ago
Depends on: 892473
(Assignee)

Updated

4 years ago
No longer depends on: 892473
(Assignee)

Comment 16

4 years ago
I have set up a testbed to perform xpcshell tests against the nsIParserUtils.sanitize() function.

My current setup:
I have taken XSS vectors from html5sec.org (namely the payloads and items as json) and ran them across the existing sanitizer and gathered results (attack vector and sanitized version) in a JSON file as output.

My next step would be applying the new sanitizer patches against my comm-central and comparing sanitized results with those that I gathered from the old parser for each attack vector.
Applying the patches fails. Can you help me set this up *or* look into why applying your patches against comm-central fails, Henri? I get quite a few failed hunks..
Flags: needinfo?(hsivonen)
(Assignee)

Comment 17

4 years ago
Created attachment 775642 [details]
xpcshell test to work with sanitizer

Attaching current xpcshell test environment (the test doesn't return true or false, I just used it to generate the sanitizer output data).
(Assignee)

Comment 18

4 years ago
Oh wait...the patch has already been commited.
I'll try assembling a few other test vectors and then make the outcome of this secreview a real xpcshell test that can be commited as well...
Flags: needinfo?(hsivonen)
(Assignee)

Comment 19

4 years ago
I want to make this a self-contained xpcshell test and wonder what the best practice is for carrying data along. Can I put arbitrary json files along the test in the same directory? I'd like to avoid carrying a huge chunk of data within the test file itself (huge meaning about 3KB of JSON, e.g. http://www.pastebin.mozilla.org/2644563)
(Assignee)

Comment 20

4 years ago
Created attachment 779159 [details] [diff] [review]
test case for html sanitizer

This is a working patch that adds a test for the sanitizer component. Feedback highly appreciated :)
Attachment #775642 - Attachment is obsolete: true
Attachment #779159 - Flags: review?
(Assignee)

Updated

4 years ago
Attachment #779159 - Flags: review? → review?(hsivonen)
Comment on attachment 779159 [details] [diff] [review]
test case for html sanitizer

r+ as far as the technical stuff goes. I don't know how the html5sec.org vectors are licensed. Please make sure that this patch is OK license-wise.
Attachment #779159 - Flags: review?(hsivonen) → review+
(Assignee)

Comment 22

4 years ago
The content is CC 3.0 BY licensed. I think we're fine if I add a line that points to the repo (https://code.google.com/p/html5security/). Should we ask a legal person for that?
Gerv, do we allow CC-by 3.0 for third-party-originating test data in m-c? We'd need a licensing legend somewhere in any case, right?
Flags: needinfo?(gerv)
For code not shipped in our products, anything open source will do, and I'd say that includes CC-BY.

If it doesn't ship with Firefox, no need for a licensing legend anywhere except the directory in question.

Please add a LICENSE file to the directory with a copy of CC-BY 3.0.

Gerv
Flags: needinfo?(gerv)

Comment 25

4 years ago
Could we reformat the JSON so that it's readable, please?
(Assignee)

Comment 26

4 years ago
Created attachment 789463 [details] [diff] [review]
test_sanitizer.patch

Carrying over Henri's r+.
I have pretty-printed the JSON as Ben suggested and added a file containing the CC-BY 3.0 license + proper attribution to the html5 security project.
Attachment #779159 - Attachment is obsolete: true
Attachment #789463 - Flags: review+
(Assignee)

Updated

4 years ago
Keywords: checkin-needed
Whiteboard: [pending secreview][fuzzing][score:0::Low][Fx] → [completed secreview][fuzzing][score:0::Low][Fx]
https://hg.mozilla.org/integration/mozilla-inbound/rev/5f7da9fe6cf5

In the future, please make sure your patches have a commit message suitable for landing before requesting checkin.
Keywords: checkin-needed
And because we like it when things actually build after pushing a patch...
https://hg.mozilla.org/integration/mozilla-inbound/rev/4dceda951fba

Which also leaves me feeling really warm and fuzzy about how thoroughly this was tested prior to requesting checkin.
https://hg.mozilla.org/mozilla-central/rev/5f7da9fe6cf5
https://hg.mozilla.org/mozilla-central/rev/4dceda951fba
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.