Closed Bug 461204 Opened 16 years ago Closed 4 years ago

Boundary delimiter for HTTP file posts is static. That is wrong according to RFC.

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla74
Tracking Status
blocking1.9.2 --- -
status1.9.2 --- wanted
blocking1.9.1 --- -
status1.9.1 --- wanted
firefox74 --- fixed

People

(Reporter: podpalch, Assigned: tjr)

References

(Blocks 1 open bug)

Details

(Keywords: privacy, sec-low, Whiteboard: [sg:low][tor 22919][adv-main74-])

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3 (.NET CLR 3.5.30729)
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3 (.NET CLR 3.5.30729)

Here is HTTP request post content, i have intentionaly used file with wrong boundary. 

-----------------------------41184676334
Content-Disposition: form-data; name="uplFile1"; filename="simple_file_post.txt"
Content-Type: text/plain

!!!FireFox:

-----------------------------41184676334
Content-Disposition: form-data; name="uplFile"; filename="POSTDATA.txt"
Content-Type: text/plain

123456789
-----------------------------41184676334
Content-Disposition: form-data; name="MAX_FILE_SIZE"

16384
-----------------------------41184676334--
-----------------------------41184676334
Content-Disposition: form-data; name="uplFile2"; filename="POSTDATA.txt"
Content-Type: text/plain

123456789
-----------------------------41184676334
Content-Disposition: form-data; name="MAX_FILE_SIZE"

16384
-----------------------------41184676334--

Reproducible: Always

Steps to Reproduce:
1.) Add -----------------------------41184676334\r\n to body of file.
2.) Submit file from step 1, using following form:
 <form action = "1.php" enctype="multipart/form-data" method = "post">
 <input type = "file" id = "uplFile" name = "uplFile">
 <input type = "submit" value = "Send">
 </form>
3.) Server will NOT parse request from firefox normally
Actual Results:  
Server side will FAIL.

Expected Results:  
Boundary SHOULD be unique. At least make boundary random
Component: Build Config → Networking
Product: Firefox → Core
QA Contact: build.config → networking
Component: Networking → HTML: Form Submission
QA Contact: networking → form-submission
Uh... Here's where the boundary string is built:

847   //
848   // Build boundary
849   //
850   mBoundary.AssignLiteral("---------------------------");
851   mBoundary.AppendInt(rand());
852   mBoundary.AppendInt(rand());
853   mBoundary.AppendInt(rand());

Looks pretty random to me....
That code is in nsFormSubmission.cpp, for what it's worth.
Blocks: 464071
Whiteboard: [sg:investigate]
But we don't call srand, as bz explained later in bug 464071 comment 13.
Status: UNCONFIRMED → NEW
blocking2.0: --- → ?
Ever confirmed: true
Whiteboard: [sg:investigate] → [sg:low]
From 464071:

3.2 Firefox 3 (Gecko) – all platforms

The boundary string Firefox 3 generates ([20]) has a very predictable structure
and contents. The boundary string is a concatenation of the following:
  * 27 hyphens
  * 3 samples of the CRT rand() function, serialized as integers.

This logic has not changed since Firefox 2.0.0.0. Mozilla Firefox 3 does not
call srand(), except in Linux and Solaris. By convention, using rand() without
calling srand() is equivalent to calling srand(1) before the first invocation
of the rand() function (this is documented behavior for Windows MSVCRT – [21]
and for Mac OS/X – [22]). Hence, all Firefox instances running on the same O/S
generate the same sequence of random numbers, and the only difference between
them is how far each one is in this sequence. This is hereby defined as the
"CRT mileage" of the browser process. This mileage is affected by the
consumption rate of rand(). File uploads are not the only "consumer" of rand(),
but there aren’t too many consumers altogether.

In order to find the mileage, the attacker can start with seed=1, and roll
forward the PRNG until 3 consecutive outputs yield the boundary string.

In Linux and Solaris, it seems that srandom() is called (probably in the
platform-specific crash-reporter code), and since in Linux and Solaris, srand()
is a wrapper for srandom(), the net result is that the CRT PRNG is actually
seeded. The seed is the process startup time (in seconds resolution). srandom()
is typically invoked before the JS PRNG is seeded, with up to few seconds
apart. Thus, knowing the JS PRNG seeding time and subtracting 0-5 seconds from
it yields the correct CRT PRNG seed (hence there’s very little additional
information in the CRT PRNG seed over the JS PRNG seed). It is therefore very
easy to calculate the CRT mileage for Linux and Solaris as well.

To summarize: the boundary string is predictable, and it leaks the "mileage" of
the browser.
Severity: minor → normal
blocking1.9.1: --- → ?
blocking1.9.2: --- → ?
status1.9.1: --- → ?
status1.9.2: --- → ?
OS: Windows XP → All
status2.0: --- → ?
Please don't spam the flags like this, Reed - we've talked about this. This is an sg:low, and doesn't obviously block any of the releases. We'd consider a nominated patch, of course.
blocking1.9.1: ? → -
blocking1.9.2: ? → -
blocking2.0: ? → ---
status1.9.1: ? → ---
status1.9.2: ? → ---
status2.0: ? → ---
Keywords: privacy
Someone reported to the Security email alias that this bug could be used as part of a CSRF attack against a site processing uploads and relying only on boundary strings to validate requests.  Is anyone willing to explore making a patch to properly re-seeding before calling rand() per comment 3?
There are a few misunderstandings on this thread about how rand() works.  Firstly, it is platform specific, some platforms (I've seen the Linux source code, it seems to) automatically seed it for you.  Secondly, seeding it does not make it unpredictable.  The simplest implementations of it are just a linear congruential PRNG, which do nothing more than multiply the current random number with a certain prime, and add another prime, to get the next random number.  More complex implementations may require getting more random numbers to obtain the internal state, but it is not difficult.  Reseeding per request I don't think is adequate, as that will most likely be time based, which an attacker can easily predict using javascript.

While there are all sorts of things you can do that will make potential attacks harder, by far the best solution is to use a platform specific cryptographically secure random number generator.  On Windows this would be the CryptGenRandom call on the win32 API, on UNIX like OS's this would be reading some random data from the non blocking /dev/urandom.  This isn't just my opinion, using these was the choice of other cross platform projects such as both the Python and Ruby language designers, while the Java language designers used /dev/urandom on UNIX and their own SHA1 based random number generator on Windows.
The chatter for this bug unfortunately was in bug 464071.

I've tried to copy only the interesting bits into this bug. If people feel I've done a bad job, I'm sorry. Please rant to me in private when I get back from my summer vacation in a month (do hold your breath, I won't read anything you rant about earlier for at least 6 weeks).

Bug 464071 comment 4 Wan-Teh Chang 2008-11-10 11:44:16 PST
> If you need unpredictable pseudorandom numbers, you need to use
> NSS's PK11_GenerateRandom function.

Bug 464071 comment 5 Jeff Walden 2008-11-10 11:59:02 PST
> Just to raise the issue (I think it's not actually a problem, but I wouldn't
> consider myself an RFC 2616 expert), I assume because our form submission
> requests have Content-Length headers, the predictability of form submission
> boundaries can't be used to perform HTTP request splitting, right?  (Except in
> the sense of being able to send two files in the same request where the actual
> form only offered space for one, that is -- but that's not the traditional
> meaning.)

Bug 464071 comment 7 Boris Zbarsky 2008-11-10 13:32:23 PST
> If people care, we could also use the uuid generator for form submission
> boundaries (of course that still falls back to random() on everything but
> Mac/Linux).  Not sure whether the uuid generators on Win/Mac are good enough
> for what we want here, though.

Bug 464071 comment 8 Amit Klein 2008-11-11 23:30:01 PST
> I think the new angle here is that even if you don't care about the actual
> randomness/predictability of Math.random(), the fact that the seed can be
> recovered is an issue in itself. This is because the seed is sensitive, as it
> can be used to identify the browser instance, and it leaks information.

Bug 464071 comment 9 Amit Klein 2008-11-11 23:33:32 PST
>> Just to raise the issue (I think it's not actually a problem, but I wouldn't
>> consider myself an RFC 2616 expert), I assume because our form submission
>> requests have Content-Length headers, the predictability of form submission
>> boundaries can't be used to perform HTTP request splitting, right?  (Except in
>> the sense of being able to send two files in the same request where the actual
>> form only offered space for one, that is -- but that's not the traditional
>> meaning.)

> That's my understanding (i.e. Content-Length determines overall message size,
> parts are internal to the message), though I'm sure there are people who know
> better than me on this list.

Bug 464071 comment 11 Boris Zbarsky 2008-11-12 05:30:00 PST
> I think we should use bug 461204 for the boundary string (and perhaps
> bug 475585 for Math.random).

Bug 464071 comment 12 Amit Klein 2008-11-12 06:12:24 PST
>> I think we should use bug 461204 for the boundary string (and perhaps the
>> exiting Math.random bug for Math.random).

> I hope I'm not nitpicking, but the particular problem described in bug 461204
> seems to be around *static* boundary string, which may very well be a separate
> issue (one which is not normally encountered - in all my experiments with
> Firefox 2/3 I did not encounter such situation). 

> The issue I submitted earlier this week is about the predictability of the
> boundary string, and implications of its predictability (e.g. calculating the
> browser "mileage").

Bug 464071 comment 13 Boris Zbarsky 2008-11-12 06:23:30 PST
> The string is static because we use rand() without seeding it, as described in
> this bug...  Fixing that involves overhauling how the string is generated,
> which is exactly what's needed to fix the boundary-string issue you submitted.

Bug 464071 comment 14 Amit Klein 2008-11-12 06:35:22 PST
>> The string is static because we use rand() without seeding it, as described in
>> this bug...  Fixing that involves overhauling how the string is generated,
>> which is exactly what's needed to fix the boundary-string issue you submitted.

> Oh, right. I missed the part where this is supposed to be the *first* file
> upload submission. So yes, there's a point in routing the multipart boundary
> issue to bug 461204. At the same time, may I point out .. that
> fixing bug 461204 per-se involves merely seeding srand(),
> whereas fixing the issue
> I pointed out requires overhauling the string generation. 
> Ultimately it's your call, of course.

Bug 464071 comment 15 Boris Zbarsky 2008-11-12 06:44:47 PST
> Maybe I misunderstood the paper, but it sounded like the problem with the
> boundary was that it was predictable due to the seeding either not happening or
> the seed value being recoverable from the JS random number generator seed.  So
> just using a better seed for srand() should help there, as I understand.  Or am
> I missing something?

Bug 464071 comment 16 Amit Klein 2008-11-12 07:17:51 PST
>> Maybe I misunderstood the paper, but it sounded like the problem with the
>> boundary was that it was predictable due to the seeding either not happening or
>> the seed value being recoverable from the JS random number generator seed.  So
>> just using a better seed for srand() should help there, as I understand.  Or am
>> I missing something?

> There are two distinct (in my opinion) issues, although they share the same
> concept, sort-of.

> The first is the preditability of the [rand()] PRNG, and the ability to
> recover the seed used.
> The second is the predictability of the boundary string, which uses the CRT
> rand/srand PRNG. The root cause in this case is the predictability of the CRT
> rand/srand PRNG (at least on Windows and Mac OS/X, and to some degree on Linux
> too, and quite possibly on other O/Ses as well). Additionally, since srand is
> not called, the seed is static and always produces the same sequence (of which,
> bug 461204 is a symptom).

> Fixing one does not automatically fixes the other, and vice versa.

> Additionally, with the boundary string issue, just seeding srand() with a good
> seed won't get rid of the issue. First, the PRNG will still be predictable, so
> knowing the current value would make it easy (at least on Windows and Mac OS/X)
> to know the next value (read: next boundary string). And quite possibly the
> PRNG can be rolled back to the point where seeding took place, and thereby
> identify the seed used and be able to track the browser (this may depend on the
> seeding scheme).

Bug 464071 comment 17 Boris Zbarsky 2008-11-12 07:29:54 PST
> Any PRNG with a published algorithm will let you predict the next result from
> the previous one.  So as long as we're using a PRNG at all for the boundary
>string (and not one-way hashing the result or anything) you'll be able to
> predict future boundaries.  I'm not clear on what the threat in predicting
> future boundaries is.

Bug 464071 comment 19 Amit Klein 2008-11-12 07:48:33 PST
>> Any PRNG with a published algorithm will let you predict the next result from
>> the previous one. 

> Eh? I disagree... Why is this an inherent attribute of any PRNG? it is only
> correct if you can deduce the internal state of the PRNG given a sequence of
> outputs. However, a cryptographic PRNG would not let you do that.

>> So as long as we're using a PRNG at all for the boundary
>> string (and not one-way hashing the result or anything) you'll be able to
>> predict future boundaries.  I'm not clear on what the threat in predicting
>> future boundaries is.

> You can use it e.g. for submitting arbitrary data as files to 3rd party
> sites(i.e. controlling the inner-part headers) as part of a CSRF attack.

>> The tracking is the real issue, as I understand it.  How does the seeding
>> scheme affect the ability to roll back the PRNG?  Just in terms of being able
>> to determine when you've rolled back to the original seed? 

> Yes. Suppose you seed with 3 bytes of true random noise, and zero-out the
> fourth byte (this is a cotrived example, of course), then the seed is quite
> good, but if I'm able to roll back the PRNG I can determine the seed by the
> zero byte, which will then give me what I need - an identifier for the browser.

>> For the seed we
>> should be able to use truly random bytes, which ought to obviate this problem.

> This still depends on the implementation of the CRT rand/srand PRNG. Suppose it
> allows seeding by 32 bits, but its internal state is much larger (e.g. 48 bits,
> 64 bits or even 992 bits which is the case with glibc - Linux). Then an
> attacker can still identify the seed (if the PRNG is weak) and the browser
> mileage.

Bug 464071 comment 20 Boris Zbarsky 2008-11-21 12:46:51 PST
> Note that we have nsIRandomGenerator if we really want, by the way.  For the
> form boundary that seems perfectly reasonable to me.

Bug 464071 comment 21 Amit Klein 2008-11-23 00:03:38 PST
>> Note that we have nsIRandomGenerator if we really want, by the way.  For the
>> form boundary that seems perfectly reasonable to me.

> There are two questions:
> a. Is nsIRandomGenerator implemented as a cryptographic PRNG? I assume so, but
> I didn't really dig inside the code. It would be nice to know the algorithm
> used.
> b. Is the PRNG seeded with enough high-entropy bits? this is usually a trickier
> question...

> Note that if (a) holds, but (b) doesn't, it's still quite bad for the boundary
> strings, because an attacker may be able to enumerate the possible seeds, and
> eliminate false guesses using one or more boundary string samples.

Bug 464071 comment 22 Boris Zbarsky 2008-11-23 07:14:15 PST
> nsIRandomGenerator just directly calls PK11_GenerateRandom.  Per comment 4, I
> would assume it's fine (in particular, that it is in fact a cryptographic PRNG,
> and that the seed is sufficiently long truly random data).  One of the NSS
> developers would have to comment on the details, though.

Bug 464071 comment 23 Nelson Bolyard 2008-11-23 11:18:23 PST
> PK11_GenerateRandom is a good cryptographic PRNG. To use it, you must have
> first initialized NSS.  That initialization will have seeded the PRNG 
> adequately.  One of the properties of this PRNG is that, without knowing
> the internal state (which is very large), one cannot predict the output,
> even if one has collected all previous inputs.

Bug 464071 comment 26 Johnathan Nightingale 2009-01-14 08:55:12 PST
>> nsIRandomGenerator just directly calls PK11_GenerateRandom.  Per comment 4, I
>> would assume it's fine (in particular, that it is in fact a cryptographic PRNG,
>> and that the seed is sufficiently long truly random data).  One of the NSS
>> developers would have to comment on the details, though.

> Are you proposing wiring Math.random() to nsIRandomGenerator, or just using
> nsIRandomGenerator to seed the JS PRNG?

Bug 464071 comment 27 Boris Zbarsky 2009-01-14 09:21:48 PST
>> Are you proposing wiring [form rand()] to nsIRandomGenerator, or just using
>> nsIRandomGenerator to seed the JS PRNG?

> I'm proposing using it instead of random() for the form submission multipart
> separator, so that it would actually not be the same from run to run as it is
> now).  This would have the side effect that it would no longer be possible to
> go back and forth from those strings to Math.random() results.

Bug 464071 comment 28 Amit Klein 2009-01-14 09:28:23 PST
> If you only give it a strong seed, without changing the weak algorithm, the
> overall system is still vulnerable. Perhaps a bit less vulnerable, but still
> vulnerable. 

> An attacker can still find the internal PRNG state at any given moment (because
> the PRNG algorithm is weak). And then, the attacker can find out e.g. how many
> times [rand()] was called in between checkpoints. An attacker can also
> still mount some form of user tracking. If the attacker knows the state at time
> T1, and then the state at time T2, and the attacker can also assume that say no
> more than 1000 [rand()] values were consumed by FF in between, then the
> attacker can roll forward the state from T1 up to 1000 iterations, and compare
> each intermediate state with the state at T2. If there's a match, the attacker
> can say with high likelihood of being correct, that the same browser was used
> in T1 and in T2. Hence user tracking (more demanding on the CPU though).

Bug 464071 comment 32 Andreas Gal 2009-09-15 14:57:56 PDT
> The main remaining worries if each [JS] context has its own random number generator
> are:

> 1) the stream of random numbers a script gets will become more predictable (we
> remove any noise from other consumers of the RN stream). This should be fine [for JS].
> Its already pretty predictable now.

It might be possible to have distinct "contexts" for each "domain-port" pair, but I'm not sure this makes sense. It might if we need to ensure that one evil domain isn't aware of how we construct data for another victim domain. Or it might not if we're relying on not being able to disclose all entropy from our random number generator.

> 3) if the RNG is per context, content might be able to spoof/predict/steer the
> numbers [form submission] gets if [form submission] executes on the same context
> [for all domains] (which right now it usually does).
Component: HTML: Form Submission → DOM: Core & HTML
Whiteboard: [sg:low] → [sg:low][tor 22919]

Using a weak RNG for the form boundary allows a website operator to perform several
attacks on users (as outlined in https://trac.torproject.org/projects/tor/ticket/22919)

These include:

  • Identifying Windows users based on the unseeded RNG
  • Identify the number of form submissions that have occurred cross-origin between same-origin submissions

Additionally, a predictable boundary makes it possible to forge a boundary in the middle
of a file upload.

Assignee: nobody → tom
Pushed by tritter@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c7fb319ba21e
Improve the random number generator for the boundaries in multipart/form-data r=smaug
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla74
Whiteboard: [sg:low][tor 22919] → [sg:low][tor 22919][adv-main74-]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: