Closed Bug 415470 Opened 17 years ago Closed 13 years ago

javascript atob() detects invalid Base64 encoding and throws an exception

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED INVALID

People

(Reporter: telega, Unassigned)

Details

Attachments

(2 obsolete files)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9b3pre) Gecko/2007121705 Minefield/3.0b3pre Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9b3pre) Gecko/2007121705 Minefield/3.0b3pre Base64 decoding of string "wuDsIO/w6Pjr7iDr6Pft7uUg8e7u4fnl7ejl=" using javascript atob() function gives "Illegal character" error. But decoding the same string using some other Base64 decoder, for example PHP base64_decode() function, successfully decodes it. The problem seems to be in trailing "=" character, because when this character is omitted, string is decoded by atob() successfully. But other decoders decode the original string without errors. Reproducible: Always Steps to Reproduce: 1.Open the following test.html file in browser: <script> atob("wuDsIO/w6Pjr7iDr6Pft7uUg8e7u4fnl7ejl="); <script> Actual Results: An exception occurs. Expected Results: String should be decoded without errors.
Component: General → DOM: Level 0
Product: Firefox → Core
QA Contact: general → general
The trailing "=" in this case is unneeded - the length of the encoded string is a multiple of 4 (since the encoded string's length is 27, a multiple of 3). nsGlobalWindow:Atob() uses the PL_Base64Decode function from NSPR to do the actual decoding. It treats extraneous trailing padding characters as invalid and fails: http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/nsprpub/lib/libc/src/base64.c&rev=3.7&mark=342-344#312 RFC 4648 (which obsoletes RFC 3458) says (section 3.3 paragraph 2): If more than the allowed number of pad characters is found at the end of the string (e.g., a base 64 string terminated with "==="), the excess pad characters MAY also be ignored. So it appears that ignoring this fault might be OK, from a spec point of view. I'm not sure whether it is feasible to make that change to NSPR, though.
Assignee: nobody → wtc
Component: DOM: Level 0 → NSPR
Product: Core → NSPR
QA Contact: general → nspr
Version: unspecified → other
Attached patch potential patch (obsolete) — Splinter Review
This fixes the specific case of allowing an extra trailing "=" character. I'm not sure it's worth taking this fix unless there's a compelling reason to be compatible with PHP's base64 decoding function.
Attached patch potential patch (obsolete) — Splinter Review
Oops, I uploaded the wrong patch.
Attachment #301206 - Attachment is obsolete: true
FYI. Bug 227290 is same problem on Mail&News in decoding of mime-encoded header. And, by patch for Bug 227290, Mail&News can tolerate one excess padding char. By the way, why common routine is not used for decoding?
(In reply to comment #4) > By the way, why common routine is not used for decoding? It looks like PL_Base64Decode is used in both cases - for that specific caller the base64 string was analyzed before being passed in, and modified. My patch only handles case 1 + "=", from bug 227290 comment 11. If we want to make this change we should perhaps fix PL_Base64Decode to allow cases 1-3+'=' from bug 227290 comment 11, and back-out the patch from bug 227290. Another option would be to do the "fixup" of the invalid encoding in nsGlobalWindow::Atob(), before calling PL_Base64Decode. wtc, Nelson: do you think it would be feasible to fix this bug at the NSPR level?
This is not an NSPR "bug", per se. NSPR's decoder correctly decodes base64 strings that are properly encoded, and correctly reports errors for improperly encoded Base64 strings. It is not a "bug" for NSPR's decoder to report errors for improperly encoded Base64 strings, since it is defined and required to work that way. So the NSPR changes requested in comment 5 constitute an enhancement request, asking that NSPR's decoder silently ignore certain encoding errors. Interoperability is based on conformance to protocol standards. The notion that interoperability failures caused by nonconforming protocol implementations in other products constitute a "bug" in Mozilla products is offensive. People merely ask that Mozilla be changed, rather than that the other products be fixed, because they have learned that sometime Mozilla is more responsive to such requests than are the developers of other products. Security components of Mozilla products rely on NSPR's base64 decoder to be exact and correct, and to NOT silently ignore errors that are important from a security perspective. The security uses of base64 rely on exact decoding. And that is what we have today, a decoder that requires exactitude with respect to the specification. Having such a decoder is a requirement. NSPR's existing decoder meets that requirement. Perhaps some applications ALSO want a different decoder, one that tends to silently ignore errors (in addition to the decoder we now have). Perhaps a new NSPR decoder API could be written to satisfy their wants without weakening the existing API's security. Or perhaps the calling application should attempt to detect encoding errors that are common among other implementations of that same application, if those errors are not important to that application's security. Surely the MIME code should not generally abandon the use of an exact decoder, and begin to use a decoder that ignores errors in all circumstances when decoding base64. It would be seriously wrong to ignore base64 encoding errors while decoding ANY components of an S/MIME message. But perhaps when decoding messages that have no pretense of security, it would be OK to ignore encoding errors for SOME (not all) components of the messages, such as Subject lines (the subject of bug 227290). From the perspective of JavaScript, the function atob must surely have a definition that decides if it does or does not silently ignore encoding errors. If it is not defined to ignore errors, then perhaps JavaScript needs a new function, atob_ignoreerrors, for this purpose. If JavaScript's atob function is defined to ignore certain encoding errors, then I suggest that it should not impose upon all other base64 decoders that they also must ignore errors. I'd like this bug to go back to be a JavaScript bug. I suggest that, if you want JavaScript's atob function to forgive encoding errors, and you want that forgiveness to occur in an NSPR decider, then please file a separate RFE against NSPR, requesting the addition of a new API for a Base64 decoder that silently ignores certain encoding errors, and let it block this bug. Any new NSPR RFE should specify exactly which decoding errors should be ignored. It should be a specification for a new decoder, rather than a vague request to "ignore errors". It should specify exactly which errors to ignore, and whether they should cause the decoder to simply stop and return without reporting an error, or whether (and how) the decoder should try to skip over them and continue.
Summary: javascript atob() gives error → javascript atob() detects invalid Base64 encoding and throws an exception
(In reply to comment #6) > This is not an NSPR "bug", per se. NSPR's decoder correctly decodes base64 > strings that are properly encoded, and correctly reports errors for improperly > encoded Base64 strings. It is not a "bug" for NSPR's decoder to report errors > for improperly encoded Base64 strings, since it is defined and required to > work that way. Defined and required by what specification? What specification defines NSPR's Base64 decoder's behavior? I couldn't find any MUST-level requirements for NSPR's current behavior in the RFC I quoted, but I didn't look closely. If other commonly used base64 decoders accept the "invalid" encoding (as this bug's comment 0 and bug 227290 seem to imply), then it would seem wise to at least consider changing NSPR to match the de-facto behavior seen in other implementations, and that's why I asked the question in comment 5. I am not familiar with the security implications of making that change, or NSPR's compatibility constraints, so if you think those alone are enough to WONTFIX this bug for NSPR, that's fine (we can turn it into a DOM bug and re-evaluate). I just don't think that "that encoding is incorrect" is a valid reason to WONTFIX, given that there doesn't seem to be a specification that defines error handling for this case, and that other decoders handle this case gracefully.
(In reply to comment #7) > given that there doesn't seem to be a specification that defines error handling > for this case, and that other decoders handle this case gracefully. Put more succinctly: what makes the encoding in comment 0 "invalid"?
On its face, this bug is about JavaScript. I have little stake in how this JavaScript bug is resolved. But (as I said) if you want the "fix" for this "bug" to be in NSPR, then IMO you're asking for a new NSPR function that behaves differently than the present one. That should be a separate RFE for NSPR. I don't oppose doing such an RFE for NSPR. I oppose a change to the definition of the existing Base64 decoder'upon which NSS depends, changing it to start silently ignoring encoding errors.
RFC 4648 says (excerpted here) Implementations MUST reject the encoded data [in error] unless the *specification referring to this document* explicitly states otherwise. Such specifications may instead state, as MIME does, that characters [in error] should simply be ignored when interpreting data. [...] Furthermore, such specifications MAY ignore the pad character, "=", [...] The point is that allowing erroneous encodings to be accepted is something that specifications OTHER THAN this RFC may explicitly allow, but this RFC does not itself allow for them. Such other specifications would typically be protocol specifications (if RFCs) or perhaps specifications of other languages or libraries (such as JavaScript). Maybe JavaScript's atob() function is specified to ignore such errors. If so, I have no problem with JavaScript's atob function implementing that. But since not all specifications that incorporate RFC 4648 call for that, the base implementation ought not to impose that on all of its users. The history of crypto-based security it littered with many incidents where a vulnerability was found because the receiver of a (supposedly) cryptographically secured document (or packet) was too generour in what it accepted. NSS has had its share of these issues, as have PGP and many other products. (The huge set of RSA signature vulnerabilities found in many products a year or so ago were all due to this.) So, the "generous in what you accept" maxim tends NOT to apply to anything related to crypto security (and that includes S/MIME messages).
OK. Moving back to Core::DOM.
Assignee: wtc → nobody
Component: NSPR → DOM
Product: NSPR → Core
QA Contact: nspr → general
Version: other → unspecified
Attachment #301208 - Attachment is obsolete: true
NSS doesn't use NSPR's base64 decoder: http://lxr.mozilla.org/seamonkey/ident?i=PL_Base64Decode NSS has its own base64 decoder. NSS itself doesn't use its own base64 decoder except in the 'atob' and 'pwdecrypt' commands and the 'blapitest' and 'sdrtest' test programs. The comments in the source code suggest that the NSS base64 decoder also ignores extraneous padding characters. In fact, it even adds missing padding characters to the input. Search for "B64_PAD" in http://lxr.mozilla.org/security/source/security/nss/lib/util/nssb64d.c
Status: UNCONFIRMED → NEW
Ever confirmed: true
At issue is the base 64 decoding used outside of NSS in security related tasks, such as decoding of S/MIME messages.
Status: NEW → RESOLVED
Closed: 13 years ago
OS: Windows XP → All
Hardware: x86 → All
Resolution: --- → INVALID
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: