Last Comment Bug 666562 - HTTP link header field title param: quoted-string not processed correctly
: HTTP link header field title param: quoted-string not processed correctly
Status: RESOLVED FIXED
[inbound]
:
Product: Core
Classification: Components
Component: DOM (show other bugs)
: unspecified
: All All
: -- normal (vote)
: mozilla8
Assigned To: Julian Reschke
:
: Andrew Overholt [:overholt]
Mentors:
http://greenbytes.de/tech/tc/httplink...
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-06-23 05:55 PDT by Julian Reschke
Modified: 2011-08-09 01:12 PDT (History)
2 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
implement quoted-string unescaping (3.99 KB, patch)
2011-07-16 08:31 PDT, Julian Reschke
bzbarsky: review+
Details | Diff | Splinter Review

Description Julian Reschke 2011-06-23 05:55:25 PDT
User-Agent:       Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20100101 Firefox/5.0
Build Identifier: 

When parsing the title param in an HTTP link header field, the quoted-string form isn't processed properly with respect to unescaping.

Reproducible: Always

Steps to Reproduce:
1. Visit <http://greenbytes.de/tech/tc/httplink/#simplecsstitleq> for overview.
2. Run linked test at <http://greenbytes.de/tech/tc/httplink/simplecsstitleq.asis>
3. The CSS should be loaded with the properly unescaped title observable from the loaded page.

Actual Results:  
It is not.
Comment 1 Boris Zbarsky [:bz] (still a bit busy) 2011-06-23 08:09:20 PDT
This code should generally be using the mime header param class, no?
Comment 2 Julian Reschke 2011-06-23 08:18:28 PDT
(In reply to comment #1)
> This code should generally be using the mime header param class, no?

In a perfect world: yes.

Problem 1: the MIME header param class carries lots of historical baggage, bugs, and workarounds. We shouldn't extend those to new header fields that don't come with the same history.

Problem 2: the formats aren't as uniform as they should be. For instance, RFC 5988 allows quoted media types without making them use the quoted-string syntax. So parsers need special rules per *parameter* (I personally consider this a bug in RFC 5988, but before reporting errata over there I'd like to explore how close we can come with a sane parsing algorithm).

That being said: controlled code-reuse would be good; both for decoding quoted-string (just refactored in mime header parser a few weeks ago), and also in decoding 2231/5987 (which we'll need for bug 663057)
Comment 3 Julian Reschke 2011-07-08 05:32:36 PDT
I think it would be useful to have a generic tokenizer that can process header field values as used in Content-Type, Content-Disposition, Link and several other header fields.

It would take a set of header field values (several for those header fields using the "," notation), and return a set of lists containing each token, where each token would come with:

- token name
- token value (which could be null it there was no "=")
- flags (was-quoted-string etc)

Parsers for individual header fields could then process the output of the generic tokenizer, such as applying 2231/5987 unescaping, 2047 handling (sigh), enforcing constraints like "first token must be disposition type", and so on.

I could give this a try in plain C, or do a p-o-c in Java or javascript. Would that be useful?
Comment 4 Julian Reschke 2011-07-16 08:31:56 PDT
Created attachment 546328 [details] [diff] [review]
implement quoted-string unescaping

this patch implements quoted-string unescaping, fixing the test failures in <http://greenbytes.de/tech/tc/httplink/#simplecsstitleq> and <http://greenbytes.de/tech/tc/httplink/#simplecsstitleq2>.

(it also removes redundant constants for PRUnichars)
Comment 5 Boris Zbarsky [:bz] (still a bit busy) 2011-07-22 08:36:05 PDT
Comment on attachment 546328 [details] [diff] [review]
implement quoted-string unescaping

This looks fine, but are we sure we don't want the unescaping for single-quoted strings?
Comment 6 Julian Reschke 2011-07-22 16:39:21 PDT
My plan is to actually remove the support for single-quoting, see bug 672079.
Comment 7 Marco Bonardo [::mak] 2011-07-25 06:13:54 PDT
http://hg.mozilla.org/mozilla-central/rev/0f58aa05f18d

Note You need to log in before you can comment on or make changes to this bug.