Align with Fetch on data: URLs

NEW
Unassigned

Status

()

P3
normal
a year ago
6 months ago

People

(Reporter: annevk, Unassigned)

Tracking

(Depends on: 4 bugs, Blocks: 1 bug, {meta})

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

a year ago
In https://github.com/whatwg/fetch/pull/579 I'm working on a revised standard for data: URLs to put all issues related to them in Firefox and across browsers to bed forever.

There are corresponding tests over at https://github.com/w3c/web-platform-tests/pull/6890.

Both are currently somewhat blocked on it not being clear to me what the best strategy around MIME types is. The RFC definition doesn't work as

  text/html;

cannot be treated as an error and neither can

  text/html;unknown

but how exactly we should preserve missing or invalid parameters is unclear. Ideas on that are very much welcome over in https://github.com/whatwg/mimesniff/issues/30.

I'm going to mark all data: URL bugs that are blocked on a better processing definition as blocking this bug to make sure the solution covers all of them. I think we should start fixing them one-by-one even if the standard hasn't landed yet as there are clear improvements we could make over the status quo.
Blocks: 1226983
Keywords: meta
Sorry, should already have marked these meta bugs as P3.
Priority: -- → P3
(Reporter)

Updated

10 months ago
Depends on: 908413
(Reporter)

Comment 2

7 months ago
Tests have landed and the Fetch Standard has been updated:

  https://fetch.spec.whatwg.org/#data-urls
Anne, can you link to any tests we fail?  Its not obvious where the tests are to me.
Thanks.  I guess wpt.fyi is not updated yet.  I don't see them there.
I’ve implemented this in Rust: https://github.com/servo/rust-url/tree/master/data-url

Though this code takes &str (UTF-8) as input, whereas Gecko might want something that works on &[u16] directly, to avoid converting and copying a potentially-long string.
Gecko's URIs are stored as UTF-8 strings, I believe.  They're definitely stored as 1-byte strings.

What they're _not_, at least in the data: case, is stored as a _single_ string.  Right now Gecko does parse the full string, with a resulting extra copy, but bug 1333899 was aiming to stop doing that, and it would be good to plan for it.  Ideally, there would be an API that takes the substring starting right after ':' and going up to (but not including) the '#' and parses that.
There isn’t exactly that public API in the code linked above, but it could be added.
You need to log in before you can comment on or make changes to this bug.