browser_misused_characters_in_strings.js doesn't work correctly with string definitions spanning multiple lines

RESOLVED FIXED in Firefox 48

Status

()

Firefox
General
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: nhnt11, Assigned: nhnt11)

Tracking

Trunk
Firefox 48
Points:
---

Firefox Tracking Flags

(firefox48 fixed)

Details

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(1 attachment)

(Assignee)

Description

2 years ago
From bug 1259859 comment 34:

It appears this test doesn't work for strings that contain newline characters in their definitions. I noticed this after a try push for bug 1240594 (which removes aboutCertError.dtd).

Specifically, that bug moves the certerror.introPara string to netError.dtd and splits it into 3 lines. I changed the whitelist to reflect that this string is now in netError.dtd, but I ended up with a failure because it never got removed from the whitelist.

Keeping the entire string in one line eliminates this failure. This seems like a bug in the test to me, because afaik string definitions in DTDs can span multiple lines.
(Assignee)

Comment 1

2 years ago
Created attachment 8738859 [details]
MozReview Request: Bug 1262648 - Make ENTITY matching regexp allow multiline string values in browser_misused_characters_in_strings.js. r=jaws

Review commit: https://reviewboard.mozilla.org/r/44721/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/44721/
Attachment #8738859 - Flags: review?(jaws)
(Assignee)

Comment 2

2 years ago
So, my apologies for the crazy regex I ended up with.

I started out just changing a "." to "[\s\S]" so that newlines are matched, but that led to greedy vs lazy behavior to correctly match within the same entity, and then some other issue, and another one.... and I ended up just writing a proper comprehensive regex to match full ENTITY tags.

You can see and play with my test cases here: http://www.regexpal.com/?fam=94585

If you think this is overkill (which I would agree with) I'll see if I can start over and come up with something simple that works with our codebase (i.e. making assumptions about code style).

Comment 3

2 years ago
Don't make assumptions, l10n is fuzz testing those.

https://dxr.mozilla.org/mozilla-central/source/python/compare-locales/compare_locales/parser.py#215 has a decent parser, we have a few years of experience with that one.
(Assignee)

Comment 4

2 years ago
https://reviewboard.mozilla.org/r/44721/#review41505

Found a much more straightforward, obvious regex.
(Assignee)

Comment 5

2 years ago
Comment on attachment 8738859 [details]
MozReview Request: Bug 1262648 - Make ENTITY matching regexp allow multiline string values in browser_misused_characters_in_strings.js. r=jaws

Found a better regex
Attachment #8738859 - Flags: review?(jaws) → review-
(Assignee)

Comment 6

2 years ago
Comment on attachment 8738859 [details]
MozReview Request: Bug 1262648 - Make ENTITY matching regexp allow multiline string values in browser_misused_characters_in_strings.js. r=jaws

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/44721/diff/1-2/
Attachment #8738859 - Attachment description: MozReview Request: Bug 1262648 - Make ENTITY matching regex more comprehensive in browser_misused_characters_in_strings.js. r=jaws → MozReview Request: Bug 1262648 - Make ENTITY matching regexp allow multiline string values in browser_misused_characters_in_strings.js. r=jaws
Attachment #8738859 - Flags: review- → review?(jaws)
(Assignee)

Comment 7

2 years ago
New regex is adapted from the file Axel linked to. It doesn't work when there are escaped quotes within the string value (e.g. <!ENTITY some.key "<span class=\"some value\">">), but since it seems we've been using that parser for a while now I think the precedent and simplification of the regex justify the lack of support for that edge case. Also I could not find any such example in our current code base [1].

[1] https://dxr.mozilla.org/mozilla-central/search?q=regexp%3A<!ENTITY\s%2B[\w\.]%2B\s%2B("[^"]*\\"[^"]*"|'[^']*\\'[^']*')\s*>
Comment on attachment 8738859 [details]
MozReview Request: Bug 1262648 - Make ENTITY matching regexp allow multiline string values in browser_misused_characters_in_strings.js. r=jaws

https://reviewboard.mozilla.org/r/44721/#review41597

Please add a dummy DTD file that is opened by the test and confirms that we will support multiline strings even if we aren't shipping any.

The rest of the patch looks fine.
Attachment #8738859 - Flags: review?(jaws)
Comment on attachment 8738859 [details]
MozReview Request: Bug 1262648 - Make ENTITY matching regexp allow multiline string values in browser_misused_characters_in_strings.js. r=jaws

https://reviewboard.mozilla.org/r/44721/#review42413

r=me if you add a new DTD that is opened by the test to confirm that multiline strings are interpreted correctly.
Attachment #8738859 - Flags: review+
(Assignee)

Comment 10

2 years ago
Comment on attachment 8738859 [details]
MozReview Request: Bug 1262648 - Make ENTITY matching regexp allow multiline string values in browser_misused_characters_in_strings.js. r=jaws

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/44721/diff/2-3/
r=me on the updated patch
(Assignee)

Comment 12

2 years ago
https://hg.mozilla.org/integration/fx-team/rev/5ad0f8d2d7dffcf57ff4d935f5dfb4781683091e
Bug 1262648 - Make ENTITY matching regexp allow multiline string values in browser_misused_characters_in_strings.js. r=jaws

Comment 13

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/5ad0f8d2d7df
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
status-firefox48: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 48
You need to log in before you can comment on or make changes to this bug.