Open Bug 1437914 Opened 2 years ago Updated 9 months ago

Typing 25-30 chars on an input with expensive regexp pattern causes browser to hang

Categories

(Core :: DOM: Editor, defect, P3)

58 Branch
defect

Tracking

()

Tracking Status
firefox58 --- affected
firefox60 --- affected

People

(Reporter: james, Unassigned)

Details

(Keywords: perf, Whiteboard: [qf:p5])

Attachments

(3 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.140 Safari/537.36

Steps to reproduce:

Create a straightforward <input> element like so:

<input
       name="emailAddress"
       pattern="\w+([\.-]?\w+)*@\w+([\.-]?\w+)*(\.\w{2,4})+"
       value=""
       type="text" />

Created a jsbin for ease of testing: https://jsbin.com/rucajimawe/1/edit?html,output


Actual results:

Try to type around 25-30 characters. E.g. typing a-z, around x/y/z I notice extreme slowness/lag appearing.


Expected results:

There should not be any lag.
FWIW, this seems to be related to the inefficient pattern used. With different (similar, but not exactly the same) regex, for example:

\w+[\w.\-+]*@\w+[\w.\-+]*.\w{2,6}

the issue does not occur.

Even so, even web-based regex testers I've used to try and debug such as https://regexr.com/ and https://regex101.com/ don't hang or anything (may be completely unrelated, but thought I'd mention anyway).
¡Hola James!

I've reproduced the issue on Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101 Firefox/60.0 ID:20180213100127

https://perf-html.io/from-addon/calltree/?hiddenThreads=&thread=4&threadOrder=0-2-4-5-6-7-8-9-10-11-1-3&v=2

¡Gracias!
Alex
Status: UNCONFIRMED → NEW
Ever confirmed: true
¡Hola!

Ups!

The right link for the profile: https://perfht.ml/2G9SPi9

¡Gracias!
Alex
Looks like most time is spent in js::irregexp::InterpretCode.
Component: Layout: Form Controls → JavaScript Engine
kannan, do you know could take a look at this? (or if this is a dupe of a known issue)

See profile in comment 3 -- scroll down to "Content (8 of 8)" and you'll see the hang towards the end of the profile (with a long red hang-bar, and black "keypress" handler bars ~doubling in size with each successive keypress -- with the last handler clocking in at ~8 seconds long.)
Flags: needinfo?(kvijayan)
Keywords: perf
Whiteboard: [qf]
Here's a testcase that just directly runs this regexp on some strings, in JS, and reports the durations (per "Date.now()") via console.log.

Firefox Nightly results:
Spent 2ms running regexp on 'abcdefghijklmnop'
Spent 32ms running regexp on 'abcdefghijklmnopqrstuv'
Spent 62ms running regexp on 'abcdefghijklmnopqrstuvw'
Spent 124ms running regexp on 'abcdefghijklmnopqrstuvwx'
Spent 234ms running regexp on 'abcdefghijklmnopqrstuvwxy'
Spent 456ms running regexp on 'abcdefghijklmnopqrstuvwxyz'
Spent 910ms running regexp on 'abcdefghijklmnopqrstuvwxyz1'

Chrome results:
Spent 1ms running regexp on 'abcdefghijklmnop'
Spent 31ms running regexp on 'abcdefghijklmnopqrstuv'
Spent 60ms running regexp on 'abcdefghijklmnopqrstuvw'
Spent 120ms running regexp on 'abcdefghijklmnopqrstuvwx'
Spent 228ms running regexp on 'abcdefghijklmnopqrstuvwxy'
Spent 450ms running regexp on 'abcdefghijklmnopqrstuvwxyz'
Spent 891ms running regexp on 'abcdefghijklmnopqrstuvwxyz1'

Looks like we're about equivalent to Chrome, in terms of raw regexp execution time (and in particular, this regexp is pretty slow in both browsers on strings of length 20+)
Here's a version of the author's JSBin.

Based on my observations, it seems like we spend the same amount of time evaluating the regexp as Chrome, when we do run it.  But we run it more eagerly.  Specifically, I think the difference between us and Chrome here is:
 - We validate (i.e. run the regexp) on *every* character-press.
 - Chrome only validates when you click outside the textfield.

So, you can immediately observe Firefox getting janky, as you type -- whereas with Chrome, it's less noticeable because it doesn't happen while you're typing.

So, this is likely a DOM/Editor bug rather than a JS bug -- it's not that our regexp is slower than Chrome's, but rather that our textarea is invoking it more aggressively.
Component: JavaScript Engine → Editor
Flags: needinfo?(kvijayan)
Summary: Typing 25-30 chars on an input with pattern causes browser to hang → Typing 25-30 chars on an input with expensive regexp pattern causes browser to hang
Attachment #8951089 - Attachment description: testcase 2 (based on author's jsbin): regexp-validated textfield → testcase 2 (based on author's jsbin): regexp-validated textfield (WARNING, may hang your content process if you type more than a couple characters)
Actually: on further testing: Chrome is janky on each keypress here, too, just like we are -- BUT, they're only janky if I'm hovering my cursor over the textbox. No idea why -- maybe their valid/invalid styling depends on that hover state.

So they're approximately as bad as we are, if you hover the textbox - i.e. they're not doing the "only validate when you click outside" thing that I was guessing at in comment 7.  But they do have one nice difference that I noticed, even in the bad scenario -- they appear to throttle their regexp-check to happen once per frame rather than once per keypress, or something like that.

e.g. if I type a bunch of characters in rapid succession into https://bugzilla.mozilla.org/attachment.cgi?id=8951089 (while hovering the textfield), Chrome makes a "batch" of those characters appear instantly, and only *then* does it hang forever. :)  As compared to Firefox, which hangs a bit with each successive character.
So I think there are incremental wins we can get here (e.g. by batching updates like Chrome does), but I'm not sure those can really make this much less painful than it already is (the bad cases will still be pretty bad).

Having said that: I checked Edge on both testcases, and they claim to be able to run the RexExp ~instantly!  On the first attachment here (from comment 6), Edge claims to spend 0ms on all of the provided strings (as compared to ~900ms on the longest one, in Chrome and Firefox).

So maybe there are some JS engine optimizations to be done... (or maybe Edge is misinterpreting the RegExp somehow).  So, perhaps worth having a JS hacker take a look at that after all, to try to account for the 0ms vs 900ms difference between us and Edge here on the first testcase. --> restoring my needinfo=djvj
Flags: needinfo?(kvijayan)
Attachment #8951086 - Attachment description: testcase just directly running the regexp in JS → testcase 1 (just directly running the regexp in JS on strings of up to 27 characters)
Here's a screencast demonstrating Edge 16's performance on both testcases here [recorded on my Linux machine using BrowserStack.com].  They have 0ms measurements on the first testcase, and no noticeable delay when typing arbitrarily many characters on the second testcase.

(Note that I haven't tested them for *correctness*, so that's worth wondering about / validating. But as long as they're not cheating somehow, this is a sign we could be doing a heck of a lot better here.)
(In reply to Daniel Holbert [:dholbert] from comment #5)
> kannan, do you know could take a look at this? (or if this is a dupe of a
> known issue)
> 
> See profile in comment 3 -- scroll down to "Content (8 of 8)" and you'll see
> the hang towards the end of the profile (with a long red hang-bar, and black
> "keypress" handler bars ~doubling in size with each successive keypress --
> with the last handler clocking in at ~8 seconds long.)

Sorry for the late reply.  I think the last person to work with the irregexp stuff was bhackett.  But it seems that you've confirmed that this is not a regex speed issue anyway?
Flags: needinfo?(kvijayan)
(In reply to Kannan Vijayan [:djvj] from comment #11)
> But it seems that you've confirmed that this is not a
> regex speed issue anyway?

When comparing to Edge, there does seem to be a regexp speed issue (per comment 10).
Priority: -- → P3
James - do you know of any real-world sites that are affected by this?  That would help our prioritization here.
Flags: needinfo?(james)
In practice, no. As mentioned, the regex is pretty bad anyway. It was being used on a client of mine's site, but I've since rewritten the regex so they weren't affected. In reality, I think this is a bit of an edge case, and whilst yes indeed the performance here is bad, most developers would pick up on the inefficiency and rework things themselves. I guess this is a "nice to have", but I certainly wouldn't put it high in priorities!
Flags: needinfo?(james)
OK, thanks! That was my guess / gut feeling as well.
Whiteboard: [qf] → [qf:p5][qf:f64]
Whiteboard: [qf:p5][qf:f64] → [qf:p5:f64]
Whiteboard: [qf:p5:f64] → [qf:p5]

The solution here is an evaluation time limit on an input pattern, e.g. 10ms ought to be ample. I have no idea if there is technical infrastructure for doing this.

Chrome suffers from this same bug.

There is, however, an aspect of this that only applies to Firefox which makes it weaponisable: values are validated immediately, even in an unmounted and inert DOM such as HTML defangers use. I’ve filed bug 1559731 for that aspect of it.

For optimal robustness, a time limit would need to apply to all patterns being evaluated at a given point. Doing so probably requires the fix I suggested in bug 1559731 (lazily evaluating patterns) anyway.

You need to log in before you can comment on or make changes to this bug.