Open Bug 890552 Opened 12 years ago Updated 3 years ago

Tumblr page causes permanent main thread hang inside regexp interpreter

Categories

(Core :: DOM: Core & HTML, defect, P5)

x86_64
Windows 7
defect

Tracking

()

People

(Reporter: kael, Unassigned)

References

()

Details

(Whiteboard: [platform-rel-Tumblr])

Reproduced in FF 24.0a2 (2013-07-03) 32-bit on 64-bit w7, FF 22 32-bit on 64-bit w8 (clean profile), and FF23 64-bit on 64-bit w7 (someone else's machine). The bug itself is kind of finicky, sometimes it doesn't ever hang and when I visited another page on the same tumblr the hang didn't happen. But I've hit it multiple times. It doesn't seem to hang in Chrome, but there is definitely weird stuff happening on this webpage. Here's a stack from pausing it in VS. The stack looks weird but from stepping around I couldn't find anything incorrect about it; it looks like it's running the regexp interpreter on manufactured strings to support some sort of querySelectorAll call? ---- mozjs.dll!JSC::Yarr::Interpreter<wchar_t>::matchDisjunction(JSC::Yarr::ByteDisjunction * disjunction=0x04fd5e80, JSC::Yarr::Interpreter<wchar_t>::DisjunctionContext * context=0x021a0044, bool btrack=false) Line 1264 + 0x9 bytes C++ mozjs.dll!JSC::Yarr::Interpreter<wchar_t>::matchNonZeroDisjunction(JSC::Yarr::ByteDisjunction * disjunction=0x04fd5e80, JSC::Yarr::Interpreter<wchar_t>::DisjunctionContext * context=0x021a0044, bool btrack=false) Line 1417 C++ mozjs.dll!JSC::Yarr::Interpreter<wchar_t>::matchParentheses(JSC::Yarr::ByteTerm & term={...}, JSC::Yarr::Interpreter<wchar_t>::DisjunctionContext * context=0x021a0000) Line 904 + 0x1a bytes C++ mozjs.dll!JSC::Yarr::Interpreter<wchar_t>::matchDisjunction(JSC::Yarr::ByteDisjunction * disjunction=0x04f98520, JSC::Yarr::Interpreter<wchar_t>::DisjunctionContext * context=0x021a0000, bool btrack=false) Line 1229 + 0xf bytes C++ mozjs.dll!JSC::Yarr::Interpreter<wchar_t>::interpret() Line 1441 C++ mozjs.dll!JSC::Yarr::interpret(JSContext * cx=0x00000003, JSC::Yarr::BytecodePattern * bytecode=0x04fa3100, const wchar_t * input=0xb1775680, unsigned int length=31, unsigned int start=0, unsigned int * output=0x04747010) Line 1968 C++ mozjs.dll!js::RegExpShared::execute(JSContext * cx=0x22cc5a00, const wchar_t * chars=0xb1775680, unsigned int length=31, unsigned int * lastIndex=0x0043ef10, js::MatchPairs & matches={...}) Line 559 + 0x10 bytes C++ mozjs.dll!ExecuteRegExpImpl(JSContext * cx=0x0043ef60, js::RegExpStatics * res=0x22a55de0, js::RegExpShared & re={...}, JS::Handle<JSLinearString *> input={...}, const wchar_t * chars=0x0000001c, unsigned int length=168380416, unsigned int * lastIndex=0x00000003, js::MatchConduit & matches={...}) Line 119 + 0xe bytes C++ mozjs.dll!js::ExecuteRegExp(JSContext * cx=0x22cc5a00, JS::Handle<JSObject *> regexp={...}, JS::Handle<JSString *> string={...}, js::MatchConduit & matches={...}) Line 583 + 0x23 bytes C++ mozjs.dll!regexp_exec_impl(JSContext * cx=0xb0e9aec0, JS::CallArgs args={...}) Line 631 + 0x1d bytes C++ mozjs.dll!js::regexp_exec(JSContext * cx=0x00000001, unsigned int argc=513289856, JS::Value * vp=0x0043f060) Line 648 + 0x2a bytes C++ > xul.dll!nsSimpleContentList::Release() Line 135 + 0x9c bytes C++ xul.dll!nsRefPtr<nsHttpTransaction>::~nsRefPtr<nsHttpTransaction>() Line 881 C++ xul.dll!mozilla::dom::DocumentBinding::querySelectorAll(JSContext * cx=0x00000000, JS::Handle<JSObject *> obj={...}, nsIDocument * self=0x00000000, const JSJitMethodCallArgs & args={...}) Line 2730 + 0x9 bytes C++ xul.dll!mozilla::dom::DocumentBinding::genericMethod(JSContext * cx=0x00000302, unsigned int argc=1191278976, JS::Value * vp=0x00000004) Line 7148 + 0x1c bytes C++ mozjs.dll!EnterIon(JSContext * cx=0x00000003, js::ion::EnterJitData & data={...}) Line 1838 C++ mozjs.dll!js::ion::Cannon(JSContext * cx=0x22cc5a00, js::RunState & state={...}) Line 1923 + 0x8 bytes C++ mozjs.dll!js::Invoke(JSContext * cx=0x22cc5a00, const JS::Value & thisv={...}, const JS::Value & fval={...}, unsigned int argc=0, JS::Value * argv=0x0043f6c0, JS::Value * rval=0x0043f690) Line 531 + 0x9 bytes C++ mozjs.dll!JS_CallFunctionValue(JSContext * cx=0x22cc5a00, JSObject * objArg=0x71350640, JS::Value fval={...}, unsigned int argc=0, JS::Value * argv=0x0043f6c0, JS::Value * rval=0x0043f690) Line 5755 + 0x2f bytes C++ xul.dll!mozilla::dom::Function::Call(JSContext * cx=0x22cc5a00, JS::Handle<JSObject *> aThisObj={...}, const nsTArray<JS::Value> & arguments={...}, mozilla::ErrorResult & aRv={...}) Line 39 + 0x1f bytes C++ xul.dll!mozilla::dom::Function::Call<nsCOMPtr<nsISupports> >(const nsCOMPtr<nsISupports> & thisObj={...}, const nsTArray<JS::Value> & arguments={...}, mozilla::ErrorResult & aRv={...}, mozilla::dom::CallbackObject::ExceptionHandling aExceptionHandling=265842968) Line 52 + 0x20 bytes C++ xul.dll!nsGlobalWindow::RunTimeoutHandler(nsTimeout * aTimeout=0xa733e160, nsIScriptContext * aScx=0x1cd52c00) Line 10208 C++ xul.dll!nsGlobalWindow::RunTimeout(nsTimeout * aTimeout=0x2aafb9a0) Line 10447 C++ xul.dll!nsGlobalWindow::TimerCallback(nsITimer * aTimer=0x0678f680, void * aClosure=0x2aafb9a0) Line 10693 C++ xul.dll!nsTimerImpl::Fire() Line 543 + 0x7 bytes C++ xul.dll!nsThread::ProcessNextEvent(bool mayWait=true, bool * result=0x0220f140) Line 626 + 0x14 bytes C++ ntdll.dll!_NtSetEvent@8() + 0x15 bytes nss3.dll!_MD_CURRENT_THREAD() Line 314 C nss3.dll!PR_Unlock(PRLock * lock=0x022531a0) Line 316 C xul.dll!mozilla::OffTheBooksMutex::Unlock() Line 79 + 0x9 bytes C++ xul.dll!NS_ProcessNextEvent(nsIThread * thread=0x02217301, bool mayWait=false) Line 238 + 0x21 bytes C++ xul.dll!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate * aDelegate=0x0122a0e0) Line 82 + 0xa bytes C++ xul.dll!MessageLoop::RunHandler() Line 213 C++ xul.dll!MessageLoop::Run() Line 187 C++ xul.dll!nsBaseAppShell::Run() Line 165 C++ xul.dll!nsAppShell::Run() Line 113 + 0x9 bytes C++ xul.dll!XREMain::XRE_mainRun() Line 3856 + 0xa bytes C++ xul.dll!XREMain::XRE_main(int argc=1, char * * argv=0x0088ea90, const nsXREAppData * aAppData=0x0043fce4) Line 3924 + 0x6 bytes C++ xul.dll!XRE_main(int argc=1, char * * argv=0x0088ea90, const nsXREAppData * aAppData=0x0043fce4, unsigned int aFlags=0) Line 4126 + 0xd bytes C++ firefox.exe!do_main(int argc=1, char * * argv=0x0088ea90, nsIFile * xreDirectory=0x02217040) Line 272 + 0x13 bytes C++ firefox.exe!NS_internal_main(int argc=1, char * * argv=0x0088ea90) Line 632 + 0xc bytes C++ firefox.exe!wmain(int argc=0, wchar_t * * argv=0x001d1a00) Line 112 C++ firefox.exe!__tmainCRTStartup() Line 552 + 0x17 bytes C kernel32.dll!@BaseThreadInitThunk@12() + 0x12 bytes ntdll.dll!___RtlUserThreadStart@8() + 0x27 bytes ntdll.dll!__RtlUserThreadStart@8() + 0x1b bytes ----
Kevein, does the yarr code not terminate, or does it just get called a lot? Trying to understand what the DOM issue here is....
Er, I meant "Kevin" of course.
Flags: needinfo?(kevin.gadd)
It looked like yarr was terminating; I think I got up to JSC::Yarr::interpret and could see the string being matched against (it was very short). But i'm not 100% sure, while I was stepping in/out the debugger fell over.
Flags: needinfo?(kevin.gadd)
OK. Is this a regression? Does the problem appear in Firefox 22, say?
Flags: needinfo?(kevin.gadd)
Quoting from original description: Reproduced in FF 24.0a2 (2013-07-03) 32-bit on 64-bit w7, FF 22 32-bit on 64-bit w8 (clean profile), and FF23 64-bit on 64-bit w7 (someone else's machine).
Flags: needinfo?(kevin.gadd)
Oddly I just tried reproducing in a clean profile on Nightly on this machine and it wouldn't repro, but it still reproduces in Aurora. I do think there's a risk of this tumblr changing such that the 'bad content' isn't there anymore (a bunch of it has already changed) but I don't know what to do about it. Is there a way to save the current state of the page such that it will be more likely to reproduce the bug? I can't think of one. I have a couple memory dumps from hung processes that I'll hang onto just in case, but I'd like to be able to ensure this can actually be investigated.
Er, sorry I missed the mention of 22 in there... What about older releases? You can save the page as "web page, complete" and see if the saved version reproduces the problem. If they're not doing too much AJAXy stuff, it should work. Part of my issue here is that the stack in comment 0 is bogus: querySelectorAll can't invoke regexp matching... So it's hard to tell what's really going on.
I can confirm that the regexp matcher does terminate. I can walk all the way up the stack to RunTimeoutHandler. I actually did repro it against nightly, it just seems to be much harder. The trigger seems to be scrolling to the right; it looks like this page has an infinite scrolling mechanism and perhaps the 'hang' is actually layout or something else happening on an extremely large, layout-intense page. The interval between message pumps seems to increase as the page scrolls until it basically hangs. Regexp looks like it's not the only thing running here; it's probably just enough of the CPU time that I happened to hit it every time I paused. If I pause->continue repeatedly I see other stuff happening but it's all under RunTimeoutHandler. Not sure if that makes this DOM or not. If it helps, here's one of the selector strings I saw being parsed for a querySelectorAll call during the hang: "#54014449730 .phorow:not(.row1)" If I step, it appears that this string gets passed to ParseSelectorString repeatedly, but at different memory offsets. It looks like querySelectorAll is being called repeatedly with the same selector, and the caller looks like jitcode.
Sorry, just saw your new comment. I think I agree that the original stack is partially bogus; I keep seeing stacks like that where there are some number of garbage frames between EnterIon or EnterBaseline and the actual code. I don't know what to do about it since I can't step through the jitcode to see what's actually above it. Is there something wrong with the symbols I'm getting from the symbol server? The mInterval on these timeouts that are hanging is 200, so it looks like it's a ton of timeouts firing at once, not one slow or 0ms interval timeout that's tying the browser up. gTimeoutsRecentlySet claims to be 149629, which explains a lot if true. The telemetry 'timeoutsRan' counter appears to be at 74814 right now in this paused browser process. So maybe this is just entirely bad content JS and not anything to do with the Firefox DOM. Not sure why I can't get it to repro in Chrome, or why it repros so easily in my normal browser configuration. Viewport zoom breaking the viewport calculation in their infinite scroll code, maybe?
If the Chrome profiler is accurate (I did notice that if I scrolled for long enough in Chrome, much longer than FF, it began to bog down), the script in question is from the tumblr theme being used, http://www.tumblr.com/theme/35230 and it's calling jquery's 'each' function and taking forever. It's strange to me that it is seemingly an order of magnitude faster in Chrome, though, since the browser never really hangs there. But it is definitely taking up lots of time. Maybe they cache some data structure to make this pattern perform better?
> Not sure if that makes this DOM or not. Unclear. It could just be a buggy page that runs a ton of JS, too... > Is there something wrong with the symbols I'm getting from the symbol server? Probably not, but unwinding through the stack-replacement ion does can be interesting. Worth trying a debug build, possibly. > and the caller looks like jitcode. In a debug build you could look up what the JS code involved is. > since the browser never really hangs there Well, the browser UI and the page JS don't run in the same process in Chrome. Does interaction with the page hang there?
I'm not sure if this is the same bug, but I found a Tumblr page that consistently hangs Firefox (and also Chrome and Edge) whenever I access it. Warning; don't follow the link below unless you want your browser to hang! http://100worstpeopleontwitter.tumblr.com/post/28832853798/67-laura-snapes
platform-rel: --- → ?
Whiteboard: [platform-rel-Tumblr]
platform-rel: ? → ---
Priority: -- → P5
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.