779581 - Unrecoverable browser hang (seemingly in the network stack/image loader)

Reporter

Description

•

12 years ago

I accidentally opened up an old test document from an app that I worked on that used an embedded HTML renderer to display UI. For some reason, this document hangs Firefox solid without any crash/hang recovery possible.

I can't tell precisely what is going wrong, but if I look at it in the debugger, it seems to be trapped in an infinite loop attempting to load resources from a custom URI scheme that the embedded renderer exposed (editor://) - under normal circumstances i would just expect this to fail and not display a resource, since the loads are being triggered by <img> tags. Instead, it seems to display a few broken images and then completely choke and never recover.

I don't necessarily think that this page should *work* in FF, but opening it should definitely not make the browser choke and hang. Chrome handles it just fine and it used to work in FF back when I was working on it.

Honza Bambas (:mayhemer)

Comment 1

•

12 years ago

Can you reproduce with a clean new profile?

http://support.mozilla.com/en-US/kb/Managing-profiles#w_creating-a-profile

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Comment 2

•

12 years ago

I could.

Katelyn Gadd (:kael)

Reporter

Comment 3

•

12 years ago

I tested in a local trunk build with a clean profile, yes.

Patrick McManus [:mcmanus]

Comment 4

•

12 years ago

it repros easily with a windows nightly. hard for me to get a stack right now cause i'm in constant meetings but the ui thread is obviously spinning at 100%

Katelyn Gadd (:kael)

Reporter

Comment 5

•

12 years ago

When I paused a few times in a local build in the debugger, it seemed to be stuck in nsImageLoadingContent::LoadImage, on a call to StringToURI. The actual URI in question seemed to bounce between two different URIs (both with the editor:// prefix), so probably two different images.

Loic

Comment 6

•

12 years ago

It's a regression. Disabling HWA doesn't help.

Mozregression range:
m-c
good=2010-11-16
bad=2010-11-17
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=a42e9b001bc8&tochange=ad227939db82

Keywords: regression

Loic

Comment 7

•

12 years ago

Maybe this suspected bug:
Ehsan Akhgari — Bug 612565 - Don't recurse into document modification listeners for editable documents if the modification to the document is made on the bogus BR node; r=roc a=blocking-beta8+ This fixes an unresponsive script dialog error for Etherpad pages (possibly among other websites).

Honza Bambas (:mayhemer)

Comment 8

•

12 years ago

Is anybody actively diagnosing this?  I could take a look...

Boris Zbarsky [:bzbarsky]

Comment 9

•

12 years ago

I think the key part is this, in doneLoading:

  var doneLoading = function (evt) {
    tiles.appendChild(tile);

this will kick off a new load for the tile off the append, and when that load completes the append will happen again, which will do a remove and readd and start a _new_ load, etc.

And each time through will kick off a loadTile on the next index to boot.

We have some protection against unnecessary loads, but that relies on being able to create an nsIURI from the src, which may be what's failing here.

I have no idea why the regression range in comment 6 matters here.

Katelyn Gadd (:kael)

Reporter

Comment 10

•

12 years ago

The number of loads should be bounded though - iirc it's set to 200 - which would suggest that it should recover from the hang eventually.

Or are you saying that doneLoading has a bug because - for example - the load event can be fired multiple times, or the error event can be fired multiple times, or both events can be fired? I'll admit I don't think I ever considered those possibilities when writing this originally.

Loic

Comment 11

•

12 years ago

(In reply to Boris Zbarsky (:bz) [In and out Aug 1 - 10, out Aug 11-20] from comment #9)
> I have no idea why the regression range in comment 6 matters here.

Before 2010-11-16: no hang, Firefox is able to display the page with the HTML editor
http://i.imgur.com/HvyKU.png

After 2010-11-17: UI hangs, almost impossible to navigate in Firefox.

Boris Zbarsky [:bzbarsky]

Comment 12

•

12 years ago

> Or are you saying that doneLoading has a bug

Yes.  Its error/load handler moves the image in the DOM, but moving the image starts a new load which calls error/load, etc.  So yes, they can fire multiple times on a given image, if you move it around, at least in Gecko right now.

> Before 2010-11-16: no hang

I get that.  What I don't see is what exactly in that range would have changed the behavior or why it used to work before.

Katelyn Gadd (:kael)

Reporter

Comment 13

•

12 years ago

Thanks Boris. Can you point me to the relevant part of the relevant spec(s) that explains what circumstances would cause multiple load/error events? This probably affects other applications I've written and it's probably something other developers I know have been doing too.

What approach would avoid triggering this? Would display: none not trigger a new load because it only affects layout and display, not DOM structure?

Boris Zbarsky [:bzbarsky]

Comment 14

•

12 years ago

The spec situation here is ... unclear.  Moving a node in the DOM can change its base URI (due to xml:base).  So what we do is we start a load every time a node is moved in the DOM which the HTML spec says nothing about, but the HTML spec also doesn't consider xml:base.  Then we optimize the load away if the URI is the same as the URI of the last load and the last load didn't get blocked by content policy (this last check is because in general content policies depend on position in the DOM!).

In this case the URI is editor://something, which is an unknown URI scheme.  Unknown protocols are flagged as URI_DOES_NOT_RETURN_DATA, and nsNoDataProtocolContentPolicy blocks such loads for images so we don't end up optimizing away loads on movement in the DOM in this case.  But all of that has been in place since 2007 or so.

(no longer active)

Comment 15

•

12 years ago

Please let me know if I can help with this bug somehow.  The stuff that is being discussed here is not very familiar to me, FWIW!

Patrick McManus [:mcmanus]

Comment 16

•

8 years ago

unfortunately the repro URL is gone

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → INCOMPLETE

Bugzilla

Quick Search

Unrecoverable browser hang (seemingly in the network stack/image loader)

Categories

(Core :: Networking, defect)

Tracking

()

People

(Reporter: kael, Unassigned)

References

(
URL
)

Details

(Keywords: regression)

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Comment 16