Closed Bug 1616004 Opened 4 years ago Closed 4 years ago

Crash in [@ mozilla::image::RecyclingSourceSurface::Unmap], broken nsHttpTransaction.mConnection

Categories

(Core :: Networking: HTTP, defect, P1)

Unspecified
Windows 10
defect

Tracking

()

RESOLVED DUPLICATE of bug 1617142
Tracking Status
firefox-esr68 --- unaffected
firefox73 --- unaffected
firefox74 --- unaffected
firefox75 --- fixed

People

(Reporter: gsvelto, Unassigned)

References

Details

(Keywords: crash, regression, regressionwindow-wanted)

Crash Data

This bug is for crash report bp-7e670035-6f2c-4886-bde6-27cae0200215.

Top 10 frames of crashing thread:

0 xul.dll mozilla::image::RecyclingSourceSurface::Unmap image/RecyclingSourceSurface.h:42
1 xul.dll mozilla::net::nsHttpTransaction::Close netwerk/protocol/http/nsHttpTransaction.cpp:1112
2 xul.dll mozilla::net::nsHttpConnectionMgr::OnMsgCancelTransaction netwerk/protocol/http/nsHttpConnectionMgr.cpp:2525
3 xul.dll mozilla::net::ConnEvent::Run netwerk/protocol/http/nsHttpConnectionMgr.cpp:258
4 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1220
5 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:481
6 xul.dll mozilla::net::nsSocketTransportService::Run netwerk/base/nsSocketTransportService2.cpp:1135
7 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1220
8 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:481
9 xul.dll mozilla::ipc::MessagePumpForNonMainThreads::Run ipc/glue/MessagePump.cpp:332

The stacks aren't great here but this looks like a NULL-pointer access within a RecyclingSourceSurface object. The mSurface field is possibly null. The oldest crash has buildid 20200214130325.

Component: Graphics → ImageLib
Priority: -- → P3

My buildid is 20200223093951 and hitting the same bug here. ( https://crash-stats.mozilla.org/report/index/feee5d7b-ed87-4fc9-84fa-0981b0200223 )

This bug bothers me a lot these days. (There are some other crashes caused by mozilla::net::nsHttpConnection::OnWriteSegment)

(In reply to slanterns.w from comment #2)

This bug bothers me a lot these days. (There are some other crashes caused by mozilla::net::nsHttpConnection::OnWriteSegment)

https://i.loli.net/2020/02/24/OZoy6FMREqv4cLS.png

I've hit this crash five times over the last week, three times today.

100% of the 131 crash reports are from Windows (7, 8.1, and 10). Perhaps this a Windows-specific issue or just that a lot of Nightly users use Windows.

The oldest crash has buildid 20200214130325.

Here is the mozilla-central pushlog for the 24-hours priority to build id 20200214130325:

https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=2f6870dd1b99edaba1de4d2aa97f3910a640c5bf&tochange=ca97340e64722330f45043b3fa8a819ad7c59835

@ Hiro, could this RecyclingSourceSurface crash be a regression from enabling lazy image loading in bug 1613611?

Flags: needinfo?(hikezoe.birchill)
Keywords: regression
Regressed by: 1613611

yeah, plausible. But I have no knowledge about network stack.

Honza, though I believe this is not the case what you were concerned about in a review comment, any insight on this crash?

Also, can someone who can see the sites caused in the crash reports see whether there is any loading=lazy images in the sites? document.querySelector("img[loading=lazy]") would be sufficient. If there are iframes, it's not sufficient though.

Flags: needinfo?(hikezoe.birchill) → needinfo?(honzab.moz)

Looking at these crash reports' URLs, I don't see a pattern, though one of the crashing pages was a tutorial about lazy loading images. 😁 That strong suggests this crash is related to lazy loading.

https://medium.com/@Grigorkh/native-lazy-loading-of-images-and-iframes-6931fe455632

NI to :aosmond too since the crash happens at RecyclingSourcesSurface::Unmap(). mSurface seems null there. Is it possible that RecylingSourceSurface::Unmap() gets called via nsHttpTransaction::Close? I couldn't find such code paths at all.

Flags: needinfo?(aosmond)
See Also: → 1617880

hiro, I don't believe it is possible. However since this question has been asked before, I filed bug 1617880 to use NotNull with RecyclingSourceSurface just to make sure nothing is going wrong.

Flags: needinfo?(aosmond)

The nsHttpTransaction.mConnection seems to be broken. Rather closing this bug, this seems security sensitive.

Group: core-security, network-core-security

This is very unlikely caused by bug 1613611

Dragana, Kershaw, do you know about anything we have recently landed that would cause this? Could it be somehow socket process related? It's not clear which process is crashing from the report.

Component: ImageLib → Networking: HTTP
Priority: P3 → P1
No longer regressed by: 1613611
Summary: Crash in [@ mozilla::image::RecyclingSourceSurface::Unmap] → Crash in [@ mozilla::image::RecyclingSourceSurface::Unmap], broken nsHttpTransaction.mConnection

FYI these crashes are all happening in the main process, on the Socket thread.

FYI, the crashes usually happen when I close a tab (the browser freeze, and then crash).

Honza: the crashes appear to all be a null address which doesn't seem so scary. Can you explain what security issue you're worried about so we can give this an appropriate security rating?

Group: core-security

Nightly no longer crashes these days.

Interesting.

I was looking at the code and we do a non-null check just before the line we do the dereference.

  if (mConnection) {
    connReused = mConnection->IsReused(); // <-- crashing here, with a vtable pointing at a different object, hence mConnection seems freed/broken
    isHttp2 = mConnection->Version() >= HttpVersion::v2_0;
  }

Could be a race (mConnection is nullified on a different thread between the two lines and mozilla::image::RecyclingSourceSurface::Unmap is a red herring), too.

The fact that the crash is gone or morphed could be because of recently landed bug 1582663 (ten days ago, on 28-02-20).

Flags: needinfo?(kershaw)
Flags: needinfo?(honzab.moz)
Flags: needinfo?(dd.mozilla)

(In reply to Honza Bambas (:mayhemer) from comment #15)

Interesting.

I was looking at the code and we do a non-null check just before the line we do the dereference.

  if (mConnection) {
    connReused = mConnection->IsReused(); // <-- crashing here, with a vtable pointing at a different object, hence mConnection seems freed/broken
    isHttp2 = mConnection->Version() >= HttpVersion::v2_0;
  }

Could be a race (mConnection is nullified on a different thread between the two lines and mozilla::image::RecyclingSourceSurface::Unmap is a red herring), too.

The fact that the crash is gone or morphed could be because of recently landed bug 1582663 (ten days ago, on 28-02-20).

There is only one place where mConnection is setting to null on main thread.
But it's less likely that nsHttpTransaction::Close is called while the same transaction is destructing.

Flags: needinfo?(kershaw)

I know what is the problem here http3Session is dereferencing connection that is nullptr. This has been fixed in 1617142.

Status: NEW → RESOLVED
Closed: 4 years ago
Flags: needinfo?(dd.mozilla)
Resolution: --- → DUPLICATE
Group: network-core-security
You need to log in before you can comment on or make changes to this bug.