Closed Bug 1366643 Opened 3 years ago Closed 3 years ago

Tabbrowser should drop bfcache entry after cloned a SHEntry

Categories

(SeaMonkey :: Tabbed Browser, defect, blocker)

defect
Not set
blocker

Tracking

(seamonkey2.52 fixed)

VERIFIED FIXED
seamonkey2.52
Tracking Status
seamonkey2.52 --- fixed

People

(Reporter: freesamael, Assigned: freesamael)

References

Details

(Keywords: verifyme)

Attachments

(2 files)

We're noticing that tabbrowser in seamonkey may move a history entry from a session history to another in bug 1363036.

Although the code seems to work for many years, an nsISHEntry is in fact tightly coupled to the nsISHistory it belongs to, and shouldn't be added to another nsISHistory.

A better way of doing this is to clone the entry and then add the cloned entry to newSH.
(In reply to Samael Wang [:freesamael] from comment #0)
> We're noticing that tabbrowser in seamonkey may move a history entry from a
> session history to another in bug 1363036.
> 
> Although the code seems to work for many years, an nsISHEntry is in fact
> tightly coupled to the nsISHistory it belongs to, and shouldn't be added to
> another nsISHistory.
> 
> A better way of doing this is to clone the entry and then add the cloned
> entry to newSH.

So DXR has incorrect info, the entry is cloned, just it didn't abandonBFCacheEntry().
Summary: Tabbrowser should clone the nsISHEntry before adding it to another nsISHistory → Tabbrowser should drop bfcache entry after cloned a SHEntry
Samael,

thanks for looking into this. I will build a test version and review the patch after work.
(In reply to Frank-Rainer Grahl from comment #3)
> Samael,
> 
> thanks for looking into this. I will build a test version and review the
> patch after work.

In case it feels confusing, what I'm trying to do is that instead of making some hacks inside gecko (that's what I did and got r-'d in bug 1363036), just make the caller avoid doing unsafe operations.
Comment on attachment 8869913 [details] [diff] [review]
Drop the bfcache entry after cloning a SHEntry

Halfway done but no cigar yet. I backed out all patches from the mozilla tree. Do I need to include another one? 

I am afraid the crash report is not good for much because no symbols are available:

https://crash-stats.mozilla.com/report/index/16408508-5e6e-49f1-8ec0-dc9ee0170522

With the patch SeaMonkey no longer crashes when you close a tab. But it does so when undoing the close. I suspect it crashes somehwere in suite\common\src\nsSessionStore.js

Could it be in restoreHistory:

history.addEntry(this._deserializeHistoryEntry(tabData.entries[i],
                                aIdMap, aDocIdentMap), true);

If you don't have any idea I will probably need to do a debug build.
Attachment #8869913 - Flags: review-
Severity: normal → blocker
Version: unspecified → Trunk
(In reply to Frank-Rainer Grahl from comment #5)
> Comment on attachment 8869913 [details] [diff] [review]
> Drop the bfcache entry after cloning a SHEntry
> 
> Halfway done but no cigar yet. I backed out all patches from the mozilla
> tree. Do I need to include another one? 
> 
> I am afraid the crash report is not good for much because no symbols are
> available:
> 
> https://crash-stats.mozilla.com/report/index/16408508-5e6e-49f1-8ec0-
> dc9ee0170522
> 
> With the patch SeaMonkey no longer crashes when you close a tab. But it does
> so when undoing the close. I suspect it crashes somehwere in
> suite\common\src\nsSessionStore.js
> 
> Could it be in restoreHistory:
> 
> history.addEntry(this._deserializeHistoryEntry(tabData.entries[i],
>                                 aIdMap, aDocIdentMap), true);
> 
> If you don't have any idea I will probably need to do a debug build.

In case you're interested, bug 1365897 is from my crash in the SeaMonkey nightly
from 13 May. It includes a stack trace with symbols and I get the same crash at every startup in nightly builds since then. That bug was then RESOLVED DUPLICATE to bug 1363036.

Even without Socorro symbols, if the crashing build is accompanied by a .crashreporter-symbols.zip and/or a .crashreporter-symbols-full.zip, I could produce a human-readable trace out of a crash, though the process is labour-intensive so I prefer to use a nightly build since Socorro would have its symbols on hand.

And if you want to have me test an experimental build, I'm willing. I'm on Linux64 and, as I said, it would be best if you know how to build the .crashreporter-symbols[-full].zip together with the .tar.bz2
(In reply to Tony Mechelynck [:tonymec]. (NEEDINFO me if you want my attention) from comment #6)
> I get the same crash at every startup in nightly builds since then. 

This must be frustrating, I'm very sorry for the pain I caused. I added MOZ_DIAGNOSTIC_ASSERT to generate crashes on nightly intentionally for hunting bug 1363036, that why bug 1365897 was marked duplicate. There may have some ways to workaround bug 1363036 but I was more worry about the bugs hidden behind.

The new patch in review in bug 1363036 will stop the crash, and will reject moving entries across session history (which is really an unsafe / problematic operation). It will also generate a warning message when these unexpected nsISHistory.addEntry / replaceEntry calls happen. The warning message will help us to find what pieces Seamonkey needs to modified for this bug, if I could't fix this bug soon enough.
(In reply to Samael Wang [:freesamael] from comment #7)
> (In reply to Tony Mechelynck [:tonymec]. (NEEDINFO me if you want my
> attention) from comment #6)
> > I get the same crash at every startup in nightly builds since then. 
> 
> This must be frustrating, I'm very sorry for the pain I caused. I added
> MOZ_DIAGNOSTIC_ASSERT to generate crashes on nightly intentionally for
> hunting bug 1363036, that why bug 1365897 was marked duplicate. There may
> have some ways to workaround bug 1363036 but I was more worry about the bugs
> hidden behind.
> 
> The new patch in review in bug 1363036 will stop the crash, and will reject
> moving entries across session history (which is really an unsafe /
> problematic operation). It will also generate a warning message when these
> unexpected nsISHistory.addEntry / replaceEntry calls happen. The warning
> message will help us to find what pieces Seamonkey needs to modified for
> this bug, if I could't fix this bug soon enough.

Oh, I just use builds not having the crash, currently an experimental build made by frg, including the r- patch from bug 1363036 and a few others, such as his own patch for bug 1364677. Before that I used the 12 May nightly, which was built before the crash appeared.

I don't know if the MOZ_DIAGNOSTIC_ASSERT is included in this "home-compiled" build, but I'm attaching the sysout/syserr log from my present run just in case. If the diagnostic messages from this patch are present, I would expect them to be somewhere near the begin.

Apart from the above-mentioned patches, this build is defined as follows:
UA:"Mozilla/5.0 (X11; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0 SeaMonkey/2.52a1" 
ID:20170521130302 en-US 
c-c:9285e79dfd098b4d5e6de6c12550e72c98387ce4 
m-c:5b74bbf20e803e299790d266fc6ebf5d53b7a1b7
(In reply to Frank-Rainer Grahl from comment #5)
> Comment on attachment 8869913 [details] [diff] [review]
> Drop the bfcache entry after cloning a SHEntry
> 
> Halfway done but no cigar yet. I backed out all patches from the mozilla
> tree. Do I need to include another one? 
> 
> I am afraid the crash report is not good for much because no symbols are
> available:
> 
> https://crash-stats.mozilla.com/report/index/16408508-5e6e-49f1-8ec0-
> dc9ee0170522
> 
> With the patch SeaMonkey no longer crashes when you close a tab. But it does
> so when undoing the close. I suspect it crashes somehwere in
> suite\common\src\nsSessionStore.js
> 
> Could it be in restoreHistory:
> 
> history.addEntry(this._deserializeHistoryEntry(tabData.entries[i],
>                                 aIdMap, aDocIdentMap), true);
> 
> If you don't have any idea I will probably need to do a debug build.

This crash hits a different assertion here:
http://searchfox.org/mozilla-central/rev/6c2dbacbba1d58b8679cee700fd0a54189e0cf1b/docshell/shistory/nsSHistory.cpp#1931

It's also an assertion I added before for bug hunting. I thought there's no way a shisotry can be assigned to another docshell.

I suspect it's caused by tabbrowser.restoreTab:
>            // reattach the old history
>            b.webNavigation.sessionHistory = hist;

Using browser.swapFrameLoaders is probably a better way of implementing restoreTab but it needs careful verification to avoid regression, so I'll just remove the assertion from gecko, and try not break this use case in gecko.
Thanks. Let me know if you have a preliminary patch ready.

The SeaMonkey sessionstore/-restore and history stuff is old and needs an urgent update. Already looked at it a few times and if I find some time I will align it with the one in browser. 

I think we may hit a third bug after this is fixed. Backed out yours and sometimes get an out of bounds crash when doing an undo. But first things first :)
(In reply to Frank-Rainer Grahl from comment #10)
> Thanks. Let me know if you have a preliminary patch ready.
> 
> The SeaMonkey sessionstore/-restore and history stuff is old and needs an
> urgent update. Already looked at it a few times and if I find some time I
> will align it with the one in browser. 
> 
> I think we may hit a third bug after this is fixed. Backed out yours and
> sometimes get an out of bounds crash when doing an undo. But first things
> first :)

Would you try the latest patch in bug 1363036 (along with the patch in this bug)?
Flags: needinfo?(frgrahl)
Will do. A bit busy today but should have time tomorrow after work.
Flags: needinfo?(frgrahl)
(In reply to Samael Wang [:freesamael] from comment #11)
> (In reply to Frank-Rainer Grahl from comment #10)
> > Thanks. Let me know if you have a preliminary patch ready.
> > 
> > The SeaMonkey sessionstore/-restore and history stuff is old and needs an
> > urgent update. Already looked at it a few times and if I find some time I
> > will align it with the one in browser. 
> > 
> > I think we may hit a third bug after this is fixed. Backed out yours and
> > sometimes get an out of bounds crash when doing an undo. But first things
> > first :)
> 
> Would you try the latest patch in bug 1363036 (along with the patch in this
> bug)?

I am doing win64 and linux-64 builds with those patches so people can use those to test.  My builds are at https://www.wg9s.com/mozilla/seamonkey/  Linux build should be available by 8:30 AM EDT and Windows by 11:30.
Bill could you check if an undo tab close crashes with a stadard assert (in which case we might still have an issue here) or an element out of bounds assert (which might be another problem).
(In reply to Frank-Rainer Grahl from comment #14)
> Bill could you check if an undo tab close crashes with a stadard assert (in
> which case we might still have an issue here) or an element out of bounds
> assert (which might be another problem).

Sure I can do that.  when you said you did not have time i figured you meant not time to do a build so offered my builds to help.  Can you give me steps to do that caused a crash before?  I just want to be sure I am testing the correct thing.
Bill actually both today. 

Open two or three tabs. Navigate in one to a few websites. Close it. Undo the Close and use Back in it more than one time. If it crashes post the crash id.
(In reply to Frank-Rainer Grahl from comment #16)
> Bill actually both today. 
> 
> Open two or three tabs. Navigate in one to a few websites. Close it. Undo
> the Close and use Back in it more than one time. If it crashes post the
> crash id.
OK to make it trickier i will also try doing the same killing the browser process in the middle and recover doing session restore and then undo the tab close.
(In reply to Frank-Rainer Grahl from comment #16)
> Bill actually both today. 
> 
> Open two or three tabs. Navigate in one to a few websites. Close it. Undo
> the Close and use Back in it more than one time. If it crashes post the
> crash id.

OK trying that procedure it was simple to reproduce a patch on my build form yesterday which had the patch here but not the ones form bug 1363036.  I was unable to reproduce a crash with my build form today which has both the patch form here and the 2 form bug 1363036.
I tried the current SeaMonkey nightly in my day-to-day profile and it crashed while restoring the session, crash bp-83576d9c-bee5-459d-ba17-aa4190170523 [@ nsSHEntry::SetSHistory ] (with symbols)

This is a Mozilla-built nightly described by
20170523003001
https://hg.mozilla.org/comm-central/rev/bd34d80df5ff4929617aaede1ec44230f54d65f7
https://hg.mozilla.org/mozilla-central/rev/5bc1c758ab57

I suppose the Socorro report is the same as in bug 1365897 duped to bug 1363036.

Now I'm going to test Bill's L64 build from comment 13. Stay tuned to this station, further news will be announced as we get them.
(In reply to Tony Mechelynck [:tonymec]. (NEEDINFO me if you want my attention) from comment #19)
> I tried the current SeaMonkey nightly in my day-to-day profile and it
> crashed while restoring the session, crash
> bp-83576d9c-bee5-459d-ba17-aa4190170523 [@ nsSHEntry::SetSHistory ] (with
> symbols)
> 
> This is a Mozilla-built nightly described by
> 20170523003001
> https://hg.mozilla.org/comm-central/rev/
> bd34d80df5ff4929617aaede1ec44230f54d65f7
> https://hg.mozilla.org/mozilla-central/rev/5bc1c758ab57
> 
> I suppose the Socorro report is the same as in bug 1365897 duped to bug
> 1363036.
> 
> Now I'm going to test Bill's L64 build from comment 13. Stay tuned to this
> station, further news will be announced as we get them.

I hope my builds are useful in figuring this out.  the patch for this bug is to fix the crash when you close a tab.  Crashes during an undo close tab or a session restore are what the patch on bug 1363036 is intended to fix.
With Bill's build, as follows:
UA:"Mozilla/5.0 (X11; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0 SeaMonkey/2.52a1" 
ID:20170523003001 en-US 
c-c:bd34d80df5ff4929617aaede1ec44230f54d65f7 
m-c:f8a8a6fbf9bdc62d32031af249b76fbd257f58d5

No startup crash. No crash with the procedure in comment #16. No relevant new console messages compared with attachment 8870275 [details] AFAICT.
My build is actually with the same patches as today's nightly plus those documented on my build page.  I do these builds daily today was the first time I actually tried to do them at 0030 Pacific time because I am not up then so was previously doing them around 6am the next morning my time.  They did not work today.  I think I fixed everything so that my builds will just be the same as the real nightly but with my extra patches.  That is the intent. I have been doing this for quite some time with Firefox builds.
Duplicate of this bug: 1367258
Thanks for the help. Will check it out one last time later when I am home and r+ and check in the patch here then.
Duplicate of this bug: 1367373
Comment on attachment 8869913 [details] [diff] [review]
Drop the bfcache entry after cloning a SHEntry

Samael,

as Bill and Tony already have reported: This fix together with the two in Bug 1363036 fix the close and undo issues. 

I am also no longer see my suspected third bug:
> ElementAt(aIndex = 4294967295, aLength = 0)
> Crash Reason 	EXCEPTION_BREAKPOINT

Big thanks again for helping out. Checking this one in.
Attachment #8869913 - Flags: review- → review+
https://hg.mozilla.org/comm-central/rev/69c2a5c364a16698a552dc4d2cb39958e356cfde
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → seamonkey2.52
Alas, SeaMonkey is currently not building, due (I'm told) to bug 1367337. Once it starts building again, I'll be sure to test it with my previouly "always crashing" session.
UA:"Mozilla/5.0 (X11; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0 SeaMonkey/2.52a1" 
ID:20170527011755 en-US 
c-c:49a68316ebabe052b42cee27985fdbc1015feb45 
m-c:ebad93e117700d8e2d65573b824beb18a8cc2030

Building has resumed, this is the first hourly after the interruption and it did not crash at startup. If the crash does not reappear in a week I shall declare the fix Verified for L64. Other platforms I cannot test, except maybe L32-on-L64 whi I don't regard as significant.

A couple of weird things: Unlike the 12-May nightly and frg's home build, it displays very few of the current session's favicon (on my "flowing tabs" on several rows by means of userChrome.css) and (considering that I load tabs "on demand") tabs other than the first don't load spontaneously after making them current but only after clicking the Go button. These quirks are, I think, unrelated to the present bug though, I may report them in a different bug if I get around to it and it won't prevent "verifying" this bug.
Keywords: verifyme
Whiteboard: [set Verified for L64 if not REOPENED on 2017-06-03, see comment # 29]
VERIFIED for lack of dissent to comment #29 after not only a week, but even 1½ months.
Status: RESOLVED → VERIFIED
Whiteboard: [set Verified for L64 if not REOPENED on 2017-06-03, see comment # 29]
You need to log in before you can comment on or make changes to this bug.