Closed Bug 274784 (blazinglyfastback) Opened 20 years ago Closed 19 years ago

Make back and forward blazingly fast and side-effect free

Categories

(Core :: DOM: Navigation, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla1.8beta2

People

(Reporter: brendan, Assigned: bryner)

References

(Depends on 12 open bugs, Blocks 1 open bug)

Details

(Keywords: perf, Whiteboard: (o72555) Set browser.sessionhistory.max_viewers to nonzero to test)

Attachments

(9 files, 6 obsolete files)

1.36 KB, patch
shaver
: review+
Details | Diff | Splinter Review
316 bytes, text/html
Details
501 bytes, text/html
Details
146.10 KB, patch
bzbarsky
: review-
bzbarsky
: superreview+
brendan
: approval1.8b2+
Details | Diff | Splinter Review
146.49 KB, patch
Details | Diff | Splinter Review
688 bytes, text/html
Details
690 bytes, text/html
Details
164.66 KB, patch
Details | Diff | Splinter Review
1.57 KB, patch
dbaron
: review+
dbaron
: superreview+
Details | Diff | Splinter Review
This bug is where the real work to fix bug 38486 will go on. That bug is a pile of noise and a useless soapbox on which free-lunch-demanders wish to stand. /be
Status: NEW → ASSIGNED
Priority: -- → P2
Target Milestone: --- → mozilla1.8alpha6
Can this bug block bug 91351 163993 as the original bug did? I think it's important for those tracking (though I realize it may attract people to this bug).
bug 91351 seems fine. I'm not marking it as blocking 163993 since it not known whether this idea is a good thing yet.
Blocks: 91351
In the death-bug, I said: To do this properly, I think you have to separate the JS global scope object from the "window" object; which is something I think we should do for security reasons anyway. I don't see why scripting is a big deal; if we're going back/forward to a page which has already "loaded", there is no reason to fire onload events. Brendan then said, "we must fire onload when reloading from historoy, for backward compatibility -- so must every other browser" And Hixie said: <Hixie> opera doesn't do that <Hixie> it's the source of many of our dom bugs <Hixie> (to do with history) <Hixie> but we consider them worth the price of lightning fast back/forward
I could see a pref for fast vs. compatible back/forward being worth the UI, but before conceding that point, perhaps we could do some analysis of pages that break if they don't get reloaded (from cache) on session history navigation (back/forw). /be
Ian, can you give a few examples of URLs that break without onload-on-Back? /be
Not off-hand, no.
Interestingly enough, I just came across a page of discussion on this bug, except in reverse and for Opera: http://www.quirksmode.org/bugreports/archives/2004/11/load_and_unload.html - if you're still looking for an example of a page broken by this functionality, I'm sure you could ask him. It seems as though the Opera folks really haven't found a solution either, they just went in the opposite direction. Would it be possible to simply not load from cache if the page includes onLoad or onUnload events? That would improve performance (storage speed notwithstanding) for most pages, while retaining compatability for javascript generated pages, wouldn't it? Oh, also, does anyone have some some kind of estimate regarding how much storage this'd take?
Is it not feasible to cache the work of the parser? I was under the impression the XUL cache does that. It won't cause any back/forward quirks, and it could be used for all page loads, not just for backing up to a previous page. If an html page is to be loaded from the disk cache, check for an associated parser cache entry. If it's there, use it to load the page.
But I, at least, don't want the parser results when I go back. I want exactly the same DOM tree that I left when I went forward. I'm pretty sure that not-firing the onload even would be the correct behavior in this case; I just want to see what parts of the existing web this would break, and why. Perhaps we need to implement the "onenter" event mentioned in the linked opera thread.
hm, I guess not firing onload would break things that do stuff that's not reflected in the DOM... (window opening... alerts... confirm()... others?)
A lot of sites use onload events or top-level scripts, but most of these sites would not be broken by DOM-storing. For example, a site that just twiddles element positions would not be affected, and a site that prompts for your age while loading or in onload would be *improved* by DOM-storing. The only onload actions that might need to be redone are actions that change external state, such as variables in another frame/window or a cookie. I would expect many sites that use scripts to alter external state to have onunload events that undo those changes. Is that true -- do most sites broken by Opera's DOM-storing feature use onunload? If so, Firefox could use the following heuristic: store the DOM in history unless the site has an onunload event. To summarize the possibilities: A) Store everything. Drop support for onunload. (Opera) B) Store unless site uses onunload. (My suggestion) C) Store unless site uses onunload, onload, or inline scripts. (Comment 7) (A) or (B) could be combined with adding support for onenter and onleave.
What I might like to see is an option to have Back/Forward behave essentially as though each page was in a separate tabbed window, because that is how I have been workarounding the current behavior-- when I want to follow a link from which I expect I may want to go Back, I open the link in a new tab, and then use Shift+Ctrl+Tab to go Back and Ctrl+Tab to go Forward; when a page is outofdate, I do Ctrl+R. When combined with using tabs for branches in the browsing path/sequence, this approach is not ideal, but for plain linear browsing, I find it works great, and so I think having the equivalent behavior just using Back/Forward rather than tabs would be even better. I guess the key thing is that I can be the one who decides the behavior regarding reloading, rather than the browser having to try to figure out what to do for me, when it may not have enough understanding of what is going on at the web application level to make a good choice. I would still want to have an option to have the browser treat Back/Forward as some sort of movement along a path of pages where only one page can be visible/visited at a time, rather than that other model of any page which has been visited may still be visible in its previous state, regardless of whatever pages have been visited since then, and it is up to the user to worry about treating the set of pages as a sequence.
What do you plan to do about plugins? Re comment 11: I agree that (B) is reasonable. However, that makes the assumption that onbeforeunload handlers are idempotent while onunload handlers are not. The way onbeforeunload works (prompt using a returned string) suggests that typical onbeforeunload handlers probably ought to be idempotent. I wonder if we could get away with making the same assumption about onunload. It doesn't seem entirely unreasonable. (How widely used are both of these?)
There's another tricky case. Forms that uses onclick/onsubmit. These scripts could be setting hidden values or make other modifications to formfields before they are submitted. OTOH, these pages are already in trouble since formfields are restored when you go back to a page. But there are actually pages that break on this already. (Can't come up with an example off the top of my head). I think the best way to deal with those pages is some sort of 'onenter' or 'onreturn' event. Also, it would be usefull to be able to tag a formfield (or maybe an entire form) for being automatically reset on this event. This is usefull for things like navigation selects, or seach-fields (it's always confusing returning to a google seachpage and see a value you searched for _leaving_ that page, rather then see the value you searched for _getting_ that page). In fact, this stuff could be independent of this bug. Though if we did store the DOM it would be usefull if the event indicated wheather the DOM was restored or just the form-values. Something for WHATWG, i'll see if I can write up a formal proposal...
Actually, what happens if I leave a page, resize the window, then go back to the page? Is there an onload event? Is there an onresize event? If there is neither, how is the page to reposition things if it does the usual "position it all in JS" thing? As for doing prototype stuff like XUL, it's been thought of.... it might still happen. It wouldn't really solve this bug as filed, though, I suspect.
(In reply to comment #15) > Actually, what happens if I leave a page, resize the window, then go back > to the page? There's an onresize event.
Whiteboard: (o72555)
Target Milestone: mozilla1.8alpha6 → mozilla1.8beta2
Blocks: 203448
I'm not gonna have time for this till 1.9, but bryner has already started hacking on it, and we've been talking. He's the man for this bug, so reassigning. /be
Assignee: brendan → bryner
Status: ASSIGNED → NEW
Hixie, anyone: do any of Opera, Safari, or IE implement an onreturn on onenter event handler as sicking proposed? (Isn't onback the obvious handler name? Hmm, not if you go forward. onhistorymove?) It would be good to build on a de-facto standard, if one exists and doesn't suck. /be
Alias: blazinglyfastback
We were this close back when jband and I last messed around with this odd case. The idea is that you have o1, you want to resolve id in it, but o1[id] doesn't exist. o1's class has a new-resolve hook that defines id in some o2 not == o1 and not on o1's prototype chain. That should work, because all callers of js_LookupProperty* (direct or via OBJ_LOOKUP_PROPERTY) respect the out params. This patch fixes the bug that broke it. For this bug, consider window == o1, an inner window object that's N:1 against the window, where N is the length of session history as o2. bryner, find me on IRC if you run into further problems. They should all be fixable. Thanks, /be
Attachment #177491 - Flags: review?(shaver)
Comment on attachment 177491 [details] [diff] [review] fix so JSNewResolveOp can return an unrelated object In my feverish haze, that looks great! r=shaver
Attachment #177491 - Flags: review?(shaver) → review+
Comment on attachment 177491 [details] [diff] [review] fix so JSNewResolveOp can return an unrelated object Checked in, although this won't be needed for now, if ever, by this bug. /be
Depends on: 285404
bryner's got a patch, it's getting there. Please don't spam this bug with notes about other bugs. Yeah, once bryner's patch lands, and if you go back within the limit of in-memory-cached history items, we will comply with RFC 2616. Woo. /be
Blocks: 288462
*** Bug 288432 has been marked as a duplicate of this bug. ***
Attached patch work in progress (obsolete) — Splinter Review
This is where I'm at right now. There are a number of known issues here: - random crashes, often when closing windows - plugins need to be stopped and restarted - timeouts need to be stopped and restarted - need to fire a new event so that password autocomplete can run when going back - random rendering problem with reload button on mac - the cache size needs to be tweaked, and ideally based on available RAM - need to handle window size changes between saving and restoring presentation - do something with marquee
One source of crashes seems to happen if an iframe has focus when you close a window -- this has to do with the non-sticky case in DocumentViewerImpl::Hide. Working on that now.
bryner, what's the plan for loads that aren't done when we leave a document? Say an image is still loading? Or the document itself is not done loading?
(In reply to comment #27) > bryner, what's the plan for loads that aren't done when we leave a document? > Say an image is still loading? Or the document itself is not done loading? Brian and I spoke about this today. I think we agreed that partially loaded pages should not be put in the back cache since they might compete against the newly loading page for network and CPU bandwidth. Either that, or we should find some way to suspend the loading of an incomplete page... but that sounds very complicated.
> since they might compete against the newly loading page for network and CPU > bandwidth. I was thinking more in terms of sharing the same loadgroup, so affecting onload and the like... ;) I agree that just not putting incompletely loaded pages in the back cache is a simple solution for the time being.
Attached patch patch for review (obsolete) — Splinter Review
I think this still has a few problems. In fact, I'm pretty sure of it. But I think we should get this in, turned on, so that we can start getting crashes isolated with Talkback, etc. and we can have this stable by Firefox 1.1.
Attachment #179734 - Attachment is obsolete: true
Attachment #181003 - Flags: superreview?(jst)
Attachment #181003 - Flags: review?(roc)
This is targeted at 1.8b3, not b2, right?
I'll do what I can but I think bz-review would be invaluable also.
I'd certainly like to see it for b2 so that we can get testing exposure in the Firefox developer preview. Unless it turns out to be so bad that it masks all other problems with the developer preview, in which case we'd be best off getting it in as soon as possible after that release.
I've been using bf-cache builds for days and not had any problems that would warrant it being turned off. This should totally be checked in, without a pref.
> This should totally be checked in, without a pref. Pref'd on to be sure, but without a pref to disable just in case?!? A simple pref that lets me disable bf-cache to see if it is the culprit in some instance would be invaluable IMO.
Some none-substantive comments: Shouldn't a Sanitize() failure cause the DOM to automatically not be cached? It looks like we just ignore the return from Sanitize()... Also, what happens if content fires DOMPageRestore events? Wouldn't autofill trigger on those? Do we need some isTrusted checks here? I'd really rather we stopped putting our custom events in the DOM* namespace. How about Moz* event names instead?
mIsGoingAway is never set for a document with this patch, as far as I can see. That seems wrong. It'll lead to extra work in RemoveStyleSheet() when the document is finally destroyed. It may also be worth seeing when we currently enter GetScriptGlobalObject() while mIsGoingAway is true, since the behavior of those codepaths will change with this patch (in particular, if they actually need a script global they'll break). I'm assuming you did some basic testing and verified that the destructors for the documents you evict from the cache are in fact firing, right? Removing the code that tears down the DOM in SetScriptGlobalObject is a tad scary, but if we can do that without effectively leaking, that's great. Some subclasses override SetScriptGlobalObject and do nontrivial things in it. If you're setting it to null and then back to non-null on the same document, for example, this should cause nsImageDocument to create a new root and call SetRootContent on it, triggering some asserts. Worth checking on this and nsPluginDocument's behavior. If a document is loaded but there are still network requests in the loadgroup (background images, loads started _after_ the onload event, etc), we don't deal at all well, as far as I can tell -- either they end up in the loadgroup of the new document we're loading (which is the same exact loadgroup), or we kill them and probably don't restart them properly on restore. We need to figure out how to do this right (while recalling that at the moment reflow events, say, look like a foreground request in the loadgroup). More comments later....
I'd vote for no pref because we're getting drowned in prefs. If this is checked in _with_ a pref then I would say we absolutely must remove at least two other prefs first. (I'm talking hidden prefs here of course, not UI prefs.)
Blocks: 290394
I was using this patch for two days, and had to go back to default build because of intermittent problems with keyboard focus. Keyboard navigation was sometimes totally broken, and while usually click inside the window helped, this trick didn't work here. Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8b2) Gecko/20050413, GTK2/Xft
I'm using build with this patch for 2 days without any problems. It works great. I built Linux builds with this, and a few folks used it - also without a problem. Today I'm going to build Windows one, to get the feedback from Win users.
Comment on attachment 181003 [details] [diff] [review] patch for review roc said it was fine to switch reviewer of record to bz, who has started already. roc's gonna have a look too. jst is on vacation still, so I'm going to sr. At this point, apart from overt bugs and design mistakes that will be hard to fix the longer we wait, I'd like to see review happen in the next couple of days, so we can get this patch in and get wider testing. /be
Attachment #181003 - Flags: superreview?(jst)
Attachment #181003 - Flags: superreview?(brendan)
Attachment #181003 - Flags: review?(roc)
Attachment #181003 - Flags: review?(bzbarsky)
I'm not going to be able to do a thourough review of this till early next week....
Attached patch new patch (obsolete) — Splinter Review
Issues fixed since the last patch: - Added a null check in the accessibility code so that we don't crash if the document has no container. I don't actually think this is related to my changes but I hit it while testing, and aaronl's recent checkin seems to be the culprit. - Added additional checking in nsDocShell::CanSavePresentation to avoid saving the presentation when we're reloading the page. - Added save and restore of the docshell tree (oops) - Refactored timeout suspend/resume code and made sure to suspend and resume for child windows as well.
Attachment #181003 - Attachment is obsolete: true
Attachment #181463 - Flags: superreview?(brendan)
Attachment #181463 - Flags: review?(bzbarsky)
Comment on attachment 181003 [details] [diff] [review] patch for review removing obsolete requests
Attachment #181003 - Flags: superreview?(brendan)
Attachment #181003 - Flags: review?(bzbarsky)
The patch changes the behavior on this simple testcase. Replace the CGI with one that does JPEG push or XMLHttpRequest activity or whatever to get an idea of where this could become a pretty major issue. The problem, of course, is that a page transition stops all network activity in a docshell's loadgroup. This is desirable, since the loadgroup belongs to the _docshell_. But then when we fast-back we don't restart the network requests we killed.
I'd like to see some documentation in nsDocShell.h. Specifically, I'd like to see documentation as to the lifetime and ownership model of mContentViewer, and documentation on what the three methods you're adding do, when it's OK to call them, what happens if aURI doesn't match the URI in aSHEntry in RestorePresentation(), what's an acceptable SHEntry to pass to RestorePresentation. I'm still sorting out through the subframe navigation handling, so if you actually wrote down somewhere how that works with this patch (given that cloning the session history entries doesn't clone this cached presentation stuff), let me know please... I suspect it works out in some cases because we only actually do a load when we have a different history entry, on the specific docshell the entry is different for, but doesn't this mean that going back to the result of a subframe navigation across a toplevel load will be slow?
More things to check on I thought of... 1) Is there an issue with the cached prescontexts holding a backpointer to the docshell (the "container")? It's an nsWeakPtr, so it's not that it can start pointing to dellocated memory or anything, but could something in the prescontext or some prescontext consumer make a call on the docshell when the docshell doesn't expect it? I seem to recall anchors getting the docshell for traversal via this pointer. What happens if someone fires a (trusted, so say from chrome) DOM click event on a pointer to a node in a document that we've moved away from? 2) Are there issues with the device context being shared by the cached viewmanagers and the current live one? I'm not sure what could arise here; roc might know this part better... So if we go back to a frameset from a non-frame page, do we create new docshells for the subframes? And if so, do they create new device contexts? Or is the device context held by the content viewer?
(In reply to comment #45) > a docshell's loadgroup. This is desirable, since the loadgroup belongs to the > _docshell_. But then when we fast-back we don't restart the network requests > we killed. I'm going to post a new patch in a minute that avoids this problem by not caching the presentation if there are any active requests in the load group (other than the request for the new document). That way we don't have to figure out how to restart something like an XMLHttpRequest. (In reply to comment #46) > them, what happens if aURI doesn't match the URI in aSHEntry in That would be bad and should not happen. Do you know of a case where it does? > I'm still sorting out through the subframe navigation handling, so if you > actually wrote down somewhere how that works with this patch (given that cloning > the session history entries doesn't clone this cached presentation stuff), let > me know please... I suspect it works out in some cases because we only actually > do a load when we have a different history entry, on the specific docshell the > entry is different for, but doesn't this mean that going back to the result of a > subframe navigation across a toplevel load will be slow? I'm not entirely clear on how it does or should work either, or I would. Do you think just copying the nsIContentViewer pointer to the cloned entry would fix this? It seems like if you do that, once you restore one of the history entries, you would need to nuke the saved presentation pointer from any other entries that are using it.
(In reply to comment #48) > traversal via this pointer. What happens if someone fires a (trusted, so say > from chrome) DOM click event on a pointer to a node in a document that we've > moved away from? > The intent is that the cached document is completely dead to events. You have to explicitly go to session history to get a pointer to one, so I don't think we need to plug this at every entry point. > 2) Are there issues with the device context being shared by the cached > viewmanagers and the current live one? I'm not sure what could arise here; roc > might know this part better... So if we go back to a frameset from a non-frame > page, do we create new docshells for the subframes? And if so, do they create > new device contexts? Or is the device context held by the content viewer? If we go back to a frameset, the subframe docshells are cached with the toplevel content viewer (because the subdocuments will have a reference to them). So they will retain their original device contexts.
> not caching the presentation if there are any active requests in the load group What if the active request is a pending reflow? Or a pending image error or load event? Same thing? As far as that goes, what _is_ the plan for pending PLEvents of various sorts when the presentation is cached? Do we want them firing on the cached presentation? > That would be bad and should not happen. Then why do you need both the URI and the SHEntry? If the URI is always the URI in the SHEntry, just pass the latter? > Do you think just copying the nsIContentViewer pointer to the cloned entry > would fix this? I'm not sure, for the same reasons you're not sure (ownership issues, Hide/Show ordering, etc). > You have to explicitly go to session history to get a pointer to one Why? What if you're holding a pointer to a document (say in a different frame on the same website) and then the document gets unloaded before you do something with it?
Attached patch new patch (obsolete) — Splinter Review
Changes since the last patch: - Added a check for active requests in the docshell's load group, and skip caching the presentation if there are any. - Got rid of nsDocument::mIsGoingAway in favor of checking for null SGO. - Fixed nsEventListenerManager::HasUnloadEvents to check for beforeunload as well. - Made nsPluginDocument recreate its mStreamListener when restored from session history. - Moved nsISecureBrowserUI ownership to the docshell, so that we can always get at this to cache it (it formerly lived on the XUL <browser>, in a way that was inaccessible from C++). Added capture and restore of security state so that alerts fire as exepcted when going back/forward. - Added suspend and resume of meta-refresh loads. To do this I ended up adding to nsITimer so that you can retrieve the nsITimerCallback object. This doesn't address Boris's most recent comments yet, but I wanted to get this new code up.
Attachment #181463 - Attachment is obsolete: true
Attachment #181797 - Flags: superreview?(brendan)
Attachment #181797 - Flags: review?(bzbarsky)
(In reply to comment #51) > What if the active request is a pending reflow? Or a pending image error or > load event? Same thing? > > As far as that goes, what _is_ the plan for pending PLEvents of various sorts > when the presentation is cached? Do we want them firing on the cached presentation? > No, I don't really want anything to fire on the cached presentation. If it's not something that we can figure out how to suspend and resume, then I'd say it should be a no-cache condition. I'd rather start off overly-conservative and then gradually expand the types of pages that we can cache. > Then why do you need both the URI and the SHEntry? If the URI is always the URI > in the SHEntry, just pass the latter? > Could... aURI originates in LoadHistoryEntry if aSHEntry is non-null, so it looks like they definitely have to correspond. > > You have to explicitly go to session history to get a pointer to one > > Why? What if you're holding a pointer to a document (say in a different frame > on the same website) and then the document gets unloaded before you do something > with it? The document needs to act as if it has no presentation, in that case.
> - Got rid of nsDocument::mIsGoingAway in favor of checking for null SGO. Note that a document created via the DOMImplementation should have a null SGO, last I checked. Not sure about documents loaded via XMLHttpRequest and such, but that may be the case there too... I need to think about how embedding apps are affected by the secure browser UI change. In general, any time we have to use nsIDocShell in our chrome that means the functionality is not available to embeddors (since nsIDocShell is not frozen and never will be). > No, I don't really want anything to fire on the cached presentation. Then we probably need to audit all places where we create and post PLEvents and figure out what's supposed to happen for all of them.... Similar for various timers (the caret blink timer comes to mind). > The document needs to act as if it has no presentation, in that case. I'm not sure this patch accomplishes that. One other thing to watch out for -- computed style. Once someone gets a computed style object, we need to somehow invalidate it (or something?) when the document is cached.
(In reply to comment #54) > I need to think about how embedding apps are affected by the secure browser UI > change. In general, any time we have to use nsIDocShell in our chrome that > means the functionality is not available to embeddors (since nsIDocShell is not > frozen and never will be). > Darin and I went through this... the secure browser UI object is not accessible via any public interface on nsWebBrowser, either. So it should be no change. > > No, I don't really want anything to fire on the cached presentation. > > Then we probably need to audit all places where we create and post PLEvents and > figure out what's supposed to happen for all of them.... Similar for various > timers (the caret blink timer comes to mind). Right, we could also try harder to suspend things that might post such events. Even if we did that, it might not hurt to have a global "inactive" flag for a presentation that would kill anything like this. > > > The document needs to act as if it has no presentation, in that case. > > I'm not sure this patch accomplishes that. > No, I don't think it does. I'd say unless there are immediate known issues, that could be addressed later. > One other thing to watch out for -- computed style. Once someone gets a computed > style object, we need to somehow invalidate it (or something?) when the document > is cached. It might not hurt to see what Safari does in this case (and the above case, as well).
So just to be clear, I've somewhat accepted that we plan to check this in with known issues. I'd like to make sure those issues are not ones that affect the basic architecture of the the design. Things like deciding on the ownership of things, clear statement of our design invariants, and a plan for enforcing those invariants are, in my mind, prerequisites for sorting out what issues can be solved in the current design....
Attached patch new patchSplinter Review
Made sure to disable any active carets before putting a document into session history. Also added lots of comments in nsDocShell.h about how this all works.
Attachment #181797 - Attachment is obsolete: true
Attachment #181828 - Flags: superreview?(brendan)
Attachment #181828 - Flags: review?(bzbarsky)
Comment on attachment 181828 [details] [diff] [review] new patch bryner, great work. sr+a=me with the major issues below fixed before checkin; minor issues for your consideration only. My marks carry over if you attach a new patch fixing these issues shortly. With bz agreeing, I think you should check this in as soon as possible. Followup bugs should be filed and fixed as usual for any big patch, particularly on the XXX and XXXbryner issues, which should, if you follow Nat Friedman's wise advice, be FIXME: <bug#> comments instead. (So do file followup bugs now). It would be good to have a followup bug about better global-LRU memory pressure instead of the at-most-five-by-default pref. I'll file it if you like; let me know. /be Major issues: CopyJSPropertyArray should use the JS_GetStringChars, JS_GetUCPropertyAttributes, JS_GetUCProperty, and JS_DefineUCProperty APIs to handle full Unicode. If any of those APIs fail, there was a JS error reported (OOM) or set as a pending exception (other than OOM), so you should propagate failure from CopyJSPropertyArray. It looks like user-defined window-level getters and setters (which will be enumerable) will be replaced by undefined values, which is bad. Instead of + jsval propvalue; + if (!::JS_GetProperty(cx, aSource, propname_str, &propvalue)) { + NS_WARNING("property was found, but get failed"); + continue; + } + + PRBool res = ::JS_DefineProperty(cx, aDest, propname_str, propvalue, + NULL, NULL, attrs); you want to declare JSPropertyOp getter, setter; call JS_GetUCPropertyAttrsGetterAndSetter (I just added this API) instead of JS_GetPropertyAttributes; and define the (Unicode-named) property with getter and setter passed instead of NULL and NULL. Why don't you have to save and restore mTimeoutInsertionPoint as well as mTimeouts? Minor issues: Nit: use two-space indenting in nsDocument::Sanitize(). Nit: localize input and form to their respective for loop body blocks in nsDocument::Sanitize() Nit: SubDocEnumData *args = NS_STATIC_CAST(SubDocEnumData*, arg); inSubDocHashEnum to avoid data->data (args->data, nice). In light of that: SubDocEnumArgs, not SubDocEnumData. DOMPageRestore is ok by me -- we should standardize this via whatwg or (hah! I'll believe it when I see it) the re-forming w3c DOM group. Uber-nit: in nsDocShell::CanSavePresentation, // difficult (i.e. XMLHttpRequest). s/i.e./e.g./ In nsDocShell::RestorePresentation, // Pulling the viewer out of the CachedPresentation will thaw() it. thaw refers to some old name for what method? open?
Attachment #181828 - Flags: superreview?(brendan)
Attachment #181828 - Flags: superreview+
Attachment #181828 - Flags: approval1.8b2+
Status: NEW → ASSIGNED
Flags: blocking1.8b2+
Oh, and bryner found what I should have remembered to mention: intervals need mWhen updating on resume. And we agreed timeouts and intervals should resume with whatever time remains in the current timeout or interval period, to avoid skewing outstanding timeouts from any related intervals. /be
Comment on attachment 181463 [details] [diff] [review] new patch patch is obsolete. Removing review-request.
Attachment #181463 - Flags: superreview?(brendan)
Attachment #181463 - Flags: review?(bzbarsky)
Comment on attachment 181797 [details] [diff] [review] new patch Patch is obsolete, removing review-request.
Attachment #181797 - Flags: superreview?(brendan)
Attachment #181797 - Flags: review?(bzbarsky)
Running with this, lots of windows, tabs, plugins, maps.google.com... crashed here: #0 0x006837a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2 #1 0x008ea624 in raise () from /lib/tls/libpthread.so.0 #2 0x080545fe in nsProfileLock::FatalSignalHandler () #3 <signal handler called> #4 0xb6f00c9e in nsDocShell::CanSavePresentation () from /home/brendan/src/firefox-opt/mozilla/dist/bin/components/libdocshell.so #5 0xb6f03458 in nsDocShell::CreateContentViewer () from /home/brendan/src/firefox-opt/mozilla/dist/bin/components/libdocshell.so #6 0xb6f0c7ea in nsDSURIContentListener::DoContent () from /home/brendan/src/firefox-opt/mozilla/dist/bin/components/libdocshell.so Looking at the disassembly before $eip in frame 4, I'm thinking nsCOMPtr<nsIDocument> doc = do_QueryInterface(pWin->GetExtantDocument()); if (doc->HasRequests(aNewRequest)) return PR_FALSE; needs a null check. bryner? /be
Nits: printf("Copied window property: %s\n", propname_str); won't work too well now that propname_str is jschar *. /be
###!!! ASSERTION: Already have a root content! Clear out first!: '!mRootContent ', file c:/mozilla.org/trunk2/mozilla/content/base/src/nsDocument.cpp, line 1501 I'm guessing that's comment 37? things get real bad after that assert.
Perhaps nsGenericElement::ShouldFocus shoul default to false when there is no script global? That would make a lot more sense to me...
Comment on attachment 181828 [details] [diff] [review] new patch Drive-by review comments: +class WindowStateHolder : public nsISupports [...] + JSRuntime *mRuntime; In Mozilla there's only ever one JSRuntime, and you can always get to that through the nsIJSRuntimeService, so no need to store that here IMO. Or is this potentially needed after the service is no longer reachable? If so, add comments about that. Also nsDocument::Sanitize() should use 2-space indentation.
Attachment #181828 - Flags: superreview+ → superreview?(brendan)
Comment on attachment 181828 [details] [diff] [review] new patch I finally built a tree with this patch and did a run with XPCOM_MEM_LEAK_LOG set. Just start up and shut down. With this patch, that leaks 22 nsDocument objects (and all sorts of other stuff that they're holding references too, of course). That's not so good. Then I tried using this patch. Loaded http://lxr.mozilla.org/seamonkey/source/layout/base/nsCSSFrameConstructor.cpp then clicked the "CVS Log" link. So far so good. Then clicked back (which was fast). Then clicked forward. Then clicked back and watched nsCSSFrameConstructor be reparsed. I guess the problem is that I'm using SeaMonkey, so the max_viewers pref ended up being 0. But if it's zero, why did we cache the viewer for nsCSSFrameConstructor the first time around? That should be fixed. If I start the browser, open a new tab, load about:config in the new tab, filter for "svg", right-click on the svg.enabled pref, toggle it, and shut down the browser, I leak two more documents (24 instead of 22). That may just be because I loaded more XUL documents, of course. I got some assertions that seem to be caused by this patch: ###!!! ASSERTION: Can't get interface requestor: 'ifreq', file /home/bzbarsky/mozilla/xlib/mozilla/extensions/typeaheadfind/src/nsTypeAheadFin d.cpp, line 2096 ###!!! ASSERTION: failed to get the nsIScriptGlobalObject. bug 95465?: 'boundGlobal', file /home/bzbarsky/mozilla/xlib/mozilla/content/xbl/src/nsXBLPrototypeHandler.cpp, line 423 The former I can reproduce at will. Start the browser, hit Ctrl-T to open a new tab, then focus the content area and hit Ctrl-W to close the tab. Gives that assert and some nice frame manager warnings. The latter assert, I'm not quite sure how I got... I managed to reproduce it once more while messing with about:config for the "svg" pref, though. With this patch, I still fail the "resuming network activity" testcase, most of the time (more on this when I comment on the docshell changes). When I _do_ pass it, going forward again I get: ###!!! ASSERTION: Request list is not empty.: 'mRequests.entryCount == 0', file /home/bzbarsky/mozilla/xlib/mozilla/netwerk/base/src/nsLoadGroup.cpp, line 410 ###!!! ASSERTION: Foreground URLs are active.: 'mForegroundCount == 0', file /home/bzbarsky/mozilla/xlib/mozilla/netwerk/base/src/nsLoadGroup.cpp, line 411 On the "forms in cached document submit" testcase, if I just click the link in the subframe and then go back (without clicking the button), I get: ###!!! ASSERTION: Unexpected root view: 'rootView == origView->GetViewManager()->RootViewManager()->GetRootView()', file /home/bzbarsky/mozilla/xlib/mozilla/view/src/nsViewManager.cpp, line 4108 on every paint... Sounds like restoring viewmanager state for subdocuments doesn't quite work. Moving on to the patch itself: >Index: browser/app/profile/firefox.js Please make a similar change to browser-prefs.js. Or maybe the C++ code should default to something nonzero here, so embeddors get this feature automatically (and those who don't want it have to disable it explicitly)? >Index: content/base/public/nsIDocument.h >@@ -52,6 +52,7 @@ >+#include "pldhash.h" Why? I don't see you using it anywhere in this header... >+ /** >+ * Sanitize the document by resetting all input elements and forms that have >+ * autocomplete=off to their default values. Do we really want to be that specific? What effect, if any, should Sanitize have on XForms? Might want to file a followup bug on that. Certainly not an initil-landing blocker. >+ typedef PRBool (*nsSubDocEnumFunc)(nsIDocument *aDocument, void *aData); What does the return value mean? >Index: content/base/src/nsDocument.cpp >@@ -1616,7 +1618,12 @@ nsDocument::RemoveStyleSheet(nsIStyleShe >+ // If there is no script global object, we assume that either: >+ // 1) Document teardown is in progress >+ // 2) The document is in session history not being shown; it >should not be That's not a good assumption for data documents, etc (though they may not ever get RemoveStyleSheet called on them; I'm not sure). Wouldn't it make sense to key off of mInDestructor here, if that's the only place where we remove all our content? >@@ -1832,7 +1839,7 @@ nsDocument::GetScriptGlobalObject() cons >+ if (!mScriptGlobalObject) { > nsCOMPtr<nsIInterfaceRequestor> requestor = > do_QueryReferent(mDocumentContainer); I don't see anything in this patch that changes a document's container... So it looks to me like cached documents will point to their original docshell as the container, and will point to the window in that docshell as their script global. Is that really desirable? jst might be able to say for sure, but it seems wrong to me... Then again, it's no different from what we already had if someone held a pointer to a document past the point where the window loaded a new document... >@@ -1847,37 +1854,6 @@ nsDocument::GetScriptGlobalObject() cons > nsDocument::SetScriptGlobalObject(nsIScriptGlobalObject *aScriptGlobalObject) I'll bet that the changes to this function are causing the leaks I observed, fwiw. >+nsDocument::Sanitize() Is there a reason we don't have to worry about autocomplete="off" <textarea> elements? >+SubDocHasRequests(PLDHashTable *table, PLDHashEntryHdr *hdr, >+ PRUint32 number, void *arg) >+ PRBool hasRequest = subdoc ? PR_FALSE : subdoc->HasRequests(nsnull); That should be the other way around? >Index: content/events/public/nsIEventListenerManager.h >+ * Allows us to quickly determine if we have unload or beforeunload s/if/whether/ >Index: content/events/src/nsDOMEvent.cpp >+ "DOMPageRestore", "submit", "reset", "change", "select", "input", "paint", I'm still not so happy with putting this in the DOM spec event name namespace. I guess if jst and brendan both feel this part is ok I'll let it be, though... >Index: content/html/content/src/nsGenericHTMLElement.cpp I still think it's odd that we plan to allow anchors in cached documents to perform page traversals in the relevant docshell (which they get off the prescontext). See the "anchors in cached documents" testcase. Note that that testcase updates the status bar too. As the "forms submit" testcase shows, we also allow forms to submit... It sounds like we need a much clearer concept of what is and what is not allowed in these cached documents and then we need to make sure that we actually satisfy these constraints somehow... We also allow focus() and blur() to do something useful. That's probably wrong... For Focus() it can be fixed easily, but blur() basically assumes that as long as we have a prescontext blurring is ok. I'm not sure we can just pretend to have no prescontext either -- there are various DOM changes that would need to be propagated to the frame tree as things stand (inline style changes, image URI changes, etc). Maybe it would make sense to drop the cached DOM if it gets mutated? >Index: content/html/document/src/nsPluginDocument.cpp >+ } else if (!mStreamListener) { >+ mStreamListener = new nsMediaDocumentStreamListener(this); >+ // XXX no way to indicate OOM I'm not sure what this is trying to do... The mStreamListener is what data is being pumped into by necko when the document starts loading. We really shouldn't need it after we finish loading the document, I suspect; we certainly shouldn't need to recreate it on fast-back... If you're trying to make the plugin show up again on back, that's not the way to do it; you'd have to reopen the network stream yourself or something. To test, load a full-page plugin (http://www.copyright.gov/legislation/dmca.pdf is a good example), go to another page, go back. Notice lack of plugin after back. I don't see changes to nsImageDocument here. Didn't I point out that this patch breaks it? To test, load an image (https://bugzilla.mozilla.org/mozilla-banner.gif will do nicely), then go somewhere, then go back. Note the assert. After this operation, we have a DOM node around that thinks it's in the DOM when it really isn't. After that, some finagling should probably cause it to crash. >Index: content/xbl/src/nsXBLPrototypeHandler.cpp >- focusController = privateWindow->GetRootFocusController(); >+ if (privateWindow) >+ focusController = privateWindow->GetRootFocusController(); I think if we have no privateWindow in here we should be bailing out altogether... I can't think of a reason we'd want to run XBL command handlers in a cached document, in general.
Attachment #181828 - Flags: superreview?(brendan) → superreview+
As I said a few weeks ago on IRC, I don't think a cache of 5 documents per-session-history object is the right approach. I think we should determine when to evict documents based on a global MRU list (perhaps around 25?). Users who keep lots of tabs or windows open probably don't want 5 documents hanging around for all of them, especially ones that haven't been touched for days.
(In reply to comment #70) > As I said a few weeks ago on IRC, I don't think a cache of 5 documents > per-session-history object is the right approach. I think we should determine > when to evict documents based on a global MRU list (perhaps around 25?). Users > who keep lots of tabs or windows open probably don't want 5 documents hanging > around for all of them, especially ones that haven't been touched for days. I'm sure everyone agrees, I mentioned darin and I discussing this last week -- but this should be a followup bug. There's no point in adding to the hurdles for the patch in this bug. Please file one. /be
> I'm sure everyone agrees, I mentioned darin and I discussing this last week -- > but this should be a followup bug. There's no point in adding to the hurdles > for the patch in this bug. Please file one. Yeah, brendan and I spoke about possibly implementing a memory pressure observer since not all pages are equal in size and it is difficult to estimate the size of a page. We of course only implement the memory pressure system on Windows, but in theory it seems like the right approach. We monitor overall memory usage by the browser, and then ask subsystems to reduce memory usage when we see the overal memory usage of the browser increasing. Of course, something simple for bfcache in the meantime seems ideal ;-)
Comment on attachment 181828 [details] [diff] [review] new patch >Index: docshell/base/nsDocShell.cpp >+#include "nsICaret.h" I'm not really so thrilled with making docshell depend on caret... If we can move caret-disabling into presshell (which already knows about caret), great. If not, I guess I can deal. >+nsDocShell::SetSecurityUI(nsISecureBrowserUI *aSecurityUI) >+{ >+ mSecurityUI = aSecurityUI; Do we want to assert here if both mSecurityUI and aSecurityUI are non-null? Or do we really support that correctly? Note to self: This adds another strong ref from docshell to window. Need to update the map. >+nsDocShell::SuspendRefreshURIs() >+ for (PRUint32 i = 0; i < n; ++i) { >+ timer = do_QueryElementAt(mRefreshURIList, --n); That doesn't look right (both incrementing i and decrementing n). In particular, say n == 3. We put the callback from slot 2 in slot 0, then put the callback from slot 1 in slot 1. The callback in slot 0 got lost. >+nsDocShell::CanSavePresentation(nsIRequest *aNewRequest) >+ nsCOMPtr<nsPIDOMWindow> pWin = do_QueryInterface(mScriptGlobal); >+ if (pWin && pWin->IsLoading()) Can pWin be null here? >+ nsCOMPtr<nsIDocument> doc = do_QueryInterface(pWin->GetExtantDocument()); If so, this will crash. I think it makes more sense to return false on null pWin, really. >+ if (doc->HasRequests(aNewRequest)) Unfortunately, this call is far too late... This method is being called when we're setting up the new content viewer, but all network activity for the old document was already stopped by the Stop() call in nsDocShell::InternalLoad. This is why this patch still fails the "resuming network activity" testcase... >+ // Only save presentation for "normal" loads and link loads. Anything else >+ // probably wants to refetch the page, so caching the old presentation >+ // would be incorrect. It might be safe to also cache for LOAD_NORMAL_REPLACE, but that's followup bug material.... >+ // If there is an unload or beforeunload listener on the window, then we do >+ // not save the presentation into the session history cache. What if there's an unload or beforeunload listener on a subframe of this window?In general, it seems to me that we want to save only if we can save the presentation for this docshell _and_ all its descendants. >+nsresult >+nsDocShell::CaptureState() This method, or something like it, needs to be called recursively on all kids of this docshell. We don't want to do the "save the kids in the SHEntry" part for all descendants, and we probably don't want to save the script global objects's properties (since we're not clearing them), but we _do_ want to suspend all timeouts and intervals in subframes, suspend refresh timers, capture any security UI state as needed, etc. As you pointed out we do suspend timeouts and intervals on the kids, but refresh timers in the kids look like they would still fire. >+nsDocShell::RestorePresentation(nsISHEntry *aSHEntry) >+ if (!viewer) { >+ return NS_ERROR_FAILURE; I think NOT_AVAILABLE would make more sense here. >+ nsresult rv = mContentViewer->Open(); >+ NS_ENSURE_SUCCESS(rv, rv); On failure here we have both the SHEntry and the docshell holding a ref to the viewer, which violates the invariant we have about it being owned by one or the other. Fix that? >+ // restore the sticky state of the viewer >+ PRBool sticky; >+ aSHEntry->GetSticky(&sticky); >+ mContentViewer->SetSticky(sticky); It's a little confusing to me that we're saving this state in the viewer but restoring it here.... I guess by this point the viewer doesn't have a ref to the shtentry, right? Might be worth a comment here that we're undoing what Destroy() on the viewer did. >+ // save the previous content viewer size >+ nsRect oldBounds(0, 0, 0, 0); >+ aSHEntry->GetViewerBounds(oldBounds); Not only are we not saving anything here, oldBounds is completely unused... So I'm not sure what this code or the code that saved the bounds in the SHEntry is trying to do, exactly. >+ // Insert the new root view at the correct location in the view tree. >+ if (rootViewParent) { ... >+ if (oldPresShell) >+ oldPresShell->InvalidateViewManagers(); Did you mean to call that on the _new_ shell? I don't see us fixing up the root viewmanager pointer for the new viewmanager anywhere; that's probably causing those asserts I saw. >+ RefreshURIFromQueue(); >+ >+ mOSHE = aSHEntry; >+ EndPageLoad(nsnull, nsnull, NS_OK); EndPageLoad already calls RefreshURIFromQueue, no? So probably no need to do it here. I assume that this avoids firing onload from EndPageLoad because mEODForCurrentDocument is true, right? May be worth documenting that here. >+ if (rootSH && (mLoadType & LOAD_CMD_HISTORY)) { Can we get here if this is not a history load? Perhaps it would make more sense to assert up front in this method that |mLoadType & LOAD_CMD_HISTORY|? >+ nsCOMPtr<nsIPresShell> shell; >+ nsDocShell::GetPresShell(getter_AddRefs(shell)); >+ >+ if (shell) { (when setting the new size) This is duplicating the code for the case when we have a rootViewParent. Perhaps we should unconditionally do some of this earlier so we don't have two copies of this code within 20-some lines of each other? >+ if (mScriptGlobal) { This method earlier asserts that it got a privWin, so mScriptGlobal isn't null by now (or we've crashed). >+ mScriptGlobal->HandleDOMEvent(presContext, &restoreEvent, nsnull, >+ NS_EVENT_FLAG_INIT, &status); I assume there's a reason we don't need to allocate the private data here? >+ // Restart plugins >+ if (shell) >+ shell->StartPlugins(); Is it safe to do this synchronously? That is, can't the plugins run random script as they instantiate (if nothing else, by firing broken-plugin events)? If they can, we could reenter docshell code from in here.... is that something we're OK with? Where do we restore caret visibility? Or do we? >@@ -4727,7 +5209,10 @@ nsDocShell::CreateContentViewer(const ch >+ if (savePresentation) { >+ // The old content viewer will be saved during the call to Embed(). >+ rv = CaptureState(); >+ gSavingOldViewer = PR_TRUE; >+ // XXX this isn't really fatal, necessarily... >+ // should we try to fire unload if NS_FAILED(rv) ? Indeed. Also, if CaptureState fails we want to NOT cache the presentation (because we may not have disabled parts of it that need to be disabled). >+ gSavingOldViewer = PR_FALSE; I'd be much happier with a member var here, just in case Embed causes something else to happen in some other docshell somewhere... If we can do that without increasing docshell size, great. Otherwise, I'll take bigger size in return for safer clearer code. >@@ -5005,6 +5499,7 @@ nsDocShell::SetupNewViewer(nsIContentVie >+ nsCOMPtr<nsIDOMDocument> oldDocument; That seems to be write-only. Is it just holding a reference so the document won't die, or something else? If this really needs to be here, please document why so it won't get removed? >@@ -5627,6 +6131,20 @@ nsDocShell::InternalLoad(nsIURI * aURI, >+ if (!CanSavePresentation(nsnull)) >+ FireUnloadNotification(); >+ >+ mFiredUnloadEvent = PR_FALSE; >+ >+ CaptureState(); Do we really want to do that if CanSavePresentation returned false? I don't think we do. >Index: docshell/base/nsDocShell.h >@@ -395,6 +397,49 @@ protected: >+ // Captures the state of the supporting elements of the presentation >+ // (the "window" object, docshell tree, meta-refresh loads, and security >+ // state) and stores them on |mOSHE|. >+ nsresult CaptureState(); Are there non-fatal error returns? If so, which? >+ // Restores the presentation stored in |aSHEntry|. >+ nsresult RestorePresentation(nsISHEntry *aSHEntry); Again, there are basically nonfatal error returns (eg if we couldn't cache the rpresentation for aSHEntry for some reason). I'm still not quite sure the order of the various calls in this code is right; I'd have to spend another day or so sorting through this stuff to make sure... Titles are probably not updating because SetTitle is called by the content sink on the document, which passes it on to the docshell. So on restore we probably want to get the title from the document and set it on ourselves. The fact that we're not firing any of the usual progress notifications other than OnLocationChange probably means we're breaking some extensions. I'd like to see some reasoning behind why we think it's OK to not fire the other expected web progress listener notifications here (and darin should probably check that reasoning out, since we're basically changing the behavior of the frozen progress listener apis). >Index: docshell/base/nsIContentViewer.idl It would be nice to document >@@ -68,4 +69,21 @@ interface nsIContentViewer : nsISupports >+ * (*) unless the document is currently being printed What will happen if it _is_ being printed? One other thing that sort of bothers me in all this code is that if we fail somewhere we leave the SHEntry holding references to all sorts of stuff (eg whatever propertties were set on the window). It'd really be good to clean that up better. >Index: dom/public/base/nsPIDOMWindow.h >@@ -139,6 +145,12 @@ public: >+ // Returns an object containing the window's state. >+ virtual nsresult SaveWindowState(nsISupports **aState) = 0; This should probably indicate that it also suspends all current timeouts, etc, and stores them all in the state.... That is, this not only saves things into the state but also changes the state of the window. >Index: dom/public/coreEvents/nsIDOMLoadListener.h >+ /** >+ * Processes a DOMPageRestore event. This is dispatched when a page's >+ * presentation is restored from session history (onload does not fire >+ * in this case). Will this fire only for the "topmost" page being restored (unlike onload, which fires for all subframes too)? That's what the code looked like to me... Should this have a behavior more like onload? >Index: dom/src/base/nsGlobalWindow.cpp >+ WindowStateHolder(JSContext *cx, JSObject *aObject, nsGlobalWindow *aWindow); Document the args? >+ JSObject* GetObject() { return mJSObj; } Document this? Same for other methods on this class? >+ nsCOMPtr<nsIEventListenerManager> mListenerManager; >+ nsCOMPtr<nsIDOMElement> mFocusedElement; >+ nsCOMPtr<nsIDOMWindowInternal> mFocusedWindow; I assume you've checked these for cycles and there are no problems? >+CopyJSPropertyArray(JSContext *cx, JSObject *aSource, JSObject *aDest, >+ if (!::JS_IdToValue(cx, props->vector[i], &propname_value) || >+ !JSVAL_IS_STRING(propname_value)) { >+ NS_WARNING("Failed to copy non-string window property\n"); Can that actually happen? If so, shouldn't we actually propagate the error and fall back on normal loads instead of just warning and leading to broken behavior? Is "non-string" referring to the value of the property? Or its name? I'm not really so much following this code, so I'm assuming Brendan has given it a careful review. >+ if (!::JS_GetProperty(cx, aSource, propname_str, &propvalue)) { >+ NS_WARNING("property was found, but get failed"); Again, we should error out. >+ JSObject *stateObj = ::JS_NewObject(cx, &sWindowStateClass, NULL, NULL); >+ NS_ENSURE_TRUE(stateObj, NS_ERROR_OUT_OF_MEMORY); So we have a newly allocated unrooted JSObject... >+ CopyJSProperties(cx, mJSObject, stateObj); And we make JS_DefineUCProperty calls in here. Those can trigger GC. >+ *aState = new WindowStateHolder(cx, stateObj, this); Which means that stateObj may have been GCed by now. I think we want to do all this copying either in the WindowStateHolder or at least after we've created the holder and rooted stateObj. >+nsGlobalWindow::RestoreWindowState(nsISupports *aState) >+ nsIDOMElement *focusedElement = holder->GetFocusedElement(); >+ if (focusedElement) { >+ nsCOMPtr<nsIDOMNSHTMLElement> htmlElement = >+ do_QueryInterface(focusedElement); >+ if (htmlElement) { >+ htmlElement->Focus(); Hmm... So back() will pull focus into the new document we navigate to in most cases (since we left it by clicking a link in it, probably, so focus was in it). I guess that makes some sense.... Shouldn't we focus the window if the element wasn't HTML or XUL? For that matter, don't we have APIs to focus arbitrary elements? Shouldn't we use them? Also, calling Focus() on HTML elements fires the onfocus handler on that element, as well as onblur handlers for whatever element used to have focus. Given the codepath into this code, is that really something we want happening here? Consider handlers starting another load in this very window, or handlers closing the window, or whatever.... >+nsGlobalWindow::SuspendTimeouts() >+ PRInt64 now = PR_IntervalNow(); >+ t->mWhen = PR_MAX(0, t->mWhen - now); No can do. First of all, there is no '-' operator for PRInt64, in general. Second of all, there is no '>' operator for it (which is needed by PR_MAX). Either use nsInt64, or use the LL_* macros as needed here. >+nsGlobalWindow::ResumeTimeouts() >+ t->mWhen += now; Same here. >Index: dom/src/base/nsGlobalWindow.h >+ virtual NS_HIDDEN_(void) SuspendTimeouts(); >+ virtual NS_HIDDEN_(nsresult) ResumeTimeouts(); Why are these virtual? And it's worth documenting what they do (esp. wrt kids). >@@ -215,6 +219,8 @@ public: >+ friend class WindowStateHolder; This will lead to build bustage on some of our platforms, since there is no prior decl of |class WindowStateHolder| (a forward-decl would suffice, outside the nsGlobalWindo class declaration). >Index: layout/base/nsDocumentViewer.cpp > DocumentViewerImpl::Destroy() >+ // This is after Hide() so that the user doesn't see the inputs clear. >+ if (mDocument) >+ mDocument->Sanitize(); Sanitize() can fail. If it does, we'll have sensitive data left sitting around in the cached document. Not only should we be checking the rv here, we should be propagating it back to someplace that nukes the cached viewer (so that on back to a document we failed to Sanitize we'll reparse it). >@@ -1323,8 +1386,9 @@ DocumentViewerImpl::Stop(void) >+ if (mEnableRendering && (mLoaded || mStopped) && mPresContext) { > mPresContext->SetImageAnimationMode(imgIContainer::kDontAnimMode); >+ } Note that this means that image animations in cached documents are disabled, right? I don't see us reenabling them anywhere... Basic testing shows that on back to a document with animated images they're all not animated. >Index: layout/base/nsPresShell.cpp >+void >+PresShell::StartPlugins() What about plugins in subdocuments? I don't see us starting those... Needs to happen. >+PresShell::StopPlugins() Same issue here. >Index: netwerk/base/public/nsISecureBrowserUI.idl >+interface nsISecureBrowserUIState : nsISupports Why do we need an empty interface for this? Why not just use nsISupports? >Index: security/manager/boot/src/nsSecureBrowserUIImpl.h >+ nsresult HandleStateChange(PRBool aIsToplevel, nsIURI *aURI, >+ PRUint32 aLoadFlags, PRUint32 aProgressStateFlgs); This seems to be unused. >Index: toolkit/components/passwordmgr/base/nsPasswordManager.cpp Doesn't the password manager in extensions/wallet also need updating? Note that if we delivered the expected progress/state notifications this change would not be necessary.... >Index: toolkit/content/widgets/browser.xml The xpfe version also needs to be updated, no? >Index: view/src/nsViewManager.cpp >+nsViewManager::InvalidateHierarchy() >+ mRootViewManager = parent->GetViewManager()->RootViewManager(); >+ NS_ADDREF(mRootViewManager); Assert that mRootViewManager != this here, ok? >Index: xpfe/components/shistory/public/nsISHEntry.idl >+ /** >+ * Saved child docshells corresponding to contentViewer. There are weak >+ * references since it's assumed that the content viewer's document has >+ * an owning reference to the subdocument for each shell. That's not a valid assumption, since the DOM of the content viewer's document can be mutated arbitrarily by anyone who happened to grab a pointer to it before we left the page. In fact, the assumption that this list has anything to do with the content viewer document is not a valid assumption as things stand. >+ void addChildShell(in nsIDocShellTreeItem shell); >+ nsIDocShellTreeItem childShellAt(in long index); >+ void clearChildShells(); Documentation, please? In particular it's not clear how one would go about enumerating the child shells in a sane way. If the index is out of bounds (which bounds?), do we return null, throw, or crash? Does addChildShell add to the end? >@@ -77,12 +84,21 @@ nsSHEntry::nsSHEntry(const nsSHEntry &ot >+ , mSticky(other.mSticky) >+ , mViewerBounds(other.mViewerBounds) Do we really want to copy those, but none of the other restoration state, when cloning? >Index: xpfe/components/shistory/src/nsSHistory.cpp >@@ -104,10 +109,10 @@ nsSHistory::Init() Do we care that the pref we added isn't live? And that there's no way for the user to manually flush out this cache? >- nsCOMPtr<nsIPrefBranch> defaultBranch; >- prefs->GetDefaultBranch(nsnull, getter_AddRefs(defaultBranch)); This would regress bug 126826. >@@ -171,7 +176,10 @@ nsSHistory::AddEntry(nsISHEntry * aSHEnt >+ EvictContentViewers(mIndex - 1, mIndex); I suspect this happenns before we store the content viewer in the entry at mIndex - 1, which is why caching 0 viewers doesn't work (caches one or two after all).
Comment on attachment 181828 [details] [diff] [review] new patch OK. So overall design comments now: As I mentioned earlier, I think changing what happens wrt web progress listener notifications (at least the state ones) on session history traversal is a really bad idea. To do this patch right, we need to define what the behavior is for various accesses to the cached document (which is available to various content and chrome scripts). What happens on DOM modification? Calls to DOM interfaces? There needs to be a clear idea of what codepaths are safe to execute in a cached document. Once we have that, we need to make sure that those code paths are the only ones that we execute. There further needs to be a clear concept of how much of the cached presentation should be visible to scripts (eg via computed style) and we need to enforce that. Once we have this codepaths thing sorted out, we need to look at the various ways of triggering them, including the various PLEvents we post in Gecko and the various timers we set up (blink timer come to mind here, as do the object frame timers, XUL listbox timers, menu timers, combobox PLEvents, restyle events, selection scroll events, image box frame events, viewmanager invalidate events, etc). I would like to second David's comment about having a global cache here instead of a per-session-history one. I would also like to say, in case anyone actually cares to listen, that I feel that trying to force this patch in for 1.8/1.1 is a grave mistake. It's nowhere close to being ready as it stands, I suspect it will lead to a number of hard-to-detect regressions that we won't catch before final because of the limited testing schedule. That is, assuming we're still planning to branch for final sometime this June. If we've decided to add another few months to the schedule, then maybe (but not definitely) we could have this working by then.
Attachment #181828 - Flags: review?(bzbarsky) → review-
(In reply to comment #71) > I'm sure everyone agrees, I mentioned darin and I discussing this last week -- > but this should be a followup bug. There's no point in adding to the hurdles > for the patch in this bug. Please file one. Since when is it the responsibility of design and code reviewers to file bugs on the review comments that the patch author chooses to ignore? And sorry, but I think doing something vaguely reasonable regarding eviction does belong in this bug and should block it from landing. Maybe the fancy memory pressure stuff doesn't need to, but a simple change like the one I suggested would help a lot. I've put in quite a few weeks of work on memory leaks in this release cycle alone, mostly cleaning up after other people's messes. I'd rather not see all that work shot to pieces by a patch that's IMO too risky to land anytime in the month before the freeze, never mind nearly a month after the freeze. I'll say more bluntly what I've said somewhat gently before: I am opposed to this patch landing for 1.8. Getting this right for all the corner cases that we need to get right in order to be able to ship is going to take time. We also need to ensure that the API that we provide to web pages for the case where our heuristics don't do the right thing is sufficient and well-documented. And then there are crashes, memory leaks, and catching up from lost testing on all the other things that we're trying to stabilize for the release. I'm tired of planning my own work responsibly around a proposed release schedule and then having everybody else ignore the schedule, start *building feature lists* around or after the freeze, and leave me doing stabilization-before-a-release work for nine months out of the year. And I certainly have no plans to do another leak-cleanup cycle for this release, since I already did that in late February and late March, which was an appropriate time given the schedule that I foolishly believed.
> Since when is it the responsibility of design and code reviewers to file bugs > on the review comments that the patch author chooses to ignore? It's not, I didn't imply that, and bryner hasn't chosen to ignore anything here. It's my opinion, and I said so, that we shouldn't have to do everything in one fell swoop. That can limit testing and serialize work through one hacker and one or a few reviewers, compared to the alternative of followup bugs. I shouldn't have presumed you agreed; sorry about that. > And sorry, but I think doing something vaguely reasonable regarding eviction > does belong in this bug and should block it from landing. Maybe the fancy > memory pressure stuff doesn't need to, but a simple change like the one I > suggested would help a lot. I've put in quite a few weeks of work on memory > leaks in this release cycle alone, mostly cleaning up after other people's > messes. I'd rather not see all that work shot to pieces by a patch that's IMO > too risky to land anytime in the month before the freeze, never mind nearly a > month after the freeze. Then the only hope is that this patch goes on a branch for two months, where in spite of lots of conflicting changes and other demands, bryner manages to keep it alive. That looks like a risky plan too. There's an alternative, but it can't be done without more people helping, even if you personally don't: we get the patch reviewed and fixed more, get it in for an alpha (not necessarily the first alpha), and divide the resulting regression and perf/footprint fixing among a number of people larger than just bryner. > I'll say more bluntly what I've said somewhat gently before: I am opposed to > this patch landing for 1.8. Getting this right for all the corner cases that > we need to get right in order to be able to ship is going to take time. I think it will take more than just time. If it's serialized through one person on a branch, besides the risk of conflicts and other demands leading to that branch becoming a dead limb, the regressions that one person with some review help will never find still must be confronted when it lands. That may be a better plan for everyone else hacking on the code, but it may be a worse plan than the alternative sketched above for users. It's hard to say, but I believe a shipably fixed version patch is more valuable for users than almost any other tradeable big ticket item. It may not be worth as much as a bunch of little fixes to severe bugs, though. There's another point I'd like to make: bz esp., and others including me, talk about document things like docshell, cleaning up interfaces, reforming bad old code. But that is never going to happen as an end in itself -- nor should it! It should happen as part of an aggressive improvement to end-user function such as this patch attempts. That is the best way to guide trade-offs away from diminishing returns. I am still in favor of trying to get this patch into an alpha soon. /be
Sorry for the typos, it's late and it has been a very long day. In a nutshell, my last point comes down to the observation that we should not put off reform as a separate exercise. My next to last point, which I am not sure is a worthwhile generalization, is that forcing ambitious plans to serialize through strong hackers on (virtual or real) branches is limiting our ability to benefit end users and actually improve our codebase at the same time. I think this is likely a half-truth, but my gut says there is something real and important to it. I wonder if we did more pair programming, whether we could get more done, more quickly, and in certain particular dimensions, even if it meant fewer good results among all dimensions. /be
> I am still in favor of trying to get this patch into an alpha soon. pref'ed off with zero regressions in the pref'ed off case, I should have said (it should go without saying!). /be
At some point, I remember there being talk about getting this patch into the tree pref'd off by default. What happened to that idea? This is a big patch, and a valuable one IMO. Getting the code into the tree will not only avoid bit-rot but also allow other people to more easily test and contribute to the patch. Please correct me if I'm mistaken, but it would seem that there should be some pinch-point (or at most a couple) where we can decide whether or not to enable this feature. Trying to get this patch to be 100% perfect before everyone uses it by default is the right goal, but that doesn't mean that it has to be 100% perfect before it is checked into the tree. This patch sort of reminds me of the HTTP/1.1 pipelining patch. If I had not checked in the patch -- pref'd off -- then it would never have landed. It is still not good enough to be enabled by default, but yet people do enable it. Over time, people have reported bugs, and the feature has slowly improved as a result. Now, in some ways pipelining is probably not the best example since it still isn't enabled by default after all these years ;-) That has to do with insurmountable server compatibility issues, but I wouldn't know that if users hadn't tried the feature and reported sites that were broken. This isn't to say that Boris isn't correct in saying that more analysis is needed. Of course, the more we know about the system, the better we can ensure that this patch "hits on all cylinders." I'm just saying that perhaps we're close to a point where we can learn even more from user feedback as well. So, are we anywhere close to be able to land this pref'd off? ;-)
First, people seem to be conveniently forgetting that we're in a _beta_ Gecko milestone. The _second_ beta, in fact. As David points out, we've already gone through a bunch of post-alpha stabilization work on Gecko in the _first_ beta cycle (most done, by David, I admit). Just because Firefox is having an alpha for its UI doesn't exactly make this an "alpha milestone". Second, most of the patch is in code that really shouldn't be changing much in beta and the following final cycle if we're doing things right. So I think it should be reasonable to maintain it until we reopen for 1.9a. If desired I will put my time where my mouth is and stick this in a tree all alone and repost the merged-to-tip patch here when 1.9a opens. Third, minimal requirements for getting this in even preffed of would be (at least; I may be missing something): 1) Having the preffing off actually work 2) Not leaking documents 3) Being very very sure that the ownership model changes in document viewer are OK. 4) Deciding how much we're willing to have not work with the code preffed on (eg, do we need to audit the code for security issues as the DOM is mutated, or do we clearly say that enabling this code can wipe your hard drive and melt your monitor?). The main issue here, of course, is that even preffed off the patch changes some very very fragile code in somewhat nontrivial ways. We'd need to make sure those changes don't break things either, basically.
> First, people seem to be conveniently forgetting that we're in a _beta_ Gecko > milestone. The _second_ beta, in fact. No, I think everyone is quite aware of that (and the complications of refocusing the project on a different major release vehicle). You know as well that this talk is motivated by a desire to make Firefox 1.1 as compelling to users as it can be. Let's not kid ourselves, Firefox is what puts our project on the map. Maybe we are trying for too much... ok, that is a fair argument. > As David points out, we've already gone > through a bunch of post-alpha stabilization work on Gecko in the _first_ beta > cycle (most done, by David, I admit). Just because Firefox is having an alpha > for its UI doesn't exactly make this an "alpha milestone". Yes, lot's of great work has been done to improve Gecko for the 1.8 milestone. No one wants to regress Gecko or undo all that good work ;-) > The main issue here, of course, is that even preffed off the patch changes > some very very fragile code in somewhat nontrivial ways. We'd need to make > sure those changes don't break things either, basically. OK, so it sounds like there is currently no way to pref it off completely. That's definitely a concern.
I pulled last night and built [SeaMonkey] with attachment 181884 [details] [diff] [review]. I didn't set any prefs. 1. open browser (home page is slashdot) 2. load http://www.microsoft.com/whdc/devtools/debugging/default.mspx 3. hit back then forward 4. url bar shows http://www.microsoft.com/whdc/devtools/debugging/default.mspx and page still be slashdot every other time
Between today and 1.1 release there is time for fixing bugs that this patch can regress. And having this feature for 1.1 is IMO extremly important for marketing reasons. There is a lot of interest around this stuff, and I believe that bryner wont be left alone working on fixes for regressions that may happen.
re: comment 82, you also get the following asserts when you do that: ###!!! ASSERTION: User did not call nsIContentViewer::Close: '!mDocument', file mozilla/layout/base/nsDocumentViewer.cpp, line 516 ###!!! ASSERTION: User did not call nsIContentViewer::Destroy: '!mPresShell && !mPresContext', file mozilla/layout/base/nsDocumentViewer.cpp, line 522 This happens even with the max_viewers pref set to 5.
(In reply to comment #83) To clear my comment. I'm just trying to say, that if it would be able to focus resources on this bug and getting it for 1.1 would be extremly valuable IMO. It would give us a lot of good press, and would be a key argument in promotion of Fx 1.1. What I'm not trying to say is that the bugs and regressions should be ignored and this patch checked in asap.
Successful software releases are built around features that help the product's marketability. This is one such feature. I shouldn't have to remind anyone here that Mozilla is facing a threat from Microsoft this summer. I'm not saying we should do a poor job of any one feature, but we need to all be on the same page here. Shipping incremental improvements to existing code as new releases is not the way we ensure our vitality - we need at least a few high impact end user features. Everyone *needs* to understand this, otherwise we will run up against these problems again and again and again. I think there's a lot more we can do here as a team that recognizes the importance of this moment in time before we write this feature off. Anyhow, enough spam from me.
I filed this bug because the previous bug, bug 38486, became overloaded with advocacy. Let's not go there again. The problem here is not advocacy against the feature, as a feature. Nor is it a difference in values about regressions (bryner doesn't want to regress Firefox 1.1). There probably is a difference in judgment about the value of expediting and favoring this feature in spite of the trunk "beta" freeze, but again, that can't be the main problem, or the patch would be in more working order and we would be arguing only about less tangible risks and release philosophies. The problem is that the patch is not ready to land as is. It will cause a storm of dups, and bad user experiences at the margin that will be counterproductive in the developer preview. That's no good, given that bryner is probably pretty crispy right now. To make progress by allowing selected others to enable, test, and even help fix regressions, the change needs to cause no regressions when landed pref'ed off. That's what I've talked to bryner about getting working, and that is what I will support when landing, as I've already said. So let's do that, or tell start a newsgroup thread about why this feature should not be expedited, if possible. /be
bug got my vote! just tried the feature in Opera. I'm amazed...
If we're shipping Firefox 1.1 with this turned on, we need to have the maximum amount of testing. Checking this in preffed off won't get us adequate testing. In fact all it will do is introduce a second codepath which will later cause support problems (not to mention that each codepath multiplies the QA cost). I understand that there is a Gecko cycle and that this is the wrong time to land something like this in the Gecko cycle. But there is also a Firefox cycle and this is the right time to land this in the Firefox cycle. The Firefox cycle is what will ensure the Gecko engine remains relevant in the future. Design problems should be fixed before being checked in. But it needs to go in.
No me-too/gee-whiz bugspam or else; comment 88 is the last one I expect to see. If bryner comes up with a patch that can be pref'ed off by default without any regression in that state, then I will make a benevolent dictator decision to land that patch, so that bryner, myself, and anyone else willing to help (there will be many) can work to get the design and implementation in agreement with the (mostly hidden as usual with our codebase) requirements. This does not mean that I will decide anything else right now, including whether this feature will be on for 1.1. No one can decide that right now -- we lack the information to make an informed decision given the current schedule, and the schedule itself is too uncertain to decide. The pref can be temporary, but it is necessary to land this so bryner does not give up out of frustration. It's unreasonable to expect him to do it all; it is also unreasonable to turn it on and coerce everyone not signed up to help to put up with the regressions likely or known from the current patch. So pref'ing it off and landing it soon are the right moves. I heard dbaron agree in person, which is great. I would like to win bz's agreement. There shouldn't be mistrust that people (specifically, bryer, me, dbaron) will do the wrong thing, (allow us to) ship an unstable release, or walk away from unfixed design or implementation bugs. If we have that level of mistrust, we shouldn't even be here. Yes, this feature is late, and it may not make it for 1.8 / Firefox 1.1. We are making a special case for it, to get it checked in pref'd off, to avoid risking the work being lost, and to get more help while not breaking too many people too soon. That's an executive decision you can blame me for. The buck stops here. /be
If the leaks are fixed, I have no problem with this landing preffed off. I agree that that would prevent it bitrotting and would allow people who want to work on it to do so, which is a very good thing in this case. We have good precedent for this sort of thing with the "scary viewmanager" of olde, which benefited a good bit from the testing it got this way, iirc.
(In reply to comment #69) > >+nsDocument::Sanitize() > > Is there a reason we don't have to worry about autocomplete="off" <textarea> > elements? > Microsoft basically defined the behavior of this attribute, and they only define it to be valid for form and input elements. We don't honor autocomplete=off for textareas for purposes of saving form data, either. > >+nsDocShell::SetSecurityUI(nsISecureBrowserUI *aSecurityUI) > >+{ > >+ mSecurityUI = aSecurityUI; > > Do we want to assert here if both mSecurityUI and aSecurityUI are non-null? Or > do we really support that correctly? > It's just a pointer, what can happen incorrectly? Granted, I wouldn't expect this situation to arise, but I don't see what bad thing happens if it does. > >+ // If there is an unload or beforeunload listener on the window, then we do > >+ // not save the presentation into the session history cache. > > What if there's an unload or beforeunload listener on a subframe of this > window?In general, it seems to me that we want to save only if we can save the > presentation for this docshell _and_ all its descendants. > > >+nsresult > >+nsDocShell::CaptureState() > > This method, or something like it, needs to be called recursively on all kids > of this docshell. We don't want to do the "save the kids in the SHEntry" part > for all descendants, and we probably don't want to save the script global > objects's properties (since we're not clearing them), but we _do_ want to > suspend all timeouts and intervals in subframes, suspend refresh timers, > capture any security UI state as needed, etc. As you pointed out we do suspend > timeouts and intervals on the kids, but refresh timers in the kids look like > they would still fire. > > >+ mScriptGlobal->HandleDOMEvent(presContext, &restoreEvent, nsnull, > >+ NS_EVENT_FLAG_INIT, &status); > > I assume there's a reason we don't need to allocate the private data here? I had to allocate it for the load event so that HandleDOMEvent would not try to set the event's target to the chrome event handler. Instead, the event gets its target from the pres context, which is correct. For the restore event, I am dispatching to the window object, which will set up the target correctly. > > >+ // Restart plugins > >+ if (shell) > >+ shell->StartPlugins(); > > Is it safe to do this synchronously? That is, can't the plugins run random > script as they instantiate (if nothing else, by firing broken-plugin events)? > If they can, we could reenter docshell code from in here.... is that something > we're OK with? I'm not really sure. Maybe jst knows. > > Where do we restore caret visibility? Or do we? We don't, it happens as part of restoring focus. Thinking about it, maybe we should just blur the element instead of messing with the caret. > >Index: docshell/base/nsDocShell.h > >@@ -395,6 +397,49 @@ protected: > > >+ // Captures the state of the supporting elements of the presentation > >+ // (the "window" object, docshell tree, meta-refresh loads, and security > >+ // state) and stores them on |mOSHE|. > >+ nsresult CaptureState(); > > Are there non-fatal error returns? If so, which? I don't think so... if anything here fails we should not store the content viewer in history. > > >+ // Restores the presentation stored in |aSHEntry|. > >+ nsresult RestorePresentation(nsISHEntry *aSHEntry); > > Again, there are basically nonfatal error returns (eg if we couldn't cache the > rpresentation for aSHEntry for some reason). I'm not really following you here, maybe I'm not sure what you mean by "fatal". Fatal in what sense? > > I'm still not quite sure the order of the various calls in this code is right; > I'd have to spend another day or so sorting through this stuff to make sure... > Titles are probably not updating because SetTitle is called by the content sink > on the document, which passes it on to the docshell. So on restore we probably > want to get the title from the document and set it on ourselves. The fact that > we're not firing any of the usual progress notifications other than > OnLocationChange probably means we're breaking some extensions. I'd like to > see some reasoning behind why we think it's OK to not fire the other expected > web progress listener notifications here (and darin should probably check that > reasoning out, since we're basically changing the behavior of the frozen > progress listener apis). > Darin and I talked at length about this and decided to avoid firing progress listener notifications, if at all possible. This is to avoid having to create some kind of dummy request/channel. > One other thing that sort of bothers me in all this code is that if we fail > somewhere we leave the SHEntry holding references to all sorts of stuff (eg > whatever propertties were set on the window). It'd really be good to clean > that up better. Agreed; but I'm not sure where offhand. Most everything is only relevant if there is a content viewer set (or the content viewer is waiting to go into SH), so maybe it makes sense to do a consistency check between the content viewer and the history entry after we return from SavePresentation and RestorePresentation. If the content viewer doesn't have a pointer to the history entry, or vice versa, all of the supporting attributes of the history entry could be nulled out. > >Index: dom/public/coreEvents/nsIDOMLoadListener.h > > >+ /** > >+ * Processes a DOMPageRestore event. This is dispatched when a page's > >+ * presentation is restored from session history (onload does not fire > >+ * in this case). > > Will this fire only for the "topmost" page being restored (unlike onload, which > fires for all subframes too)? That's what the code looked like to me... Should > this have a behavior more like onload? > You're right, it does fire only for the topmost page. Perhaps it should fire for the subframes since they were restored as well. > >+ nsCOMPtr<nsIEventListenerManager> mListenerManager; > >+ nsCOMPtr<nsIDOMElement> mFocusedElement; > >+ nsCOMPtr<nsIDOMWindowInternal> mFocusedWindow; > > I assume you've checked these for cycles and there are no problems? As far as I know, correct. > > >+CopyJSPropertyArray(JSContext *cx, JSObject *aSource, JSObject *aDest, > > >+ if (!::JS_IdToValue(cx, props->vector[i], &propname_value) || > >+ !JSVAL_IS_STRING(propname_value)) { > >+ NS_WARNING("Failed to copy non-string window property\n"); > > Can that actually happen? If so, shouldn't we actually propagate the error and > fall back on normal loads instead of just warning and leading to broken > behavior? Is "non-string" referring to the value of the property? Or its > name? > Brendan said this might be possible with E4X, where the property name can be an object (?), iirc. > >+nsGlobalWindow::RestoreWindowState(nsISupports *aState) ---- More later, I'm up to here with either answering these here or fixing the code (minus the assertions noted, I can test for those after fixing other things up and see if they're still around).
> Darin and I talked at length about this and decided to avoid firing progress > listener notifications, if at all possible. This is to avoid having to create > some kind of dummy request/channel. To be clear, this applies to onStateChange, onStatusChange, and onProgressChange. We will still need to generate onLocationChange and onSecurityChange events. There are cases already where onLocationChange may be called with a null request. As for onSecurityChange, none of the consumers in our tree use the request that is passed to it, presumably because all of the interesting information is available from the associated browser's nsISecureBrowserUI object. We decided that it would be best to try to issue onSecurityChange events with a null request to avoid having to create a dummy request. I asked bryner to post a message to n.p.m.embedding to give people a heads up about this, and I also said that this may end up requiring a dummy request... the challenge with that however is in constructing a dummy request. Does it QI to nsIChannel? If so, does it return a non-null value for nsIChannel::securityInfo? And, if so... what value? Otherwise, I see no problem suppressing the other notifications including onStateChange. Boris: can you think of any problems with this?
> It's just a pointer, what can happen incorrectly? Nothing per se, but I can't think of a non-broken reason it'd be getting set twice.... > Thinking about it, maybe we should just blur the element instead of messing > with the caret. Would that work with caret browsing mode? For that matter, does setting focus (to the window, say) in caret browsing mode reenable the caret if it's been disabled? That sounds odd to me, but... > I'm not really following you here, maybe I'm not sure what you mean by > "fatal". Fatal as in "cases where the return value doesn't indicate that you should just bail out immediately". That is, cases that could happen during normal browser operation that we signal via an error nsresult. For example, if there was no presentation cached in aSHEntry (which is perfectly fine), this method will throw. It would be good to define exactly what it will throw then (and preferably make sure it never throws it otherwise)... > maybe it makes sense to do a consistency check between the content viewer and > the history entry after we return from SavePresentation and > RestorePresentation. That sounds reasonable, yes. It'd only have to be done on error return too... > Perhaps it should fire for the subframes since they were restored as well. It it's meant to serve as a general onload replacement for chrome consumption, that might make sense... but see below.
On the web progress listener issue, talked to Darin some more. The problem, of course, is that of figuring out which approach breaks the smallest number of consumers due to assumptions they make about correlation between changes in what a docshell is showing and WPL notifications. Some specific issues we discussed: 1) The WPL notifications need a request. I think this can be dealt with by just storing the document's channel in the document. The HTML document already does, and the XML document does until end of pageload. The latter could use a boolean to indicate whether the channel is still "live", and we could push mChannel up to nsDocument. This isn't quite perfect, since the channel isn't in the state it normally would be in when STATE_START, say, happens (eg. Darin pointed out its cache entry is closed), but better than nothing. ccing sicking and peterv in case they have thoughts against moving this to nsDocument... 2) The WPL notifications are tightly tied to the firing of load an unload events and consumers may be expecting said events. This is a case of which approach breaks fewest consumers... If we do send WPL notifications and people get confused because of lack of load/unload we could even consider dispatching load/unload events for fastbacked documents to chrome only. That may also break things, of course, if said chrome munges the document onload in a non-idempotent way. Have to try to figure out which breaks more... 3) Which notifications to send. I think onProgressChange makes no sense, nor does onStatusChange. onStateChange with the minimal flags we can use to indicate completion of a full pageload (with all subframes) is probably the best way to go here. If we want to be really close to what we do now, we could even do it for the child docshells of the docshell doing fastback.... Thoughts? Do we have a good idea of what extensions are actually assuming and what our UI code is assuming? For now the one data point I've seen is that password manager depends on STATE_DOCUMENT|STATE_STOP notifications. Note that it depends on getting the for subframes too, by the way. We can fix it up with DOMPageRestore listeners if those are fired on all subdocuments, but again I'm worried that this is a common usage pattern for WPL.
Other thoughts I had about this, in general: 1) It may be worth it to have a way for embeddors (or extensions) veto fastback on a per-document basis (that is, give them a hook into CanSavePresentation). Future-proofing, if nothing else. For example, XForms can then hook into that for whatever XForms has in the way of password inputs. 2) Do we need a way for embeddors to tell that a fastback operation is starting (DOMPageRestore tells them it's done)? I guess if we implement WPL notifications they'll know the document in the docshell is changing. The question is whether they need to know specifically that it's changing because of fastback.
The only important UI progress listener is the browser status handler. At least in the Suite, it needs to see start/stop state, location and security changes as these already have to be synthesized by the tabbrowser when switching tabs. Thinking of tabbrowser, the other notification it sends when switching tabs is the favicon, which expects to see DOMLinkAdded notifications. Otherwise it would need some way of persisting the favicon to restore it during a DOMPageRestore. Thinking of DOMLinkAdded notifications, the link toolbar uses a combination of unload, load, DOMHeadLoaded, DOMLinkAdded and select events. However I think adding support for DOMPageRestore would suffice. In fact, I'd quite like tabbrowser to generate DOMPageRestore events instead of select events. Oh, and I don't think Password Manager should try to restore anything.
Actually I think we can live without state changes, they're only used to force the URL bar to update to the latest page's URL even if it's been typed into.
we must send STATE_STOP notification to listeners. people are using it to figure out when a page finished loading, according to questions asked in the newsgroups ;) a11y may depend on it too...
On a practical note, we're going to be rev'ing the extension version for FF 1.1, so extension authors will be having to update anyway... as long as we clearly document the notification changes, I think they can probably deal.
Yes, but that doesn't help embeddors so much. Again, nsIWPL is a frozen interface, so we should preserve backwards compat in its functioning as much as we can (perfectly if we could, but I don't think we can do that with fastback...)
(In reply to comment #101) > Yes, but that doesn't help embeddors so much. Again, nsIWPL is a frozen > interface, so we should preserve backwards compat in its functioning as much as > we can (perfectly if we could, but I don't think we can do that with fastback...) Well, part of the reasoning that Darin and I had was that max_viewers would be enabled in application- specific prefs, so embedders would have to opt-in.
Right. I'm not saying people can't work around this and all. I'm saying we should do what we can to make it as easy as possible for this feature to be used with minimal code changes in the rest of the Mozilla tree, extensions, embedding apps, etc.
> Right. I'm not saying people can't work around this and all. I'm saying we > should do what we can to make it as easy as possible for this feature to be used > with minimal code changes in the rest of the Mozilla tree, extensions, embedding > apps, etc. This should have zero impact on non-firefox stuff (no workarounds required)... that is, unless other products enable bfcache as well. For example, pink has said that he's interested in enabling bfcache for camino. I'm sure he'll have to change some things to support bfcache, but that's ok.
Ideally, every single Gecko consumer will be able to enable any Gecko feature desired with minimal pain... That's all I'm saying. I'm really not so happy with the "it works in Firefox, so that's ok" approach, and I hope that's not what we're talking about here...
Please, nobody (least of all Darin!) is giving embedders the finger here. Embedders who want to use a more complex history and back/forward model will have to deal with that more complex model. We are giving them the _opportunity_ to use a bfcache, should they want to do the app-side work to make the user experience whatever they desire it to be. Nobody is forcing it on them, and I don't think it's "ideal" to spend effort to prevent embedders from having to write some code here if they want to take advantage of the bfcache. But arguing about an ideal world doesn't seem to be useful here, even if we were to agree on what "ideal" was. If existing embeddings of Gecko can continue unchanged when using a bfcache-capable Gecko, and observe the same behaviour, then we're exactly in the right place; investment in making it a freer lunch for embedders who _do_ want to use bfcache should wait until we know more about what it means to "enable" bfcache, have some idea of what the useful knobs are, and can spot cliches in embedder use. Right now, we only know of one app's use of it (Firefox), and we have far too many embedding interfaces that were frozen on the basis of single data points.
bz, what do you think about this going in? The only changes here that will happen regardless of the pref being enabled are (as far as I know), - nsDocument::SetScriptGlobalObject(nsnull) changes - No longer releasing mDocument in DocumentViewerImpl::Close() If you're comfortable with those changes by themselves, I think this can go in with minimal risk. Here's the complete set of changes since the last patch: - Get rid of default enabled state for firefox - Added documentation for subdocument enumeration in nsIDocument - Added nsIDocument::Destroy(), which does what SetScriptGlobalObject(nsnull) used to do. This is called when the document viewer releases its reference, and fixes the leaks bz pointed out. - Fixed grammar for nsIEventListenerManager.h comments - Got rid of mStreamListener recreation in nsPluginDocument, it's bogus. - Changed nsDocShell gSavingOldViewer to be a member variable - Fixed SuspendRefreshURIs loop copy+paste error - Made CanSavePresentation short-circuit if max_viewers == 0 - Moved caret disabling into nsIPreShell::Freeze - Made RestorePresentation take aCanSavePresentation argument, so that the check can happen before Stop() is called. - Added comments and error checking to RestorePresentation - Merged jst event changes - Got rid of unneeded GetDOMDocument call in SetupNewViewer - Clarified comments in several files. - Made CopyJSPropertyArray a little more bulletproof - Made sure nsGlobalWindow mIsDocumentLoaded is set to true after restoring from shistory
Attachment #182197 - Flags: review?(bzbarsky)
shaver, I'm not saying that we _have_ to do something more complicated. Just that it may be worth considering doing it if the win (a better and more consistent platform) is there and is worth it. I somewhat suspect it is, but I'd welcome data either way. Do we have a reasonable way yet of contacting embedders and extension writers about changes like this? bryner, I'll try to look at the pref'd off patch tonight, but if that doesn't happen it'll have to be Sunday....
One change I forgot to list: - never cache full page plugin documents
er, the interface in question (http://lxr.mozilla.org/seamonkey/source/uriloader/base/nsIWebProgressListener.idl) is already frozen. what is the way to get notified that the current page changed, or is changing, when fastback is involved (if we don't support nsIWebProgressListener for this purpose)?
(In reply to comment #110) > what is the way to get notified that the current page changed, or is changing, > when fastback is involved (if we don't support nsIWebProgressListener for this > purpose)? The DOMPageRestore event.
biesi: WPL.onLocationChange is still called. onLocationChange is the event that tells you when the contents of the webprogress object have really changed. using onStateChange for that purpose is somewhat problematic anyways since STATE_START doesn't mean the thing is going to be loaded in that webprogress anyways, and STATE_STOP happens pretty late. See the WPL implementation here: http://friedfish.homeip.net/~darinf/xulrunner/content/mybrowser/mybrowser.js I believe that that WPL would still function correctly with this bfcache.
DOMPageRestore doesn't help me if I'm watching the global nsIWebProgress. hm, yeah, onLocationChange does, I think... let's hope it suffices...
> onLocationChange is the event that tells you when the contents of the > webprogress object have really changed. That notification is not fired for initial loads in subframes, so currently if you want to keep track of documents as they're being loaded you listen to onStateChange...
ok, so, I probably missed it in the last two dozen comments... but what exactly was the problem with the suggestion in comment 95? it seems like it would be an easy way to keep the semantics of nsIWebProgressListener.
(In reply to comment #73) > >+nsGlobalWindow::SuspendTimeouts() > > >+ PRInt64 now = PR_IntervalNow(); > > >+ t->mWhen = PR_MAX(0, t->mWhen - now); > > No can do. First of all, there is no '-' operator for PRInt64, in general. > Second of all, there is no '>' operator for it (which is needed by PR_MAX). > Either use nsInt64, or use the LL_* macros as needed here. It was my understanding that the need for the LL_ macros was a limitation of the old Mac Codewarrior build, and is not necessary for any platforms we currently compile on. > >Index: dom/src/base/nsGlobalWindow.h > > >+ virtual NS_HIDDEN_(void) SuspendTimeouts(); > >+ virtual NS_HIDDEN_(nsresult) ResumeTimeouts(); > > Why are these virtual? And it's worth documenting what they do (esp. wrt > kids). > Because at one point I had them on nsPIDOMWindow... I'll fix that. > in the cached document. Not only should we be checking the rv here, we should > be propagating it back to someplace that nukes the cached viewer (so that on > back to a document we failed to Sanitize we'll reparse it). > > >@@ -1323,8 +1386,9 @@ DocumentViewerImpl::Stop(void) > > >+ if (mEnableRendering && (mLoaded || mStopped) && mPresContext) { > > mPresContext->SetImageAnimationMode(imgIContainer::kDontAnimMode); > >+ } > > Note that this means that image animations in cached documents are disabled, > right? I don't see us reenabling them anywhere... Basic testing shows that on > back to a document with animated images they're all not animated. > Oops, what I really meant to do here was to leave the animations enabled, since we want to avoid the possibility that the new document shares images (which would also have their animations stop). We need to do some work in imglib to make the animations not be shared between documents. > >Index: netwerk/base/public/nsISecureBrowserUI.idl > > >+interface nsISecureBrowserUIState : nsISupports > > Why do we need an empty interface for this? Why not just use nsISupports? > The idea was to avoid unnecessary QI'ing around. > >Index: toolkit/components/passwordmgr/base/nsPasswordManager.cpp > > Doesn't the password manager in extensions/wallet also need updating? > I'm going to hold off, pending the outcome of the WebProgressListener discussion. > >Index: xpfe/components/shistory/public/nsISHEntry.idl > > >+ /** > >+ * Saved child docshells corresponding to contentViewer. There are weak > >+ * references since it's assumed that the content viewer's document has > >+ * an owning reference to the subdocument for each shell. > > That's not a valid assumption, since the DOM of the content viewer's document > can be mutated arbitrarily by anyone who happened to grab a pointer to it > before we left the page. > > In fact, the assumption that this list has anything to do with the content > viewer document is not a valid assumption as things stand. > Is your suggestion, then, to throw out the cached presentation when the DOM is mutated? Or just to hold strong references here? > >@@ -77,12 +84,21 @@ nsSHEntry::nsSHEntry(const nsSHEntry &ot > >+ , mSticky(other.mSticky) > >+ , mViewerBounds(other.mViewerBounds) > > Do we really want to copy those, but none of the other restoration state, when > cloning? > I'm not actually sure which things it's safe to copy (because I don't entirely understand the circumstances where we clone a history entry). I haven't been thinking about this with the possibility of multiple history entries hanging on the same content viewer (and I can imagine it causing all sorts of problems). So maybe we don't want to copy any of the state... what do you think? > >Index: xpfe/components/shistory/src/nsSHistory.cpp > > >@@ -104,10 +109,10 @@ nsSHistory::Init() > > Do we care that the pref we added isn't live? > I don't, really... there's no UI for it. > And that there's no way for the user to manually flush out this cache? In firefox, when you clear history it clears session history as well. It might be that that's too coarse, but I'd prefer to solve that by making improvements to the eviction behavior rather than exposing something to the user. > > >- nsCOMPtr<nsIPrefBranch> defaultBranch; > >- prefs->GetDefaultBranch(nsnull, getter_AddRefs(defaultBranch)); > > This would regress bug 126826. > Oh, wow, that was intentional. I will restore it and add a comment so that someone doesn't do this again. > >@@ -171,7 +176,10 @@ nsSHistory::AddEntry(nsISHEntry * aSHEnt ---- up to here now, more later
> It was my understanding that the need for the LL_ macros was a limitation It's a limitation of any platform without an 8-byte long long.... We use them pretty consistently in our code, and I'd rather we didn't start misusing the type. If we want to avoid the macros, we should just use nsInt64. > The idea was to avoid unnecessary QI'ing around. But QI'ing from what to what, exactly? Any one could implement this interface (which is even scriptable!). So the cast from it to a concrete class in nsSecureBrowserUIImpl::TransitionToState is equally dangerous no matter whether the interface being cast from is nsISupports or nsISecureBrowserUIState... > Is your suggestion, then, to throw out the cached presentation when the DOM is > mutated? I think this is the only thing that makes sense in general, yes... I don't think we want to have DOM mutations that happen on the hidden document affect the state of what the user fast-backs to. The other option would be to clone the DOM wholesale on mutation and have the clone to fast-back to while the original is mutated. That could take some work, though. Then again, any solution to this issue will take work. :( > I'm not actually sure which things it's safe to copy (because I don't entirely > understand the circumstances where we clone a history entry) We clone in a few cases I can think of: 1) (Most common) Subframe pageloads. These clone the entire session history tree and then replace the subtree rooted at the relevant subframe with the new session history entry for that subframe. 2) When handing out a currentDescriptor via nsIWebPageDescriptor. That clone should probably have anything related to bfcache cleared in it before it's returned. 3) When we do a view-source load via nsIWebPageDescriptor. If #2 is done, this should be a non-issue. Those might well be it. For now I think copying none of the state is safest, since it'll just mean that fastback doesn't happen sometimes when maybe it could. I _think_ the approach you took to eviction (getting the entry via the transaction) will find the entries with the viewers correctly...
Comment on attachment 182197 [details] [diff] [review] pref'd off patch, with some review comments addressed Not repeating earlier review comments that haven't been addressed, on the assumption that there are plans to address them. >+ /** >+ * Notify the document that its associated ContentViewer is being destroyed. >+ * This releases circular references so that the document can go away. So this isn't guaranteed to be called (since not all documents have a content viewer), right? That's sort of what we have now, but may be worth documenting anyway. And please file a followup bug on figuring out why this is needed at all. >Index: content/base/src/nsDocument.cpp >+nsDocument::Destroy() >+ // XXX HACK ALERT! If the script context owner is null, the document Fix the comment? This has nothing to do with the script context owner anymore. >+ // This has to be done before we actually >+ // set the script context owner to null so that the content elements >+ // can remove references to their script objects. Is that still relevant? If so, we have a problem.... If not, it should probably be removed. >Index: docshell/base/nsDocShell.cpp >+nsDocShell::CanSavePresentation(nsIRequest *aNewRequest) >+ // If there is an unload or beforeunload listener on the window, then we do >+ // not save the presentation into the session history cache. Just realized that this doesn't check for such listeners on subdocuments. It probably needs to.... >@@ -5588,6 +6053,11 @@ nsDocShell::InternalLoad(nsIURI * aURI, >+ // Check for saving the presentation here, before calling Stop(). >+ // This is necessary so that we can catch any pending requests. >+ PRBool savePresentation = >+ aSHEntry ? CanSavePresentation(nsnull) : PR_FALSE; This looks wrong, fwiw. aSHEntry is always null for normal loads, so we'll always end up with a PR_FALSE here on, say, a link click. Then we'll stop current network activity, etc, etc. So the checking for network activity is still broken with this patch, if fastback is enabled. Doesn't affect things with it disabled, of course. >+ // If we have a saved content viewer in history, restore and show it now. >+ if (aSHEntry) { >+ if (!savePresentation) >+ FireUnloadNotification(); I don't see why we're doing this (and I should have caught this the first time). This will fire unload here if we can't save the presentation. Why? The old code didn't use to fire unload here.... We need to fire unload if we _restore_ a presentation in this block, but that's handled inside RestorePresentation(). In any case, this is changing the current codepath for history loads in a way that looks wrong to me, and does so even when fastback is disabled. >Index: docshell/base/nsIContentViewer.idl >+ * (*) unless the document is currently being printed, in which case >+ * it will never be saved in session history. But we would have skipped firing unload on it anyway? That doesn't sound so great... Not an issue for in-page listeners (since we check for those), but chrome listening for unload on the content will get a little confused. It might even do that in the non-printing case, though, since we also skip firing unload there (and the user may never hit back). Not an issue for the preffed-off patch, I guess. >Index: layout/base/nsDocumentViewer.cpp > DocumentViewerImpl::~DocumentViewerImpl() >- NS_ASSERTION(!mDocument, "User did not call nsIContentViewer::Close"); Is it normal to have an mDocument here now? If not, perhaps we should leave a warning here? But I'm guessing that it is? The rest looks ok. Note that I didn't read any of the code that's not supposed to execute when fastback is off. So the only parts I really reviewed are the change to docshell to make sure turning it off really leaves it off, the changes to codepaths that are executed when it's off, and the ownership stuff... The issue with firing unload on history traversals needs to be fixed; once that's fixed I guess I'm ok with this being checked in, preffed off, as long as the final code we decide to enable gets reviewed before we enable it.
Attachment #182197 - Flags: review?(bzbarsky) → review-
Attached patch new pref'd off patch (obsolete) — Splinter Review
This patch addresses all of the review comments in this bug, except for a few which I'll mention at the end. Here are the changes from the last patch: - Replaced nsIDocument::HasRequests with CanSavePresentation, which is also overridden by nsPluginDocument to always return false. This checks for pending requests and for unload handlers, recursively. - Change nsGenericElement::ShouldFocus to not allow focus if there is no window. - Fixed nsImageDocument::Destroy/SetScriptGlobalObject. - Fixed nsXBLPrototypeHandler::ExecuteHandler to just bail if it can't get a window. - Made refresh URI suspend/resume recursive - Removed special case in CanSavePresentation for max_viewers == 0, since I fixed the problems in nsSHistory (see below). - Changed securityState to be an nsISupports. - Made DOMPageRestore fire on all subframes, bottom-up. - Added a mechanism to make the saved presentation state of a history entry self-consistent. Call this whenever we had an error saving or restoring a presentation. - Made RestorePresentation return a boolean for whether the presentation was restored; this makes it so there is no error return if we simply hadn't saved the presentation. - On Brendan/jst advice, get rid of saving/restoring exception state in CopyJSPropertyArray. Instead, use LookupUCProperty, and explicitly exclude the location property since we prevent it from being defined. - Clean up focus logic in RestoreWindowState -- don't special case various element types, and don't steal focus away if the toplevel window is not active. - Use nsInt64 where appropriate in nsGlobalWindow::SuspendTimeouts. - Make the document viewer unset the document container and prescontext link handler when the document stores itself into session history. These are restored in Open(). This makes both the form-submit and link-traversal testcases work as expected. - Yank the content viewer back out of history if Sanitize() fails. - Make sure that content viewer eviction runs _after_ storing the content viewer into history (not after creating the history entry). This makes the max_viewers == 0/1 cases work correctly. - Made plugin stop/restore and caret hiding be recursive. - Made password manager only process trusted DOMPageRestore events. - Fixed an unrelated error in nsTypeAheadFind that I found while testing caret browsing (missing QI entry for nsIObserver). - Eliminated need for nsIViewManager::InvalidateHierarchy. Just update the pointers automatically whenever a root view is inserted or removed. Roc thought this would be a good approach; this case is not common at all. - Made sure to clone none of the presentation state for history entries. Now, the issues that I did not address: - It may still be possible for focus and blur to have effects in cached documents - The presentation is not cleared out when the DOM is mutated - I didn't do anything with respect to WebProgressListener notifications; darin and I continue to feel that we should basically try this and see what breaks. - I didn't do anything to address the possibility of reentering docshell code due to plugin start. I believe this could trivially be put off on a PLEvent but I think a wait-and-see approach is best. - Going back fires a focus event on the last-focused content; I still want to test IE and Safari and copy their behavior on this. - I did not do anything to make the cache be global instead of per session-history instance. - I have not yet filed a follow-up bug to determine why nsIDocument::Destroy is needed. All in all I think things are pretty good with this patch, both with it turned off and with it turned on. I'm not seeing any assertions or crashes anymore.
Attachment #182197 - Attachment is obsolete: true
Attachment #182539 - Flags: review?(bzbarsky)
That patch also doesn't address the issue with network requests being killed and restarted. Just try it on the very first testcase I posted in the bug. I pointed out in comment 73 that the current approach to dealing with the issue simply doesn't work.... I'd really prefer that we leave the short-circuit in place, since that will make this a lot easier to review. I simply won't have the time to adequately review the bulk of this patch this week.
Jesus, let's get this in and be done with it already. I can and will fix bugs later. I'm not disappearing tomorrow. If you can't review it this week, we need to find someone who can so that this can go in for alpha. (In reply to comment #120) > That patch also doesn't address the issue with network requests being killed and > restarted. Just try it on the very first testcase I posted in the bug. I > pointed out in comment 73 that the current approach to dealing with the issue > simply doesn't work.... > > I'd really prefer that we leave the short-circuit in place, since that will make > this a lot easier to review. I simply won't have the time to adequately review > the bulk of this patch this week.
Just in case people still are not clear on my position, I want this patch to land pref'd off without regressions, ASAP. I trust bryner to respond to bz's review comments and tests, and do whatever else it takes to fix this bug in the right time frame, with some ability to rest, recouperate, get help, etc. So I think this patch should land tomorrow. /be
Let me rephrase. I believe that the network-activity testcase is not an architectural problem and can be fixed by tweaking the CanSavePresentation logic. I don't think that there is anything right now that should block this from landing turned off, and very little, if anything, that should block it from being turned on. The short-circuit path is convenient for limiting regressions in the short-term, but I think we really want this to just be an immediate ejection from session history, and I think that's pretty safe. Safe enough for 1.1 alpha, in fact.
Perhaps I need to clarify: 1) CanSavePresentation is being called at the wrong time to possibly be able to fix the network-activity testcase. If by "tweak the logic" you mean "call it at a few other places in the load sequence", I buy that, perhaps. If you mean "edit the logic of the existing method", there's no way that will help. 2) I believe the short-circuit is the right way to go in general, not just for my review convenience, since it avoids a fair amount of work for cases when this is disabled. Given that this being disabled is our proposed embedding scenario for the near term, we should make that case reasonable. 3) I'm getting mixed signals here. We want this to land preffed off without regressions but today? It'll take me on the order of 15-20 hours of reviewing time to be able to sign off on that without the short-circuit. With the short-circuit I can do that in 2-3 hours. So I'm going to review this with the assumption that the short-circuit will be put back in (which means I'll not review any code called by CanSavePresentation or any code triggered when that returns true). Then you can land it if you want as far as I'm concerned, as long as that code _is_ reviewed (and tested, dammit!) before the pref is enabled.
I'm not sending mixed messages, I'm trusting bryner to do the right thing. If the patch lands pref'd off and we do a bunch of extra work that's thrown away on every back and forward, we are technically not regressing corrrectness, but we are wasting memory and cycles. The short-circuit code avoids this. But is it really necessary before landing? I said "without regressions". bryner: will everyone who runs default builds (with the pref set to 0) notice the lack of the short circuit? /be
It sounds like Thunderbird does not turn off session history, so without the short circuit this patch will cause presentations to be saved and thrown away on every message load over top of a previous message view. That's a case for fixing Thunderbird and other such apps to disable sesson history altogether in their embeddings. But it also suggests that the short circuit is the safer, more conservative way to handle the pref, besides being more efficient for the pref'd off case. So i'm in favor of the short circuit. But, I still trust bryner to do that as an easy step after landing. At this point followup bugs seems much better for tracking open issues. /be
Is there a mechanism to turn off session history for a particular docshell/<xul:browser>/whathaveyou?
It sounded like Thunderbird didn't disable history, but looking is better than listening -- http://lxr.mozilla.org/aviary101branch/source/mail/base/content/messageWindow.xul#146. There may be other <browser> tags that lack that attribute setting, though. Anyway, please land and fix the short circuit in a separate patch. /be
We should assume that there are embeddings out there that do not turn off session history and won't have any chance to rev code before being using with Gecko 1.8. I'm thinking of Eclipse mainly (I have not tested it), and there may be others.
Comment on attachment 182539 [details] [diff] [review] new pref'd off patch >Index: content/base/src/nsDocument.cpp >+nsDocument::CanSavePresentation(nsIRequest *aNewRequest) >+ if (request && request != aNewRequest) { ... >+ return PR_TRUE; PR_FALSE, right? >Index: content/html/document/src/nsImageDocument.cpp >+nsImageDocument::Destroy() This needs to call the superclass Destroy(). >Index: docshell/base/nsDocShell.cpp >+nsDocShell::FireRestoreEvents() >+{ >+ // These events fire bottom-up, so call this on our children first. >+ PRInt32 n = mChildList.Count(); Followup bug -- if event handlers modify the DOM, this can walk off the end of the array. In general, it's worth sorting out what should happen here if the DOM gets mutated (shell subtrees get removed, say). Similarly, the caller of this method needs to deal with the docshell being removed from the docshell tree; I'm not sure it does. >+nsDocShell::RestorePresentation(nsISHEntry *aSHEntry, PRBool aSavePresentation, >+ // save the previous content viewer size We're getting it, not saving it.... >Index: docshell/base/nsIDocShell.idl >+ void suspendRefreshURIs(); >+ void resumeRefreshURIs(); >+ void fireRestoreEvents(); I really wish we didn't need to put these on a "public" scriptable API... Maybe we can do something like the GetAsDocLoader thing we have... or something. >Index: layout/base/nsPresShell.cpp >+FreezeSubDocument(nsIDocument *aDocument, void *aData) Later on, might be worth doing the recursion in document, not presshell, and freezing all presentations, not just the one at 0. Let's file separate bugs on the remaining issues, since they're starting to get lost here... With the short-circuit restored, this is ok to land, I think. I still haven't really had a chance to trace the details of the session history stuff, but it looks reasonable at first glance. Thank you for bailing out on Sanitize failure! r=bzbarsky (with short-circuit in)
Attachment #182539 - Flags: review?(bzbarsky) → review+
Actually, there is one more serious issue. Open a new window. Load an image (full-page image). Close the window. We crash when trying to remove event listeners off a null event target. This happens no matter what the pref is set to. The problem is that mScriptGlobalObject is null by the time we get into Destroy() there. Setting and unsetting of listeners on the global object should remain in SetScriptGlobalObject, not move to Destroy().
Just to clarify, the r= holds with that crash fixed.
(In reply to comment #127) > Is there a mechanism to turn off session history for a particular > docshell/<xul:browser>/whathaveyou? xul:browser has a 'disablehistory' attribute whose presence prevents (or at least ought to prevent) session history from being used.
Depends on: 292890
- Fixed inverted return value of nsDocument::HasRequests - Fixed nsImageDocument to deal with the window event listeners in SetScriptGlobalObject (the mImageContent listener is still unhooked in Destroy). Also make sure to call nsMediaDocument::Destroy(). - Restored the short-circuit in nsDocShell::CanSavePresentation. Reordered the checks in that method to minimize the overhead in cases where we can't save the presentation (doc->CanSavePresentation() becomes last since it does a recursive crawl of the subdocuments).
Attachment #182539 - Attachment is obsolete: true
That last patch looks good to go to me.
Can someone set a review flag on the last patch?
(In reply to comment #136) > Can someone set a review flag on the last patch? https://bugzilla.mozilla.org/show_bug.cgi?id=274784#c132
Depends on: 292903
I'm getting a build error on Win32 cygwin MinGW which seems to be due to this bugs patch :- e:/mozilla/source/mozilla/dom/src/base/nsGlobalWindow.cpp: In function `nsresult CopyJSPropertyArray(JSContext*, JSObject*, JSObject*, JSIdArray*)': e:/mozilla/source/mozilla/dom/src/base/nsGlobalWindow.cpp:5928: error: invalid s tatic_cast from type `jschar*' to type `PRUnichar*' e:/mozilla/source/mozilla/dom/src/base/nsGlobalWindow.cpp:5962: error: invalid s tatic_cast from type `jschar*' to type `PRUnichar*' make[5]: *** [nsGlobalWindow.o] Error 1
Attached patch mingw fixSplinter Review
Comment on attachment 182608 [details] [diff] [review] pref'd off patch w/ short circuit and crash fix - In CopyJSPropertyArray(): + JSString *propname = JSVAL_TO_STRING(propname_value); + jschar *propname_str = ::JS_GetStringChars(propname); + NS_ENSURE_TRUE(propname_str, NS_ERROR_FAILURE); + + // We exclude the "location" property because restoring it this way is + // problematic. It will "just work" without us explicitly saving or + // restoring the value. + + if (!nsCRT::strcmp(NS_STATIC_CAST(PRUnichar*, propname_str), + NS_LITERAL_STRING("location").get())) { + continue; + } + + size_t propname_len = ::JS_GetStringLength(propname); There is a nsDependentJSString class that you could use here, but you do need the jschar pointer too so I'm not sure it's worth it. Oh, and on an unrelated note, I wonder if the state-saving code could be triggerd from a timeout? If it could, then there's some nasty timer issues to deal with. When an XPCOM timer fires, we may run more than one JS timeout if the XPCOM timer fires late (which is not that uncommon), or if there are more than one timeout with the same deadline. In such a case, we need to stop firing timeouts if we end up saving state, and we need to make sure we never save timeouts that are on the stack (if this code is triggerd from a timeout there will be a timeout in the timeout list that's not heap allocated, look for dummy_timeout in nsGlobalWindow::RunTimeout()). Separate bug, I guess, but it needs fixing, I think (or are location changes from JS *guaranteed* to be async?).
Donno if you guys want comments in here for side effects or not. Some side effects: 1) Seems like there is a short pause before browser starts loading the page...actually I think the page loads before the page displayed except for secure pages (See #3) 2) The favicon is not changing when I go back and forward...the second page I go to, its favicon stays in the address bar. 3) Not instant back/foward with secure servers ie.. paypal, bugzilla, usaa.com...also takes a long time for page to display
File new bugs for new problems, please. Also, study up on this bug before reporting intentional design decisions as bugs (and don't report the paint suppression delay as a bug, if that's what comment 141 #1 is). /be
Component: DOM: Level 0 → History: Session
Depends on: 292926
Blocks: 292934
No longer blocks: 292923, 292933, 292934
Depends on: 292923
No longer depends on: 292926
Blocks: 292933
No longer blocks: 292933
Depends on: 292934
Depends on: 292938
Blocks: 292941
No longer blocks: 292941
Depends on: 292941
Depends on: 292945
Depends on: 292998
This was checked in on 5/4. Leaving the bug open since this is not yet turned on.
(In reply to comment #143) > This was checked in on 5/4. Leaving the bug open since this is not yet turned on. What's the pref to enable this in the nightlies? (I apologise for the bugspam if this was mentioned earlier.)
Alex, I think it should already be enabled in latest builds. The number of pages cached is held in the preference browser.sessionhistory.max_entries and is defaulted to 50, I think.
max_viewers and defaults to 0, actually.
Whiteboard: (o72555) → (o72555)Set browser.sessionhistory.max_viewers to nonzero to test
(In reply to comment #146) > max_viewers and defaults to 0, actually. If max_viewers determines the number of pages to cache and it's set to 0 by default, does that meant that bfcache is disabled by default, in effect?
Depends on: 293049
(In reply to comment #147) > If max_viewers determines the number of pages to cache and it's set to 0 by > default, does that meant that bfcache is disabled by default, in effect? It is currently disabled by default while testing is carried out and as a compromise while other patches are attended to that some people have believed to be more important as they are lower risk.
No longer depends on: 293049
Whiteboard: (o72555)Set browser.sessionhistory.max_viewers to nonzero to test → (o72555)
Depends on: 293049
(In reply to comment #144) > What's the pref to enable this in the nightlies? See http://weblogs.mozillazine.org/chase/archives/008085.html (In reply to comment #145) > The number of pages > cached is held in the preference browser.sessionhistory.max_entries and is > defaulted to 50, I think. That's the number of pages to store in the session history (Go menu, Back/Forward menus). browser.sessionhistory.max_viewers is the relevant preference. (In reply to comment #147) > If max_viewers determines the number of pages to cache and it's set to 0 by > default, does that meant that bfcache is disabled by default, in effect? browser.sessionhistory.max_viewers is the number of pages to store (I think that's per window/tab right now but may become global later). A value of 5 would be sensible. By default, it's not set at all, so the bfcache is disabled. <rant>Most of these questions could be answered just by reading the bug.</rant>
No longer depends on: 285404
This is a great feature. I'm loving it. I would like to report on some side effects I have seen. -when pressing Back or Forward, the page view will first become grey, and then load the page from cache. This isn't the normal behavior that is used when following a link. I think it should be the same -animated GIFs on a page you Back or Forward to will no longer animate -as I think I have read in a previous comment, the website icon isn't displayed anymore on a page that you Back or Forward to
Please report any side effects or issues in new bugs blocking this one if you want them to be fixed.
Depends on: 293074
Whiteboard: (o72555) → (o72555) Set browser.sessionhistory.max_viewers to nonzero to test
No longer depends on: 293074
Related crash cases: #293076
This might have caused bug 293135? Bug is reproducable _without_ enabling the pref
Depends on: 293135
Depends on: 293076
Depends on: 293173
Depends on: 293175
Depends on: 293176
Depends on: 293179
Depends on: 293234
Depends on: 293235
Depends on: 293022
Depends on: 293270
Depends on: 293386
Depends on: 293534
Depends on: 293572
I think that this bug should not block 1.8b2 anymore. Instead I think it should block aviary1.1 and once the time comes the discussion whether to enable or disable this feature should be here.
Depends on: 293588
Depends on: 293698
Depends on: 293700
Depends on: 293709
(http://www.animatedgif.net/sitemessages/download/download.shtml) 1. A page with several animated gif's. Select View Image on one of them 2. See the gif and then push Back. Only one of them (the one you clicked) is only moving. Other are stucked
Filed #293797
Depends on: 293797
No longer depends on: 293797
Depends on: 293793
Depends on: 293606
Depends on: 294258
Depends on: 294231
we'll be tracking blocking regressions individually so I'm pulling this bug off the blocking 1.8b2 radar.
Flags: blocking1.8b2+
Depends on: 294454
Depends on: 294610
Depends on: 293754
Depends on: 295085
Blocks: majorbugs
Depends on: 295591
No longer blocks: majorbugs
Depends on: 295656
Depends on: 295931
Depends on: 296575
Depends on: 296745
Depends on: 296760
Depends on: 297122
Depends on: 292987
Depends on: 293403
Depends on: 297488
Depends on: 297887
Depends on: 297945
Depends on: 298077
Depends on: 298125
No longer depends on: 292948
Depends on: 298133
No longer depends on: 297122
No longer depends on: 297945
Depends on: 298112
Depends on: 298622
Depends on: 298794
Depends on: 297926
Depends on: 299205
Depends on: 299547
Depends on: 299556
No longer depends on: 299205
Depends on: 300411
Depends on: 299853
Depends on: 300533
Depends on: 300538
Depends on: 300642
Depends on: 300663
Depends on: 300800
Depends on: 300802
Depends on: 300863
No longer depends on: 300863
Depends on: 300572
Depends on: 300944
Depends on: 300956
Depends on: 301354
Depends on: 301358
Depends on: 301397
No longer depends on: 293175
Depends on: 293175
Depends on: 301516
Depends on: 301517
Depends on: 298377
Depends on: 302290
Depends on: 299514
Depends on: 302956
Depends on: 303013
Depends on: 303151
Depends on: 303059
Depends on: 303237
Depends on: 303255
No longer depends on: 303237
Depends on: 303311
Depends on: 303360
Depends on: 303327
Depends on: 303267
Depends on: 303637, 303638
Depends on: 303904
Depends on: 304003
Depends on: 304288
Depends on: 304764
Depends on: 304639
Depends on: 305129
Depends on: 305167
Depends on: 300756
Depends on: 304994
Depends on: 303839
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b4) Gecko/20050823 Firefox/1.0+ http://www.key.ru -- DPA2 doesn't put the page in memory and redownload/rerender it when clicking back or forward buttons. If it's a bug, will it be fixed?
Depends on: 305954
Depends on: 305995
Depends on: 306283
(In reply to comment #158) > http://www.key.ru -- DPA2 doesn't put the page in memory and redownload/rerender > it when clicking back or forward buttons. If it's a bug, will it be fixed? It's Bug 292941.
Blocks: 304083
Depends on: 308194
Depends on: 297575
No longer blocks: 38486
*** Bug 38486 has been marked as a duplicate of this bug. ***
Blocks: 164421
Depends on: 310738
Depends on: 311791
Depends on: 312027
Depends on: 312117
Depends on: 312816
No longer blocks: 304083
Depends on: 304083
Depends on: 314400
Depends on: 307178
Depends on: 314288
Depends on: 314549
Depends on: 317380
Depends on: 319527
Depends on: 306862
Depends on: 322725
Depends on: 328722
Depends on: 328752
done.
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Depends on: 333050
Depends on: 319551
Depends on: 309993
Depends on: 338542
Depends on: 314362
Depends on: 343258
Component: History: Session → Document Navigation
QA Contact: ian → docshell
Depends on: 343747
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: