Closed Bug 209275 Opened 21 years ago Closed 15 years ago

Mozilla doesn't update link's/hrefs when changing base href

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: webmaster, Assigned: justin.lebar+bug)

References

(Blocks 1 open bug)

Details

(Keywords: testcase)

Attachments

(2 files, 14 obsolete files)

246 bytes, text/html
Details
39.60 KB, patch
Details | Diff | Splinter Review
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4b) Gecko/20030507
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4b) Gecko/20030507

When updating the base.href, Mozilla doesn't update any href's on the page. They
still all use the old base.href, even though the base.href has been succesfully
changed.

Reproducible: Always

Steps to Reproduce:
1. load a page
2. change the base.href though javascript
3. hover over any link and you'll see it still uses the old base.href value

Actual Results:  
Links still pointed to URL which used the old base.href

Expected Results:  
When changing the base-URL, Mozilla should have updated any links on that page
Also not working in 1.4-RC2 and FireBird 0.6
Summary: Mozilla doesn't update link's/hrefs when changing base href → Mozilla doesn't update link's/hrefs when changing base href
Still exists in Mozilla 1.4. Can someone please confirm this?
jst, do we want to support this?
Ian, what do you think?  Should dynamic changes to the "href" attribute of
<base> change the base URI?
Sure, why not... :-D
QA Contact: desale → ian
Well... Here are some reasons for why not:

1)  If done "properly" it means reloading and reexecuting all scripts, reloading
    all style sheets, reloading all frames and objects and images.  This is
    unreasonable.  Changing just the hrefs of <a> elements without changing the
    hrefs of anything else makes no sense.

2)  Even doing just hrefs of links will take up time and add extra code.
Well, what happens if the xml:base attribute on the root element is changed 
dynamically? That definitely _should_ work, according to the xml:base and DOM3 
Core specs. And since DOM3 Core says the baseURI is calculated live from xml:
base and <html:base href="">, it seems that dynamically changing base href 
should indeed work.

I agree it means redoing a lot of work (although not scripts -- once they are 
executed that's it, in our model). But you asked... :-)
> Well, what happens if the xml:base attribute on the root element is changed 
> dynamically?

That works.  All base href lookups that happen after that will get the new base.
 All lookups that happened before do not.  So none of that reloading stuff
happens.  Somewhat inconsistent, but then xml:base is a rather bad idea overall
in my mind (as designed; it could have been less evil, but it got hit with a
dose of XML-ness).

> I agree it means redoing a lot of work

Not if we just do what we do for xml:base

> although not scripts

Consistency is the hobgoblin of little minds, eh?  ;)

I'm mostly opposed to adding code to the codebase to support an edge case
behavior of legacy HTML element when that edge case behavior is ill-defined and
internally inconsistent when defined in any "reasonable" way.
> [changing xml:base dynamically] works.  All base href lookups that happen 
> after that will get the new base.

Ah. Ok. Then do that for changing <html:base href="">.


> xml:base is a rather bad idea overall in my mind (as designed; it could have
> been less evil, but it got hit with a dose of XML-ness).

Dude, you think everything is bad idea overall. :-)


>> although not scripts 
> Consistency is the hobgoblin of little minds, eh?  ;)

Scripts don't reexecute when you move the node -- I was applying that same idea.
> Dude, you think everything is bad idea overall. :-)

The fact is, the vast majority of standards are poorly thought through in terms
of their performance implications and their interaction with other existing or
developing standards.   Having to check whether your image's URI changed every
single time the DOM changes around it is pretty ridiculous, for example.  Have
to do it if xml:base is to be supported, though.

Confirming this bug; the way to do this would be to override SetAttr on
nsHTMLSharedElement and if it's a <base> and the href is being set to clobber
the document's base URI.  Some changes to the content sinks (HTML and XML) may
also be needed if those cache the base URI (the change would be to not cache
it).  Finally, the current content-sink code that handles standard <base> can be
removed, since <base> will handle that itself.  Probably can't ditch the HTML
_baseHref stuff, though.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: helpwanted
OS: Windows XP → All
Hardware: PC → All
Whiteboard: [good first bug]
Is Bug 286809 related/a dupe of this bug here or is that another issue?
(In reply to comment #11)
> Is Bug 286809 related/a dupe of this bug here or is that another issue?

Duplicate.
(In reply to comment #11)
> Is Bug 286809 related/a dupe of this bug here or is that another issue?

Duplicate.
(In reply to comment #12)
> (In reply to comment #11)
> > Is Bug 286809 related/a dupe of this bug here or is that another issue?
> Duplicate.

Well, almost. The difference between the bugs is that in Bug 286809, changes 
are made to base href BEFORE the base href is used. It tries to dynamically set 
it at loading. Here, in 209275, changes to base href are made after loading. 
The only diffence between the bugs is how base href is used. The bug/issue 
itself is the same.

Blocks: 286809
No longer blocks: 286809
Depends on: 286809
In bug 286809 I described why I set this one to depend on 286809:  If that one
is fixed, it does not imply this one is fixed, whereas if this one is fixed it
means that 286809 is also fixed.  Hence, this one being "fixed" depends on that
one being fixed.

However, I would add my opinion that the resolution of this bug would seem to
depend on what the definition of <BASE HREF=...> really is.  Should it mean the
base reference that SUBSEQUENT LOADS are performed with respect to OR should it
mean the base reference that all current pages must be relative to (at all
times).  It seems to me that this second variant is rather more cumbersome. 
Whereas my verbiage might be cleaned up, it seems that even so this variant
entails a higher degree of overhead (and, of course subsequently loaded code is
bound to change the <BASE HREF=...> too).

In short, I don't see this one as a bug.  It's expected behaviour to me.

Csaba Gabor from Vienna
Component: DOM: HTML → DOM: Core & HTML
QA Contact: ian → general
So, is this still a bug or is there a patch to fix it?
This is still a bug.  Latest stable build and latest public beta.

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1b3) Gecko/20090305 Firefox/3.1b3
It's worth nothing that other browsers, from Safari to Chrome to Opera to IE, have this functional.  So they've long figured out a way to make this work without too many problems, I assume.
Attached file Testcase
Keywords: testcase
Blocks: 500328
Attached patch Patch v1 (obsolete) — Splinter Review
First pass at a patch.  I have two concerns about this patch:

1) I update the document's base URI whenever any <base> tag's href is changed.  It might be better to update the base URI only when the first <base>'s href changes, but it might not matter.

2) Whenever an <a> tag is removed from the document's link map, it needs to have its cached href cleared, and it should be the case that IsInDoc() returns false for all links not in the link map.  Otherwise, we can get stale hrefs in some links.  We need to be sure that this invariant is true.

If that's too hard, we could explicitly turn caching on/off when we add/remove from the link map by adding a new nsAttrValue::ValueType which we check before we cache the resolved href.  This is another piece of state we'd have to keep synchronized, but at least it's explicit state as opposed to implicit state.
Assignee: general → jlebar
Status: NEW → ASSIGNED
Attachment #396761 - Flags: review?(bzbarsky)
Attachment #396761 - Flags: review?(bzbarsky) → review?(peterv)
Attached patch Patch v1.1 (obsolete) — Splinter Review
Oops; I forgot to hg add my testcase to the last patch.
Attachment #396761 - Attachment is obsolete: true
Attachment #397901 - Flags: review?
Attachment #396761 - Flags: review?(peterv)
Attachment #397901 - Flags: review? → review?(bzbarsky)
Attached patch Patch v2 (obsolete) — Splinter Review
Fixes a number of issues with the previous patch.  The biggest change is that the document now keeps track of its first <base> tag.  Changes to any other <base>'s href don't affect the document.
Attachment #397901 - Attachment is obsolete: true
Attachment #399390 - Flags: review?(bzbarsky)
Attachment #397901 - Flags: review?(bzbarsky)
Justin, would you please comment on what the rationale is for only taking action on the first setting of the <base> tag vs. your earlier fix of taking action whenever the <base> tag is changed?

It seems to me that the prior version would also fix bug 286809 which is the only use case given for this bug, whereas the fresh patch would specifically prevent fixing that bug.  Or have I misunderstood something?
Csaba, Justin's patch is taking action on any setting of the first <base> tag, not on first setting of anything.  Please reread carefully.
Blocks: 286809
No longer depends on: 286809
Blocks: 515401
Attachment #399390 - Flags: review?(bzbarsky) → review-
Comment on attachment 399390 [details] [diff] [review]
Patch v2

>+++ b/content/base/public/nsIContent.h

Need an IID change here.

>+++ b/content/base/public/nsIDocument.h

>+   * Iterates over the document's link map and updates the href of all the
>+   * links.  This should be called whenever the base URI of the document
>+   * changes.
>+   */
>+  virtual void RefreshLinkHrefs() = 0;

That's not actually what the function does, though, right?

>+++ b/content/base/src/nsDocument.cpp
> nsDocument::SetBaseURI(nsIURI* aURI)

Shouldn't you be comparing to GetBaseURI, not mDocumentBaseURI, for purposes of equalBases?

> nsDocument::GetBaseURI(nsAString &aURI)

>+  nsCOMPtr<nsIURI> baseURI = static_cast<nsIDocument*>(this)->GetBaseURI();

Why do you need the cast and the nsCOMPtr?  Just call nsIDocument::GetBaseURI() directly, no?

>+nsDocument::RefreshLinkHrefs()

>+  DestroyLinkMap();

Hmm.  So the thing here is that this will clear non-<a> links from the link map too, right?  What puts them back?  I bet one can write a (failing) test for this.

Also, where do we trigger restyles for links that aren't <a> on base URI change?  Again, I bet one can write a (failing) test for this.

>+  PRUint32 childCount;
>+  nsIContent * const * child = GetChildArray(&childCount);
>+  for (PRUint32 i = 0; i < childCount; i++) {
>+    RefreshLinkHrefsRecursive(*child);
>+    child++;

How about just:

  RefreshLinkHrefsRecursive(GetRootContent())

?  Probably conditioned on GetRootContent() not being null.

>+nsDocument::SetFirstBaseNode(nsIContent *elem)
>+    NS_ASSERTION(elem->Tag() == nsGkAtoms::base,
>+                 "Setting base node to a non <base> element?");

Shouldn't you assert the namespace too?

>+++ b/content/html/content/src/nsGenericHTMLElement.cpp
>+  // We may have to re-resolve all our cached hrefs when the document's base
>+  // URI changes. The document keeps track of the links it contains, so it can
>+  // re-resolve those links; for links not in the document, we simply don't
>+  // cache the URI.
>+  if (isURIAttr && GetOwnerDoc() == GetCurrentDoc()) {

The point is the base URI depends on the owner document, but the current document is what keeps track of links.  So if the two don't match, we shouldn't cache.

>+   * If this element's href attribute is a lazy URI value, this method clears
>+   * that cached value.
>+   *
>+   * Overrides empty implementation in nsIContent.
>+   */
>+  virtual void RefreshCachedHref();

Why is this on nsGenericHTMLElement, not on nsHTMLAnchorElement?

>+++ b/content/html/content/src/nsHTMLSharedElement.cpp
>+  nsresult SetAttr(PRInt32 aNameSpaceID, nsIAtom* aName,

What about UnsetAttr?

>+already_AddRefed<nsIURI> ResolveBaseURI(nsIContent* elem)

>+  nsCOMPtr<nsIDocument> doc = elem->GetOwnerDoc();
>+  nsCOMPtr<nsIURI> currentURI = doc->GetDocumentURI();

Why do those need to be strong refs?

>+  nsCOMPtr<nsIURI> newBase;
....
>+  NS_IF_ADDREF(newBase);

This _used_ to not compile.  Too bad it does now.  :(

>+  return newBase.get();

The right way to do this is |return newBase.forget()| without the manual addref.

>+nsHTMLSharedElement::SetAttr(PRInt32 aNameSpaceID, nsIAtom* aName,

>+    nsCOMPtr<nsINode> firstBaseNode = doc->GetFirstBaseNode();

Why do we need the strong ref here?

>+nsHTMLSharedElement::BindToTree(nsIDocument* aDocument, nsIContent* aParent,
>+                                nsIContent* aBindingParent,
>+                                PRBool aCompileEventHandlers)

Weird indent here.

>+    nsCOMPtr<nsIDocument> doc = GetOwnerDoc();
>+    NS_ENSURE_TRUE(doc, NS_ERROR_FAILURE);

Why do we need to do that, exactly?  Which <base> should this <base> be the <base> for?  I'd assume it should have no effect unless it's in its owner document, right?

>+      aDocument->SetFirstBaseNode(this);
>+      nsCOMPtr<nsIURI> newBaseURI = ResolveBaseURI(this);
>+      doc->SetBaseURI(newBaseURI);

Shouldn't SetFirstBaseNode handle the SetBaseURI call?  That would make more sense to me...

>+  if (elem->Tag() == nsGkAtoms::base)
>+    return elem;

That'll find lovely non-HTML <base> tags.

>+nsHTMLSharedElement::UnbindFromTree(PRBool aDeep, PRBool aNullParent)
>+    nsCOMPtr<nsIDocument> doc = GetOwnerDoc();

How could this be null here?  And shouldn't you just grab the doc up front, before the parent-class unbind?

>+    // I'd love to use NS_GetContentList here, but I can't because that flushes
>+    // content notifications.

Why is it a problem to flush content stuff under unbindfromtree?

If it actually is (check with sicking?), please file a followup bug to nuke this code once we land the html5 parser and remove content flushes?

>+    nsIContent * const * child = doc->GetChildArray(&childCount);

GetRootContent, again.

>+    // Resolve the new base URI relative to the document's current URI
>+    nsCOMPtr<nsIURI> newBaseURI = ResolveBaseURI(newBaseNode);
>+
>+    // the document checks that it's legal to set this base.  If it isn't, we
>+    // set the base to null and hope that works.
>+    nsresult rv = doc->SetBaseURI(newBaseURI);
>+    if (NS_FAILED(rv))
>+      doc->SetBaseURI(nsnull);

An example of why SetFirstBaseNode should handle this: the behavior here is not identical to that of BindToTree.

Would it make more sense to update the mFirstBaseNode via nsDocument::ContentInserted/Appended/Removed, perhaps?
Blocks: 515819
Attached file Patch v2.1 (obsolete) —
I discovered that I'd misunderstood the spec.  We're supposed to ignore all but the first <base> *with an href*.  The code and tests now reflect this.

Comments on the comments on the patch:

> >+already_AddRefed<nsIURI> ResolveBaseURI(nsIContent* elem)
> 
> >+  nsCOMPtr<nsIDocument> doc = elem->GetOwnerDoc();
> >+  nsCOMPtr<nsIURI> currentURI = doc->GetDocumentURI();
> 
> Why do those need to be strong refs?

I changed them, but I'm not clear on what the rules are for using raw pointers vs nsCOMPtr. I was operating under the assumption that I should pretty much never use raw pointers, but apparently that's not right.

===========

> >+  if (elem->Tag() == nsGkAtoms::base)
> >+    return elem;
> 
> That'll find lovely non-HTML <base> tags.

Do I need to do anything additional other than check the namespace is XHTML?

===========

> >+nsHTMLSharedElement::UnbindFromTree(PRBool aDeep, PRBool aNullParent)
> >+    nsCOMPtr<nsIDocument> doc = GetOwnerDoc();
> 
> How could this be null here?  And shouldn't you just grab the doc up front,
> before the parent-class unbind?

I have no idea; I just generally sprinkle null checks on every pointer I get.  In any case, I now grab the document upfront.

===========

> Would it make more sense to update the mFirstBaseNode via
> nsDocument::ContentInserted/Appended/Removed, perhaps?

The question is whether the awkward public Get/SetFirstBaseNodeWithHref interface is worse than moving all that <base>-specific code from nsHTMLSharedElement, where it makes sense, into the clean, small AttributeChanged/ContentInserted/ContentRemoved/etc functions in nsIDocument.

My gut feeling is that it's better to have a suboptimal interface than to clutter nsDocument in this case, but I'm happy to change it if you feel like it should be the other way.
Attachment #399915 - Flags: review?(bzbarsky)
> I was operating under the assumption that I should pretty much never use raw
> pointers

For safety, probably true.  But it has some unsavory performance implications...

> Do I need to do anything additional other than check the namespace is XHTML?

No.  Just do elem->NodeInfo()->Equals() with the right arguments.

> I just generally sprinkle null checks on every pointer I get. 

That tends to lead to confusing code and people later cargo-culting the null-check.

I'll review with this in mind.

> The question is whether the awkward public Get/SetFirstBaseNodeWithHref
> interface is worse 

Right.  I'm not sure it is; let's try it for now.  That's why that was a question, not a review requirement... ;)
Attachment #399390 - Attachment is obsolete: true
Attached patch Patch 2.1.1 (obsolete) — Splinter Review
Uses NodeInfo->Equals() for determining whether a node is a base, as suggested.

Also adds a missing return statement.  I wish it was harder for those warnings to get lost in the noise of a build...
Attachment #399915 - Attachment is obsolete: true
Attachment #399922 - Flags: review?(bzbarsky)
Attachment #399915 - Flags: review?(bzbarsky)
Attached patch Patch v2.1.2 (obsolete) — Splinter Review
Added a test to make sure that links' cached hrefs are cleared when they're removed from the document.  No code changes.
Attachment #399922 - Attachment is obsolete: true
Attachment #400091 - Flags: review?(bzbarsky)
Attachment #399922 - Flags: review?(bzbarsky)
Attached patch Patch v2.1.3 (obsolete) — Splinter Review
Adds cycle collection to nsDocument for new member.  I think I did this right...
Attachment #400091 - Attachment is obsolete: true
Attachment #400112 - Flags: review?(bzbarsky)
Attachment #400091 - Flags: review?(bzbarsky)
Comment on attachment 400091 [details] [diff] [review]
Patch v2.1.2

>+++ b/content/base/public/nsIDocument.h
>+  /**
>+   * Re-resolves the hrefs of all the <a>s in the document, and updates their
>+   * styles.
>+   */
>+  virtual void RefreshLinkHrefs() = 0;

This is only called inside nsDocument, so can be an nsDocument private
function, right?  No need for it to live on nsIDocument.

> nsDocument::SetBaseURI(nsIURI* aURI)
>+    equalBases = !curBaseURI;

Why not compare to curBaseURI here too?  I doubt it'd ever be null, btw, since it's just the return value of GetBaseURI(), so you're always setting equalBases to false here.

>+  virtual void Visit(nsIContent* aContent) {
>+    // We can't call ContentStateChanged here, because that may modify the link
>+    // map.

Hmm.  Good catch.  I thought it couldn't, but it clearly can.  Should document this in the other visitor too.

>+  PRUint32 childCount;
>+  nsIContent * const * child = elem->GetChildArray(&childCount);
>+  for (PRUint32 i = 0; i < childCount; i++) {
>+    DropCachedHrefsRecursive(*child);
>+    child++;

How about:

  PRUint32 childCount;
  nsIContent * const * child = elem->GetChildArray(&childCount);
  nsIContent * const * end = child + childCount;
  for ( ; child != end; ++child) {
    DropCachedHrefsRecursive(*child);
  }

instead?  Less incrementing and all.

>+nsDocument::RefreshLinkHrefs()
>+  for (PRUint32 count = visitor.contentVisited.Count(), i = 0; i < count; i++) {
>+    ContentStatesChanged(visitor.contentVisited[i],
>+                         nsnull, NS_EVENT_STATE_VISITED);

You probably want:

  MOZ_AUTO_DOC_UPDATE(this, UPDATE_CONTENT_STATE, PR_TRUE);

around that loop.  I know the other map visitor wasn't doing that; it should too.

>+nsDocument::SetFirstBaseNodeWithHref(nsIContent *elem)
>+  if (elem) {
>+    NS_ASSERTION(elem->Tag() == nsGkAtoms::base,
>+                 "Setting base node to a non <base> element?");
>+    NS_ASSERTION(elem->GetNameSpaceID() == kNameSpaceID_XHTML,
>+                 "Setting base node to a non XHTML element?");
>+  }

Just move these asserts to after the !elem early return?

>+  if (elem) {

No need to null-check here; this is after the !elem early return.

>+++ b/content/base/src/nsDocument.h
>+  nsCOMPtr<nsIContent> mFirstBaseNodeWithHref;

You should presumably add this to the document's cycle collection, right?

>+++ b/content/html/content/src/nsHTMLSharedElement.cpp
>+

Don't add that blank line (in the class decl).

>+nsHTMLSharedElement::SetAttr(PRInt32 aNameSpaceID, nsIAtom* aName,
>+    return NS_OK;

No need; just fall through to the |return rv| below.

>+FindBaseRecursive(nsIContent * const elem)
>+  // Ignore the value of the href returned by GetAttr

HasAttr, please.  Much less overhead.

>+  PRUint32 childCount;
>+  nsIContent * const * child = elem->GetChildArray(&childCount);
>+  for (PRUint32 i = 0; i < childCount; i++) {

Similar change here to the nsDocument loop.

>+nsHTMLSharedElement::UnsetAttr(PRInt32 aNameSpaceID, nsIAtom* aName,
>+    nsCOMPtr<nsIContent> newBaseNode = FindBaseRecursive(doc->GetRootContent());

That doesn't need to be a strong pointer.

>+nsHTMLSharedElement::BindToTree(nsIDocument* aDocument, nsIContent* aParent,
>+  nsAutoString ignore;

HasAttr().

>+    nsCOMPtr<nsINode> curBaseNode = aDocument->GetFirstBaseNodeWithHref();

Can be a weak pointer.

>+nsHTMLSharedElement::UnbindFromTree(PRBool aDeep, PRBool aNullParent)
>+  nsCOMPtr<nsIDocument> doc = GetOwnerDoc();

Shouldn't that be GetCurrentDoc()?  That's what you use elsewhere in this code.  That would incidentally mean no need for the IsInDoc() check: if the GetCurrentDoc() is null, you're not in a document.

>+    // If we're not the first base node, then we don't need to do anything.
>+    if (!doc->GetRootContent() || this != doc->GetFirstBaseNodeWithHref())

Shouldn't need that first check, right?  We can't be the first base node if there is no root content...

>+    nsCOMPtr<nsIContent> newBaseNode = FindBaseRecursive(doc->GetRootContent());

Can be a weak pointer.

I didn't read the tests very carefully.  Let me know if I should.
Attachment #400091 - Attachment is obsolete: false
You should probably add mFirstBaseNodeWithHref to unlink too, not just traverse.
Comment on attachment 400112 [details] [diff] [review]
Patch v2.1.3

Still need to address the other comments.
Attachment #400112 - Flags: review?(bzbarsky) → review-
(In reply to comment #31)
> You should probably add mFirstBaseNodeWithHref to unlink too, not just
> traverse.

Is there a doc which explains the relevant rules here?  I noticed that most things in traverse aren't in unlink.
> Is there a doc which explains the relevant rules here? 

Whatever is in traverse should generally be in unlink.

nsDocument is a bit of an exception; the comment at the end of its unlink implementation explains why.  I think unlinking mFirstBaseNodeWithHref should be quite safe, though, and should be done.
(In reply to comment #30)
> >+++ b/content/base/public/nsIDocument.h
> >+  /**
> >+   * Re-resolves the hrefs of all the <a>s in the document, and updates their
> >+   * styles.
> >+   */
> >+  virtual void RefreshLinkHrefs() = 0;
> 
> This is only called inside nsDocument, so can be an nsDocument private
> function, right?  No need for it to live on nsIDocument.

I call RefreshLinkHrefs from the docshell in the pushstate patch, but I only call it after SetDocumentURI.  I wouldn't mind calling RefreshLinkHrefs from SetDocumentURI when mBaseURI is null, but we need to be sure that we don't start triggering full DOM walks during normal browsing.

SetDocumentURI is *not* called when I click on an anchor link.  If you can't think of any other times when SetDocumentURI causing a DOM walk would be problematic, then let's make this change.

> > nsDocument::SetBaseURI(nsIURI* aURI)
> >+    equalBases = !curBaseURI;
> 
> Why not compare to curBaseURI here too?  I doubt it'd ever be null, btw, since
> it's just the return value of GetBaseURI(), so you're always setting equalBases
> to false here.

I'm not sure what you're suggesting here.  Do you mean I should call |curBaseURI->Equals(aURI)|?  At least in nsStandardURL, that will either crash or immediately fail, so I might as well set equalBases to PR_FALSE.  |equalBases = !curBaseURI| is clearer to me, but I think equalBases = PR_FALSE is fine, with a comment.  Maybe I'm missing something here?

> >+nsHTMLSharedElement::UnbindFromTree(PRBool aDeep, PRBool aNullParent)
> >+  nsCOMPtr<nsIDocument> doc = GetOwnerDoc();
> 
> >+    // If we're not the first base node, then we don't need to do anything.
> >+    if (!doc->GetRootContent() || this != doc->GetFirstBaseNodeWithHref())
> 
> Shouldn't need that first check, right?  We can't be the first base node if
> there is no root content...

I'm not sure what circumstances might cause GetRootContent to be null, so I'm not sure how we guarantee that this == doc->GetFirstBaseNodeWithHref() implies doc->GetRootContent().  I have a similar check in UnsetAttr; should I get rid of that one, too?

Patch is forthcoming once we clear up these last few issues.
Attached patch Patch v2.1.4 (obsolete) — Splinter Review
Applies changes from comment #30, except for those mentioned comment #35.  I'm turning in my computer to IT now, so I wanted to get this patch up just in case my various backups fail.
Attachment #400091 - Attachment is obsolete: true
Attachment #400112 - Attachment is obsolete: true
> If you can't think of any other times when SetDocumentURI causing a DOM walk
> would be problematic

It's called in the following cases:

1) NS_NewXMLDocument (no DOM here).
2) nsDocument::ResetToURI (right after blowing away the DOM)
3) CloneDocHelper, which is called from cloneNode on a document before cloning
   the kids.

So there is nothing in the DOM in all cases when it's called.  We could even add an assert to that effect if we want (or at least assert no root content or something).

> Do you mean I should call |curBaseURI->Equals(aURI)|?

Oh, I see the problem.  This is the aURI == null case.  Right.  The right thing to do here is to compare curBaseURI (the old base URI) to mDocumentURI (the new base URI), right?

> I'm not sure what circumstances might cause GetRootContent to be null

Not having any elements in the DOM.

Reading over the UnbindFromTree code again, it could be that the root element got removed.  In that case, GetRootContent() will in fact not be null, and in fact we would be able to find our own <base> in that situation... (the one that's in the process of being removed from the document, along with everything else!).  See nsDocument::RemoveChildAt.  So that would actually not work correctly.  We should add a test for that too.

Maybe the right solution is to make FindBaseRecursive take an nsINode*const instead, and do FindBaseRecursive(doc) in UnbindFromTree and UnsetAttr?  I think that should work...
Modulo the issues left in comment 35, that last patch looks terrific!
Attached patch Patch v2.1.5 (obsolete) — Splinter Review
I'm either writing a bad testcase for the issue in comment #37, writing bad code to fix it, or both.  All relevant code is in the latest patch.

In the test, I load into an iframe a document with a <base>.  During the document's onload, it calls document.open() and document.write().  The document.write() writes an <a>, whose href I expect to be resolved relative to the document's URI, not the <base> which was erased (?) by the document.write().

Both before and after the changes I made in this patch to FindBaseRecursive, the <a> is resolved relative to the page's old base href.  It appears that calling document.open() doesn't unbind the <base> from the tree, because after the open() call, mFirstBaseNodeWithHref is non-null.  I'm not sure what the correct behavior should be.

So the questions are: Does this test the issue we were trying to fix by changing FindBaseRecursive to take an nsINode?  If so, how do we modify our solution so the testcase works?  If not, does this expose a new issue, or is this expected/acceptable behavior?
Attachment #400195 - Attachment is obsolete: true
> after the open() call, mFirstBaseNodeWithHref is non-null.

Before or after your changes?

> Does this test the issue we were trying to fix by changing FindBaseRecursive
> to take an nsINode?

No, because document.open() explicitly sets the mDocumentBaseURI of the resulting document to the GetBaseURI() of the calling document.  See http://hg.mozilla.org/mozilla-central/annotate/ff296aa200bb/content/html/document/src/nsHTMLDocument.cpp#l1946

> If not, does this expose a new issue, or is this expected/acceptable behavior?

The latter, for now.

I think the right way to test the nsINode thing is to document.removeChild(document.documentElement) on a document with a <base> and then createElement a new <html> element, append to the document, append a <body> to the <html>, append an <a> to the <body> and see what the base href is.

Can do it using both createElement/appendChild and using innerHTML on the <html> for extra checking if desired.
And you'd probably want to verify that without the nsINode change that test fails.
(In reply to comment #40)
> > after the open() call, mFirstBaseNodeWithHref is non-null.
> 
> Before or after your changes?
This happens with the latest patch.  I can test with the previous version, but I don't suspect that would change anything; do you?

Maybe I should just explicitly clear mFirstBaseNodeWithHref in document.open().
Attached patch Patch v2.1.6 (obsolete) — Splinter Review
Added testcase for removing the document node.  The test fails before we changed FindBaseRecursive to take an nsINode.

I also verified that both before and after we changed FindBaseRecursive, mFirstBaseNodeWithHref is non-null after a document.open() (assuming, of course, that the document has a <base> with an href).  I'm not sure if we care to clear it during nsHTMLDocument::OpenCommon().
Attachment #400701 - Attachment is obsolete: true
Attachment #400851 - Flags: review?(bzbarsky)
> but I don't suspect that would change anything

Nope.  Don't think it would.

> mFirstBaseNodeWithHref is non-null after a document.open() 

Ah, I see why.  See the XXXbz comment in nsDocument::ResetToURI right around the UnbindFromTree call.

Want to just fix that code and remove the XXX comment as part of this patch?  Just moving the RemoveChildAt to before the ContentRemoved call should do it.
Attached patch Patch v2.1.7 (obsolete) — Splinter Review
Modified nsDocument::ResetURI as suggested.  Verified that document.open() now causes mFirstBaseNodeWithHref to be null.
Attachment #400851 - Attachment is obsolete: true
Attachment #400891 - Flags: review?(bzbarsky)
Attachment #400851 - Flags: review?(bzbarsky)
Comment on attachment 400891 [details] [diff] [review]
Patch v2.1.7

In nsDocument::SetBaseURI, the |oldBase| branch assumes newBase is non-null.  Probably better not to do that.  Maybe structure the code like so:

if (oldBase && newBase) {
 // Call Equals
} else {
  equalBases = !oldBase && !newBase;
}

And maybe readd the document.open test that was failing before?

r=bzbarsky with those two nits.
Attachment #400891 - Flags: review?(bzbarsky) → review+
Attached patch Patch for checkin (obsolete) — Splinter Review
Forwarding review from last version.  I think we're good to go!
Attachment #400891 - Attachment is obsolete: true
Attachment #401597 - Flags: review+
Get rid of that printf() in nsDocument.cpp?  And then yes, I think we are.  Will land Sunday or Monday if it doesn't happen before then.
Attached patch Patch for checkin (take two) (obsolete) — Splinter Review
Argh.  For real this time.
Attachment #401597 - Attachment is obsolete: true
Attachment #401599 - Flags: review+
Keywords: checkin-needed
Pushed http://hg.mozilla.org/mozilla-central/rev/00377b2a7c75
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Whiteboard: [good first bug]
backed out due to mochitest timeouts. http://hg.mozilla.org/mozilla-central/rev/c193c5dc321c
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
JavaScript error: http://localhost:8888/tests/content/html/content/test/file_bug209275_3.html, line 15: missing ) after argument list
JavaScript error: http://localhost:8888/tests/content/html/content/test/file_bug209275_3.html, line 1: load is not defined
For those following along from home, we pushed http://hg.mozilla.org/mozilla-central/rev/00377b2a7c75 and backed out http://hg.mozilla.org/mozilla-central/rev/00377b2a7c75 because the test was hanging.  It should work now.
Attachment #401599 - Attachment is obsolete: true
Today just isn't my day.  This patch should be the right one.
Attachment #402758 - Attachment is obsolete: true
Keywords: checkin-needed
Pushed http://hg.mozilla.org/mozilla-central/rev/018f433b13f5
Status: REOPENED → RESOLVED
Closed: 15 years ago15 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.