Closed Bug 539193 Opened 15 years ago Closed 11 years ago

gatherTextUnder() in utilityOverlay.js doesn't work.

Categories

(Firefox :: General, defect)

5 Branch
x86
Windows XP
defect
Not set
minor

Tracking

()

RESOLVED DUPLICATE of bug 845363

People

(Reporter: ease, Unassigned)

Details

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2) Gecko/20100105 Firefox/3.6
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2) Gecko/20100105 Firefox/3.6

There is a bug gatherTextUnder() at Mozilla Firefox/chrome/browser.jar/content/browser/utilityOverlay.js.
If there is no uncle/aunt this function exits.
For example, <A><B>Hello</B></A>.
Right version of gatherTextUnder() should be

function gatherTextUnder ( root ) 
{
  var text = "";
  var node = root.firstChild;
  var depth = 1;  
  var needVisitChildren = true;
  while ( node && depth > 0 ) {
    // See if this node is text.
    if ( node.nodeType == Node.TEXT_NODE ) {
      // Add this text to our collection.
      text += " " + node.data;
    } else if ( node instanceof HTMLImageElement) {
      // If it has an alt= attribute, use that.
      var altText = node.getAttribute( "alt" );
      if ( altText && altText != "" ) {
        text = altText;
        break;
      }
    }
    // Find next node to test.
    // First, see if this node has children.
    if ( needVisitChildren && node.hasChildNodes() ) {
      // Go to first child.
      node = node.firstChild;
      depth++;
    } else {
      // No children, try next sibling.
      if ( node.nextSibling ) {
        needVisitChildren = true;
        node = node.nextSibling;
      } else {
        // Visited all children.
        needVisitChildren = false;
        node = node.parentNode;
        depth--;
      }
    }
  }
  // Strip leading whitespace.
  text = text.replace( /^\s+/, "" );
  // Strip trailing whitespace.
  text = text.replace( /\s+$/, "" );
  // Compress remaining whitespace.
  text = text.replace( /\s+/g, " " );
  return text;
}

Reproducible: Always
Component: Developer Tools → General
QA Contact: developer.tools → general
Reporter, are you still seeing this issue with Firefox 3.6.13 or later in safe mode? If not, please close. These links can help you in your testing.
http://support.mozilla.com/kb/Safe+Mode
http://support.mozilla.com/kb/Managing+profiles

You can also try to reproduce in Firefox 4 Beta 8 or later, there are many improvements in the new version, http://www.mozilla.com/en-US/firefox/all-beta.html
Whiteboard: [CLOSEME 2011-1-30]
gatherTextUnder() in utilityOverlay.js hasn't changed a bit in 3.6.13 or 4 beta 8 so it is still opened.
Whiteboard: [CLOSEME 2011-1-30]
Version: unspecified → 3.6 Branch
What exactly is the bug (how can I reproduce it)?
This can be an issue only when someone uses this gatherTextUnder() api.
I found this issue when I wanted to use inner js api, but
Now(FF5) I can't find utilityOverlay.js any more.
Maybe you can close this issue now.
(In reply to comment #4)
> I found this issue when I wanted to use inner js api, but
> Now(FF5) I can't find utilityOverlay.js any more.

I'd say it's pretty much alive
http://mxr.mozilla.org/mozilla-central/source/browser/base/content/utilityOverlay.js#331
Is it still avaiable to add-on developer?
Then it's back on the table.
Version: 3.6 Branch → 5 Branch
I tried rewrite gatherTextUnder() using only text.replace instead, seems to be working.

Before:
http://jsfiddle.net/gh8xN/11/

After:
http://jsfiddle.net/tneMc/10/

Test case taken from 369341:
http://jsfiddle.net/tneMc/9/
CQD. What a brilliant idea! It seems it works.
Status: UNCONFIRMED → RESOLVED
Closed: 13 years ago
Resolution: --- → WORKSFORME
I just wonder which is more efficient

original method (DOM tree recursive traverse) and
CQD's method(innerHTML -> text replace)

Can anyone tell me?
Before
http://jsfiddle.net/gh8xN/33/

After
http://jsfiddle.net/tneMc/11/

It's about 20~25% faster on my machine. 

I am personally more worried about the match pattern used to strip img tag would fail in some unforeseen circumstance....
Doesn't look like this was actually resolved...
Status: RESOLVED → REOPENED
Ever confirmed: true
Resolution: WORKSFORME → ---
I think I may be running into this exact problem with an extension I maintain. If the problem I'm about to describe doesn't match up with this report, let me know and I'll file a new bug.

My extension CoLT uses the gContextMenu.linkText() function, which subsequently calls the gatherTextUnder() function on the specified link. Using an example from the middle column of http://tagesschau.de/ (choose any large-text link in that column), the returned text is the word "intern" instead of the link's text. The reason this happens is due to the way the site has organized their link structure. Here's what a sample link looks like on that site:

<h2><a href="/some/location"><img src="/some/image.gif" alt="intern" /> Link text here</a></h2>

The gatherTextUnder() function does not find the text within the link since it sees the image first. In this case, it gets the alt attribute from the image, and returns that as the text. This is obviously incorrect behavior, as the link has associated text; it just doesn't appear first within the element.

It seems to me that this function should only return the alt attribute of a nested image if the associated link has no text at all within it (i.e. the alt attribute should be the last possibility for what to return).

Here's a look at the Firefox linkText() function that I'm using: http://mxr.mozilla.org/mozilla1.9.1/source/browser/base/content/nsContextMenu.js#1203
I forgot to mention that I see this problem in Firefox 9.0.1 (Windows 7 64-bit) using CoLT 2.5.4 (the extension simply provides an easy way to access this internal Firefox functionality).
Status: REOPENED → RESOLVED
Closed: 13 years ago11 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.