Closed Bug 1282752 Opened 8 years ago Closed 7 years ago

[New Find Toolbar] The highlight of a long String does not respect the structure of the page

Categories

(Toolkit :: Find Toolbar, defect, P2)

50 Branch
defect

Tracking

()

VERIFIED FIXED
Tracking Status
firefox50 --- disabled
firefox51 --- disabled
firefox52 --- disabled
firefox53 --- disabled
firefox54 --- verified

People

(Reporter: cirdeiliviu, Assigned: mikedeboer)

References

(Blocks 2 open bugs)

Details

Attachments

(2 files, 4 obsolete files)

Attached image screenshot_issue.png
[Affected platforms]: Windows 10, Mac OS X 10.10, Windows 7

[Affected versions]: Nightly 50, Build ID 20160627030215

[Steps to reproduce]:

1. Open Nightly (be sure you have "findbar.modalHighlight" and "findbar.highlightAll" set to true).
2. Go to http://www.nytimes.com/2016/06/28/world/europe/brexit-bregret-european-union-the-interpreter.html?_r=0
3. Press CTRL+F to open Find toolbar.
4. Search for a long String that appears on this page (e.g. "WASHINGTON — In the days since Britons voted to leave the European Union, the so-called “Brexit”").
 
[Expected Result]: The String is found and it is highlighted with yellow. The highlight respects the structure of the page (columns, rows) and the text remains in the original position.

[Actual Result]: The highlight of the String does not respect the structure in which it's placed. For example if the text is displayed on multiple rows, when it becomes highlighted is shown on a single row overlapping other text or images on the page.

See the attached screenshot for better understanding.
Assignee: nobody → mdeboer
Status: NEW → ASSIGNED
Iteration: --- → 51.3 - Sep 19
Points: --- → 5
Flags: qe-verify+
Flags: firefox-backlog+
Blocks: 1291278
Comment on attachment 8790691 [details]
Bug 1282752 - use the awesome new Range.getRectsAndTexts() API to fetch the text content for each rect of a range.

https://reviewboard.mozilla.org/r/78390/#review77346

::: toolkit/modules/FinderHighlighter.jsm:832
(Diff revision 2)
>        dict.frames.set(frame, null);
>    },
>  
>    /**
>     * Update the content, position and style of the yellow current found range
>     * outline the floats atop the mask with the dimmed background.

s/the/that/

::: toolkit/modules/FinderHighlighter.jsm:838
(Diff revision 2)
>     * @param {Array}  [textContent] Array of text that's inside the range. Optional,
>     *                               defaults to an empty array
>     * @param {Object} [fontStyle]   Dictionary of CSS styles in camelCase as
>     *                               returned by `_getRangeFontStyle()`. Optional
>     */
> -  _updateRangeOutline(dict, textContent = [], fontStyle = null) {
> +  _updateRangeOutline(dict, textContent = null, fontStyle = null) {

This comment doesn't match the default argument here. It is now defaulting to null instead of an empty array.

::: toolkit/modules/FinderHighlighter.jsm:853
(Diff revision 2)
> -      fontStyle = this._getRangeFontStyle(range);
>      // Text color in the outline is determined by kModalStyles.
>      delete fontStyle.color;
>  
> -    if (textContent.length)
> -      outlineNode.setTextContentForElement(kModalOutlineId + "-text", textContent.join(" "));
> +    let rects = this._getRangeRects(range);
> +    textContent = textContent || this._getRangeContentArray(range);

After removing the previous outline nodes we should return if textContent.length is 0.

::: toolkit/modules/FinderHighlighter.jsm:893
(Diff revision 2)
> +    }
> +
> +    const kModalOutlineTextId = kModalOutlineId + "-text";
> +    let i = 0;
> +    for (let rect of rects) {
> +      let text = (i == rectCount - 1) ? textContent.slice(i).join(" ") : textContent[i];

Can you explain this a bit, perhaps with a comment?

The way that I read this is that if the current rect is the last rect, then text is set to the rest of the textContent with single spaces injected between the text. Otherwise text is set to the current textContent for the matching rect.

How does this handle text with multiple spaces between like searching for "fire     fox"?
Attachment #8790691 - Flags: review?(jaws) → review+
(In reply to Jared Wein [:jaws] (please needinfo? me) from comment #4)
> The way that I read this is that if the current rect is the last rect, then
> text is set to the rest of the textContent with single spaces injected
> between the text. Otherwise text is set to the current textContent for the
> matching rect.

Do you mind if I copy this verbatim as the comment? It's exactly what I meant.

> How does this handle text with multiple spaces between like searching for
> "fire     fox"?

It handles it just fine, because there will be textNodes present for the whitespace. The case I'm accounting for above is apathetic, but nonetheless good to have to avoid JS errors.
Pushed by mdeboer@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9d0ff9c806d7
support find terms that span multiple lines whilst using modal highlighting in the find toolbar. r=jaws
Carrying over the 51 tracking flag from the duplicate bug.
This doesn't seem to be fixed for me, using a build from just after this landed.

I tried the STR from comment 0. Before I've activated find-in-page, the first line of text ends with the word "European". After I activate find-in-page and search for the sentence, the word "Union" is brought up to the first line (though the comma remains on the second line), and awkward bits of highlighted text & blank area are left on the second line.

I also tried using my testcase on duplicate bug 1303035 (attachment 8791610 [details]), searching for "lot of words Please" (which straddles the first line-wrapping point on my machine).  The full phrase "lot of words Please" gets pulled up onto the first line, when find-in-page is active for that phrase.

I'm using a local build from http://hg.mozilla.org/integration/autoland/rev/f564985a3552 , which includes the changeset for this bug (linked in comment 7).
Flags: needinfo?(mdeboer)
(I can imagine that the fix for bug 1300869 ("don't allow whitespace wrapping in find toolbar modal highlighting boxes") might be interfering with the ability for the fix here to actually do its job...)
(Adding "leave-open" to prevent this from getting auto-closed when it merges to central -- since AFAICT, this isn't actually fixed.)
Keywords: leave-open
bug 1302024's case is also not yet fixed.
You are both right. I need a way to extract the contents of the range per rect that I receive from `Range#getClientRects()`.
I don't suppose one of you might know how?
Flags: needinfo?(mdeboer)
OK, I now know - after a few experiments - that document.caretPositionFromPoint() is not the thing to use here to get the text in a certain rect.
There's nothing else in platform code that can help me here I think, so I'll need to doctor something out myself.
(In reply to Mike de Boer [:mikedeboer] from comment #17)
> OK, I now know - after a few experiments - that
> document.caretPositionFromPoint() is not the thing to use here to get the
> text in a certain rect.
> There's nothing else in platform code that can help me here I think, so I'll
> need to doctor something out myself.

Morris, I see you've touched the CollectClientRects() code before and my know about this, or could point me in the right direction: would you be able to implement a scriptable, chrome-only function that would return a list of DocumentFragments:
 1. For each rect a fragment
 2. each fragment contains the (text) contents of that rect, even if it's in the middle of the textContent in a textNode (as is done with SplitDataNode() in CutContents())

So, basically it's CutContents(), but cloning the content instead of removing it and sorted by ClientRects.

I can understand if you're not really into this sort of thing right now, but would you know someone or a team who would?

Thanks in advance!
Flags: needinfo?(mtseng)
I'm not very sure how to do this. Mats might know this.
Flags: needinfo?(mtseng) → needinfo?(mats)
I'm not sure what you're asking for, but if you want the DOM ranges
for the matching content, then that is represented as a Selection
internally:
http://searchfox.org/mozilla-central/rev/8910ca900f826a9b714607fd23bfa1b37a191eca/dom/base/nsISelectionController.idl#30
You should be able to get the DOM nodes and offsets from that I suppose.
Flags: needinfo?(mats)
... there's also a Range.getClientRects() that you could use on the ranges in that
selection, if that helps.
Yeah, that's what I'm doing, but what I _really_ need is:
1) I do range.getClientRects().
2) For each rect, I need to get the _exact_ content that it encompasses.
2.1) Thus cloneContents() and extractContents() are utterly useless, because they return (and/ or extract) all the nodes that all the rects together encompass.

Example:
           +----------+
           | The Quick|
           +----------+
+---------+
|Brown Fox| Jumps Over The Lazy Dog
+---------+

Output of `getClientRects()`: [DOMRect(), DOMRect()]
Output of `cloneContents()`: DocumentFragment<<DOMNode<' The Quick Brown Fox'>>>

Open question that I'd like to answer: Which piece of text belongs to which DOMRect?

My request is to have a method (chrome only most probably, but perhaps something worth standardizing in the future to make it available to the Web) that returns a list of DocumentFragments where each fragment contains the contents of each respective DOMRect.

Output of `getClientRects()`: [DOMRect(), DOMRect()]
Output of `cloneContentsOfRects()`: [DocumentFragment<<DOMNode<` The Quick'>>>,
                                     DocumentFragment<<DOMNode<`Brown Fox'>>>]

Because at the moment, I don't have any means at my disposal to determine where goes what.

Does this make things clearer and do you understand what I'm asking for? Thanks for your time!
Flags: needinfo?(mats)
OK, I understand what you're asking for.  As far as I know, we don't have any methods
that does that exactly.

If you look at nsRange::CollectClientRects, we do have the DOM node + offsets there
so it might be possible to tweak or fork that code to collect something you can use?
Flags: needinfo?(mats)
Yeah, that's what I was thinking! Unfortunately I don't have the necessary skills to do this, so do you know someone who could help out here?
Flags: needinfo?(mats)
Someone on the DOM or Layout teams I guess, if you can find someone who has
time to do it.  (I don't, not in the near future anyway, sorry.)
Perhaps @jst or @jet has a better idea who can take it on?
Flags: needinfo?(mats)
Let's ask! :) Please see comment 18 and comment 22.
Flags: needinfo?(jst)
Flags: needinfo?(bugs)
Severity: normal → major
Priority: -- → P2
I'll get this sorted. Please find me on IRC to talk about timing & details. Thx!
Flags: needinfo?(jst)
Flags: needinfo?(bugs)
Unassigning, whilst waiting on platform team support.
Assignee: mdeboer → nobody
Status: ASSIGNED → NEW
Iteration: 51.3 - Sep 19 → ---
Points: 5 → ---
Blocks: 1309155
Hi Jet, any luck finding someone to work on this? Thanks!
Flags: needinfo?(bugs)
Assignee: nobody → bwerth
I'll take this on, per a conversation with Jet.
Flags: needinfo?(bugs)
(In reply to Mike de Boer [:mikedeboer] from comment #22)
> Yeah, that's what I'm doing, but what I _really_ need is:
> 1) I do range.getClientRects().
> 2) For each rect, I need to get the _exact_ content that it encompasses.

Mike, I've just pushed up some patches that implement a new method, getClientRectsAndTexts() that returns a struct holding both these pieces of data. rectList is the same array as you'll get from getClientRects(). textList is an array of text, corresponding to the items in rectList. Please take a look at the try build https://treeherder.mozilla.org/#/jobs?repo=try&revision=06e7b1af1d01 and see if the API works for you, then give me whatever feedback you have.
Flags: needinfo?(mdeboer)
Exciting!! I'll take a look as I possibly can!
It seems like the test build is only for Windows and OS X so I can't check it out :(
Is there a GNU/Linux try build?

I came here form a duplicate of this bug :)
(In reply to andrei from comment #38)
> It seems like the test build is only for Windows and OS X so I can't check
> it out :(
> Is there a GNU/Linux try build?

Sorry I didn't put that in the initial try build. I've updated the build at https://treeherder.mozilla.org/#/jobs?repo=try&revision=06e7b1af1d01 and it now contains linux and linux64 builds.
(In reply to andrei from comment #38)
> It seems like the test build is only for Windows and OS X so I can't check
> it out :(
> Is there a GNU/Linux try build?

I fear I may have created some confusion. The patches I've provided and the try builds only contain the functionality to provide a new Range getClientRectsAndTexts function. The try build does not include the initial patch in this bug. What I will do now is to spin off the new API into a separate helper bug, and when that's landed then Mike can make some progress on the patch to solve the original issue.
Depends on: 1314080
Attachment #8805647 - Attachment is obsolete: true
Attachment #8805648 - Attachment is obsolete: true
Attachment #8805649 - Attachment is obsolete: true
Attachment #8805650 - Attachment is obsolete: true
Brad, you're awesome! I used it locally and can confirm that with this API I can
1) fix the issue(s)
2) simplify the code in FinderHighlighter.jsm a bit.

I'm crazy happy about this! What do you think the performance hit is, compared to invoking just `range.getClientRects()`?
Flags: needinfo?(mdeboer) → needinfo?(bwerth)
(In reply to Mike de Boer [:mikedeboer] from comment #41)
> I'm crazy happy about this! What do you think the performance hit is,
> compared to invoking just `range.getClientRects()`?

I'm glad to hear it meets the need. The additional computation is light and shouldn't affect performance. The new code does not trigger reflow; it just does a localized traversal of the DOM tree to extract text, one local traversal per client rect. There may be edge cases where a highly nested DOM tree (like text at the bottom of a thousand nested spans) would be moderately more expensive, but those cases are very unlikely. And the code is certainly cheaper than doing paired calls to getClientRects and cloneContents.
Flags: needinfo?(bwerth)
(In reply to Brad Werth [:bradwerth] from comment #42)
> And the code
> is certainly cheaper than doing paired calls to getClientRects and
> cloneContents.

...which is exactly what I'm doing right now, so this'll be a net perf win. Again, I couldn't be more pleased.
This feature will be disabled in 51.
Assignee: bwerth → nobody
Assignee: nobody → mdeboer
Status: NEW → ASSIGNED
Blocks: 1309275
Pushed by mdeboer@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d3fc2da97133
use the awesome new Range.getRectsAndTexts() API to fetch the text content for each rect of a range. r=jaws
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Keywords: leave-open
Resolution: --- → FIXED
Verified as fixed using the latest Firefox Nightly 54.0a1(20170124030205) on Windows 10 x64, Ubuntu 16.04 and Mac OS X 10.11. 

I still see some issues on this page when searching for long strings that contains links. I logged bug 1334088 for this.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: