264412 - (innertext) Add support for element.innerText

Reporter

Description

•

20 years ago

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20040913 Firefox/0.10
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20040913 Firefox/0.10

It seems that for some reason, the innerText property is not updatable with the
script in the above URL.

Reproducible: Always
Steps to Reproduce:
1. Go to the URL listed in this bug report
2.
3.

Actual Results:  
The innerText properties of the p, td, div and span elements are undefined when
the javascript starts, and they never get updated by the script.
The input element does actually get updated if the script is downloaded and
modified to take out the null checking.

Expected Results:  
The innerText property of the p, div, span, and td elements should have been
updated via the javascript.

If the innerText property in the 'clock.js' script is replaced by innerHTML,
then everything works OK

Phil Ringnalda (:philor)

Comment 1

•

20 years ago

Gecko (and thus the Mozilla suite, Firefox, Netscape, Camino, Galeon, ...) never
has supported innerText -
http://www.mozilla.org/docs/web-developer/upgrade_2.html#dom_unsupp

If you actually meant to file an enhancement bug (on Browser - DOM Level 0) for
innerText support, well, in an era when we support document.all as long as you
don't ask first, who knows?

Status: UNCONFIRMED → RESOLVED

Closed: 20 years ago

Resolution: --- → INVALID

Paul_Adams

Reporter

Comment 2

•

20 years ago

This does beg the question why do you continue to support innerHTML?!!

Gérard Talbot

Comment 3

•

20 years ago

*** Bug 282317 has been marked as a duplicate of this bug. ***

John Vandenberg

Comment 4

•

19 years ago

*** Bug 310661 has been marked as a duplicate of this bug. ***

Phil Ringnalda (:philor)

Comment 5

•

18 years ago

*** Bug 338801 has been marked as a duplicate of this bug. ***

:Gavin Sharp [email: gavin@gavinsharp.com]

Comment 6

•

18 years ago

*** Bug 287395 has been marked as a duplicate of this bug. ***

:Gavin Sharp [email: gavin@gavinsharp.com]

Comment 7

•

18 years ago

*** Bug 317330 has been marked as a duplicate of this bug. ***

:Gavin Sharp [email: gavin@gavinsharp.com]

Comment 8

•

18 years ago

*** Bug 339017 has been marked as a duplicate of this bug. ***

:Gavin Sharp [email: gavin@gavinsharp.com]

Comment 9

•

18 years ago

*** Bug 254174 has been marked as a duplicate of this bug. ***

:Gavin Sharp [email: gavin@gavinsharp.com]

Updated

•

18 years ago

Alias: innertext

:Gavin Sharp [email: gavin@gavinsharp.com]

Updated

•

18 years ago

Status: RESOLVED → VERIFIED

:Gavin Sharp [email: gavin@gavinsharp.com]

Updated

•

14 years ago

Assignee: bugzilla → nobody

Component: General → DOM

Product: Firefox → Core

QA Contact: general → general

Aryeh Gregor (:ayg) (no longer with Mozilla)

Comment 10

•

14 years ago

Is Gecko interested in implementing this if a detailed spec is written?  It's complicated, but considerably more useful than textContent for getting a plaintext version of an HTML element, and every other browser has it.  WebKit seems not to be interested in dropping it, and it would be nice if we could get interop here.

Status: VERIFIED → REOPENED

Ever confirmed: true

Resolution: INVALID → ---

Mounir Lamouri (:mounir)

Updated

•

14 years ago

Component: DOM → DOM: Core & HTML

OS: Windows XP → All

Hardware: x86 → All

Version: unspecified → Trunk

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 11

•

14 years ago

Why innerText is more useful than textContent?
Based on http://msdn.microsoft.com/en-us/library/ms533899%28v=vs.85%29.aspx
it is quite bizarre API.

Aryeh Gregor (:ayg) (no longer with Mozilla)

Comment 12

•

14 years ago

innerText returns the element's HTML contents converted to plaintext.  Conceptually, it works the same way as Selection stringification -- in fact, WebKit uses the same algorithm for both, and I'm going to specify both as the same algorithm.  textContent just concatenates the text node descendants in tree order, so it preserves indentation and so forth.  innerText collapses runs of whitespace to a single space, inserts newlines after block elements, etc.

I don't have specific use-cases offhand for why anyone would want to use innerText instead of textContent.  The large majority of authors seem to only set it, in which case aliasing it to textContent would work, and in fact Opera treats it similarly to textContent (although they apparently have some compat bugs as a result).  But given that everyone needs to implement the plaintext conversion algorithm for Selection stringification anyway, I don't see any reason not to provide innerText too.  The additional implementation burden should be low, and it makes things easier for authors if all browsers implement it instead of all but Firefox.

Aryeh Gregor (:ayg) (no longer with Mozilla)

Comment 14

•

14 years ago

I got a list of webpages using innerText and went through a whole bunch.  Most of them use it either for setting (where it behaves the same as textContent), or for getting for elements where it behaves basically the same as textContent anyway (e.g., only text node children).  The most interesting use I found was to transform a comment to plaintext for a quote, in two places:

1) http://www.cristianofino.net/post/Comment-Toolbar-plugin-per-WordPress.aspx

This uses innerText for a quotation feature -- it's part of an actively-maintained Wordpress plugin, <http://wordpress.org/extend/plugins/comment-toolbar/>.  The relevant JS is at <http://www.cristianofino.net/User%20Controls/CommentToolbar.js>.  To get the quote when you click the "Quota" button on a comment, it basically stringifies the Selection if there is one, otherwise takes the innerText of the post you clicked on, otherwise takes the textContent.  It so happens that this works okay, because there's no block-level markup in the comments and <br>'s and newlines are paired up exactly.  But given more general markup, Firefox (and Opera and IE9) would look much worse here, with lots of extra whitespace stuck in.

2) http://blog.livedoor.jp/geinin_matome/archives/1080186.html

Similar to (1) -- hover over the links like ">>66" to get a tooltip with a plaintext quote in WebKit or IE.  Again, no interesting markup that would make a big difference between innerText and textContent, but there could have been.

Aryeh Gregor (:ayg) (no longer with Mozilla)

Comment 15

•

13 years ago

I filed a bug asking for this to be specced in HTML5 using a relatively trivial algorithm (textContent but with <br> converted to "\n"):

http://www.w3.org/Bugs/Public/show_bug.cgi?id=13145

I did this because I was reminded of the research and discussion I did about this last February:

http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-February/030179.html

Notable from that link is that I found some pages that do break on Firefox because of missing innerText, e.g.:

http://beihai.gov.cn/5863/2010_4_28/5863_98289_1272423962000.html?open=1

At the time of this writing, a large chunk of content is missing in Firefox 6.0a2 which is present in Chrome 14 dev and Opera 11.50:

"4月26日上午，北海船检局召开中心组成员《新跨越》专题学习会。     会议根据广西区港航管理局《关于在全区港航系统开展推动广西水运事业新跨越专题学习讨论活动的通知》（港航政治[2010]59号文精神，围绕推动广西水运事业新跨越主题，认真组织学习了广西区人民政府..."

That's because of the following script:

<script>var strCol=document.getElementById("contentID").innerText;if(strCol.length>150){document.write(strCol.substring(0,150)+"...");}else{document.write(strCol);}</script>

The script does nothing in Gecko because the conditional checks the length of undefined, which is a fatal error.  It works in all other browsers.

Comment hidden (advocacy)

Hi Folks chrome's users have more than firefox,It's time we started to do it .right?

Wesley Johnston (:wesj)

Updated

•

12 years ago

Blocks: ringmark-ring1

Comment hidden (advocacy)

Every morning I wake up and feel so sad that Firefox is the only browser doesn't support innerText.
I'm amateur programmer, but if I'll be more skilled I would give any help to make Firefox fill this gap.
textContent has a different behavior and in may cases can't be the solution.

David Bruant

Updated

•

12 years ago

Keywords: dev-doc-needed

danya.postfactum

Comment 18

•

11 years ago

As I see innerText is in spec TODO list (http://wiki.whatwg.org/wiki/Specs_todo).
Looks like nobody is going to write it. So, we neve get innerText in Firefox ? :(

Yuri

Comment 22

•

10 years ago

This really needs to be fixed. Here is the test case just in case http://jsfiddle.net/azpnmxuc/
Otherwise people will keep creating duplicates, like they did before.

Karl Dubost💡 :karlcow

Comment 23

•

9 years ago

WebCompat issue created because of lack of support of innerText 
https://github.com/webcompat/web-bugs/issues/633

Karl Dubost💡 :karlcow

Comment 24

•

9 years ago

Other previously reported Webcompat issues (some are now invalid or resolved):
* Bug 942989
* Bug 281825
* Bug 187698
* Bug 131793
* Bug 358349 
* Bug 25918
* Bug 148812
* Bug 169598
* Bug 220538 
* Bug 471755

Karl Dubost💡 :karlcow

Updated

•

9 years ago

Blocks: 1130981

Karl Dubost💡 :karlcow

Comment 25

•

9 years ago

Adding for context, an interesting blog post with textContent and innerText, performances computations, etc.
http://perfectionkills.com/the-poor-misunderstood-innerText/

Yuri

Comment 26

•

9 years ago

innerText should be faster than innerHtml, because the former doesn't need any parsing. Seems like browsers should be motivated to support it well as a good practice.

Juriy "kangax" Zaytsev

Comment 27

•

9 years ago

Would Mozilla be interested implementing innerText after all? As Karl mentioned, I've outlined use-cases in a recent blog post. I believe innerText is more than just a pesky webcompat offender. I'll try to work on spec in the meantime.

Karl Dubost💡 :karlcow

Comment 28

•

9 years ago

Another Web Compatibility issue https://webcompat.com/issues/1051

Karl Dubost💡 :karlcow

Comment 29

•

9 years ago

Another Web Compatibility issue https://webcompat.com/issues/1071#issuecomment-101837467

Mike Taylor [:miketaylr]

Comment 30

•

9 years ago

Apart from the web compatibility gains, Juriy has written a great summary of innerText and a possible way forward in terms of a (pseudo) spec. If Chrome and IE are willing to converge, great. If not, we should pick one of the flavors and implement it.

http://perfectionkills.com/the-poor-misunderstood-innerText/#naive-spec
http://kangax.github.io/jstests/innerText/

Summary: innerText property on various elements not updatable with javascript → Add support for element.innerText

Mike Taylor [:miketaylr]

Updated

•

9 years ago

Blocks: 1170774

:shell escalante

Updated

•

9 years ago

tracking-p11: --- → +

Whiteboard: [compat]

Karl Dubost💡 :karlcow

Updated

•

9 years ago

Blocks: 914252

Karl Dubost💡 :karlcow

Updated

•

9 years ago

Blocks: 942989

Karl Dubost💡 :karlcow

Comment 31

•

9 years ago

https://www.w3.org/Bugs/Public/show_bug.cgi?id=13145
https://lists.w3.org/Archives/Public/public-webapps/2014JulSep/thread.html#msg580
http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-February/030179.html

Hallvord R. M. Steen [:hallvors]

Comment 32

•

9 years ago

Breaks Google Calendar (and probably GMail) navigation rendering: https://bugzilla.mozilla.org/show_bug.cgi?id=914252#c16

Karl Dubost💡 :karlcow

Updated

•

9 years ago

See Also: → https://github.com/whatwg/compat/issues/5

Hallvord R. M. Steen [:hallvors]

Updated

•

9 years ago

Comment 33

•

9 years ago

Breaks Facebook status update field resizing - bug 1197166

Status: REOPENED → NEW

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 34

•

9 years ago

Breaks clicking on erroneous words in xkcd SimpleWriter. http://blog.xkcd.com/2015/09/22/a-thing-explainer-word-checker/

Carlos Alén Silva

Comment 35

•

9 years ago

For what is worth, please note that the original URL ( http://www.sharepointcustomization.com/resources/codesamples/TimeZoneClocks/clock_example.htm ) is gone.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 36

•

9 years ago

We need to implement something here, because regardless of whether innerText is a desirable feature, Web developers are using it and sites are broken in Firefox without it.

I propose we implement the last version of Aryeh's proposed spec, here: https://rawgit.com/timdown/rangy/master/fiddlings/spec/innerText.htm

Olli, does that sound OK to you?

Assignee: nobody → roc

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Updated

•

9 years ago

Flags: needinfo?(bugs)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 37

•

9 years ago

Given that this is so bizarre feature, I'd like to hear feedback from anne and bz too.

But yes, I agree, we need to implement something here.

Flags: needinfo?(bzbarsky)

Flags: needinfo?(annevk)

Anne (:annevk)

Comment 38

•

9 years ago

What makes the most sense to me is that getting behaves equivalently to stringifying a selection. So once selection is defined, this feature would be too. That matches what WebKit is doing if memory serves me.

Flags: needinfo?(annevk)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 39

•

9 years ago

Yeah, that has just the issue that selection isn't properly specified.

How differently would we behave if we just used our selection stringification?

Flags: needinfo?(bugs)

Juriy "kangax" Zaytsev

Comment 40

•

9 years ago

Good question. I have no time now but if someone could run it through my test — http://kangax.github.io/jstests/innerText/ (which also shows how other implementations differ among each other; and which I'm planning to update with couple more options soon)

Boris Zbarsky [:bzbarsky]

Comment 41

•

9 years ago

I think in the short term we should do whatever is simplest here.  If just stringifying a selection is simpler for us than implementing Aryeh's somewhat stillborn spec, we should do that.  Compat for this stuff is pretty much nonexistent anyway.  :(

Flags: needinfo?(bzbarsky)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 42

•

9 years ago

My priorities are in this order:
0) Don't break stuff that currently works.
1) Implement a reasonably Web-compatible version of innerText in Gecko.
2) Have a spec for innerText that browsers converge on.
3) Make innerText consistent with selection.toString().

I think perhaps the best way to proceed is to extend nsPlainTextSerializer with new options until it supports Aryeh's draft spec, then implement innerText using nsPlainTextSerializer, and iterate on the implementation and draft spec until we've shipped something Web-compatible enough. Then we can try changing selection.toString() to use the same options.

Juriy, do you know of any issues with the 2011 draft? https://rawgit.com/timdown/rangy/master/fiddlings/spec/innerText.htm

Flags: needinfo?(kangax)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 43

•

9 years ago

(In reply to Boris Zbarsky [:bz] from comment #41)
> I think in the short term we should do whatever is simplest here.  If just
> stringifying a selection is simpler for us than implementing Aryeh's
> somewhat stillborn spec, we should do that.  Compat for this stuff is pretty
> much nonexistent anyway.  :(

While that's true, I would like to have a spec, and I'm not confident that just pushing our current selection.toString() behavior onto the Web would be good for the Web or us.

Juriy "kangax" Zaytsev

Comment 44

•

9 years ago

> Juriy, do you know of any issues with the 2011 draft? 

Unfortunately, I haven't had a chance to look into Aryeh's test thoroughly. Glancing at it now, the algorithm seems correct but it's probably missing few things. I see that Aryeh spec'd tabs between table-cell elements — this mimics webkit behavior but not IE. It also looks like inline-block elements don't create newlines — this has been the case in IE<=10 but not anymore (including webkit). There's no mention of what iframes resolve to (webkit ignores them, IE doesn't); or <canvas>, or <video>, or <select>. Well, <select> would resolve to its content (due to display:inline?) so that's IE's behavior but not WebKit. Finally, visibility:hidden mimics WebKit's but not IE behavior.

Yeah, this is a sad state of affairs.

Like I mentioned in a post, my understanding is that we need to:

1) Write a spec
2) Try to converge IE and WebKit/Blink behavior
3) Implement spec'd behavior in Firefox

The hardest part is — of course — choosing which behavior to go with. Considering drastic difference between IE and WebKit, I doubt there's any serious compat issues lurking here. Most of the problems are from not implementing _any_ version of innerText.

I'd love to see what FF's selection behavior is — is it closer to WK or IE's innerText implementations? Perhaps that's the starting point.

Flags: needinfo?(kangax)

Juriy "kangax" Zaytsev

Comment 45

•

9 years ago

*Aryeh's spec thoroughly

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 46

•

9 years ago

When in doubt we should probably just follow Chrome, which I expect they mostly got from Webkit.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 47

•

9 years ago

Here are the results I get on Juriy's tests using selection.toString():

Newlines between paragraphs: 2
Newlines after table caption: 0
Newlines after block, non-paragraph elements (e.g. <div>): 1
Includes display:none elements: no
Includes visibility:hidden elements: no
Includes <select> content: no
Includes <button> content: no
Includes <canvas>, <video> fallback content: canvas no, video yes
Includes <iframe> fallback content: no
Includes <script>, <style> content: no
Creates leading space for anonymous inline boxes (including <li>): yes
Creates newline around inline, block-styled elements (e.g. <span>): no
Respects white-space:pre: yes
Creates (only) spaces between block, inline-styled elements (e.g. <div>): no
Puts tabs between <td> tags: yes
Preserves upper/lower-cased text: no

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 48

•

9 years ago

Looking at nsPlainTextSerializer, it's pretty complex. Some changes we'd want to make would be quite disruptive, e.g. right now many HTML elements (such as <span>) ignore IsElementBlock() in nsPlainTextSerializer::DoOpenContainer. I'd feel a lot more comfortable about having a separate implementation for innerText that we can evolve independently. I also think it would be good to deploy an implementation of innerText whose behavior is as simple as possible, so that the spec can be as simple as possible while still be being Web-compatible. Maybe we should try picking the simpler behavior everywhere IE and Chrome differ.

Karl Dubost💡 :karlcow

Comment 49

•

9 years ago

Roc,

First of all. Very cool. That's a very good news. Thanks for taking this. 

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) (offline Oct 1-4) from comment #43)
> While that's true, I would like to have a spec

That's partly the goal of the opened issue on the webcompat spec
https://github.com/whatwg/compat/issues/5
You are welcome to suggest prose or writing along the implementation and we can adjust step by step. 

I'm pretty sure Mike, Hallvord and I also can dig out old Web Compat issues and see how we fare with regards to them with the implementation you will be doing. That would be nice test cases. And specifically how it would work in the face of code like these ones which try to address the Web Compatibility issue. 


----
> function innerText (elem) {
>   return elem.innerText || elem.textContent;
> }

 module.exports = innerText;
In https://github.com/joshwnj/marki/blob/475b400cb4ad9a2c1d43fe46bebb0f518c397c05/src/inner-text.js

---- 

> if (el.textContent !== undefined)
>   el.textContent = string;
> else
>   el.innerText = string;
In https://github.com/HubSpot/youmightnotneedjquery/blob/ef987223c20e480fcbfb5924d96c11cd928e1226/comparisons/elements/set_text/ie8.js

----

> function text (el, value) {
>   if (arguments.length === 2) {
>     el.innerText = el.textContent = value;
>   }
>   if (typeof el.innerText === 'string') {
>     return el.innerText;
>   }
>   return el.textContent;
> }

In https://github.com/bevacqua/insignia/blob/53742c6929332b7c776d85dade8d7d82f524195b/text.js


----

> // innerText polyfill for Firefox
> ((document) => {
>   const dummy = document.createElement('div');
>     Object.defineProperty(HTMLElement.prototype, 'innerText', {
>       get: function()  { return this.textContent },
>       set: function(v) { this.textContent = v; }
>     });
>   }
> })(document);
In https://github.com/ng-kyoto/ng-kyoto.github.io/blob/9315fb7b24baec5a1614b4692bfedf9c0fc267dc/app/utils/innertext-polyfill.ts

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 50

•

9 years ago

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) (offline Oct 1-4) from comment #48)
> Looking at nsPlainTextSerializer, it's pretty complex.
It is, which is one reason why we for innerHTML, which is quite well spec'ed, moved out from using ns*Serializer (at least in HTML case). Performance was another reason (bug 744830 comment 51).
FragmentOrElement::GetMarkup does the innerHTML serialization.

> I'd feel a lot more comfortable
> about having a separate implementation for innerText that we can evolve
> independently. I also think it would be good to deploy an implementation of
> innerText whose behavior is as simple as possible, so that the spec can be
> as simple as possible while still be being Web-compatible. Maybe we should
> try picking the simpler behavior everywhere IE and Chrome differ.
This all sounds good

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 51

•

9 years ago

(In reply to Karl Dubost :karlcow from comment #49)
> I'm pretty sure Mike, Hallvord and I also can dig out old Web Compat issues
> and see how we fare with regards to them with the implementation you will be
> doing. That would be nice test cases. And specifically how it would work in
> the face of code like these ones which try to address the Web Compatibility
> issue.

So far I have not been able to find any specific Web compat issues that require more than just making innerText an alias for textContent. I suspect they exist since IE's behavior has changed over time; I've asked Travis if he can find out. That's the information we really need.

(I looked at Webkit history for information about how and why their implementation has evolved, but changes to their plaintext serializer (TextIterator) do not typically mention why the change was made, and linked Webkit Bugzilla bugs are not helpful. Webkit's TextIterator is used for clipboard and other features so it's unclear which, if any, of their behavior is actually important for Web apps.)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 52

•

9 years ago

Though actually, we need to not just come up with a spec that's the minimum required for Web compatibility, but something that IE and Chrome at least are willing to implement, so we might need something more complicated than the minimum to satisfy their perceived needs rather than just their actual needs...

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 53

•

9 years ago

I have created a proposed spec plus testsuite here: https://github.com/rocallahan/innerText-spec

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 54

•

9 years ago

Attached file MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz — Details

Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats

Attachment #8675486 - Flags: review?(mats)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 55

•

9 years ago

Attached file MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats — Details

Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

Attachment #8675487 - Flags: review?(mats)

Attachment #8675487 - Flags: review?(bugs)

Chris Peterson [:cpeterson]

Updated

•

9 years ago

Keywords: feature

Mats Palmgren (inactive)

Comment 56

•

9 years ago

Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

https://reviewboard.mozilla.org/r/22369/#review20043

::: layout/generic/nsIFrame.h:1907
(Diff revision 1)
>     * @param aSkippedMaxLength  Maximum number of characters to return

The GetRenderedText() documentation needs to be updated.

::: layout/generic/nsIFrame.h:1922
(Diff revision 1)
> +  virtual RenderedText GetRenderedText(uint32_t aFlags = 0,

It's a bit hard to understand what the zero means at the call site:
GetRenderedText(0, aContentOffset, aContentOffset + 1);
I think you should add a CONTENT_TEXT_OFFSETS to make it clear that the flag passed matches the offsets:
GetRenderedText(CONTENT_TEXT_OFFSETS, aContentOffset, aContentOffset + 1);

Do you think it's likely that we'll add additional flags here in the future?  If not, then I suggest you make it an enum class to avoid typos like:
GetRenderedText(aContentOffset, aContentOffset + 1);
Actually, I think we should do that now anyway.  Then if we need more flags in the future we add "operator|" etc.

Alternatively, could we bundle these three params into an object, DOMOffsets/RenderedOffsets subclassed of SomeOffsets, or something like that? e.g.
GetRenderedText(DOMOffsets(aContentOffset, aContentOffset + 1));
GetRenderedText(RenderedOffsets(aRenderedOffset, aRenderedOffset + 1));

::: layout/generic/nsTextFrame.h:264
(Diff revision 1)
> -  virtual nsresult GetRenderedText(nsAString* aString = nullptr,
> +  virtual RenderedText GetRenderedText(uint32_t aFlags = 0,

aFlags = CONTENT_TEXT_OFFSETS would make this clearer too

::: layout/generic/nsTextFrame.cpp:8888
(Diff revision 1)
> -  uint32_t validCharsLength = 0;
> +  bool setOffsets = false;

|setOffsets| is a confusing name. It's used like so:
if (!setOffsets) {
  // set the offsets
  setOffsets = true;
}
Which appears to be the opposite of what the name suggests.
Perhaps |haveOffsets| is better?

Attachment #8675486 - Flags: review?(mats)

Mats Palmgren (inactive)

Comment 57

•

9 years ago

Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

https://reviewboard.mozilla.org/r/22371/#review20051

r=mats with those nits addressed as you see fit

::: dom/base/nsRange.cpp:3201
(Diff revision 1)
> +  int mRequiredLineBreakCount;

Why use a signed type for mRequiredLineBreakCount?
And why int instead of int32_t?

::: dom/base/nsRange.cpp:3206
(Diff revision 1)
> +    while (mRequiredLineBreakCount > 0) {

This loop seems silly.  Why not just do:

    if (mRequiredLineBreakCount > 0) {
      if (!mString.IsEmpty()) {
        mString.Append('\n');
      }
      mRequiredLineBreakCount = 0;
    }

::: dom/base/nsRange.cpp:3228
(Diff revision 1)
> +static bool IsVisibleAndNotInReplacedElement(nsIFrame* aFrame)

add a newline after |bool| please

::: dom/base/nsRange.cpp:3279
(Diff revision 1)
> +  if (aFrame->GetType() != nsGkAtoms::tableCellFrame) {

And nsGkAtoms::bcTableCellFrame ?

::: dom/base/nsRange.cpp:3325
(Diff revision 1)
> +  nsIContent* currentNode = static_cast<nsIContent*>(mStartParent.get());

Can you use AsContent() here?

::: dom/base/nsRange.cpp:3328
(Diff revision 1)
> +    nsGenericDOMDataNode* t = static_cast<nsGenericDOMDataNode*>(mStartParent.get());

'auto' makes this pattern more readable, IMO.

::: dom/base/nsRange.cpp:3358
(Diff revision 1)
> +          nsIFrame::RenderedText text = f->GetRenderedText(0, 0, UINT32_MAX);

Just GetRenderedText() seems clearer.

::: dom/base/nsRange.cpp:3371
(Diff revision 1)
> +          result.Append(NS_LITERAL_STRING("\n"));

Append('\n') is shorter (three places)

::: dom/base/nsRange.cpp:3373
(Diff revision 1)
> +        uint8_t display = f->StyleDisplay()->mDisplay;

No need to declare a variable for this, just use the expression directly in the 'switch'.

Attachment #8675487 - Flags: review?(mats) → review+

Mats Palmgren (inactive)

Comment 58

•

9 years ago

FYI, nsDocumentEncoder has some special code for handling ShadowRoot
and IsHTMLElement(nsGkAtoms::rp) here:
http://mxr.mozilla.org/mozilla-central/source/dom/base/nsDocumentEncoder.cpp#107
did you consider those?

Mats Palmgren (inactive)

Comment 59

•

9 years ago

It would be good to have test that checks <table><tfoot>x</tfoot><tbody>y</tbody></table>
since those are reordered in rendering.  And some tests using shadow dom.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 60

•

9 years ago

https://reviewboard.mozilla.org/r/22369/#review20043

> It's a bit hard to understand what the zero means at the call site:
> GetRenderedText(0, aContentOffset, aContentOffset + 1);
> I think you should add a CONTENT_TEXT_OFFSETS to make it clear that the flag passed matches the offsets:
> GetRenderedText(CONTENT_TEXT_OFFSETS, aContentOffset, aContentOffset + 1);
> 
> Do you think it's likely that we'll add additional flags here in the future?  If not, then I suggest you make it an enum class to avoid typos like:
> GetRenderedText(aContentOffset, aContentOffset + 1);
> Actually, I think we should do that now anyway.  Then if we need more flags in the future we add "operator|" etc.
> 
> Alternatively, could we bundle these three params into an object, DOMOffsets/RenderedOffsets subclassed of SomeOffsets, or something like that? e.g.
> GetRenderedText(DOMOffsets(aContentOffset, aContentOffset + 1));
> GetRenderedText(RenderedOffsets(aRenderedOffset, aRenderedOffset + 1));

I thought I would have to add more flags but in the end I didn't. So making this an enum class parameter, and moving it to the end, seems like the way to go.

> |setOffsets| is a confusing name. It's used like so:
> if (!setOffsets) {
>   // set the offsets
>   setOffsets = true;
> }
> Which appears to be the opposite of what the name suggests.
> Perhaps |haveOffsets| is better?

Renamed to haveOffsets.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 61

•

9 years ago

https://reviewboard.mozilla.org/r/22371/#review20065

::: dom/base/nsRange.cpp:3201
(Diff revision 1)
> +  int mRequiredLineBreakCount;

It can only be 0, 1 or 2. I'll make it int8_t.

::: dom/base/nsRange.cpp:3206
(Diff revision 1)
> +    while (mRequiredLineBreakCount > 0) {

Because we need to add as many newlines as mRequiredLineBreakCount says. In particular sometimes (when a <p> is present) we need to add two newlines.

::: dom/base/nsRange.cpp:3279
(Diff revision 1)
> +  if (aFrame->GetType() != nsGkAtoms::tableCellFrame) {

Yes, good catch.

::: dom/base/nsRange.cpp:3325
(Diff revision 1)
> +  nsIContent* currentNode = static_cast<nsIContent*>(mStartParent.get());

Sure.

::: dom/base/nsRange.cpp:3328
(Diff revision 1)
> +    nsGenericDOMDataNode* t = static_cast<nsGenericDOMDataNode*>(mStartParent.get());

OK

::: dom/base/nsRange.cpp:3358
(Diff revision 1)
> +          nsIFrame::RenderedText text = f->GetRenderedText(0, 0, UINT32_MAX);

Sure.

::: dom/base/nsRange.cpp:3371
(Diff revision 1)
> +          result.Append(NS_LITERAL_STRING("\n"));

OK, though I have had to add an Append(char) overload for that.

::: dom/base/nsRange.cpp:3373
(Diff revision 1)
> +        uint8_t display = f->StyleDisplay()->mDisplay;

OK

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 62

•

9 years ago

(In reply to Mats Palmgren (:mats) from comment #58)
> FYI, nsDocumentEncoder has some special code for handling ShadowRoot

I don't understand what that code is for. It was added in bug 806506 without explanation. I think we should ignore that for now, but let's ask William. Unless I hear otherwise I'd like to completely ignore shadow DOM, like we're ignoring CSS anonymous content. I'll write some tests to check that it's ignored. Chrome excludes shadow DOM content from "innerText".

> and IsHTMLElement(nsGkAtoms::rp) here:
> http://mxr.mozilla.org/mozilla-central/source/dom/base/nsDocumentEncoder.
> cpp#107

Yes ... clearly <rp> contents should be included in innerText, though Chrome doesn't do it. I'll fix that in the spec and here.

> It would be good to have test that checks <table><tfoot>x</tfoot><tbody>y</tbody></table>
> since those are reordered in rendering.  And some tests using shadow dom.

Sure.

Flags: needinfo?(wchen)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Updated

•

9 years ago

Attachment #8675486 - Flags: review?(mats)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 63

•

9 years ago

Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 64

•

9 years ago

Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

Mats Palmgren (inactive)

Updated

•

9 years ago

Attachment #8675486 - Flags: review?(mats) → review+

Mats Palmgren (inactive)

Comment 65

•

9 years ago

Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

https://reviewboard.mozilla.org/r/22369/#review20107

r=mats with the issue below fixed

::: layout/generic/nsIFrame.h:1907
(Diff revision 2)
>     * @param aSkippedMaxLength  Maximum number of characters to return

The updated documentation is still missing.

Mats Palmgren (inactive)

Comment 66

•

9 years ago

https://reviewboard.mozilla.org/r/22371/#review20111

::: dom/base/nsRange.cpp:3267
(Diff revision 2)
> +static int

You're still using 'int' here.  I guess you want int8_t?

::: dom/base/nsRange.cpp:3386
(Diff revision 2)
> +          result.Append('\n');

Oh, right, 'result' is the accumulator object here.  I misread it as a string type the last time which is why I suggested to use Assign(char).  Adding an overload is fine, but don't feel like you have to if you want to revert that bit.

Mats Palmgren (inactive)

Comment 67

•

9 years ago

(It's fine, but I still don't get why you want to use a signed type though.)

Mats Palmgren (inactive)

Comment 68

•

9 years ago

We're not compatible with Chrome on this test (table.innerText):
<table>
<tfoot><tr><td>footer</tfoot>
<thead><tr><td style="visibility:collapse">thead</thead>
<tbody><tr><td>tbody</tbody>
</table>

You may want to fix that, and perhaps also update the spec?

Please also add tests with visibility:collapse on (each) row-group, row, and cell.

Mats Palmgren (inactive)

Comment 69

•

9 years ago

We should probably also add tests for grid/flex items that use 'order':
<div style="display:grid"><x style="order:1">1</x><x>2</x></div>
<div style="display:flex"><x style="order:1">1</x><x>2</x></div>

"2" is rendered before "1", but the innerText should be "1\n2"
I suppose (since the <x>'s are blockified).

Probably also worth having tests with visibility:collapse on some items,
since it's supported at least for flexbox:
http://www.w3.org/TR/css-flexbox-1/#visibility-collapse
(and Grid might support it too in the future)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 70

•

9 years ago

https://reviewboard.mozilla.org/r/22369/#review20197

::: layout/generic/nsIFrame.h:1907
(Diff revision 2)
>     * @param aSkippedMaxLength  Maximum number of characters to return

Fixed

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 71

•

9 years ago

https://reviewboard.mozilla.org/r/22371/#review20199

::: dom/base/nsRange.cpp:3267
(Diff revision 2)
> +static int

Yes.

I'm using a signed type because especially for limited-range types, it's safer. Bugs like for (...; x >= 0; ...) can't bite.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 72

•

9 years ago

(In reply to Mats Palmgren (:mats) from comment #68)
> We're not compatible with Chrome on this test (table.innerText):
> <table>
> <tfoot><tr><td>footer</tfoot>
> <thead><tr><td style="visibility:collapse">thead</thead>
> <tbody><tr><td>tbody</tbody>
> </table>
> 
> You may want to fix that, and perhaps also update the spec?

I'd rather not change anything here. I don't think there's anything clearly wrong with what the spec and our implementation do, changing this to match Chrome would add complexity, and I'd be surprised if it causes Web compatibility problems in practice.

> Please also add tests with visibility:collapse on (each) row-group, row, and
> cell.

Done.

(In reply to Mats Palmgren (:mats) from comment #69)
> We should probably also add tests for grid/flex items that use 'order':
> <div style="display:grid"><x style="order:1">1</x><x>2</x></div>
> <div style="display:flex"><x style="order:1">1</x><x>2</x></div>
> 
> "2" is rendered before "1", but the innerText should be "1\n2"
> I suppose (since the <x>'s are blockified).

Done.

> Probably also worth having tests with visibility:collapse on some items,
> since it's supported at least for flexbox:
> http://www.w3.org/TR/css-flexbox-1/#visibility-collapse
> (and Grid might support it too in the future)

Done.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 73

•

9 years ago

Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 74

•

9 years ago

Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 75

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=3b4cc6272d38

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 76

•

9 years ago

Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 77

•

9 years ago

https://reviewboard.mozilla.org/r/22371/#review20203

I've updated <rp> handling a little bit to ensure that visiblity:hidden <rp>s don't contribute to the result of innerText.

William Chen [:wchen]

Comment 78

•

9 years ago

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #62)
> (In reply to Mats Palmgren (:mats) from comment #58)
> > FYI, nsDocumentEncoder has some special code for handling ShadowRoot
> 
> I don't understand what that code is for. It was added in bug 806506 without
> explanation. I think we should ignore that for now, but let's ask William.
> Unless I hear otherwise I'd like to completely ignore shadow DOM, like we're
> ignoring CSS anonymous content. I'll write some tests to check that it's
> ignored. Chrome excludes shadow DOM content from "innerText".
If I recall correctly, the code is to make serializing from a selection in shadow DOM work. In some cases, selection within a shadow DOM is described by a range where the root container is the shadow root and there is code that checks if the root container is visible before serializing the contents. It looks like innerText is supposed to work similar to serializing a selection, and given that selection with shadow DOM is not well defined and behaves poorly in all implementations (spec only says user agents provide a best attempt), it's not clear what behavior we actually want for textContent so whatever is easiest is probably fine for now. It should be more clear once selection and shadow DOM is specified.

Flags: needinfo?(wchen)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 79

•

9 years ago

OK great, thanks.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 80

•

9 years ago

These patches caused some accessibility test failures that I'm fixing up now.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 81

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/ab657569f554a1b1dae9575fb95ca5a94949442a
Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

https://hg.mozilla.org/integration/mozilla-inbound/rev/a396f4262479a5695c918f59bacd542ba21448b0
Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 82

•

9 years ago

Hmm, this landed already? Is my review needed here? 
(Sorry, I've had some other stuff to review too)

Guilherme Lima

Comment 83

•

9 years ago

Roc landed by mistake, it seems:
b70e89c03c56	Robert O'Callahan — Revert incorrectly committed changes ab657569f554 and a396f4262479

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 84

•

9 years ago

I landed it by mistake and backed it out immediately. So it still needs your review.

Flags: needinfo?(bugs)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 85

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=326d0927d8a0

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 86

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=bfdc26cd9ea4

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 87

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4e7a7c3d1a11

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 88

•

9 years ago

Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

The test changes here are to adjust for the fact that
nsTextFrame::GetRenderedText can now trim whitespace from the end of lines
that end in a hard line break.

Attachment #8675486 - Attachment description: MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats → MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

Attachment #8675486 - Flags: review?(mzehe)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 89

•

9 years ago

Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 90

•

9 years ago

https://reviewboard.mozilla.org/r/22369/#review20995

mats: I'm adding aPushedFrame = true; in nsLineLayout.cpp. This fixes a bug that was revealed by the accessibility tests, where a line was not getting SetLineWrapped(true) even though we had a soft line break, because nsBlockFrame::ReflowInlineFrame did not see a pushed frame. So innerText could strip spaces before soft line breaks when it shouldn't.

This also fixes a bug in GetRenderedText where whitespace trimming was not being taken into account when we compute the length of the rendered text for a frame, and so our rendered text offsets were off sometimes. Also revealed by the accessibility tests.

marcoz: The accessibility test changes are entirely due to the new code trimming trailing whitespace when it's before a hard line break and collapsible.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 91

•

9 years ago

https://reviewboard.mozilla.org/r/22371/#review20997

The changes here are very minor, just a static-analysis fix and a couple of tests to catch the bug about lines not being marked as having a soft line break.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 92

•

9 years ago

Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

The test changes here are to adjust for the fact that
nsTextFrame::GetRenderedText can now trim whitespace from the end of lines
that end in a hard line break.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 93

•

9 years ago

Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 94

•

9 years ago

Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

The test changes here are to adjust for the fact that
nsTextFrame::GetRenderedText can now trim whitespace from the end of lines
that end in a hard line break.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 95

•

9 years ago

Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 96

•

9 years ago

Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

The test changes here are to adjust for the fact that
nsTextFrame::GetRenderedText can now trim whitespace from the end of lines
that end in a hard line break.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 97

•

9 years ago

Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Updated

•

9 years ago

Attachment #8675486 - Flags: review+ → review?(mats)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Updated

•

9 years ago

Attachment #8675487 - Flags: review+ → review?(mats)

Marco Zehe (:MarcoZ)

Comment 98

•

9 years ago

Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

https://reviewboard.mozilla.org/r/22369/#review21029

r=marcoz

::: accessible/generic/HyperTextAccessible.cpp:1989
(Diff revision 4)
> -  // Only get info up to original offset, we know that will be larger than skipped offset
> +  *aRenderedOffset = text.mOffsetWithinNodeRenderedText;

Can we keep the comment, please?

::: accessible/generic/HyperTextAccessible.cpp:2013
(Diff revision 4)
> -  // We only need info up to skipped offset -- that is what we're converting to original offset
> +  *aContentOffset = text.mOffsetWithinNodeText;

Same here.

r=me with the two nits fixed. Thanks!

Attachment #8675486 - Flags: review?(mzehe) → review+

Mats Palmgren (inactive)

Comment 99

•

9 years ago

Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

https://reviewboard.mozilla.org/r/22369/#review21059

r=mats

::: layout/generic/nsTextFrame.cpp:9009
(Diff revisions 3 - 4)
> +        NS_ASSERTION(result.mOffsetWithinNodeRenderedText >= offsetInRenderedString &&

Please use MOZ_ASSERT instead.

Attachment #8675486 - Flags: review?(mats) → review+

Mats Palmgren (inactive)

Comment 100

•

9 years ago

Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

https://reviewboard.mozilla.org/r/22371/#review21061

Attachment #8675487 - Flags: review?(mats) → review+

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 101

•

9 years ago

+struct InnerTextAccumulator {
{ goes to its own line
+  nsString& mString;
+  int8_t mRequiredLineBreakCount;
Could you put these to the end of the struct.

void Append(char ch)
s/ch/aCh/

+static bool
+ElementIsVisible(nsIContent* aContent)
+{
+  Element* elem = aContent->AsElement();
+  if (!elem) {
+    return false;
+  }
You must check aContent->IsElement() before using AsElement(). AsElement just does cast.
(and in DOM world its name would be GetAsElement() if it might return null.)


+  RefPtr<nsStyleContext> sc = nsComputedDOMStyle::GetStyleContextForElement(
+      elem, nullptr, nullptr);
2 spaces for indentation.


How does ElementIsVisible behave if the element is inside display: none; ? And how should it work?


GetRequiredInnerTextLineBreakCount
So <p<>'s linebreak handling never depends on its styling.


+enum TreeTraversalState {
+  AT_NODE,
+  AFTER_NODE
+};
Could you please add some comment what the states are about.

+void
+nsGenericHTMLElement::GetInnerText(mozilla::dom::DOMString& aValue,
+                                   mozilla::ErrorResult& aError)
you can drop mozilla::dom and mozilla:: here

+{
+  if (!GetPrimaryFrame(Flush_Layout)) {
+    RefPtr<nsStyleContext> sc =
+        nsComputedDOMStyle::GetStyleContextForElementNoFlush(this, nullptr, nullptr);
2 space for indentation

+  RefPtr<nsRange> range;
+  nsresult rv = nsRange::CreateRange(static_cast<nsINode*>(this), 0, this,
+      GetChildCount(), getter_AddRefs(range));
2 spaces for indentation

Hmm, GetInnerText looks a bit malloc heavy.  Did you compare the performance to other engines?
One simple optimization would be to not create Range object all the time. Either reuse one (and let Document to keep it alive) or
don't use Range at all but just pass the right 'startParent' and 'endParent' to GetInnerTextNoFlush.
In fact, since you just pass 'this' as start and end, why do we need Range at all?
GetInnerTextNoFlush could be just a static method taking startParent and endOffset, or if we want to support using if for other cases too, 
make it take start/endParent and also start/endOffset. But no need for creating Range object all the time.

+  nsString str;
+  const char16_t* s = aValue.BeginReading();
+  const char16_t* end = aValue.EndReading();
nsAutoString for str.

+      if (!str.IsEmpty()) {
+        RefPtr<nsTextNode> textContent =
+            new nsTextNode(NodeInfo()->NodeInfoManager());
2 spaces for indentation

+      already_AddRefed<mozilla::dom::NodeInfo> ni =
+          NodeInfo()->NodeInfoManager()->GetNodeInfo(nsGkAtoms::br,
ditto

+      str.SetLength(0);
.Truncate()


So I'd like to see a bit less malloc heavy (especially given that Range is cycle collectable object) setup for the getter, and it should be rather easy to do.

Flags: needinfo?(bugs)

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

9 years ago

Attachment #8675487 - Flags: review?(bugs) → review-

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 102

•

9 years ago

(In reply to Marco Zehe (:MarcoZ) from comment #98)
> ::: accessible/generic/HyperTextAccessible.cpp:1989
> (Diff revision 4)
> > -  // Only get info up to original offset, we know that will be larger than skipped offset
> > +  *aRenderedOffset = text.mOffsetWithinNodeRenderedText;> Can we keep the comment, please?

I don't think we should keep it, because the fact "we know that will be larger than skipped offset" is no longer relevant. The new code doesn't depend on that assumption.

> ::: accessible/generic/HyperTextAccessible.cpp:2013
> (Diff revision 4)
> > -  // We only need info up to skipped offset -- that is what we're converting to original offset
> > +  *aContentOffset = text.mOffsetWithinNodeText;
> 
> Same here.

I don't think this comment really makes sense anymore either.

Flags: needinfo?(mzehe)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 103

•

9 years ago

(In reply to Mats Palmgren (:mats) from comment #99)
> ::: layout/generic/nsTextFrame.cpp:9009
> (Diff revisions 3 - 4)
> > +        NS_ASSERTION(result.mOffsetWithinNodeRenderedText >= offsetInRenderedString &&
> 
> Please use MOZ_ASSERT instead.

I'm not a fan of this change, since I think an error in rendered text offsets is not worth interrupting a test run for, but I don't really care so I'll do it.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 104

•

9 years ago

(In reply to Olli Pettay [:smaug] from comment #101)
> +struct InnerTextAccumulator {
> { goes to its own line

Done.

> +  nsString& mString;
> +  int8_t mRequiredLineBreakCount;
> Could you put these to the end of the struct.

Done.

> void Append(char ch)
> s/ch/aCh/

Done

> +static bool
> +ElementIsVisible(nsIContent* aContent)
> +{
> +  Element* elem = aContent->AsElement();
> +  if (!elem) {
> +    return false;
> +  }
> You must check aContent->IsElement() before using AsElement(). AsElement
> just does cast.
> (and in DOM world its name would be GetAsElement() if it might return null.)

I made ElementIsVisible take Element* and moved AsElement() to the caller. The caller only calls it if IsHTMLElement(nsGkAtoms::rp) is true, which I guess is enough to justify calling AsElement().

> +  RefPtr<nsStyleContext> sc = nsComputedDOMStyle::GetStyleContextForElement(
> +      elem, nullptr, nullptr);
> 2 spaces for indentation.

Done.

> How does ElementIsVisible behave if the element is inside display: none; ?
> And how should it work?

It returns true. It's only supposed to check the visiblity style. This is because <rp> is normally display:none but we want to include it in innerText --- but we *don't* want to include it if it's visibility:hidden.

> GetRequiredInnerTextLineBreakCount
> So <p<>'s linebreak handling never depends on its styling.

Correct. I'm guessing that making linebreak handling depend on CSS style (e.g. 'margin') would be unnecessarily complicated.

> +enum TreeTraversalState {
> +  AT_NODE,
> +  AFTER_NODE
> +};
> Could you please add some comment what the states are about.

Sure.

> +void
> +nsGenericHTMLElement::GetInnerText(mozilla::dom::DOMString& aValue,
> +                                   mozilla::ErrorResult& aError)
> you can drop mozilla::dom and mozilla:: here

Done.

> +{
> +  if (!GetPrimaryFrame(Flush_Layout)) {
> +    RefPtr<nsStyleContext> sc =
> +        nsComputedDOMStyle::GetStyleContextForElementNoFlush(this, nullptr,
> nullptr);
> 2 space for indentation

Done.

> +  RefPtr<nsRange> range;
> +  nsresult rv = nsRange::CreateRange(static_cast<nsINode*>(this), 0, this,
> +      GetChildCount(), getter_AddRefs(range));
> 2 spaces for indentation

Done.

> Hmm, GetInnerText looks a bit malloc heavy.  Did you compare the performance
> to other engines?

No. I don't know of any use-cases where performance would matter, though I suppose that won't stop people writing silly benchmarks.

> One simple optimization would be to not create Range object all the time.
> Either reuse one (and let Document to keep it alive) or
> don't use Range at all but just pass the right 'startParent' and 'endParent'
> to GetInnerTextNoFlush.
> In fact, since you just pass 'this' as start and end, why do we need Range
> at all?

Because I'd like to reuse this to implement Selection.toString().

> GetInnerTextNoFlush could be just a static method taking startParent and
> endOffset, or if we want to support using if for other cases too, 
> make it take start/endParent and also start/endOffset. But no need for
> creating Range object all the time.

OK, I'll do that.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 105

•

9 years ago

Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

The test changes here are to adjust for the fact that
nsTextFrame::GetRenderedText can now trim whitespace from the end of lines
that end in a hard line break.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 106

•

9 years ago

Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

Attachment #8675487 - Flags: review- → review?(bugs)

Marco Zehe (:MarcoZ)

Comment 107

•

9 years ago

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #102)
> (In reply to Marco Zehe (:MarcoZ) from comment #98)
> > ::: accessible/generic/HyperTextAccessible.cpp:1989
> > (Diff revision 4)
> > > -  // Only get info up to original offset, we know that will be larger than skipped offset
> > > +  *aRenderedOffset = text.mOffsetWithinNodeRenderedText;> Can we keep the comment, please?
> 
> I don't think we should keep it, because the fact "we know that will be
> larger than skipped offset" is no longer relevant. The new code doesn't
> depend on that assumption.

OK, thanks for the clarification! Just wanted to make sure we don't lose any important information. This goes for the second one, too.

Flags: needinfo?(mzehe)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 108

•

9 years ago

Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

Note 
  struct RenderedText {
    nsString mString;
    uint32_t mOffsetWithinNodeRenderedText;
    int32_t mOffsetWithinNodeText;
    RenderedText() : mOffsetWithinNodeRenderedText(0),
        mOffsetWithinNodeText(0) {}
  };
that mString shows up pretty badly in a silly testcase like 
http://mozilla.pettay.fi/moztests/innertext_nonsimple_nostyle_test.html
That test has two parts, and it is the latter part where mString handling shows up.
The first 500 runs is reflow bounded (can we do anything to that?), and the latter part is malloc heavy.
Using nsAutoString and not nsString reduces allocations quite a bit. Still behind blink's performance, but at least a bit closer.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 109

•

9 years ago

Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

+IsVisibleAndNotInReplacedElement(nsIFrame* aFrame)
+{
+  if (!aFrame || !aFrame->StyleVisibility()->IsVisible()) {
+    return false;
+  }
+  for (nsIFrame* f = aFrame->GetParent(); f; f = f->GetParent()) {
+    if (f->GetContent() && f->GetContent()->IsHTMLElement(nsGkAtoms::button)) {
+      continue;
+    }
+    if (f->IsFrameOfType(nsIFrame::eReplaced)) {
+      return false;
+    }
+  }
+  return true;
+}
This shows up in performance profiles quite a bit (less than the nsString though). Anything we can do to it?

+nsRange::GetInnerTextNoFlush(DOMString& aValue, ErrorResult& aError,
+                             nsINode* aStartParent, uint32_t aStartOffset,
+                             nsINode* aEndParent, uint32_t aEndOffset)
I think aStartParent and aEndParent should be nsIContent objects since that is what the algorithm assumes.
(aEndParent->AsContent() isn't safe if aEndParent isn't nsIContent)


+void
+nsGenericHTMLElement::GetInnerText(mozilla::dom::DOMString& aValue,
+                                   mozilla::ErrorResult& aError)
+{
+  if (!GetPrimaryFrame(Flush_Layout)) {
+    RefPtr<nsStyleContext> sc =
+      nsComputedDOMStyle::GetStyleContextForElementNoFlush(this, nullptr, nullptr);
+    if (!sc || sc->StyleDisplay()->mDisplay == NS_STYLE_DISPLAY_NONE ||
+        !GetCurrentDoc()) {
GetCurrentDoc() is deprecated, and I think we want this to deal with shadow dom too, so IsInComposedDoc() should work fine.



+  // Batch possible DOMSubtreeModified events.
+  mozAutoSubtreeModified subtree(nullptr, nullptr);
+
+  // Scope firing mutation events so that we don't carry any state that
+  // might be stale
+  {
+    // We're relying on mozAutoSubtreeModified to keep a strong reference if
+    // needed.
+    nsIDocument* doc = OwnerDoc();
+
+    // Optimize the common case of there being no observers
+    if (nsContentUtils::HasMutationListeners(doc, NS_EVENT_BITS_MUTATION_NODEREMOVED)) {
+      subtree.UpdateTarget(doc, nullptr);
+      // Keep "this" alive during mutation listener firing
+      kungFuDeathGrip = this;
+      nsCOMPtr<nsINode> child;
+      for (child = nsINode::GetFirstChild();
+           child && child->GetParentNode() == this;
+           child = child->GetNextSibling()) {
+        nsContentUtils::MaybeFireNodeRemoved(child, this, doc);
+      }
+    }
+  }

// Batch possible DOMSubtreeModified events.
mozAutoSubtreeModified subtree(doc, nullptr);
FireNodeRemovedForChildren();

should work here.

Attachment #8675487 - Flags: review?(bugs) → review+

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 110

•

9 years ago

(In reply to Olli Pettay [:smaug] from comment #108)
> Note 
>   struct RenderedText {
>     nsString mString;
>     uint32_t mOffsetWithinNodeRenderedText;
>     int32_t mOffsetWithinNodeText;
>     RenderedText() : mOffsetWithinNodeRenderedText(0),
>         mOffsetWithinNodeText(0) {}
>   };
> that mString shows up pretty badly in a silly testcase like 
> http://mozilla.pettay.fi/moztests/innertext_nonsimple_nostyle_test.html
> That test has two parts, and it is the latter part where mString handling
> shows up.
> The first 500 runs is reflow bounded (can we do anything to that?)

Not the way it's currently specced, no.

> and the
> latter part is malloc heavy.
> Using nsAutoString and not nsString reduces allocations quite a bit. Still
> behind blink's performance, but at least a bit closer.

OK. The question is, is this something worth optimizing?

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 111

•

9 years ago

According to my profile 2/3 of the time is actually in nsTextFrame::GetRenderedText for your benchmark. 1/7 is in IsVisibleAndNotInReplacedElement. The string memory overhead (Linux) seems to be quite low, maybe 1/15.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 112

•

9 years ago

Comment on attachment 8675486 [details]
MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

The test changes here are to adjust for the fact that
nsTextFrame::GetRenderedText can now trim whitespace from the end of lines
that end in a hard line break.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 113

•

9 years ago

Comment on attachment 8675487 [details]
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

Attachment #8675487 - Flags: review+ → review?(bugs)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 114

•

9 years ago

Attached file MozReview Request: Bug 264412. Optimize GetRenderedText. r=mats — Details

Bug 264412. Optimize GetRenderedText. r=mats

With these changes we're slightly faster than Chrome on the non-reflowing part of
Olli's testcase.

Attachment #8681097 - Flags: review?(mats)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 115

•

9 years ago

I(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #111)
> According to my profile 2/3 of the time is actually in
> nsTextFrame::GetRenderedText for your benchmark. 1/7 is in
> IsVisibleAndNotInReplacedElement. The string memory overhead (Linux) seems
> to be quite low, maybe 1/15.

I misread the profile. A lot of the time in GetRenderedText was string processing; Append(char) seems to have significant fixed overhead. The third patch here avoids that. I also made changes to IsVisibleAndNotInReplacedElement to make it somewhat faster, and made a tweak so we call it roughly half as often.

Mats Palmgren (inactive)

Comment 116

•

9 years ago

Comment on attachment 8681097 [details]
MozReview Request: Bug 264412. Optimize GetRenderedText. r=mats

https://reviewboard.mozilla.org/r/23769/#review21253

::: layout/generic/nsTextFrame.cpp:8906
(Diff revision 1)
> +    if (ch == '\n') {

Perhaps this is more readable?

    if ((ch == '\n' && !aStyle->NewlineIsSignificant(aFrame)) ||
        (ch == '\t' && !aStyle->TabIsSignificant())) {
      ch = ' ';
    }

::: layout/generic/nsTextFrame.cpp:8918
(Diff revision 1)
> +  if (aStyle->mTextTransform != NS_STYLE_TEXT_TRANSFORM_NONE) {

Why exclude TRANSFORM_NONE up front?
The 'switch' is a NOP for that value already so it doesn't seem necessary.
Is it just an optimization for the common case?
If so, you should probably point that out for clarity, and perhaps also
make it an early return instead?

::: layout/generic/nsTextFrame.cpp:9058
(Diff revision 1)
> +          trimmedOffsets.GetEnd() - iter.GetOriginalOffset());

iter.GetOriginalOffset() hides a subtraction and this expression is
repeated in a couple of places (also in a loop).
I think you should make this a local var, say originalOffset.
That should make the std::min fit on one line?
Also, you can avoid the addition inside the loop, like so:

auto end = originalOffset + runLength;
for (int32_t i = originalOffset; i < end; ++i) {
  ... CharAt(i)

It might be worth getting rid of iter.GetOriginalOffset()
in the while-loop condition too unless it complicates
the code too much.

Attachment #8681097 - Flags: review?(mats)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 117

•

9 years ago

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #110)
> (In reply to Olli Pettay [:smaug] from comment #108)
> > and the
> > latter part is malloc heavy.
> > Using nsAutoString and not nsString reduces allocations quite a bit. Still
> > behind blink's performance, but at least a bit closer.
> 
> OK. The question is, is this something worth optimizing?
Well, we've spent quite some time to optimize innerHTML for example since many or probably most of
DOM related benchmarks have some silly micro benchmarks for it. 
Now with innerText supported I wouldn't be surprised benchmarks adding tests for it.
The reason I tested the performance here was to make sure we aren't way of the performance of others.

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

9 years ago

Attachment #8675487 - Flags: review?(bugs) → review+

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 118

•

9 years ago

(In reply to Mats Palmgren (:mats) from comment #116)
> ::: layout/generic/nsTextFrame.cpp:8906
> (Diff revision 1)
> > +    if (ch == '\n') {
> 
> Perhaps this is more readable?
> 
>     if ((ch == '\n' && !aStyle->NewlineIsSignificant(aFrame)) ||
>         (ch == '\t' && !aStyle->TabIsSignificant())) {
>       ch = ' ';
>     }

Sure

> ::: layout/generic/nsTextFrame.cpp:8918
> (Diff revision 1)
> > +  if (aStyle->mTextTransform != NS_STYLE_TEXT_TRANSFORM_NONE) {
> 
> Why exclude TRANSFORM_NONE up front?
> The 'switch' is a NOP for that value already so it doesn't seem necessary.
> Is it just an optimization for the common case?
> If so, you should probably point that out for clarity, and perhaps also
> make it an early return instead?

Sorry, that was left over from previous work. Removed.

> ::: layout/generic/nsTextFrame.cpp:9058
> (Diff revision 1)
> > +          trimmedOffsets.GetEnd() - iter.GetOriginalOffset());
> 
> iter.GetOriginalOffset() hides a subtraction and this expression is
> repeated in a couple of places (also in a loop).
> I think you should make this a local var, say originalOffset.
> That should make the std::min fit on one line?
> Also, you can avoid the addition inside the loop, like so:
> 
> auto end = originalOffset + runLength;
> for (int32_t i = originalOffset; i < end; ++i) {
>   ... CharAt(i)
> 
> It might be worth getting rid of iter.GetOriginalOffset()
> in the while-loop condition too unless it complicates
> the code too much.

I'd rather not do these things because they complicate the code and I very much doubt there's any measurable performance impact.

Mats Palmgren (inactive)

Comment 119

•

9 years ago

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #118)
> I'd rather not do these things because they complicate the code and I very
> much doubt there's any measurable performance impact.

Well, I didn't quite like the formatting here:
      runLength = std::min(runLength,
          trimmedOffsets.GetEnd() - iter.GetOriginalOffset());

so I figured a local var would fix that:
      auto originalOffset = iter.GetOriginalOffset();
      runLength = std::min(runLength, trimmedOffsets.GetEnd() - originalOffset);

and when you have that you might as well use it in the for-loop.

Oh well, it's a matter of opinion I guess. :-)

So r=mats with the TRANSFORM_NONE check removed.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 120

•

9 years ago

Comment on attachment 8681097 [details]
MozReview Request: Bug 264412. Optimize GetRenderedText. r=mats

Bug 264412. Optimize GetRenderedText. r=mats

With these changes we're slightly faster than Chrome on the non-reflowing part of
Olli's testcase.

Attachment #8681097 - Flags: review?(mats)

Mats Palmgren (inactive)

Updated

•

9 years ago

Attachment #8681097 - Flags: review?(mats) → review+

Mats Palmgren (inactive)

Comment 121

•

9 years ago

Comment on attachment 8681097 [details]
MozReview Request: Bug 264412. Optimize GetRenderedText. r=mats

https://reviewboard.mozilla.org/r/23769/#review21259

r=mats if my question below isn't an issue

::: dom/base/nsRange.cpp:3403
(Diff revision 2)
> -    } else {
> +    if (currentNode == endNode && currentState == endState) {
> +      break;

Hmm, doesn't this skip the AFTER_NODE action for the endNode?

::: layout/generic/nsTextFrame.cpp:9050
(Diff revision 2)
> +      runLength = std::min(runLength,
> +          trimmedOffsets.GetEnd() - iter.GetOriginalOffset());

I'd prefer if the second arg to std::min lined up with the first.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 122

•

9 years ago

(In reply to Mats Palmgren (:mats) from comment #121)
> ::: dom/base/nsRange.cpp:3403
> (Diff revision 2)
> > -    } else {
> > +    if (currentNode == endNode && currentState == endState) {
> > +      break;
> 
> Hmm, doesn't this skip the AFTER_NODE action for the endNode?

That's the same behavior as before. We stop as soon as we reach the state endNode/endState. And that's OK. For example, when HTMLElement calls GetInnerTextNoFlush, we pass the subject element and its GetChildCount() for aEndParent/aEndOffset. So we stop when currentNode is aEndParent and currentState is AFTER_NODE, which is before we've done the AFTER_NODE actions for aEndParent, which is correct per spec.

Mats Palmgren (inactive)

Comment 123

•

9 years ago

OK, good.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 124

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=f919b665924c

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 125

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/95346f49d048f5abb3d60df4beec4fe4ef412017
Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz

https://hg.mozilla.org/integration/mozilla-inbound/rev/5ebc59281c25fbb8ea288f24797b9ece1fdb21a5
Bug 264412. Implement HTMLElement.innerText. r=smaug,mats

https://hg.mozilla.org/integration/mozilla-inbound/rev/9160f08660b8290559e427fd80d617edd86fe2a6
Bug 264412. Optimize GetRenderedText. r=mats

Carsten Book [:Tomcat]

Comment 126

•

9 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/95346f49d048
https://hg.mozilla.org/mozilla-central/rev/5ebc59281c25
https://hg.mozilla.org/mozilla-central/rev/9160f08660b8

Status: NEW → RESOLVED

Closed: 20 years ago → 9 years ago

status-firefox45: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla45

Binyamin

Comment 127

•

9 years ago

Please update the compatibility table in https://developer.mozilla.org/en-US/docs/Web/API/Node/innerText - supported since Firefox 45.

Juriy "kangax" Zaytsev

Comment 128

•

9 years ago

I think there's a bug.

<p style="visibility:hidden"><span style="visibility:visible">p</span></p><div>div</div>

This chunk produces "p\ndiv" in FF but "p\n\ndiv" in Chrome. 

Chrome behavior seems to make more sense to me. Also, removing visibility:hidden/visibility:visible from <p> produces more sensible double \n in FF right away.

Juriy "kangax" Zaytsev

Comment 129

•

9 years ago

Besides the above mentioned combination (which I assume to be a bug), I have so far found 2 differences in Chrome and FF: Chrome doesn't respect "whitespace: pre-line" (actually none of the browsers do besides FF) which I think is a bug in Chrome (and IE/Edge). Chrome puts 1 newline between paragraph and non-paragraph (like div) whereas FF puts 2, which I think is correct, and which IE10+/Edge do too.

The updated compat table — http://kangax.github.io/jstests/innerText/

I'm planning to add few more tests (from the ones Robert added with this commit) to find out if there's more deviations among implementations. And will try to file tickets with Chrome as well.

Marco Zehe (:MarcoZ)

Updated

•

9 years ago

Depends on: 1221043

Juriy "kangax" Zaytsev

Comment 130

•

9 years ago

Quick note about performance. 

On my test page, I get 25-30ms vs. 35-40ms. in Chrome vs. Firefox (jsperf is unusable so my benchmark is crude) — so a pretty significant/noticeable difference. This might be because Chrome cuts corners.

The reason this is important is because some apps need to request innerText quite frequently on large documents when editing text and having to calculate position of cursor in formatted text.

P.S. Edge seems to perform really bad (~100ms) although I'm checking it through VirtualBox so perhaps that's what's affecting it.

Carsten Book [:Tomcat]

Comment 131

•

9 years ago

bugherder uplift

https://hg.mozilla.org/releases/mozilla-b2g44_v2_5/rev/95346f49d048
https://hg.mozilla.org/releases/mozilla-b2g44_v2_5/rev/5ebc59281c25
https://hg.mozilla.org/releases/mozilla-b2g44_v2_5/rev/9160f08660b8

status-b2g-v2.5: --- → fixed

Carsten Book [:Tomcat]

Comment 132

•

9 years ago

removing the b2g 2.5 flag since this commit has been reverted due to an incorrect merge, sorry for the confusion

status-b2g-v2.5: fixed → ---

Josh Matthews [:jdm]

Updated

•

9 years ago

Depends on: 1226293

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 133

•

9 years ago

(In reply to Juriy "kangax" Zaytsev from comment #128)
> <p style="visibility:hidden"><span
> style="visibility:visible">p</span></p><div>div</div>
> 
> This chunk produces "p\ndiv" in FF but "p\n\ndiv" in Chrome. 
> 
> Chrome behavior seems to make more sense to me. Also, removing
> visibility:hidden/visibility:visible from <p> produces more sensible double
> \n in FF right away.

The FF/spec behavior here is intentional.

Handling CSS 'visibility' in innerText is tricky because in CSS 'visibility' does not affect layout, but that doesn't make sense in the context of 'innerText'. (The most faithful mimicry would be to replace text with spaces, perhaps, but that seems crazy and no browser does it.) So I've chosen to take a simple approach in FF and the spec where visibility:visible elements are just ignored for innerText. Chrome has a halfway approach where visibility:hidden suppresses some related text but not a <p>'s extra newlines *if the <p> has visible descendants* (or something like that).

We could make the spec a little more complicated to match Chrome here, but I'm reluctant to do that without a strong reason. Chrome's <p> handling has some definitely undesirable features, e.g. abc<p>123</p>def gives "abc\n123\n\ndef", i.e. you get a "bottom margin" but not a "top margin", which makes no sense, so even if Chrome's <p>-related behavior had a simple spec (which I doubt), we wouldn't really want to follow it. I very much doubt this case will matter in practice.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 134

•

9 years ago

(In reply to Juriy "kangax" Zaytsev from comment #129)
> Besides the above mentioned combination (which I assume to be a bug), I have
> so far found 2 differences in Chrome and FF: Chrome doesn't respect
> "whitespace: pre-line" (actually none of the browsers do besides FF) which I
> think is a bug in Chrome (and IE/Edge). Chrome puts 1 newline between
> paragraph and non-paragraph (like div) whereas FF puts 2, which I think is
> correct, and which IE10+/Edge do too.
> 
> The updated compat table — http://kangax.github.io/jstests/innerText/
> 
> I'm planning to add few more tests (from the ones Robert added with this
> commit) to find out if there's more deviations among implementations. And
> will try to file tickets with Chrome as well.

Thanks!!!

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 135

•

9 years ago

(In reply to Juriy "kangax" Zaytsev from comment #130)
> On my test page, I get 25-30ms vs. 35-40ms. in Chrome vs. Firefox (jsperf is
> unusable so my benchmark is crude) — so a pretty significant/noticeable
> difference. This might be because Chrome cuts corners.
>
> The reason this is important is because some apps need to request innerText
> quite frequently on large documents when editing text and having to
> calculate position of cursor in formatted text.

OK ... it would be great if you can file a bug with a specific testcase on this.

Flags: needinfo?(kangax)

Tantek Çelik

Comment 136

•

9 years ago

From a spec standpoint, just to confirm, this from 2011-02-03 is the latest there is right?

https://rawgit.com/timdown/rangy/master/fiddlings/spec/innerText.htm

Karl Dubost💡 :karlcow

Comment 137

•

9 years ago

Tantek,
here by Roc
https://github.com/rocallahan/innerText-spec

Binyamin

Comment 138

•

9 years ago

http://rocallahan.github.io/innerText-spec/ is WHATWG approved innerText spec (https://github.com/whatwg/compat/issues/5#issuecomment-168049752)

Firefox Nightly 46 passes all tests (http://rocallahan.github.io/innerText-spec/setter-tests.html, http://rocallahan.github.io/innerText-spec/getter-tests.html).

Chrome issue https://code.google.com/p/chromium/issues/detail?id=573309

Alice0775 White

Updated

•

9 years ago

Blocks: 998390

devan.shah14

Comment 139

•

8 years ago

When we try to fetch innerText when it is not set at all, it used return undefined and now it is returning "" (empty string), can this be changed so that in the case that innerText is not set then it comes as undefined and not as an empty string. Technically if you have not set any thing it should be undefined, or was there a reason to actually have the default be "" empty string.

Juriy "kangax" Zaytsev

Comment 140

•

8 years ago

innerText is not supposed to return something that was set before. It returns element's textual representation, which always exists and should always be a string.

Flags: needinfo?(kangax)

Loic

Updated

•

8 years ago

Depends on: 1260025

Kohei Yoshino [:kohei]

Updated

•

8 years ago

Depends on: 1268833

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

8 years ago

Depends on: 1288975

Loic

Updated

•

8 years ago

Depends on: 1290937

Karl Dubost💡 :karlcow

Updated

•

3 years ago

MozReview Request: Bug 264412. Refactor nsIFrame::GetRenderedText API to be more sane. r=mats,marcoz 9 years ago Robert O'Callahan (:roc) (email my personal email if necessary) 40 bytes, text/x-review-board-request	MatsPalmgren_bugz : review+ MarcoZ : review+	Details
MozReview Request: Bug 264412. Implement HTMLElement.innerText. r=smaug,mats 9 years ago Robert O'Callahan (:roc) (email my personal email if necessary) 40 bytes, text/x-review-board-request	MatsPalmgren_bugz : review+ smaug : review+	Details
MozReview Request: Bug 264412. Optimize GetRenderedText. r=mats 9 years ago Robert O'Callahan (:roc) (email my personal email if necessary) 40 bytes, text/x-review-board-request	MatsPalmgren_bugz : review+	Details