Closed
Bug 1391538
Opened 7 years ago
Closed 7 years ago
nsTextFragment for the text node in <input> or <textarea> should not store text as single byte characters even if all characters are less than U+0100
Categories
(Core :: DOM: Core & HTML, enhancement)
Core
DOM: Core & HTML
Tracking
()
RESOLVED
FIXED
mozilla57
Tracking | Status | |
---|---|---|
firefox57 | --- | fixed |
People
(Reporter: masayuki, Assigned: masayuki)
References
(Blocks 1 open bug)
Details
Attachments
(1 file)
nsTextFragment is the storage of nsGenericDOMDataNode (nsTextNode). It stores text as single byte characters if all characters are less than 256. However, this packing makes damage of dynamical change of text nodes in web apps.
I *guess* that we don't need such saving anymore, however, at least in this bug, we should disable the single byte mode for text nodes in <input> and <textarea>.
Comment 1•7 years ago
|
||
Sounds reasonable to disable the memory optimization for input and textarea. I guess for any native anonymous content?
nsTextFragment methods could take a param to tell whether mIs2b mode should be forced and then
textnode implementation just passes information whether it is native anonymous or something like that.
(Removing the !mIs2b mode everywhere is way more controversial. It might increase memory usage quite a bit.)
Assignee | ||
Comment 2•7 years ago
|
||
Assignee | ||
Comment 3•7 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #1)
> Sounds reasonable to disable the memory optimization for input and textarea.
> I guess for any native anonymous content?
I have no idea. What's for example? (Anyway, should be changed in other bugs.)
> nsTextFragment methods could take a param to tell whether mIs2b mode should
> be forced and then
> textnode implementation just passes information whether it is native
> anonymous or something like that.
I add nsTextFrame::MarkAsMaybeModifiedFrequently() and calling it before calling SetText() or SetData(). I think that's easier to understand what is specifying rather than bool argument of constructor.
> (Removing the !mIs2b mode everywhere is way more controversial. It might
> increase memory usage quite a bit.)
Yeah, but it's only Western websites and no Emojis nor Unicode hyphens, spaces etc. I wonder the scan cost of adding text may make users feel worse than saving memory usage since if web content loads an image, it'll eats memory more than what we're saving in nsTextFragment.
Comment 4•7 years ago
|
||
(In reply to Masayuki Nakano [:masayuki] (JST, +0900) from comment #3)
> (In reply to Olli Pettay [:smaug] from comment #1)
> > Sounds reasonable to disable the memory optimization for input and textarea.
> > I guess for any native anonymous content?
>
> I have no idea. What's for example? (Anyway, should be changed in other
> bugs.)
I mean since the content of <input> and <textarea> is native anonymous content, and it is fast to get data about a node being native anonymous, all such nodes can probably behave the same way.
>
> > nsTextFragment methods could take a param to tell whether mIs2b mode should
> > be forced and then
> > textnode implementation just passes information whether it is native
> > anonymous or something like that.
>
> I add nsTextFrame::MarkAsMaybeModifiedFrequently() and calling it before
> calling SetText() or SetData(). I think that's easier to understand what is
> specifying rather than bool argument of constructor.
Hmm. And what does MarkAsMaybeModifiedFrequently do? Do we have spare bits somewhere?
> Yeah, but it's only Western websites and no Emojis nor Unicode hyphens,
> spaces etc.
It is also about whitespace
Assignee | ||
Comment 5•7 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #4)
> (In reply to Masayuki Nakano [:masayuki] (JST, +0900) from comment #3)
> > (In reply to Olli Pettay [:smaug] from comment #1)
> > > Sounds reasonable to disable the memory optimization for input and textarea.
> > > I guess for any native anonymous content?
> >
> > I have no idea. What's for example? (Anyway, should be changed in other
> > bugs.)
> I mean since the content of <input> and <textarea> is native anonymous
> content, and it is fast to get data about a node being native anonymous, all
> such nodes can probably behave the same way.
Ah, okay.
> > > nsTextFragment methods could take a param to tell whether mIs2b mode should
> > > be forced and then
> > > textnode implementation just passes information whether it is native
> > > anonymous or something like that.
> >
> > I add nsTextFrame::MarkAsMaybeModifiedFrequently() and calling it before
> > calling SetText() or SetData(). I think that's easier to understand what is
> > specifying rather than bool argument of constructor.
> Hmm. And what does MarkAsMaybeModifiedFrequently do? Do we have spare bits
> somewhere?
Yeah, it sets new flag with nsINode::SetFlags() and nsTextFragment::SetTo() callers sets bool result of nsINode::GetFlags() to new argument.
> > Yeah, but it's only Western websites and no Emojis nor Unicode hyphens,
> > spaces etc.
> It is also about whitespace
If you say about shared storage of whitespaces in nsTextFragment, I think that they can be made to char16_t* if we do it.
Comment hidden (mozreview-request) |
Comment 7•7 years ago
|
||
mozreview-review |
Comment on attachment 8899020 [details]
Bug 1391538 - nsTextFragment for text nodes in <input> or <textarea> shouldn't store text as single byte characters even if all characters are less than U+0100
https://reviewboard.mozilla.org/r/170342/#review175570
Comments fixed, r+.
Not about this bug, but we might want to measure how often JS wants the data of text nodes, and perhaps mark such text nodes with NS_MAYBE_MODIFIED_FREQUENTLY.
Also storing the data using StringBuffer might make sense.
::: dom/base/nsTextFragment.cpp:334
(Diff revision 1)
> nsTextFragment::Append(const char16_t* aBuffer, uint32_t aLength, bool aUpdateBidi)
> {
> // This is a common case because some callsites create a textnode
> // with a value by creating the node and then calling AppendData.
> if (mState.mLength == 0) {
> - return SetTo(aBuffer, aLength, aUpdateBidi);
> + return SetTo(aBuffer, aLength, aUpdateBidi, mState.mIs2b);
Why this way?
Couldn't you just add aForce2b param to Append and then the only caller of this method would use it.
That would be at least easier to follow.
::: editor/libeditor/EditorBase.cpp:4741
(Diff revision 1)
> +already_AddRefed<nsTextNode>
> +EditorBase::CreateTextNode(nsIDocument& aDocument,
> + const nsAString& aData)
> +{
> + RefPtr<nsTextNode> text = aDocument.CreateEmptyTextNode();
> + text->MarkAsMaybeModifiedFrequently();printf("%p, Marked!\n", text.get());
You have some left over printf here. Remove it.
Attachment #8899020 -
Flags: review?(bugs) → review+
Assignee | ||
Comment 8•7 years ago
|
||
mozreview-review-reply |
Comment on attachment 8899020 [details]
Bug 1391538 - nsTextFragment for text nodes in <input> or <textarea> shouldn't store text as single byte characters even if all characters are less than U+0100
https://reviewboard.mozilla.org/r/170342/#review175570
> Why this way?
> Couldn't you just add aForce2b param to Append and then the only caller of this method would use it.
> That would be at least easier to follow.
Ah, I misunderstood about nsTextFragment::AppendTo(). Okay, I can do it simply. (I thought that a lot of AppendTo() calls Append().)
> You have some left over printf here. Remove it.
Oops! Thank you for your nice catch!
Comment hidden (mozreview-request) |
Assignee | ||
Comment 10•7 years ago
|
||
Comment 11•7 years ago
|
||
Pushed by masayuki@d-toybox.com:
https://hg.mozilla.org/integration/autoland/rev/346b38526bdc
nsTextFragment for text nodes in <input> or <textarea> shouldn't store text as single byte characters even if all characters are less than U+0100 r=smaug
Comment 12•7 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Assignee | ||
Comment 13•7 years ago
|
||
Hmm, today's Nightly, I see 2ms in nsTextFragment.
https://perfht.ml/2wmO18s
> mozilla::EditorBase::SetTextImpl(mozilla::dom::Selection &,nsAString const &,mozilla::dom::Text &)
> nsGenericDOMDataNode::SetData(nsAString const &)
> nsGenericDOMDataNode::SetTextInternal(unsigned int,unsigned int,char16_t const *,unsigned int,bool,CharacterDataChangeInfo::Details *)
> nsTextFragment::SetTo(char16_t const *,int,bool,bool)
> mozilla::CheckedInt<unsigned int>::operator*=<unsigned __int64>(unsigned __int64)
and
> nsGenericDOMDataNode::SetTextInternal(unsigned int,unsigned int,char16_t const *,unsigned int,bool,CharacterDataChangeInfo::Details *)
> nsTextFragment::SetTo(char16_t const *,int,bool,bool)
> nsTextFragment::UpdateBidiFlag(char16_t const *,unsigned int)
> HasRTLChars(char16_t const *,unsigned int)
I should struggle with them. Perhaps, we can skip something for the both case.
Comment 14•7 years ago
|
||
Note that this may not just affect memory. At one point we had a layout fast-path for ASCII-only text that was triggered only in the m1b textfragment case. I don't know whether we still have that.
Comment 15•7 years ago
|
||
FWIW, I did some <textarea> profiling with lots of text and couldn't immediately see 2b usage related slowness when pasting or modifying the text.
(and need to disable spellcheck in such testing, since it is still too slow and happening too often. I'll file some bugs.)
Comment 16•7 years ago
|
||
Except that HasRTLChars does show up.
You need to log in
before you can comment on or make changes to this bug.
Description
•