62188 - Support start number for lists in plain text output

Reporter

Description

•

24 years ago

Spinoff of bug 61173. We will start all numbered lists on 1 regardless if someone copy'n'pastes the number 4711 to 4717 in a list. There is support in html to specify number of an item in a list. We should look at that in the plaintext converter.

Tanu Mutreja

Comment 1

•

23 years ago

Bratell, would you please provide some more information about this bug. I tried with following: I copied a list of following from the browser : 1. one 2. two 3. three 4. four then pasted it on notepad and got the following : 1. one 2. two 3. three 4. four but when I copied : 2. two 3. three 4. four and pasted it on notepad I got : * two * three * four whereas if I pasted the same on composer I got: 1. two 2. three 3. four Do you mean that the numbering is maintained during HTML serialization but not in PlainTextSerialization. Precisely, do we want 1. two 2. three 3. four to appear while copying on the notepad.

Daniel Bratell

Reporter

Comment 2

•

23 years ago

What I meant this bug for was to add support to nsPlainTextSerializer for the start attribute in <ol>. That way the editor/copy code could add that attribute to the copied list items so that they would have the same numbers in the cut'n'paste. If the list is a ranking and I copy the part between 80 and 89 into a list to show how bad my friend is, it's bad that the output list is numbered 1 to 10. That you got an unordered list must be because the copy code doesn't output a <ol> around the copied <li>:s. Maybe this is the reason they don't? So to answer your question, I want notepad to get the following: 2. two 3. three 4. four Anyway, it's problably quite simple and safe to add the support in nsPlainTextSerializer. May be a one-liner almost, but I have bigger fish to catch right now so I will probably not look at it sometime soon and when that is done the copy code have to be changed to use the start attribute and I have no idea how simple or difficult that is.

Tanu Mutreja

Comment 3

•

23 years ago

Attached patch Patch to fix this. (obsolete) — Details — Splinter Review

Currently OL and LI tags are not being passed to the serializer so we have no clue about the ordinal number of LI elements. Problem got more complicated in order to take care of nested lists, partially selected ordered lists (where we may completely loose track of the ordinal number for LI if "value" attribute is not explicitly set with that LI) and to keep the view source cleaner to maximum possible extent. Also we could not set any other temporary attribute for internal processing of LI value as that attribute would not be interpreted correctly if we paste the content to any other HTML editor. I made following modifications to the code to take care of copying list items with 'n' levels of nesting and pasting them to some plain text editor / HTML editor / MSWord etc. : 1. Added OL and LI in nsHTMLCopyEncoder::IncludeInContext(). 2. Added a member variable |mIsCopying| to nsDocumentEncoder, which is true only when the content is coming via nsHTMLCOPYEncoder. 3. While copying the partial list (more complicated when the selected part is having nested lists too) as we loose track of the actual ordinal number of LI elements if it's not explicitly set, we added a temporary "value" attribute to the first LI element of an OL within the selected range. Based on this, ordinal numbers for the remaining LIs are calculated during serialization. 4. To get the value of this temporary attribute, we start from the first selected LI item and traverse the DOM Tree back till we get either a previous sibling LI node with an explicit "value" attribute or there is no more previous sibling to traverse. In case we don't get any sibling node with a value attribute, we need to decide the ordinal number based on "start" attribute of the parent OL node. If parent OL does not contain explicitly mentioned "start" attribute, set it to a default value of 0. Also to get the right ordinal, we need to keep track of the number of previous siblings we traversed. 5. Now as the "value" attributes are added to the original DOM Nodes, we have to avoid displaying them while "view source". To take care of this, as soon as we add that temporary "value" attribute we push the node in a stack and later when the destructor of nsDocumentEncoder is called after the copy part is done, we clean up nodes kept in the stack for removal of the added "value" attribute. Remove the stack once this clean up is done. 6. By this time, clipboard content is supposed to contain clearly mentioned ordinal number wherever required, so during HTMLSerialization ordinal numbers are correctly generated. 7. In PlainText Serializer, we defined a data structure a keep start attribute of an OL and number of LI nodes traversed. Later whenever LI tag is encountered, we check if it contains explicit "value" attribute otherwise we read the stack to get the start value and number of siblings traversed before(For that OL). Addition of these two values gives the right ordinal value for the LI node.

Daniel Bratell

Reporter

Comment 4

•

23 years ago

Comment on attachment 69038 [details] [diff] [review] Patch to fix this. I can only comment on the nsPlainTextSerializer part but I think you're making things a little more complicated than needed. In the nsPlainTextSerializer there is already a stack that is used to keep track of the next number for the current ordered list, mOLStack. you see the mOLStack[mOLStackIndex-1]++, where the number is increased (removed by your patch). When you encounter the value attribute you can just set that value directly instead of increasing by one. I don't see you ever using the startval number you're saving, except for copying it to other state structs. Just using the existing stack should make your patch much smaller.

Daniel Bratell

Reporter

Comment 5

•

23 years ago

Giving the bug to Tanu, since he/she has made good progress on it.

Assignee: bratell → tmutreja

Akkana Peck

Comment 6

•

23 years ago

RemoveElementAt might be a little cleaner than ReplaceElementAt(nsnull...), though I expect the implementations are pretty much the same. I agree with Daniel about using the existing stack; or if you can't use it, please delete it or integrate it into your new variable, so we don't have two different OL stacks being maintained.

Tanu Mutreja

Comment 7

•

23 years ago

Attached patch Incorporating the suggestions... (obsolete) — Details — Splinter Review

Thanks Daniel & Akkana! I think I really complicated the stuff. Modifying nsPlainTextSerializer here, as per your suggestions.

Vinay Badami

Updated

•

23 years ago

Keywords: patch

Whiteboard: [patch need r/sr=]

Akkana Peck

Comment 8

•

23 years ago

Comment on attachment 69222 [details] [diff] [review] Incorporating the suggestions... Much simpler, thanks! Looks fine to me, r=akkana. Personally I'd probably move the declaration of startAttr inside the next block along with startVal, but I'll leave the decision on that to you.

Attachment #69222 - Flags: review+

Daniel Bratell

Reporter

Comment 9

•

23 years ago

Comment on attachment 69222 [details] [diff] [review] Incorporating the suggestions... Yes, this looks fine. If I may ask for a favour, that would be to comment this: if (mULCount + mOLStackIndex == 0) - EnsureVerticalSpace(1); + EnsureVerticalSpace(0); Those constants are fragile since the serializer is used in so many places.

Johnny Stenback (:jst)

Comment 10

•

23 years ago

Comment on attachment 69222 [details] [diff] [review] Incorporating the suggestions... I'm not at all happy about us using a "value" attribute in the DOM for storing the number of an list item, this is something that should be dealt with in the text output code, w/o touching the DOM tree. For all we know someone else could've stored valuable information in the "value" attribute on the list items. Keep this information in the text output code.

Attachment #69222 - Flags: needs-work+

Tanu Mutreja

Comment 11

•

23 years ago

Johnny, here we are not setting "value" attribute if LI is already having it. But I see no other way to overcome special cases like the one below: If the original list is something like: <OL> <LI>ONE</LI> <LI VALUE = 100>TWO</LI> <LI>THREE</LI> <OL> <LI>FOUR</LI> </LI>FIVE</LI> </OL> <LI>SIX</LI> </OL> and then if I make a selection for copy starting from the LI having "THREE", we almost loose the context information for this node. I could not see any other way to set an ordinal as 101 for this node. Only if I set a value attribute to this node and pass further, serializer can know the actual ordinal for this one otherwise serializer will always take this with an ordinal as 3. So, as far as I understand your concern here, we are not doing anything with an LI having "value" attribute so there is no possibility of loosing any information. Moreover, as we are manipulating this additional attribute only while copying and cleaning the node as soon as we are done with copy, there should be no eventual impact of this. Once I'm clear with this part, I'll update the patch for incorporating other suggestions too.

Johnny Stenback (:jst)

Comment 12

•

23 years ago

Even if we're sure we don't loose value attributes when copying LI's, modifying the DOM tree while doing the copy is IMO not a good design. Why not simply have a trivial hash (plhash, or pldhash) that you store the "value"'s in as you see them (i.e. key on the LI's nsIDOMNode pointer value or something, and store the "value" in the hash based on that key), and then use that when you serialize the document? That would leave the DOM tree alone (and not cause reflows, mutation events, and who knows what as a result of setting the "value" attributes) and isolate this logic completely inside the serializer code. This is not hard, and is IMO well worth doing.

Tanu Mutreja

Comment 13

•

23 years ago

Johnny & Akkana, when we use nsPlainTextSerializer for pasting the content on text editor, we have Parser Nodes instead of DOM nodes. So putting the DOM nodes / elements on hash may not work because while pasting it we will not be having handle to the DOM. (Correct me, if I'm wrong here !!!) Other solution that I can think now is to set the value attribute to LIs while doing HTML serialization(during "copy") in nsHTMLContentSerializer. If feasible, that would just be shifting the logic from nsDocumentEncoder--> nsHTMLContentSerializer. Please suggest if you have anything better in mind and see if the above makes some sense.

Johnny Stenback (:jst)

Comment 14

•

23 years ago

If we're serializing to plaintext from parser nodes there is no problem here, right? In that case we'll be outputting every LI that the parser hands us, and we'll have every LI so we can easily compute the numbers for every LI as we serialize them. However, if we're serializing a partial DOM tree into HTML, we'll have a DOM tree, and we can use the hash ahd output the value atributes that are needed to make the output correct.

Tanu Mutreja

Comment 15

•

23 years ago

Attached patch Moving the logic from nsDocumentEncoder to nsHTMLContentSerializer... (obsolete) — Details — Splinter Review

In this patch I've shifted the logic of setting "value" attributes to LI from nsDocumentEncoder --> nsHTMLContentSerializer and hence we are no more modifying the original DOM tree. Daniel, I'm still continuing with the EnsureVerticalSpace(0), as on skipping this I see 2 new line characters whenever I paste an OL to a plain text editor. Please let me know if I'm missing some cases where you think it mandatory to comment it out.

Daniel Bratell

Reporter

Comment 16

•

23 years ago

Please test sending OL:s in plain text mails to. If that looks ok, it's fine with me and r=bratell for the nsPlaintextSerializer part. Just make sure this matches the tests so we don't break tinderbox.

Johnny Stenback (:jst)

Updated

•

23 years ago

Attachment #70251 - Flags: superreview+

Johnny Stenback (:jst)

Comment 17

•

23 years ago

Comment on attachment 70251 [details] [diff] [review] Moving the logic from nsDocumentEncoder to nsHTMLContentSerializer... - In nsHTMLContentSerializer::IsFirstChildOfOL(): + if (state->isFirstListItem) + return PR_TRUE; + else + return PR_FALSE; + } + else + return PR_FALSE; two else-after-return's, doing it this way is much cleaner: + if (state->isFirstListItem) + return PR_TRUE; + + return PR_FALSE; + } + + return PR_FALSE; Other than that, sr=jst

Tanu Mutreja

Updated

•

23 years ago

Status: NEW → ASSIGNED

Tanu Mutreja

Updated

•

23 years ago

Whiteboard: [patch need r/sr=] → [patch need a=]

Tanu Mutreja

Comment 18

•

23 years ago

Attached patch Including jst's comments. — Details — Splinter Review

In nsHTMLContentSerializer::IsFirstChildOfOL(), replaced + if (state->isFirstListItem) + return PR_TRUE; + else + return PR_FALSE; + } + else + return PR_FALSE; by + if (state->isFirstListItem) + return PR_TRUE; + + return PR_FALSE; + } + + return PR_FALSE;

Tanu Mutreja

Updated

•

23 years ago

Attachment #69038 - Attachment is obsolete: true

Tanu Mutreja

Updated

•

23 years ago

Attachment #69222 - Attachment is obsolete: true

Tanu Mutreja

Updated

•

23 years ago

Attachment #70251 - Attachment is obsolete: true

Tanu Mutreja

Updated

•

23 years ago

Target Milestone: --- → mozilla1.0

Asa Dotzler [:asa]

Comment 19

•

23 years ago

Comment on attachment 71463 [details] [diff] [review] Including jst's comments. a=asa (on behalf of drivers) for checkin to 0.9.9 and the mozilla trunk. Also, moving reviews forward to the current patch. Let's try to keep the relevant reviews attached to the relevant patches. Thanks.

Attachment #71463 - Flags: superreview+

Attachment #71463 - Flags: review+

Attachment #71463 - Flags: approval+

scottputterman

Comment 20

•

23 years ago

Moving Netscape owned 0.9.9 and 1.0 bugs that don't have an nsbeta1, nsbeta1+, topembed, topembed+, Mozilla0.9.9+ or Mozilla1.0+ keyword. Please send any questions or feedback about this to adt@netscape.com. You can search for "Moving bugs not scheduled for a project" to quickly delete this bugmail.

Target Milestone: mozilla1.0 → mozilla1.2

Tanu Mutreja

Updated

•

23 years ago

Whiteboard: [patch need a=] → [has-approval]

Vinay Badami

Comment 21

•

23 years ago

Fixed with checkin D:\mozilla\content\base>cvs commit cvs commit: Examining . cvs commit: Examining public cvs commit: Examining src Checking in public/nsIContentSerializer.h; /cvsroot/mozilla/content/base/public/nsIContentSerializer.h,v <-- nsIContentSe rializer.h new revision: 1.10; previous revision: 1.9 done Checking in src/nsDocumentEncoder.cpp; /cvsroot/mozilla/content/base/src/nsDocumentEncoder.cpp,v <-- nsDocumentEncode r.cpp new revision: 1.63; previous revision: 1.62 done Checking in src/nsHTMLContentSerializer.cpp; /cvsroot/mozilla/content/base/src/nsHTMLContentSerializer.cpp,v <-- nsHTMLCont entSerializer.cpp new revision: 1.40; previous revision: 1.39 done Checking in src/nsHTMLContentSerializer.h; /cvsroot/mozilla/content/base/src/nsHTMLContentSerializer.h,v <-- nsHTMLConten tSerializer.h new revision: 1.14; previous revision: 1.13 done Checking in src/nsPlainTextSerializer.cpp; /cvsroot/mozilla/content/base/src/nsPlainTextSerializer.cpp,v <-- nsPlainTextS erializer.cpp new revision: 1.48; previous revision: 1.47 done Checking in src/nsPlainTextSerializer.h; /cvsroot/mozilla/content/base/src/nsPlainTextSerializer.h,v <-- nsPlainTextSer ializer.h new revision: 1.18; previous revision: 1.17 done Checking in src/nsXMLContentSerializer.cpp; /cvsroot/mozilla/content/base/src/nsXMLContentSerializer.cpp,v <-- nsXMLConten tSerializer.cpp new revision: 1.21; previous revision: 1.20 done Checking in src/nsXMLContentSerializer.h; /cvsroot/mozilla/content/base/src/nsXMLContentSerializer.h,v <-- nsXMLContentS erializer.h new revision: 1.10; previous revision: 1.9 done

Status: ASSIGNED → RESOLVED

Closed: 23 years ago

Resolution: --- → FIXED

Aleksey Nogin

Comment 22

•

23 years ago

This commit added a compiler warning content/base/src/nsHTMLContentSerializer.cpp:1147 `struct nsHTMLContentSerializer::olState * state' might be used uninitialized in this function Would not it be better to change it : --- nsHTMLContentSerializer.cpp 2002/03/12 08:21:51 1.40 +++ nsHTMLContentSerializer.cpp 2002/03/12 20:47:53 @@ -1144,13 +1144,13 @@ if (parentName.EqualsIgnoreCase("OL")) { olState defaultOLState(0, PR_FALSE); - olState* state; + olState* state=nsnull; if (mOLStateStack.Count() > 0) state = (olState*)mOLStateStack.ElementAt(mOLStateStack.Count()-1); /* Though we should never reach to a "state" as null at this point as all LI are supposed to be inside some OL and OL tag should have pushed a state to the mOLStateStack.*/ - if (!state || mOLStateStack.Count() == 0) + if (!state) state = &defaultOLState; if (state->isFirstListItem)

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Johnny Stenback (:jst)

Comment 23

•

23 years ago

This should've been filed as a new bug, but let's not worry about that now. sr=jst for the above patch.

Asa Dotzler [:asa]

Comment 24

•

23 years ago

a=asa (on behalf of drivers) for checkin to the 1.0 trunk of the patch in comment 22

Vinay Badami

Comment 25

•

23 years ago

Fix checked in for ayn2@cornell.edu D:\mozilla\content\base\src>cvs commit cvs commit: Examining . Checking in nsHTMLContentSerializer.cpp; /cvsroot/mozilla/content/base/src/nsHTMLContentSerializer.cpp,v <-- nsHTMLContentSerializer.cpp new revision: 1.42; previous revision: 1.41 done

Status: REOPENED → RESOLVED

Closed: 23 years ago → 23 years ago

Resolution: --- → FIXED

Joe Francis

Comment 26

•

23 years ago

I had to back out part of the patch. reopening

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Joe Francis

Comment 27

•

23 years ago

clearing milestone

Target Milestone: mozilla1.2alpha → ---

Chris Casciano

Comment 28

•

23 years ago

Does bug 136595 fit into this discussion at all, or is the list-style-type/CSS stuff another layer on top of it all which should be dealt with separately?

Tanu Mutreja

Comment 29

•

23 years ago

Backed part of the patch: @@ -1300,7 +1303,9 @@ tag.get() == nsHTMLAtoms::h3 || tag.get() == nsHTMLAtoms::h4 || tag.get() == nsHTMLAtoms::h5 || - tag.get() == nsHTMLAtoms::h6) { + tag.get() == nsHTMLAtoms::h6 || + tag.get() == nsHTMLAtoms::ol || + tag.get() == nsHTMLAtoms::li) { return PR_TRUE; } Though the remaining patch has not much meaning without this, we did not back the whole patch out to make sure that the problem to be fixed in future is just a smart addition of OL, LI, UL in nsDocumentEncoder::IncludeInContext(). I'm opening a new bug for that and marking it dependent on that.

Tanu Mutreja

Updated

•

23 years ago

Depends on: 140541

Whiteboard: [has-approval] → [#140541 will fix this]

Jim Crumley

Comment 30

•

23 years ago

*** Bug 149956 has been marked as a duplicate of this bug. ***

Chris Casciano

Comment 31

•

22 years ago

Any update on this? Just running through some old bugs i'm cc'd on and see no movement in almost 7 months

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Updated

•

19 years ago

Depends on: 316775

Phil Ringnalda (:philor)

Updated

•

16 years ago

QA Contact: sujay → dom-to-text

BugBot [:suhaib / :marco/ :calixte]

Comment 32

•

3 years ago

The bug assignee didn't login in Bugzilla in the last 7 months, so the assignee is being reset.

Assignee: t_mutreja → nobody

BMO Automation

Updated

•

2 years ago

Severity: minor → S4

Patch to fix this. 23 years ago Tanu Mutreja 12.63 KB, patch		Details \| Diff \| Splinter Review
Incorporating the suggestions... 23 years ago Tanu Mutreja 9.78 KB, patch	akkzilla : review+	Details \| Diff \| Splinter Review
Moving the logic from nsDocumentEncoder to nsHTMLContentSerializer... 23 years ago Tanu Mutreja 17.01 KB, patch	jst : superreview+	Details \| Diff \| Splinter Review
Including jst's comments. 23 years ago Tanu Mutreja 17.00 KB, patch	asa : review+ asa : superreview+ asa : approval+	Details \| Diff \| Splinter Review