Closed Bug 282317 Opened 21 years ago Closed 21 years ago

IHTMLElement::innerText property incorrectly implemented - returns a value including HTML tags

Categories

(Core Graveyard :: Embedding: ActiveX Wrapper, defect)

x86
Windows XP
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla1.8beta2

People

(Reporter: rlking, Assigned: bzbarsky)

Details

Attachments

(1 file)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.7.5) Gecko/20041110 Firefox/1.0 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.7.5) Gecko/20041110 Firefox/1.0 The IHTMLElement::innerText property is intended to return the text associated with an HTML element, including the text of any embedded HTML tags, but not the tags themselves. This distinguishes it from the IHTMLElement::innerHTML property, which includes the full HTML within the element. As an example, consider the following HTML snippet: <P>The big brown <B>fox</B> jumps over the lazy <I>dog</I>.</P> If an IHTMLElement interface is obtained for this <P> element, then the following values should be returned by the innerHTML and InnerText properties: innerHTML: The big brown <B>fox</B> jumps over the lazy <I>dog</I>. innerText: The big brown fox jumps over the lazy dog. However, Version 1.7.1 of the Mozilla ActiveX Control returns the same value for both properties, this value being that expected for innerHTML. In other words, innerText incorrectly returns HTML tags along with the text. Reproducible: Always Steps to Reproduce: 1. Create a new VB6 standard exe project. 2. Place the Mozilla ActiveX control on the form 3. Add a command button, and insert the following code: Private Sub Command1_Click() Dim doc As IHTMLDocument2 Dim element As IHTMLElement MozillaBrowser1.navigate "http://www.mozilla.org" ' any URL will do Do Until MozillaBrowser1.readyState = READYSTATE_COMPLETE: DoEvents: Loop Set doc = MozillaBrowser1.document Set element = doc.All(0) 'element 0 is the <HTML> element MsgBox element.innerText End Sub 4. Run the project and click the button. Actual Results: The message box displays the complete HTML inside the <HTML> tag, instead of merely the text. (Note: in the case of the URL given in the code sample, the HTML displayed in the message box is truncated due to limitations on the size of text a message box cn display.) Expected Results: No HTML tags should be included in the displayed message box text.
Mozilla has been supporting DOM 3 Core textContent attribute since Mozilla rv 1.5. http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#Node3-textContent innerText is a MSIE proprietary extension while textContent is part of an official W3C TR.
(In reply to comment #1) > Mozilla has been supporting DOM 3 Core textContent attribute since Mozilla rv 1.5. > http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html#Node3-textContent > > innerText is a MSIE proprietary extension while textContent is part of an > official W3C TR. Thanks for the info. However, I would suggest that if innerText is to do anything at all, it should do the same as Microsoft intended. If the textContent property is implemented, presumably it should not be a difficult task to make innerText return the same value as textContent? Furthermore, it's not immediately obvious to me how to access textContent in an automation context. Any hints?
> However, I would suggest that if innerText is to do anything at all, it should > do the same as Microsoft intended. If textContent is an attribute in official W3C TR, why then can't we use it for MSIE? If innerText is already implemented, it should be very easy to map textContent to innerText. "Microsoft is committed to working with the World Wide Web Consortium (W3C) to implement W3C-approved HTML standards, and has confirmed its pledge to work through W3C and other standards bodies on enhancements to HTML and other key Web technologies." Microsoft on Web standards, 2001 FYI, MSIE 5+ and all modern versions of browsers support perfectly, flawlessly DOM 2 CharacterData attributes and methods too. See for yourself with Safari 1.x, Opera 7+, MSIE 5+, any Gecko-based browser: http://www.gtalbot.org/HTMLJavascriptCSS/DOM2CharacterData.html So there is no need anymore to use/rely on innerText or to create code forks or to create multiple code for multiple browsers. > If the textContent property is implemented, presumably it should not be a > difficult task to make innerText return the same value as textContent? It's the other way around. If a proprietary extension is already implemented and doing the same as an attribute of an official W3C TR, then it should be very easy to map such attribute to the proprietary stuff. I do not say this to annoy you because I said exactly that to opera dev. team and even opened a bugfile (bug 155598 at Opera's BTS) on implementing DOM 3 textContent a few months ago. http://www.gtalbot.org/BrowserBugsSection/Opera7Bugs/Opera7Bugs.html Resolving as DUPLICATE of bug 264412 *** This bug has been marked as a duplicate of 264412 ***
Status: UNCONFIRMED → RESOLVED
Closed: 21 years ago
Resolution: --- → DUPLICATE
(In reply to comment #3) Gerard I'm really not satisfied with your response. The subject of this bug is the Mozilla ActiveX control, not the browser itself. This is about how one can manipulate the DOM from an external program, not from within the script in a webpage. As far as I'm aware, the only interfaces supported by the Mozilla ActiveX control are those defined by Microsoft for Internet Explorer 4, ie IHTMLDocument2, IHTMLElement, etc. The website for the control specifically points to the Microsoft documentation for these interfaces, and there is no other documentation provided. [Actually I've just noticed that the website has been updated within the last couple of days, and there is no longer a direct link to the Microsoft documentation: however it is absolutely clear that the Microsoft IE4 interfaces are supported, albeit partially, and the intention is for the Mozilla ActiveX control to be a direct replacement for the Microsoft WebBrowser control.] Given that the ActiveX control supports the Microsoft interfaces, even if only a subset, it seems common sense to me that any method or property that is implemented should work according to the Microsoft specification, especially when there is no other way to achieve the purpose of the method or property. So I'm NOT suggesting that innerText should be available to script operating within a webpage, but that specifically the implementation of the IHTMLElement::innerText property should conform to its specification. It may be that there is some other way to achieve the effect of this property, but it's not obvious what it might be. If there is a way, please spell it out for me.
Status: RESOLVED → UNCONFIRMED
Resolution: DUPLICATE → ---
> I'm really not satisfied with your response. I replied to what your comments within the context of bugzilla. Bugzilla is not an help/support discussion forum but a place to file, confirm, investigate and fix bugs happening in Mozilla software. All done by volunteers. > Given that the ActiveX control supports the Microsoft interfaces, even if only a > subset, You say only a subset. Maybe you should send your requests, demands to the ActiveX control developer then. > it seems common sense to me that any method or property that is > implemented should work according to the Microsoft specification, especially > when there is no other way to achieve the purpose of the method or property. "especially when there is no other way to achieve the purpose of the method or property": I very strongly doubt that. > So I'm NOT suggesting that innerText should be available to script operating > within a webpage, That is not how I understood your "I would suggest that if innerText is to do anything at all, it should do the same as Microsoft intended." > but that specifically the implementation of the > IHTMLElement::innerText property should conform to its specification. What if someone else (module owner, reviewer,..) tells you it can not conform to such external proprietary specification?
Gérard Talbot, you are being an ass. Did you notice the product and component for this bug? This IS the place to, as you put it, "send your requests, demands to the ActiveX control developer". Given that you clearly had no idea what this bug was talking about, why did you comment on it at all?
Status: UNCONFIRMED → NEW
Ever confirmed: true
Attached patch PatchSplinter Review
Adam, this just implements get/setInnerText in terms of nsIDOM3Node.textContent. I can't actually compile/test it, but it's straightforward enough that I think the changes should be fine. Thoughts?
Attachment #175374 - Flags: review?(adamlock)
> Did you notice the product and component > for this bug? This IS the place to, as you put it, "send your requests, demands > to the ActiveX control developer". Richard, I may have wrongly understood the whole thing here. I did not try to intentionally annoy you or frustrate you. If I have upset you, then please accept my sincere apologies. I have tried to seek help regarding this bug and I have failed. Errors of this sort can happen and will happen in any large organization. I thought that MS-VB6 was the original developer of the ActiveX control. On another area, I have written a personal email to Mr Boris Zbarsky asking for some clarifications on some other issues.
Removing myself from the CC list
(In reply to comment #8) > Richard, I may have wrongly understood the whole thing here. I did not try to > intentionally annoy you or frustrate you. If I have upset you, then please > accept my sincere apologies. Gerard, no problem. I wasn't offended or upset. I've been developing software for long enough (30 years) to know that misunderstandings are all too easy, and a careful debate is far more productive than getting emotional. Since you've removed yourself from the Cc list, I suppose you won't see this response, but it's here for the record.
Hi Boris, I tested your patch. All OK. Maybe you could use a nsDependentString instead of the nsAutoString in the put_ method. I haven't had any news from Adam for some time, so you can ask me for review, if needed. Alex
I just copied the innerHTML put method... To use an nsDependentString I'd have to sort out the scoping and lifetime issues with OLE2W. If I don't hear back from Adam in the near future, I'll ask you for review.
Comment on attachment 175374 [details] [diff] [review] Patch r=adamlock
Attachment #175374 - Flags: review?(adamlock) → review+
Assignee: adamlock → bzbarsky
Fixed.
Status: NEW → RESOLVED
Closed: 21 years ago21 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.8beta2
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: