Closed Bug 967493 Opened 10 years ago Closed 8 years ago

DevTools Inspector should display HTML Entities not evaluate them

Categories

(DevTools :: Inspector, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: codacodercodacoder, Assigned: jyeh)

References

(Blocks 1 open bug)

Details

(Whiteboard: [btpp-fix-later][good taipei bug])

Attachments

(2 files)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:26.0) Gecko/20100101 Firefox/26.0 (Beta/Release)
Build ID: 20131205075310

Steps to reproduce:

Used ≡ in the HTML.


Actual results:

Inspector evaluates the entity and displays it (like a browser would).


Expected results:

Using ≡ in the HTML should be echoed directly in the Inspector.
Component: Untriaged → Developer Tools: Inspector
Version: 26 Branch → 29 Branch
Status: UNCONFIRMED → NEW
Ever confirmed: true
Version: 29 Branch → Trunk
Attached image scrreenshot.png
Here's a screenshot showing the problem.
OS: Windows 7 → All
Hardware: x86_64 → All
See Also: → 1256756
Triaging (filter on CLIMBING SHOES).
Priority: -- → P3
Whiteboard: [btpp-backlog]
I have been reminded of this bug in this twitter thread:
https://twitter.com/kevinSuttle/status/725042908823478272
The person made a test page to demonstrate the issue:
http://codepen.io/kevinSuttle/pen/ZWMqrw

Basically, the fact that the inspector evaluated 
<input readonly value="I can't be edited">

to
<input readonly value="I can't be edited">

made it really hard for the user to understand what was happening in the inspector. The text node looked like an actual node because of the angle brackets but appeared greyed-out.

Bumping to P2.
Priority: P3 → P2
Whiteboard: [btpp-backlog] → [btpp-fix-later]
Whiteboard: [btpp-fix-later] → [btpp-fix-later][good taipei bug]
Hi Patrick,

I am interesting in this bug, so I take it and try to figure out how to work on it.
Will you be the mentor of this bug?

Thanks :)
Assignee: nobody → jyeh
(In reply to Joseph Yeh [:joseph] from comment #4)
> I am interesting in this bug, so I take it and try to figure out how to work
> on it.
> Will you be the mentor of this bug?
Thanks for your interest. Yes I can help you with this bug.
Make sure you go through our getting started guide first: https://developer.mozilla.org/en-US/docs/Tools/Contributing
In terms of code, you will be mostly interested in 2 things:
- the back-end (where the DOM nodes and text content lives), the NodeActor is the class that represents a given DOM node and sends its textcontent to the inspector: https://dxr.mozilla.org/mozilla-central/source/devtools/server/actors/inspector.js#215
- the front-end (the toolbox itself, where the inspector panel lives), the MarkupView is the panel that displays nodes in the inspector, using information that the NodeActor sent: https://dxr.mozilla.org/mozilla-central/source/devtools/client/inspector/markup/markup.js#2259-2270
Flags: needinfo?(pbrosset)
Hi Patrick,

Sorry, I was busy fixing other bugs so it took a while for this patch. 

I trace the code a little bit and found a way to replace the HTML Entities when showing in the TextEditor and also setting the node value in NodeActor. It works but I think this is probably not the best solution. Can you give me some feedback from this patch?

BTW, do you prefer using MozReview for reviewing?

Thanks!
Attachment #8753700 - Flags: feedback?(pbrosset)
Comment on attachment 8753700 [details] [diff] [review]
0001-Bug-967493-DevTools-Inspector-should-display-HTML-Entities-not-evaluate-them.patch

Review of attachment 8753700 [details] [diff] [review]:
-----------------------------------------------------------------

A patch on bugzilla or a review on mozreview are both fine. I don't care. Just choose the one you prefer.

This is not the right approach unfortunately. Here you are replacing all < and > with their corresponding HTML entities. But the goal here is to show all HTML entities the user has entered in the HTML source and show them as-authored in the inspector panel.
There are more than just &lt; and &gt; entities, and we don't want to replace all, just show all the text content as it was authored.

Having said this, I'm wondering if this is at all possible. I've been playing with this a bit, and couldn't find a way to get the authored text out of the walker. Indeed, HTML entities seem to be evaluated when parsing HTML.
This would explain why other browsers don't show them either in their devtools.

Am I missing something or should we just close this bug as WONTFIX?
I don't see a way of getting the authored HTML source (apart from downloading it again, as we do for stylesheets and scripts in the style-editor and debugger), and then comparing it to the rendered DOM to find out actual source texts.
Attachment #8753700 - Flags: feedback?(pbrosset) → feedback-
Tom, I need someone to tell me if I'm wrong in comment 7.
Flags: needinfo?(ttromey)
(In reply to Patrick Brosset <:pbro> from comment #7)
>
> Having said this, I'm wondering if this is at all possible. I've been
> playing with this a bit, and couldn't find a way to get the authored text
> out of the walker. Indeed, HTML entities seem to be evaluated when parsing
> HTML.
> This would explain why other browsers don't show them either in their
> devtools.
> 
> Am I missing something or should we just close this bug as WONTFIX?
> I don't see a way of getting the authored HTML source (apart from
> downloading it again, as we do for stylesheets and scripts in the
> style-editor and debugger), and then comparing it to the rendered DOM to
> find out actual source texts.

Well, thanks for being honest, Patrick.  This outcome has been my expectation all along. If you use the inspector's HTML editor to create a new node, add &equiv;&equiv;&equiv;&equiv; you see exactly the same issue: after DOM insertion/update, the original is gone.

Some of the wording in the docs should change - this tool doesn't actually "inspect the HTML of the page", it inspects the DOM using a different layout to the (new) DOM inspector.

Some genius will hopefully figure out a way of treating the original HTML source like source in sourcemaps processing.
(In reply to Patrick Brosset <:pbro> from comment #8)
> Tom, I need someone to tell me if I'm wrong in comment 7.

That's my belief as well, though I don't know it with 100% certainty.
Flags: needinfo?(ttromey)
No longer blocks: top-inspector-bugs
Hi Patrick, 

So should we close this bug as WONTFIX? I am not sure if I need to work on it.

Thanks!
Flags: needinfo?(pbrosset)
You're right, sorry I forgot to do it.
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(pbrosset)
Resolution: --- → WONTFIX
Crying over spilt (and long turned sour) milk here but...

This bug (as originally reported, not the meta-mess it turned into) just bit me again. I had a div, zero padding. It contained a textarea width 100%, zero margins. Below the textarea, inside the div, was a space some 30px or so high. 

ME: Where's that space coming from???

Box Model panel reported the correct values for the css as described.

Rules panel reported the correct values.

Same for Computed panel.

Inspector showed the (I thought) correct HTML.

What the hell...?

15-20 minutes later, I see a stray &emsp; after the textarea - which, recall, has 100% width so the &emsp; moves to the next "line" creating the offending space.

<rant>
You can add all the bells and whistles you like, but until you fix the *basic* stuff, the Inspector and the rest of the failures (see list above) cannot do what the docs claim they can do.
</rant>
May I suggest to implement this like Firebug does? As explained in bug 839818 comment 5 Firebug allows to switch the display of all entity characters independently of whether they were authored as entities or not.

This is surely less good than only displaying authored entities as such, but would solve Codacoder's use case.

Sebastian
Blocks: firebug-gaps
Flags: needinfo?(pbrosset)
See Also: → 839818
(In reply to Sebastian Zartner [:sebo] from comment #14)
> May I suggest to implement this like Firebug does? As explained in bug
> 839818 comment 5 Firebug allows to switch the display of all entity
> characters independently of whether they were authored as entities or not.
> 
> This is surely less good than only displaying authored entities as such, but
> would solve Codacoder's use case.
> 
> Sebastian
Let me file a separate bug for this. This sounds interesting, and Firebug has it. I don't want to reuse this bug here though, let's start from a clean base: bug 1277828
Flags: needinfo?(pbrosset)
> Having said this, I'm wondering if this is at all possible. I've been playing with this a bit, and
> couldn't find a way to get the authored text out of the walker. Indeed, HTML entities seem to be 
> evaluated when parsing HTML.
> This would explain why other browsers don't show them either in their devtools.

Just wondering, can't this be fixed upstream (in the walker)? Is there a bug reported upstream, or to a firefox subteam? 

I've just been hit by this bug myself, and what would be obviously fixed in seconds (here a line break issue due to &nbsp;) caused many hours due to not seeing the obvious. Manually switching the view like proposed in Bug 1277828 is not an alternative because for this I would already somehow have a clue about what's causing the issue. And then I could just view "View Source"...
Flags: needinfo?(pbrosset)
(In reply to Jens from comment #16)
> I've just been hit by this bug myself, and what would be obviously fixed in
> seconds (here a line break issue due to &nbsp;) caused many hours due to not
> seeing the obvious.

this.

> Manually switching the view like proposed in Bug 1277828
> is not an alternative because for this I would already somehow have a clue
> about what's causing the issue. And then I could just view "View Source"...

Precisely. And if you were not the author of the original HTML... well, the possible causes are endless.

I can understand, obviously, why the Inspector needs to "police" and "sanitize" the (potentially broken) HTML sent down the wire before displaying it and, as a result, it will always be an "interpretation". But to propose yet another "layer" - an interpretation of the interpretation - is not "engineering", it's guesswork at best, misleading at worst.

We need someone with the knowledge to truly fix it - properly - to actually suffer this issue in the way you and I have. Then, maybe, it'll get fixed.
(In reply to Codacoder from comment #13)
> This bug (as originally reported, not the meta-mess it turned into) just bit
> me again. I had a div, zero padding. It contained a textarea width 100%,
> zero margins. Below the textarea, inside the div, was a space some 30px or
> so high. 
> 
> ME: Where's that space coming from???
> 
> Box Model panel reported the correct values for the css as described.
> 
> Rules panel reported the correct values.
> 
> Same for Computed panel.
> 
> Inspector showed the (I thought) correct HTML.
> 
> What the hell...?
> 
> 15-20 minutes later, I see a stray &emsp; after the textarea - which,
> recall, has 100% width so the &emsp; moves to the next "line" creating the
> offending space.
This might be a separate issue.
The inspector actually ignores text nodes that only contain white spaces. So the text node generated by the &emsp; text in the source doesn't even get displayed in the inspector. So even if we managed to show entities as authored as comment 0 proposes, the text node here wouldn't be visible.

"Empty" text nodes are a common source of layout problems, as you explained here. I have started to look at the possibility to toggle empty text nodes in the inspector, and even be able to highlight them in the page.
See this screenshot for instance: https://dl.dropboxusercontent.com/u/714210/empty-text-node-visible.png
Here the text node generated by &emsp; is shown in the inspector, and hovering over it highlights it in the page. So at least you know something is there.

Now to actually fix what comment 0 describes, we'd need to do what is done in the CSS Rules panel. There, the CSS object model is used to know the list of rules and properties, but the actual stylesheet text content is consulted to extract the authored CSS code for these properties. This is possible because we have line:col location information for each property.

The walker API we use to navigate the DOM tree doesn't give us this information as far as I know. If it did, that would be one way to get the authored text content for the DOM nodes.
Of course this would only work for the markup that comes in the HTTP response. Anything that is inserted later in the DOM with Javascript wouldn't work in the same fashion.

I'll file a bug on the platform side to investigate this further.
Flags: needinfo?(pbrosset)
See Also: → 1296313
(In reply to Patrick Brosset <:pbro> from comment #18)

> > 15-20 minutes later, I see a stray &emsp; after the textarea - which,
> > recall, has 100% width so the &emsp; moves to the next "line" creating the
> > offending space.
> This might be a separate issue.

Thanks Patrick.

Yes, I think perhaps you're right. Had it been &something-visible; I'd have found it sooner/immediately.

> The inspector actually ignores text nodes that only contain white spaces. So
> the text node generated by the &emsp; text in the source doesn't even get
> displayed in the inspector. So even if we managed to show entities as
> authored as comment 0 proposes, the text node here wouldn't be visible.

Perhaps display hard-coded "whitespace-generating-entities" as black rectangles?  Just an idea... if possible.

> "Empty" text nodes are a common source of layout problems, as you explained
> here. I have started to look at the possibility to toggle empty text nodes
> in the inspector, and even be able to highlight them in the page.
> See this screenshot for instance:
> https://dl.dropboxusercontent.com/u/714210/empty-text-node-visible.png
> Here the text node generated by &emsp; is shown in the inspector, and
> hovering over it highlights it in the page. So at least you know something
> is there.

Agreed. That would be better (remember my "nightmare" scenario - I may *not* be the original author so sometimes it's left to my imaginative powers to sift through the (seemingly endless) possibilities that could be causing the issue).
 
> Now to actually fix what comment 0 describes, we'd need to do what is done
> in the CSS Rules panel. There, the CSS object model is used to know the list
> of rules and properties, but the actual stylesheet text content is consulted
> to extract the authored CSS code for these properties. This is possible
> because we have line:col location information for each property.

Right. And thanks for the explanation. I guess, ideally, I'd like to "x-ray" the page and see "in negative form" what's there. Hey, perhaps that's a new tool ;) (Knowing Moz, you've probably used that name elsewhere already - how about MRI? ;))
 
> The walker API we use to navigate the DOM tree doesn't give us this
> information as far as I know. If it did, that would be one way to get the
> authored text content for the DOM nodes.

I think I see your future ;)

> Of course this would only work for the markup that comes in the HTTP
> response.

As you say, "of course".

> I'll file a bug on the platform side to investigate this further.

Link here please so I can follow along?

Thanks again.
(In reply to Codacoder from comment #19)
> > I'll file a bug on the platform side to investigate this further.
> 
> Link here please so I can follow along?
Bug 1296313
(In reply to Patrick Brosset <:pbro> from comment #18)
> This might be a separate issue.
> The inspector actually ignores text nodes that only contain white spaces. So
> the text node generated by the &emsp; text in the source doesn't even get
> displayed in the inspector. So even if we managed to show entities as
> authored as comment 0 proposes, the text node here wouldn't be visible.
> 
> "Empty" text nodes are a common source of layout problems, as you explained
> here. I have started to look at the possibility to toggle empty text nodes
> in the inspector, and even be able to highlight them in the page.
> See this screenshot for instance:
> https://dl.dropboxusercontent.com/u/714210/empty-text-node-visible.png
> Here the text node generated by &emsp; is shown in the inspector, and
> hovering over it highlights it in the page. So at least you know something
> is there.

Thanks for the in-depth explaination, Patrick! (I'm happy to spent you a beer when you're in Berlin at ViewSource ;) Do you have a bug number I could follow here? I guess it's something else than Bug 1296313
(In reply to Jens from comment #21)
> Do you have a bug number I could
> follow here? I guess it's something else than Bug 1296313
I didn't file a bug yet for helping debug issues related to empty text nodes. The screenshot I shared was just of a hack I did locally, but I haven't yet thought this through enough to file a bug for it.
Update: Chrome converts *some* characters to their entities. See:
https://cs.chromium.org/chromium/src/third_party/WebKit/Source/devtools/front_end/elements/ElementsTreeOutline.js?q=mappedcharToEntity&sq=package:chromium&l=114&dr=C
So things like &nbsp; &ensp; &emsp; etc. are visible in the inspector
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: