Bugzilla

Comment 3

•

5 years ago

Looking into it:

it looks like jsbin write's this HTML to be evaluated:

"<!DOCTYPE html>
<script>(function(){window.__blocked={methods:[\"open\",\"print\",\"alert\",\"prompt\",\"confirm\"],old:{}};for(var m in __blocked.methods){try {__blocked.old[m]=window[m];window[m]=function(){};}catch(e){}}})()</script>

<html>
<head>
  <meta charset=\"utf-8\">
  <meta name=\"viewport\" content=\"width=device-width\">
  <title>JS Bin</title>
<style id=\"jsbin-css\">

</style>
</head>
<body>

<script>try {window.runnerWindow.proxyConsole.log('Before breakpoint');
debugger;
window.runnerWindow.proxyConsole.log('After breakpoint');
} catch (error) { throw error; }

//# sourceURL=null.js
</script>
</body>
</html>
<!--jsbin live harness--><script>(function(){for(var m in __blocked.methods){try{window[m]=__blocked.old[m];delete __blocked;}catch(e){}};})()</script>"

https://dbg-iframe-inline-source.glitch.me

Comment 4

•

5 years ago

Thanks for the report. Here's a smaller STR, which highlights how the iframe might be created and with the inline script.

Comment 5

•

5 years ago

I think the issue comes from the fact that the logic in TabSources assumes that if a source was introduced by JS, then it must be JS, which in this case is misleading because the source was HTML for the iframe.

Comment 6

•

5 years ago

Brian, would you be able to take a look at this? It seems like there might be a case that we're missing with dynamically created iframes.

Assignee: nobody → bhackett1024

Updated

•

5 years ago

Flags: needinfo?(bhackett1024)

Assignee

Comment 7

•

5 years ago

I think there are a couple options here, though neither is straightforward.

Show script-created HTML in the same way that we do normal HTML that was streamed over the network. This requires extending the logic that just landed in bug 1538056 to supply HTML content to the devtools in the script-created HTML case. That isn't very hard, and I have a patch for it. Where I ran into trouble is what to do when a page pauses before the script-created HTML has finished. <script> tags in script-created HTML run as soon as possible after the document.write() call happens, and if they pause in the debugger before a document.close() call happens, then what file contents do we show to the user when pausing? In the streaming case, we can just wait for the stream to finish and then show the complete HTML, but in the script-created case the HTML won't finish until after the target unpauses and it calls document.close() (or maybe it never does!). So, handling this correctly requires showing a partial file to the user when the pause occurs, and after unpausing the file contents need to be updated as new HTML appears. In principle that can be done, but it's also kind of opening up a can of worms to deal with a rather obscure case.
A simpler option is to continue showing the inline source as we are now, but recognize that the source has a start line other than 1 and bias the line numbers we use when communicating with the client so that it correctly renders the pause location, breakpoint positions, and so forth. This shouldn't be too hard to do by extending the hacky parts for inline script column positions that also just landed in bug 1538056. This approach would be sort of cementing those hacks in place, however.

I'm inclined to continue pushing forward on the first option, unless I hear otherwise. To do this we'll need to be able to associate sources with the UUID of the parser they originated from --- with script-created HTML we can't use the URL as a key for determining the full HTML associated with an inline script. This would allow the server to distinguish between sources from different parses of content at the same URL, which is an issue that was brought up in the review of bug 1538056, though I'm still not sure how much of a problem it actually is.

Flags: needinfo?(bhackett1024)

Comment 8

•

5 years ago

Pulling in Jim + Logan + Harald.

From a user perspective, I would prefer showing the inline script in the editor, but I am not sure what is realistic and i am not inclined to cement hacks. Bug 1538056 landed today and this is not how I was hoping to celebrate 3 months of work :)

e.g. https://dbg-iframe-inline-source.glitch.me

try {
    //window.runnerWindow.proxyConsole.log('yo')
debugger
} catch (error) { throw error; }

Flags: needinfo?(lsmyth)

Flags: needinfo?(jimb)

Flags: needinfo?(hkirschner)

Comment 9

•

5 years ago

My concern with the first approach is that it means the code the user sees in the debugger editor is not the code that the user seems on their filesystem/on the network. Personally, I'd find that surprising. It also means that if we wanted to one day allow editing/saving of files via the devtools editor (like Chrome's workspaces), we'd have immediate problems because users would be editing the version with the document.write output in it.

Another downside that comes to mind would be around error messages. If one of these scripts throws an error, is the line number of that error based on the original HTML file, or the HTML file with the injected HTML? It seems like the user would expect that to map back to their editor, but we couldn't realistically do that if we render with the injected HTML.

To me, the best options seem like:

Treat the scripts as standalone files. I think this is what the second bullet point above is also mentioning? We could also consider naming them based on position, e.g. `foo.html (inline script #1) for each script tag so they show up in the tree, if that's a concern, but that could be for the future.
Treat document.write fragments as their own HTML source. This could be of nice and allow our existing infrastructure for line numbers and such to work by avoiding needing to rewrite the lines? I don't know if the injected inline scripts are getting line numbers based on the overall HTML file, or just the injected HTML snippet. Similiar to the JS case, I could imagine the file tree having foo.html and foo.html (fragment #1) or something in the long run too.

Flags: needinfo?(lsmyth)

Assignee

Comment 10

•

5 years ago

The injected HTML is handled downstream by Gecko in all the same ways as HTML that is streamed over the network, it's just that it comes in from document.write() calls instead of network packets. Inline scripts in the injected HTML will have line numbers relative to the start of the injected HTML, and so will errors reported via those scripts. That does make it harder to go with the second option in comment 7, as we would want to adjust line numbers for errors, stacks, etc. to match what the client is showing. (This concern also applies to the adjustment which bug 1538056 does, but that only changes column numbers so it isn't as problematic.)

An injected document's content can't be mixed together with that from a streamed document: calling document.open() wipes out everything in the document, streamed or otherwise. That point motivated the first option in comment 7; injected HTML documents are just a different kind of HTML document, so why not treat them in the same way. The HTML doesn't map to anything on disk or that was delivered by the network, but that same concern applies to eval'ed scripts. Using document.write() is essentially creating an eval'ed HTML document.

I don't think treating each document.write() call as a separate source in the client would work. For one, the line numbers will still need to be adjusted since the script/error lines are relative to the point the document was originally written. Additionally, document.write() just supplies text data for the parser, so an inline script or other HTML could be split between different write() calls and we could end up with a source actor being associated with multiple client sources, which doesn't really make sense.

Comment 11

•

5 years ago

An injected document's content can't be mixed together with that from a streamed document: calling document.open() wipes out everything in the document, streamed or otherwise. That point motivated the first option in comment 7; injected HTML documents are just a different kind of HTML document, so why not treat them in the same way. The HTML doesn't map to anything on disk or that was delivered by the network, but that same concern applies to eval'ed scripts. Using document.write() is essentially creating an eval'ed HTML document.

I think I'm not following something here. I thought mixing the content together so that the document.write results show up in the final result source text what what your option #1 is proposing. I think I'm just not following what your option #1 would actually show the user? Would you be able to elaborate on what the experience would be for users? What do they see in the sources panel and editor when the pause in a document.write-injected inline script?

Additionally, document.write() just supplies text data for the parser, so an inline script or other HTML could be split between different write() calls

Great point, I missed that.

Assignee

Comment 12

•

5 years ago

(In reply to Logan Smyth [:loganfsmyth] from comment #11)

I think I'm not following something here. I thought mixing the content together so that the document.write results show up in the final result source text what what your option #1 is proposing. I think I'm just not following what your option #1 would actually show the user? Would you be able to elaborate on what the experience would be for users? What do they see in the sources panel and editor when the pause in a document.write-injected inline script?

Option #1 would treat the written HTML as a different source from the original HTML for the document (if there is any). Over the course of its lifetime a given document can have at most one streamed HTML source, and at most one written HTML source (AFAICT, after calling document.write() and document.close(), further document.write() calls continue appending to the document). From the debugger client's perspective, these are separate sources, with different source actors. It has to be that way, really; when a document's contents are overwritten, its old scripts continue to exist, and need to be associated with the old HTML contents.

Below is a self contained example that overwrites the document and has scripts in both the old and new version (it also pauses at a debugger statement and exhibits this bug). When the client is viewing the source associated with the actor from the streamed HTML, it shows that streamed HTML. Likewise, when the client is viewing the source associated with the actor in the written HTML, it shows all HTML that has been written so far, excluding any original streamed content (which was wiped out when the writing began). When pausing at the debugger statement, it shows the the first <div> and the <script>, and after resuming it updates the source text so that the second <div> is shown as well.

The original HTML source would be shown in the sources tree with its usual title. The written source should have a different title, and could be shown in the source tree or kept out of it, like we do with eval sources.

<div>Hello</div>
<script>
setTimeout(() => {
document.write(`
<div>There</div>
<sc` + `ript>
debugger;
document.write("<div>Surprise!</div>");
</sc` + `ript>
`);
}, 1000);
</script>

Comment 13

•

5 years ago

Option #1 would treat the written HTML as a different source from the original HTML for the document (if there is any). Over the course of its lifetime a given document can have at most one streamed HTML source, and at most one written HTML source (AFAICT, after calling document.write() and document.close(), further document.write() calls continue appending to the document). From the debugger client's perspective, these are separate sources, with different source actors. It has to be that way, really; when a document's contents are overwritten, its old scripts continue to exist, and need to be associated with the old HTML contents.

Got it. And to verify, the "written HTML source" for your snippet above would still include the HTML from the original HTML source, but it would then also include the HTML content injected via document.write? It seems like it would have to, since the written HTML is interleaved with the original HTML. That I find concerning because it means that there either
a) a Debugger.Source would have to apply in multiple source actors, not a single one, which adds complexity around breakpoint handling
b) a Debugger.Source would apply to either the original or written, making the user experience of empty-lines/breakpoints in the UI confusing

Thoughts?

I think my preference would still be to treat inline scripts that were not in the original direct HTML source the same way we're treat evaled code. It is just an anonymous snippet of JS.

Assignee

Comment 14

•

5 years ago

(In reply to Logan Smyth [:loganfsmyth] from comment #13)

Got it. And to verify, the "written HTML source" for your snippet above would still include the HTML from the original HTML source, but it would then also include the HTML content injected via document.write? It seems like it would have to, since the written HTML is interleaved with the original HTML.

No, it would only include the written HTML. When an HTML document is streamed in from the network, and someone later calls document.write() or document.open() on it, all nodes in the document are cleared out. This includes the <script> elements, but the scripts associated with those elements will continue to run --- they have a life of their own once they start, independent from the element that created them. Returning to the example in comment 12, there would be two HTML sources in the client. The streamed source would be:

<div>Hello</div>
<script>
setTimeout(() => {
document.write(`
<div>There</div>
<sc` + `ript>
debugger;
document.write("<div>Surprise!</div>");
</sc` + `ript>
`);
}, 1000);
</script>

The written source when paused at the debugger statement would be:

<div>There</div>
<script>
debugger;
document.write("<div>Surprise!</div>");
</script>

After the script in the written source finishes, the written source would be updated to:

<div>There</div>
<script>
debugger;
document.write("<div>Surprise!</div>");
</script>
<div>Surprise!</div>

Each HTML source would have a distinct set of source actors, which are the source actors for its inline scripts. This is all exactly how we do it today. The differences would be:

a) HTML sources might not have a title / show up in the sources pane.
b) The contents of an HTML source can grow over time.

I think my preference would still be to treat inline scripts that were not in the original direct HTML source the same way we're treat evaled code. It is just an anonymous snippet of JS.

It is, yeah, but we'll need to fixup line numbers associated with errors, breakpoints, pause locations, etc. for the script so that they match the inline source rather than the written HTML the script is contained in. From Gecko's perspective the written HTML is no different from the original streamed HTML, so the inline script can have a start line different from one, unlike the other eval scripts we deal with.

Comment 15

•

5 years ago

No, it would only include the written HTML. When an HTML document is streamed in from the network, and someone later calls document.write() or document.open() on it, all nodes in the document are cleared out.

Gotcha.

It is, yeah, but we'll need to fixup line numbers associated with errors, breakpoints, pause locations, etc. for the script so that they match the inline source rather than the written HTML the script is contained in. From Gecko's perspective the written HTML is no different from the original streamed HTML, so the inline script can have a start line different from one, unlike the other eval scripts we deal with.

Since this is certainly non-trivial, this seems to come down to a combination of what final design we want, what avoids the most complexity, and what do we think will be most useful for users.

From a technical standpoint, I'm very concerned about the scale of changes that would be necessary to support source text that can change. The debugger assumes that source content is static all over the places. Having to change that is something that I personally find pretty scary and highly likely to introduce bugs or have us miss things.

From the user perspective, I'm unsure that users want or care about a few of strictly the tokens that were written into the page, without the context in which they were written. If I load an HTML file with

<body><div>other stuff<script>document.write("</div>");</script>

is it useful in any way for the user to have an additional source with just </div>?

In that context, it feels like all users will actually care about is the ability to view and interact with the script itself, and in that context, treating the inline script as a simple anonymous eval-like thing seems in line with that.

I agree that it may well be complex to have to update the position for these scripts, but I think that complexity is worth it when compared against user expectations and the complexity introduced into the debugger client, were we to have a secondary HTML file that could have its content update over time.

From Gecko's perspective the written HTML is no different from the original streamed HTML, so the inline script can have a start line different from one, unlike the other eval scripts we deal with.

Would we be able to detect HTML injected with reentrant logic and injected after the last network packet arrives? We could use that to switch from using the HTML positioning logic to defaulting to a blank URL at line 1 column 0 when instantiating scripts for these.

Assignee

Comment 16

•

5 years ago

(In reply to Logan Smyth [:loganfsmyth] from comment #15)

I agree that it may well be complex to have to update the position for these scripts, but I think that complexity is worth it when compared against user expectations and the complexity introduced into the debugger client, were we to have a secondary HTML file that could have its content update over time.

Sure, that sounds good, that was my main question in comment 7. I'll work on modifying the line numbers for references to these scripts to be relative to the start of the source instead of the start of the written HTML (the second option in comment 7).

From Gecko's perspective the written HTML is no different from the original streamed HTML, so the inline script can have a start line different from one, unlike the other eval scripts we deal with.

Would we be able to detect HTML injected with reentrant logic and injected after the last network packet arrives? We could use that to switch from using the HTML positioning logic to defaulting to a blank URL at line 1 column 0 when instantiating scripts for these.

I don't think we need to do this explicitly, we just need the source to recognize when it is not considered an inline script (which is the case here already) but that its start line is something other than 1. In such cases the line numbers associated with the source need to be biased.

Assignee

Comment 17

•

5 years ago

Attached file Bug 1559583 Part 1 - Add Debugger.Source.startLine, r=jimb. — Details

Assignee

Comment 18

•

5 years ago

Attached file Bug 1559583 Part 2 - Pad source contents with blank lines when they don't start at line 1, r=loganfsmyth. — Details

Depends on D40911

:Harald Kirschner :digitarald

Assignee

Comment 19

•

5 years ago

These patches use a third option, which is somewhat cheesy but much simpler and less invasive than either of the other two. When the server is sending a source's text to the client, and it notices the source's start line is greater than one, it just pads the source contents with blank lines so that the text it sends has the source's start on the right line. In a long written HTML document this could add a lot of blank lines to the source, but since these sources aren't included in the sources pane, users can't browse to them, and if we select a location in the source (by pausing, selecting a console message location, and so forth), then the debugger will scroll so that the right line is in view, and it doesn't really matter how many blank preceding lines there are.

Marco Mucci [:MarcoM]

Updated

•

5 years ago

Status: NEW → ASSIGNED

Comment 20

•

5 years ago

clearing ni?

Flags: needinfo?(hkirschner)

Pulsebot

Comment 21

•

5 years ago

Pushed by bhackett@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c6d224b63023
Part 1 - Add Debugger.Source.startLine, r=jimb.
https://hg.mozilla.org/integration/autoland/rev/e9636f3aa216
Part 2 - Pad source contents with blank lines when they don't start at line 1, r=loganfsmyth.

Andrei Ciure[:aciure]

Comment 22

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/c6d224b63023
https://hg.mozilla.org/mozilla-central/rev/e9636f3aa216

Status: ASSIGNED → RESOLVED

Closed: 5 years ago

status-firefox70: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → Firefox 70

Gregg Tavares

Comment 23

•

5 years ago

I must be missing it but this doesn't seem to be fixed

Tried Nightly 71.0a1 (2019-09-16) (64-bit)

Still doesn't work

https://i.imgur.com/fiOByHY.gif

Here's Chrome for comparison

https://i.imgur.com/nH5bfaK.gif

Gregg Tavares

Comment 24

•

5 years ago

sorry, maybe that last comment is a different bug

https://bugzilla.mozilla.org/show_bug.cgi?id=1562824

Gregg Tavares

Comment 25

•

5 years ago

Testing this issue though it's not fixed for me.

Nightly 71.0a1 (2019-09-16) (64-bit)

https://i.imgur.com/plmCLBb.gif

Chrome

https://i.imgur.com/cUGWEFv.gif

Jim Blandy :jimb

Updated

•

5 years ago

Flags: needinfo?(jimb)

:Harald Kirschner :digitarald

Comment 26

•

5 years ago

Brian, should we re-open this bug or file a new bug for the failing test in comment 25?

Flags: needinfo?(bhackett1024)

Assignee

Updated

•

5 years ago

Depends on: 1591743