Closed Bug 1288084 Opened 8 years ago Closed 5 years ago

[jsdbg2] Define properties to obtain the startLine/Column and endLine/Column of a source.

Categories

(Core :: JavaScript Engine, defect, P2)

defect

Tracking

()

RESOLVED FIXED

People

(Reporter: ejpbruel, Assigned: hsivonen)

References

Details

Attachments

(4 files, 5 obsolete files)

The Chrome Debugging Protocol defines an event to be fired when a new script is parsed/failed to parse (note that a script in the Chrome protocol corresponds to a source in the Firefox protocol). See:

https://chromedevtools.github.io/debugger-protocol-viewer/1-1/Debugger/#event-scriptParsed

The JSON RPC packet for these events contains parameters named startLine/Column and endLine/Column. These represent the line/column offset of the script within the resource with the given URL, and the (length of) the last line of the script, respectively.

The Debugger API currently does not provide a way to obtain this information (there is a way to obtain a startLine/lineCount from a script, but this is relative to the source that introduced the script, not relative to the resource that introduced the source).

The documentation for Debugger.Source mentions two properties, enclosingStart/lineCount to be implemented as future plans. See:

https://developer.mozilla.org/en-US/docs/Tools/Debugger-API/Debugger.Script

This is close to what we need, but does not seem to offer an easy way to get the startColumn/endColumn (we could obtain the endColumn by simply counting the number of characters on the last line of the source, but the startColumn seems less obvious).

Assuming that the design of these future plans is not set in stone, I'd like to propose redesigning this API so that it matches the way we intend to use it. In other words, instead of defining two properties enclosingStart and lineCount, let's define four properties startLine/Column and endLine/Column, that correspond to the parameters on the scriptParsed/scriptFailedToParse events.
Whiteboard: [devtools-html] [triage]
Blocks: 1263289
Flags: qe-verify-
Priority: -- → P2
Whiteboard: [devtools-html] [triage] → [devtools-html]
Going to start looking into this.
Assignee: nobody → ejpbruel
I made some good progress on this, but I've ran into some issues.

When a new JavaScript source is compiled, we want the start line of that source relative to the resource that introduced it. We pass this information to SpiderMonkey via the lineno property on the CompileOptions class. If we store this lineno property on the resulting ScriptSource, we can expose it with the Debugger.Source.prototype.startLine property.

I have written a few tests, and for static script elements this approach seems to work fine: in this case, lineno is the line number of the script element in the HTML document, which is exactly what we want.

For dynamically inserted script elements, the desired behavior is less clear: in this case, lineno is always 1. This makes sense if dynamically inserted script elements don't have a well-defined position in the HTML document. Is this indeed the behavior we want?

For static event handlers, the current behavior seems wrong: in this case, too, lineno is always 1. I would expect lineno to be the line number of the event handler in the HTML document. Is this behavior by design? If so, why? If not, can we change it?

Finally, I noticed that although we set the line of the source in nsScriptLoader.cpp, we don't seem to set the column. Getting the start column of source is also something we'd like to be able to do. Is it possible to implement this and how hard would this be?

Henri, my understanding is that you are the person to talk to about issues related to HTML parsing, so can you help me answer the above questions? Thank you in advance for your help!
Flags: needinfo?(hsivonen)
Attached patch WIP (obsolete) — Splinter Review
For reference, here is the WIP patch with the failing test.
I don't think we want to just blindly expose `CompileOptions::lineno` via `Debugger.Source.prototype.startLine`. In those cases where there is no enclosing document, `startLine` should return something like `undefined`.
(In reply to Jim Blandy :jimb from comment #4)
> I don't think we want to just blindly expose `CompileOptions::lineno` via
> `Debugger.Source.prototype.startLine`. In those cases where there is no
> enclosing document, `startLine` should return something like `undefined`.

Agreed, I will take that into account in the next version of the patch. It might be worthwhile to see what the Chrome protocol returns in that case.
(In reply to Eddy Bruel [:ejpbruel] from comment #2)
> I made some good progress on this, but I've ran into some issues.
> 
> When a new JavaScript source is compiled, we want the start line of that
> source relative to the resource that introduced it. We pass this information
> to SpiderMonkey via the lineno property on the CompileOptions class. If we
> store this lineno property on the resulting ScriptSource, we can expose it
> with the Debugger.Source.prototype.startLine property.
> 
> I have written a few tests, and for static script elements this approach
> seems to work fine: in this case, lineno is the line number of the script
> element in the HTML document, which is exactly what we want.
> 
> For dynamically inserted script elements, the desired behavior is less
> clear: in this case, lineno is always 1. This makes sense if dynamically
> inserted script elements don't have a well-defined position in the HTML
> document. Is this indeed the behavior we want?
> 
> For static event handlers, the current behavior seems wrong: in this case,
> too, lineno is always 1. I would expect lineno to be the line number of the
> event handler in the HTML document. Is this behavior by design? If so, why?
> If not, can we change it?
> 
> Finally, I noticed that although we set the line of the source in
> nsScriptLoader.cpp, we don't seem to set the column. Getting the start
> column of source is also something we'd like to be able to do. Is it
> possible to implement this and how hard would this be?
> 
> Henri, my understanding is that you are the person to talk to about issues
> related to HTML parsing, so can you help me answer the above questions?
> Thank you in advance for your help!

Needinfo ping.
(In reply to Eddy Bruel [:ejpbruel] from comment #2)
> For static event handlers, the current behavior seems wrong: in this case,
> too, lineno is always 1. I would expect lineno to be the line number of the
> event handler in the HTML document. Is this behavior by design? If so, why?
> If not, can we change it?

It is by design in the sense that our DOM nodes other than script element nodes don't track source location info to avoid the size overhead in all DOM nodes.

The event handler compilation isn't triggered by the HTML parser directly. The parser just sets the attributes and then DOM code takes care of passing the strings to the JS compiler.

It looks like event handler compilation happens synchronously when the attribute is set. smaug, is this so? If it is, we could do something ugly like sticking the line number is some global(-ish) variable before setting the attribute so that event handler compilation could get the line number out of band without bloating the DOM nodes themselves (and without having to add extra arguments to all the methods on the attribute setting path).

> Finally, I noticed that although we set the line of the source in
> nsScriptLoader.cpp, we don't seem to set the column. Getting the start
> column of source is also something we'd like to be able to do. Is it
> possible to implement this and how hard would this be?

Not hard. The Java version of the parser already tracks column info. However, because Gecko didn't care about the column, I omitted column tracking in the hope of a marginal perf win.

Basically, checkChar needs to be marked as not translated from Java (currently just reads from the buffer https://hg.mozilla.org/projects/htmlparser/file/dd08dec8acb7/src/nu/validator/htmlparser/impl/Tokenizer.java#l6512 ) and replaced with an implementation that also updates a column counter like this: https://hg.mozilla.org/projects/htmlparser/file/dd08dec8acb7/src/nu/validator/htmlparser/impl/ErrorReportingTokenizer.java#l230 . (I'd rather not hoist the implementation from ErrorReportingTokenizer to Tokenizer on the Java side in order to allow the case where error reporting isn't requested on the Java side to be JITted to faster code.)
Flags: needinfo?(hsivonen) → needinfo?(bugs)
(In reply to Henri Sivonen (:hsivonen) from comment #7)
> It looks like event handler compilation happens synchronously when the
> attribute is set. smaug, is this so?

Nope. See EventListenerManager::SetEventHandler. It has aDeferCompilation.
But we could still store the lineno in Listener.
Flags: needinfo?(bugs)
(In reply to Olli Pettay [:smaug] from comment #8)
> (In reply to Henri Sivonen (:hsivonen) from comment #7)
> > It looks like event handler compilation happens synchronously when the
> > attribute is set. smaug, is this so?
> 
> Nope. See EventListenerManager::SetEventHandler. It has aDeferCompilation.
> But we could still store the lineno in Listener.

Ok. I started diving into this. It looks like EventHandlerManager::CompileEventHandlerInternal always has access to an instance of Listener, regardless of whether or not compilation is deferred, so we can simply store the line number in the Listener in EventListenerManager::SetEventHandler, like Olli suggested.

What I have not yet figured out is how to get the line number in EventListenerManager::SetEventHandler.  Henri proposed using a global, so I assume that EventListenerManager::SetEventHandler is not called by the HTML parser directly. Otherwise, we could just pass the line number as an argument to the function.

How exactly does control flow from parsing an event handler to calling EventListenerManager::SetEventHandler? More specifically, where in the code should we set this global line number variable? Also, is EventListenerManager::SetEventHandler guaranteed to be called before the next event handler is parsed? If not, the line number information could be overwritten by the time we access it.
Flags: needinfo?(bugs)
Parser should set the global line number variable, somewhere.
It does call SetAttr on each attribute (and SetAttr then calls SetEventHandler), and before any event handler attribute it should set the variable and clear it afterwards.
EventListenerManager::SetEventHandler should effectively use something like
TakeCurrentHandlerNumber which takes the lineno and clears the global variable. That way Mutation Event Listeners modifying attributes won't end up reusing the variable.

The setup is a tad error prone, and I would name the variable something like gEventHandlerLinoForDevtools.
Flags: needinfo?(bugs)
(In reply to Olli Pettay [:smaug] from comment #10)
> Parser should set the global line number variable, somewhere.
> It does call SetAttr on each attribute (and SetAttr then calls
> SetEventHandler), and before any event handler attribute it should set the
> variable and clear it afterwards.
> EventListenerManager::SetEventHandler should effectively use something like
> TakeCurrentHandlerNumber which takes the lineno and clears the global
> variable. That way Mutation Event Listeners modifying attributes won't end
> up reusing the variable.
> 
> The setup is a tad error prone, and I would name the variable something like
> gEventHandlerLinoForDevtools.

Thanks Olli.

Henri, where in the parser are we calling SetAttr for event handlers? And how can I get the parser's knowledge about the current location at that point?
Flags: needinfo?(hsivonen)
(In reply to Eddy Bruel [:ejpbruel] from comment #11)
> Henri, where in the parser are we calling SetAttr for event handlers?

Normally at https://dxr.mozilla.org/mozilla-central/source/parser/html/nsHtml5TreeOperation.cpp#435

Also at https://dxr.mozilla.org/mozilla-central/source/parser/html/nsHtml5TreeOperation.cpp#322 when a second <body> adds attributes to the first <body>.

> And
> how can I get the parser's knowledge about the current location at that
> point?

The most memory-efficient (though illogical) place to transfer this info would be add a field for the line number to the attribute holder https://dxr.mozilla.org/mozilla-central/source/parser/html/nsHtml5HtmlAttributes.h (no need to spend time setting up the Java-to-C++ translation to get a proof-of-concept; I can do that later this week). Then set the line number (obtained by calling tokenizer->getLineNumber()) on the attribute holder in nsHtml5TreeBuilder::createElement (https://dxr.mozilla.org/mozilla-central/source/parser/html/nsHtml5TreeBuilderCppSupplement.h#71) and read it at the locations above.

This is memory-efficient, because the attribute holder is a pre-existing heap-allocated thing that can be extended with fields without bloating all tree ops. The logical place for this data would be in the tree op itself, but that would make *all* tree ops larger, since all ops have to be of the same size.
Flags: needinfo?(hsivonen)
(In reply to Olli Pettay [:smaug] from comment #10)
> Parser should set the global line number variable, somewhere.
> It does call SetAttr on each attribute (and SetAttr then calls
> SetEventHandler), and before any event handler attribute it should set the
> variable and clear it afterwards.
> EventListenerManager::SetEventHandler should effectively use something like
> TakeCurrentHandlerNumber which takes the lineno and clears the global
> variable. That way Mutation Event Listeners modifying attributes won't end
> up reusing the variable.
> 
> The setup is a tad error prone, and I would name the variable something like
> gEventHandlerLinoForDevtools.

Guys, there's got to be a better way than this. Haven't we regretted these sorts of "expedient" decisions often enough already? ("At long last, have you no shame, sir??") The devtools need the information they present to be trustworthy.
It should be trustworthy, when used right. Which is why the name should hint something about its usage here.

Adding some new parameters to SetAttr would be a bit overkill for this kind of case, IMO.
(and one would need to prove that adding new params wouldn't slow down attribute changes)
In theory we could add some AttrChangeContext* param to SetAttr and default to null. But just passing that param everywhere, only for this case... feels over-engineering.
I don't know the types here, which is why I hesitated to suggest anything more concrete, but:

Could you add a new virtual method to Element that takes an AttrChangeContext pointer, with a default definition that drops the pointer and just calls the ordinary definition? Then only the implementations that care would need to override it, and only the callers that have the data would need to call it.
(In reply to Olli Pettay [:smaug] from comment #14)
> It should be trustworthy, when used right.

But that's true of everything: threads, void *, nullable pointers, downcasts, ...

What matters is our experiences with using the construct in the past.
Well, SetAttr is overriden in many cases. So all those would need to pass around this new object.
Maybe it isn't too bad and it probably doesn't affect to performance much.

Looks like SetAttr would need to pass the object to AfterSetAttr which then calls AfterSetAttr.

But this global variable wouldn't be no worse than say, entering to a compartment before doing some JS stuff. Here one would enter to a "parsing attribute with known lineno" mode. No manual variable changing but some normal Auto* classes on stack.

I don't object passing some object around if it doesn't complicate code too much.

Would we have any other use for lineno or other parsing related information? Or other context stuff?
(In reply to Henri Sivonen (:hsivonen) from comment #12)
> (In reply to Eddy Bruel [:ejpbruel] from comment #11)
> > Henri, where in the parser are we calling SetAttr for event handlers?
> 
> Normally at
> https://dxr.mozilla.org/mozilla-central/source/parser/html/
> nsHtml5TreeOperation.cpp#435
> 
> Also at
> https://dxr.mozilla.org/mozilla-central/source/parser/html/
> nsHtml5TreeOperation.cpp#322 when a second <body> adds attributes to the
> first <body>.
> 
> > And
> > how can I get the parser's knowledge about the current location at that
> > point?
> 
> The most memory-efficient (though illogical) place to transfer this info
> would be add a field for the line number to the attribute holder
> https://dxr.mozilla.org/mozilla-central/source/parser/html/
> nsHtml5HtmlAttributes.h (no need to spend time setting up the Java-to-C++
> translation to get a proof-of-concept; I can do that later this week). Then
> set the line number (obtained by calling tokenizer->getLineNumber()) on the
> attribute holder in nsHtml5TreeBuilder::createElement
> (https://dxr.mozilla.org/mozilla-central/source/parser/html/
> nsHtml5TreeBuilderCppSupplement.h#71) and read it at the locations above.
> 
> This is memory-efficient, because the attribute holder is a pre-existing
> heap-allocated thing that can be extended with fields without bloating all
> tree ops. The logical place for this data would be in the tree op itself,
> but that would make *all* tree ops larger, since all ops have to be of the
> same size.

What is the point of (In reply to Henri Sivonen (:hsivonen) from comment #12)
> (In reply to Eddy Bruel [:ejpbruel] from comment #11)
> > Henri, where in the parser are we calling SetAttr for event handlers?
> 
> Normally at
> https://dxr.mozilla.org/mozilla-central/source/parser/html/
> nsHtml5TreeOperation.cpp#435
> 
> Also at
> https://dxr.mozilla.org/mozilla-central/source/parser/html/
> nsHtml5TreeOperation.cpp#322 when a second <body> adds attributes to the
> first <body>.
> 
> > And
> > how can I get the parser's knowledge about the current location at that
> > point?
> 
> The most memory-efficient (though illogical) place to transfer this info
> would be add a field for the line number to the attribute holder
> https://dxr.mozilla.org/mozilla-central/source/parser/html/
> nsHtml5HtmlAttributes.h (no need to spend time setting up the Java-to-C++
> translation to get a proof-of-concept; I can do that later this week). Then
> set the line number (obtained by calling tokenizer->getLineNumber()) on the
> attribute holder in nsHtml5TreeBuilder::createElement
> (https://dxr.mozilla.org/mozilla-central/source/parser/html/
> nsHtml5TreeBuilderCppSupplement.h#71) and read it at the locations above.
> 
> This is memory-efficient, because the attribute holder is a pre-existing
> heap-allocated thing that can be extended with fields without bloating all
> tree ops. The logical place for this data would be in the tree op itself,
> but that would make *all* tree ops larger, since all ops have to be of the
> same size.

This seems to contradict what Olli just said. As far as I can tell, EventListenerManager::SetEventHandler does not have access to nsHtml5HtmlAttributes. Wasn't that why we're considering a global in the first place?

Truth be told, now that I understand the code a bit better, I don't see the point in using a global here either: if we can obtain the current line number from the tokenizer in nsHtml5TreeBuilder::createElement, as Henri suggests, and nsHtml5TreeBuilder::createElement calls SetAttr, shouldn't we just pass the line number as an argument to SetAttr? Since each attribute can have a different line number, that seems like the natural thing to do.

What do we gain by passing this information via a global instead, other than making the intended data flow harder to follow?
(In reply to Olli Pettay [:smaug] from comment #17)
> But this global variable wouldn't be no worse than say, entering to a
> compartment before doing some JS stuff. Here one would enter to a "parsing
> attribute with known lineno" mode. No manual variable changing but some
> normal Auto* classes on stack.

When compartments were first implemented, from the beginning, we had hundreds of compartment-match assertions in SpiderMonkey; and indeed, we were sorting through compartment assertion failures for months. The success of compartments is only possible because we have had this thorough network of checks to catch mistakes. We persevered in this case because we had no alternative. In that light, I instead see our experience with compartments as showing just how hard globals are to get right.

> I don't object passing some object around if it doesn't complicate code too
> much.
> 
> Would we have any other use for lineno or other parsing related information?
> Or other context stuff?

I think most other contextual information is available from the element's document.
Needinfo for comment 18.
Flags: needinfo?(bugs)
You can pass the the lineno, or better to pass some more abstract "creationContext" to SetAttr, but be prepared to change several SetAttr and AfterSetAttr implementations (SetEventHandler is called in AfterSetAttr).
That could end up being a bit overkill given that we just need lineno in some rare case.
That is why the global lineno is still an option.
Flags: needinfo?(bugs)
If the concern is performance, I could live with using a global. If the concern is 'we'd have to change a lot of code' then I don't think the technical debt incurred later weighs up against the time we save now by using a global, so I'm going to try and pass the attribute location as an argument to SetAttr.
Here's a first attempt to add an extra argument to SetAttr for the creation context of an attribute. It's still a work in progress, but before I continue, I have a couple of questions:

1. Do you agree with the overall approach so far?
2. In particular, I've assumed that for now we only care about the creation
   context of an attribute when we are calling SetAttr from the HTML parser for
   an event handler. Rather than force all other callers of SetAttr to pass an
   explicit AttributeLocation, I've made it a default argument. Is that
   acceptable practice within DOM code?
3. For those classes that forward their implementation of SetAttr to the
   implementation of SetAttr on their base class, I've tried to forward the 
   AttributeLocation instead of relying on the default argument. Did I miss
   any egregious cases here?
4. The end goal here is to get the AttributeLocation to SetEventHandler. This
   is called from AfterSetAttr, which in turn is called from SetAttrAndNotify,
   so both these methods will need an AttributeLocation argument as well. 
   Moreover, since these methods are public, I'd like to use a default value
   here too. Is that ok?
5. Finally, and somewhat surprisingly, I noticed that UnsetAttr also calls 
   AfterSetAttr. Can we afford to simply pass a default location here?
Attachment #8782442 - Flags: feedback?(bugs)
(In reply to Eddy Bruel [:ejpbruel] from comment #23)
> Created attachment 8782442 [details] [diff] [review]

> 1. Do you agree with the overall approach so far?
Looks ok. You will need to update also all the AfterSetAttr implementations


> 2. In particular, I've assumed that for now we only care about the creation
>    context of an attribute when we are calling SetAttr from the HTML parser
> for
>    an event handler. Rather than force all other callers of SetAttr to pass
> an
>    explicit AttributeLocation, I've made it a default argument. Is that
>    acceptable practice within DOM code?
totally fine


> 3. For those classes that forward their implementation of SetAttr to the
>    implementation of SetAttr on their base class, I've tried to forward the 
>    AttributeLocation instead of relying on the default argument. Did I miss
>    any egregious cases here?
That is error prone indeed. Not sure how we can enforce that each SetAttr and AfterSetAttr deal with the case properly.
Static analysis?


> 4. The end goal here is to get the AttributeLocation to SetEventHandler. This
>    is called from AfterSetAttr, which in turn is called from
> SetAttrAndNotify,
>    so both these methods will need an AttributeLocation argument as well. 
>    Moreover, since these methods are public, I'd like to use a default value
>    here too. Is that ok?
That should be ok. And yes, this adds tons of stuff everywhere, but perhaps that is ok.


> 5. Finally, and somewhat surprisingly, I noticed that UnsetAttr also calls 
>    AfterSetAttr. Can we afford to simply pass a default location here?
Well, UnsetAttr calls AfterSetAttr with null value. So yes, default location is fine


FWIW, we're aiming to make SetAttr non-virtual and all the specialization should happen in Before/AfterSetAttr, so hopefully this code will become a bit nicer.
Comment on attachment 8782442 [details] [diff] [review]
Add an AttributeLocation argument to the SetAttr method (WIP).

I'd be still interested to know if AttributeLocation shows up in some way in performance profiles when doing attributes changes in a loop.
Attachment #8782442 - Flags: feedback?(bugs) → feedback+
(In reply to Eddy Bruel [:ejpbruel] from comment #18)
> This seems to contradict what Olli just said. As far as I can tell,
> EventListenerManager::SetEventHandler does not have access to
> nsHtml5HtmlAttributes. Wasn't that why we're considering a global in the
> first place?

I suggested a global mainly to avoid adding the run-time cost of propagating yet another argument through multiple layers to all post-parsing calls to SetAttr for non-event handler attributes.

> Truth be told, now that I understand the code a bit better, I don't see the
> point in using a global here either: if we can obtain the current line
> number from the tokenizer in nsHtml5TreeBuilder::createElement,

BTW, what I suggested about obtaining the line number gives you the line number of the ">" that ends the tag. Most often the whole tag is on one line, so this is OK. If you want correctness on the level of dealing with tags that have line breaks within the tag, that's doable, too, but more invasive. (I suggest doing what I suggested first and then extending the solution to per-attribute line numbers if really deemed necessary.)
(In reply to Olli Pettay [:smaug] from comment #25)
> Comment on attachment 8782442 [details] [diff] [review]
> Add an AttributeLocation argument to the SetAttr method (WIP).
> 
> I'd be still interested to know if AttributeLocation shows up in some way in
> performance profiles when doing attributes changes in a loop.

I'd be happy to profile these changes, if you can give me some pointers how to do so.
I've made further progress on this patch. I now propagate the AttributeLocation from SetAttr all the way to CompileEventHandlerInternal, where it is needed.

To minimize the number of callers that I had to change, I've used default values for AttributeLocation wherever possible. The only exception to this is when subclasses override a method on their base class, and the overridden method forwards to the base class method. In those cases, I forward the location argument as well.

Olli, please let me know whether you are happy with this patch as it is. If not, please let me know what you'd like top see changed.
Attachment #8782442 - Attachment is obsolete: true
Attachment #8782920 - Flags: feedback?(bugs)
(In reply to Henri Sivonen (:hsivonen) from comment #12)
> (In reply to Eddy Bruel [:ejpbruel] from comment #11)
> > Henri, where in the parser are we calling SetAttr for event handlers?
> 
> Normally at
> https://dxr.mozilla.org/mozilla-central/source/parser/html/
> nsHtml5TreeOperation.cpp#435
> 
> Also at
> https://dxr.mozilla.org/mozilla-central/source/parser/html/
> nsHtml5TreeOperation.cpp#322 when a second <body> adds attributes to the
> first <body>.
> 
> > And
> > how can I get the parser's knowledge about the current location at that
> > point?
> 
> The most memory-efficient (though illogical) place to transfer this info
> would be add a field for the line number to the attribute holder
> https://dxr.mozilla.org/mozilla-central/source/parser/html/
> nsHtml5HtmlAttributes.h (no need to spend time setting up the Java-to-C++
> translation to get a proof-of-concept; I can do that later this week). Then
> set the line number (obtained by calling tokenizer->getLineNumber()) on the
> attribute holder in nsHtml5TreeBuilder::createElement
> (https://dxr.mozilla.org/mozilla-central/source/parser/html/
> nsHtml5TreeBuilderCppSupplement.h#71) and read it at the locations above.
> 
> This is memory-efficient, because the attribute holder is a pre-existing
> heap-allocated thing that can be extended with fields without bloating all
> tree ops. The logical place for this data would be in the tree op itself,
> but that would make *all* tree ops larger, since all ops have to be of the
> same size.

It looks like the tokenizer is only available in nsHTML5TreeBuilderCppSupplement, so I assume that what you meant to do was obtain the line number there, and then pass that as an extra argument to nsHTML5TreeOperation::CreateElement.

I've tried that approach, but am running in to some trouble, because nsHTML5TreeOperation::CreateElement is also called from nsHTML5TreeOperation::Perform. How do I obtain the line number in that case?
Flags: needinfo?(hsivonen)
(In reply to Eddy Bruel [:ejpbruel] from comment #29)
> (In reply to Henri Sivonen (:hsivonen) from comment #12)
> > (In reply to Eddy Bruel [:ejpbruel] from comment #11)
> > > Henri, where in the parser are we calling SetAttr for event handlers?
> > 
> > Normally at
> > https://dxr.mozilla.org/mozilla-central/source/parser/html/
> > nsHtml5TreeOperation.cpp#435
> > 
> > Also at
> > https://dxr.mozilla.org/mozilla-central/source/parser/html/
> > nsHtml5TreeOperation.cpp#322 when a second <body> adds attributes to the
> > first <body>.
> > 
> > > And
> > > how can I get the parser's knowledge about the current location at that
> > > point?
> > 
> > The most memory-efficient (though illogical) place to transfer this info
> > would be add a field for the line number to the attribute holder
> > https://dxr.mozilla.org/mozilla-central/source/parser/html/
> > nsHtml5HtmlAttributes.h (no need to spend time setting up the Java-to-C++
> > translation to get a proof-of-concept; I can do that later this week). Then
> > set the line number (obtained by calling tokenizer->getLineNumber()) on the
> > attribute holder in nsHtml5TreeBuilder::createElement
> > (https://dxr.mozilla.org/mozilla-central/source/parser/html/
> > nsHtml5TreeBuilderCppSupplement.h#71) and read it at the locations above.
> > 
> > This is memory-efficient, because the attribute holder is a pre-existing
> > heap-allocated thing that can be extended with fields without bloating all
> > tree ops. The logical place for this data would be in the tree op itself,
> > but that would make *all* tree ops larger, since all ops have to be of the
> > same size.
> 
> It looks like the tokenizer is only available in
> nsHTML5TreeBuilderCppSupplement, so I assume that what you meant to do was
> obtain the line number there, and then pass that as an extra argument to
> nsHTML5TreeOperation::CreateElement.

No, I meant adding it as a field to nsHtml5HtmlAttributes. However, let's forget that. Attached is a patch that exposes per-attribute line numbers, so you don't need to do anything in nsHTML5TreeBuilderCppSupplement.

This patch is untested beyond checking that it compiles. Also, I'm already aware that this results in the wrong line number for unquoted attributes whose value is terminated by a line break and for valueless attributes that have a line break between them and the next non-whitespace character.

I can fix this before shipping this, but this patch should be enough to unblock your efforts.

That is, with this patch, you can call aAttributes->getLineNoBoundsCheck(i) in nsHtml5TreeOperation::CreateElement(). Note that it returns -1 when the attribute wasn't in the source code. (Can happen even with parser-created attributes in the case of <isindex>, plain text or view source.)

> I've tried that approach, but am running in to some trouble, because
> nsHTML5TreeOperation::CreateElement is also called from
> nsHTML5TreeOperation::Perform. How do I obtain the line number in that case?

The line number needs to travel in the nsHtml5HtmlAttributes object, since that object travels across threads.
Flags: needinfo?(hsivonen)
(In reply to Henri Sivonen (:hsivonen) from comment #30)
> Created attachment 8783308 [details] [diff] [review]
> Expose a line number for each attribute
> 
> (In reply to Eddy Bruel [:ejpbruel] from comment #29)
> > (In reply to Henri Sivonen (:hsivonen) from comment #12)
> > > (In reply to Eddy Bruel [:ejpbruel] from comment #11)
> > > > Henri, where in the parser are we calling SetAttr for event handlers?
> > > 
> > > Normally at
> > > https://dxr.mozilla.org/mozilla-central/source/parser/html/
> > > nsHtml5TreeOperation.cpp#435
> > > 
> > > Also at
> > > https://dxr.mozilla.org/mozilla-central/source/parser/html/
> > > nsHtml5TreeOperation.cpp#322 when a second <body> adds attributes to the
> > > first <body>.
> > > 
> > > > And
> > > > how can I get the parser's knowledge about the current location at that
> > > > point?
> > > 
> > > The most memory-efficient (though illogical) place to transfer this info
> > > would be add a field for the line number to the attribute holder
> > > https://dxr.mozilla.org/mozilla-central/source/parser/html/
> > > nsHtml5HtmlAttributes.h (no need to spend time setting up the Java-to-C++
> > > translation to get a proof-of-concept; I can do that later this week). Then
> > > set the line number (obtained by calling tokenizer->getLineNumber()) on the
> > > attribute holder in nsHtml5TreeBuilder::createElement
> > > (https://dxr.mozilla.org/mozilla-central/source/parser/html/
> > > nsHtml5TreeBuilderCppSupplement.h#71) and read it at the locations above.
> > > 
> > > This is memory-efficient, because the attribute holder is a pre-existing
> > > heap-allocated thing that can be extended with fields without bloating all
> > > tree ops. The logical place for this data would be in the tree op itself,
> > > but that would make *all* tree ops larger, since all ops have to be of the
> > > same size.
> > 
> > It looks like the tokenizer is only available in
> > nsHTML5TreeBuilderCppSupplement, so I assume that what you meant to do was
> > obtain the line number there, and then pass that as an extra argument to
> > nsHTML5TreeOperation::CreateElement.
> 
> No, I meant adding it as a field to nsHtml5HtmlAttributes. However, let's
> forget that. Attached is a patch that exposes per-attribute line numbers, so
> you don't need to do anything in nsHTML5TreeBuilderCppSupplement.
> 
> This patch is untested beyond checking that it compiles. Also, I'm already
> aware that this results in the wrong line number for unquoted attributes
> whose value is terminated by a line break and for valueless attributes that
> have a line break between them and the next non-whitespace character.
> 
> I can fix this before shipping this, but this patch should be enough to
> unblock your efforts.
> 
> That is, with this patch, you can call aAttributes->getLineNoBoundsCheck(i)
> in nsHtml5TreeOperation::CreateElement(). Note that it returns -1 when the
> attribute wasn't in the source code. (Can happen even with parser-created
> attributes in the case of <isindex>, plain text or view source.)
> 
> > I've tried that approach, but am running in to some trouble, because
> > nsHTML5TreeOperation::CreateElement is also called from
> > nsHTML5TreeOperation::Perform. How do I obtain the line number in that case?
> 
> The line number needs to travel in the nsHtml5HtmlAttributes object, since
> that object travels across threads.

Thanks for your help! I will take another shot at this tomorrow, armed with your patch.
Comment on attachment 8782920 [details] [diff] [review]
Propagate AttributeLocation from SetAttr to CompileEventHandlerInternal.

struct AttributeLocation {
nit, { goes to its own line.

I'm a bit worried whether AttributeLocation gets passed to all the needed SetAttr and AfterSetAttr impls. But don't have good suggestions how to make the situation better.


EventListenerManager::SetEventHandler shouldn't have optional aLocation, nor
SetEventHandlerInternal
Attachment #8782920 - Flags: feedback?(bugs) → feedback+
This patch should be ready for review. Note that none of these changes depend on Henri's patch. I will file a separate patch for those.

I couldn't think of a good way to test these changes, so I haven't written any tests for this patch. However, since the goal of this patch is to get the attribute location into the debugger API, we can test these changes there.

I already have such a test for the debugger API, but it requires this patch, the changes that depend on Henri's patch, and the changes in the debugger API, to land first. With all three patches applied, the test passes on my machine. I still have to test this extensively on try before landing it, of course.
Attachment #8782920 - Attachment is obsolete: true
Attachment #8784311 - Flags: review?(bugs)
These changes depends on the patch by Henri. It uses getLineNoBoundsCheck to obtain an AttributeLocation and pass it to SetAttr. In combination with the previous patch, this will cause the line number to be passed to the CompileOptions for each event handler, which was our goal.
Attachment #8784312 - Flags: review?(hsivonen)
Comment on attachment 8784312 [details] [diff] [review]
Obtain AttributeLocation from getLineNoBoundsCheck.

Please also check aFromParser and set the line number to your "unknown" value (apparently 1) if aFromParser is something other than FROM_PARSER_NETWORK, since the line number given by the parser when parsing innerHTML or document.write() is going to be confusing to the user of dev tools.

Sorry about my failure to realize and mention this earlier.
Attachment #8784312 - Flags: review?(hsivonen) → review-
New patch with comments by Henri addressed.
Attachment #8784839 - Flags: review?(hsivonen)
Attachment #8784312 - Attachment is obsolete: true
As Jim pointed out during our standup yesterday, this patch does not strictly depend on the previous patches in that we could land it if we changed the line number for static event handler source in the test to 1 (the purpose of the previous patches is to give us the *actual* line number of the event handler source).
Attachment #8777799 - Attachment is obsolete: true
Attachment #8784845 - Flags: review?(jimb)
Attachment #8784839 - Flags: review?(hsivonen) → review+
Thanks for the review Henri.

As you know, this patch depends on you changes to the HTML parser, so do you have an ETA on when you can land those for me? Thanks!
Flags: needinfo?(hsivonen)
(In reply to Eddy Bruel [:ejpbruel] from comment #38)
> As you know, this patch depends on you changes to the HTML parser, so do you
> have an ETA on when you can land those for me? Thanks!

Once the patch I'm attaching now passes review, you can push it to inbound along with your patches here. I can then land the corresponding changeset to the htmlparser repo shortly after.

This patch does a better job than the previous version where there are line breaks within a tag. Attributes that don't have a value report the line of the attribute name. Attributes that do have a value report the line that the value started on. Isindex propagates the original line numbers.
Attachment #8783308 - Attachment is obsolete: true
Flags: needinfo?(hsivonen)
Attachment #8785950 - Flags: review?(wchen)
Comment on attachment 8784311 [details] [diff] [review]
Propagate AttributeLocation from SetAttr to CompileEventHandlerInternal.

Please don't add mColumn if it won't be used at all. It just takes memory.

That fixed, or explained why mColumn needs to be there in AttributeLocation, r+
Attachment #8784311 - Flags: review?(bugs) → review+
Comment on attachment 8784845 [details] [diff] [review]
Implement Debugger.Source.prototype.startLine.

Review of attachment 8784845 [details] [diff] [review]:
-----------------------------------------------------------------

::: devtools/server/tests/mochitest/chrome.ini
@@ +37,5 @@
>  [test_css-properties_02.html]
>  [test_Debugger.Source.prototype.introductionScript.html]
>  [test_Debugger.Source.prototype.introductionType.html]
>  [test_Debugger.Source.prototype.element.html]
> +[test_Debugger.Source.prototype.startLine.html]

Can't we write shell-only tests for this, using the options to the shell `evaluate` function?

http://searchfox.org/mozilla-central/rev/0dd403224a5acb0702bdbf7ff405067f5d29c239/js/src/shell/js.cpp#5400

::: js/src/jsscript.cpp
@@ +2344,5 @@
>          if (!formatted)
>              return false;
>          filename_.reset(formatted);
> +
> +        lineno_ = options.introductionLineno;

Is this correct? The "introduction" data is supposed to call out the location of the function call (eval, Function, etc.) that introduced a script to the system. It's not at all the position of the script within a larger document.

@@ +2349,5 @@
>      } else if (options.filename()) {
>          if (!setFilename(cx, options.filename()))
>              return false;
> +
> +        lineno_ = options.lineno;

This, however, seems just right. ReadOnlyCompileOptions::column would be appropriate here, too.

::: js/src/jsscript.h
@@ +655,5 @@
>      SourceType data;
>  
>      // The filename of this script.
>      UniqueChars filename_;
> +    unsigned lineno_;

This should probably be named `enclosingLine_`. Needs a comment referring to "the enclosing document" or some such.

::: js/src/vm/Debugger.cpp
@@ +6742,5 @@
> +        ScriptSource* ss = sourceObject->source();
> +        return (uint32_t) ss->lineno();
> +    }
> +    ReturnType match(Handle<WasmInstanceObject*> wasmInstance) {
> +        return 0;

Since we're returning a garbage value here, it might as well be a garbage value in range. Code expecting a value greater than or equal to 1 will barf the first time it hits a wasm script.

@@ +7104,5 @@
>      return true;
>  }
>  
>  static const JSPropertySpec DebuggerSource_properties[] = {
> +    JS_PSG("startLine", DebuggerSource_getStartLine, 0),

Could this be named "enclosingLine", to match the unimplemented "enclosingStart" in the spec?
Attachment #8784845 - Flags: review?(jimb) → feedback+
Attachment #8785950 - Flags: review?(wchen) → review+
Whiteboard: [devtools-html]
Try push for the following patch:
- Expose a line number for each attribute, v2

https://treeherder.mozilla.org/#/jobs?repo=try&revision=5b3ecf1c4e74&selectedJob=27672499
Whiteboard: [leave-open]
Pushed by ejpbruel@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/496081cb3214
Expose a line number for each attribute, v2; r=wchen
https://hg.mozilla.org/projects/htmlparser/rev/137df5cc6d3204519f8807bbfb6309d11bf5a4cb
Mozilla bug 1288084 - Expose a line number for each attribute in C++. r=wchen.
Assignee: ejpbruel → nobody
Keywords: leave-open
Whiteboard: [leave-open]

The leave-open keyword is there and there is no activity for 6 months.
:sdetar, maybe it's time to close this bug?

Flags: needinfo?(sdetar)

Henri, do you why we set the leave-open keyword on this, do you think we should keep it open or close this bug?

Flags: needinfo?(sdetar) → needinfo?(hsivonen)

(In reply to Steven DeTar [:sdetar] from comment #47)

Henri, do you why we set the leave-open keyword on this, do you think we should keep it open or close this bug?

I have no idea. Let's close.

Status: NEW → RESOLVED
Closed: 5 years ago
Flags: needinfo?(hsivonen)
Keywords: leave-open
Resolution: --- → FIXED
Assignee: nobody → hsivonen
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: