Closed Bug 1184498 Opened 9 years ago Closed 9 years ago

error recovery in parser

Categories

(L20n :: JS Library, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Pike, Assigned: Pike)

References

Details

Attachments

(1 file, 1 obsolete file)

I've got an idea on how to get to better error recovery, I'll take a stab at that today.

Basically, when creating the JunkEntry, search the end starting from _curWhatever rather+1 than pos.
Stas, gandalf, what do you think?

From the commit message: When finding parser errors, make the JunkEntry as small as possible, and continue parsing.
If there are adjancent JunkEntries, glue them together.
After parsing, getEntry on just the JunkEntry to create error messages.
Assignee: nobody → l10n
Status: NEW → ASSIGNED
Attachment #8634686 - Flags: feedback?(stas)
Attachment #8634686 - Flags: feedback?(gandalf)
PS: travis complains about my hack to actually get out of "this", I might need to create the that somewhere else.

Note, I find the use of "this" peculiar in the code, we're setting internal parsing state properties on the global export.
Comment on attachment 8634686 [details] [review]
Create minimal JunkEntries, non-adjacent, create error messages after the fact

That looks good to me. I'm a little bit worried about perf impact on the entries parser if we'll go for this strategy, but that's mostly because the getResource is growing in complexity.

The latter part of your patch can be hidden behind some simple flag that is set to true only if at least one JunkEntry is created so that we can skip it entirely in the optimistic scenario.
Attachment #8634686 - Flags: feedback?(gandalf) → feedback+
I added a shortcut to my original branch, but also refactored the code to just have one lingering junk entry around, as an alternative.

Compare https://github.com/l20n/l20n.js/compare/master...Pike:minimal-junk-one-at-a-time?expand=1 vs https://github.com/l20n/l20n.js/compare/master...Pike:minimal-junk?expand=1

Not sure which I like better.

I started looking into the entries parser, and the non-emit code path puzzles me, in particular if we're going on with parsing to find a better error message.

I am confused a bit in which case we'd only ever wanted the first possible error, and not errors coming later.
(In reply to Axel Hecht [:Pike] from comment #5)
> I added a shortcut to my original branch, but also refactored the code to
> just have one lingering junk entry around, as an alternative.
> 
> Compare
> https://github.com/l20n/l20n.js/compare/master...Pike:minimal-junk-one-at-a-
> time?expand=1 vs
> https://github.com/l20n/l20n.js/compare/master...Pike:minimal-junk?expand=1
> 
> Not sure which I like better.

I like that parseErrors is a separate function in minimal-junk-one-at-a-time.  However, AFAIK, it runs that.getEntry() on every subsequent JunkEntry found, vs. once per JunkEntries group in minimal-junk?

> I started looking into the entries parser, and the non-emit code path
> puzzles me, in particular if we're going on with parsing to find a better
> error message.
> 
> I am confused a bit in which case we'd only ever wanted the first possible
> error, and not errors coming later.

This might be legacy from when there was only one parser.  I'd be okay with removing the non-emit path entirely.
https://github.com/Pike/l20n.js/blob/68b87e74ba53119736c390745d93fbeb6b44cda5/src/lib/format/l20n/ast/parser.js#L42 glues the currently pending junk entry together with the newly incoming one.

parseError() is only called once we actually have a new successful entry to add the AST, and thus, we never end up with adjacent JunkEntries, nor with error messages that aren't related to that entry.

... And then at the very end.
(In reply to Staś Małolepszy :stas from comment #6)
> > I am confused a bit in which case we'd only ever wanted the first possible
> > error, and not errors coming later.
> 
> This might be legacy from when there was only one parser.  I'd be okay with
> removing the non-emit path entirely.

The non-emit path is only useful when we run the entries parser without an emitter - like in case of perf benchmarks or tools/parse.

We probably can remove it and just enforce passing some eventemitter in those cases.
Stas, do you think we should do this?
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #8)
> (In reply to Staś Małolepszy :stas from comment #6)

> The non-emit path is only useful when we run the entries parser without an
> emitter - like in case of perf benchmarks or tools/parse.

But then it stops ont he first error which is probably not very useful anyways ;)

> We probably can remove it and just enforce passing some eventemitter in
> those cases.

Yeah, totally, let's make `emit` a required argument.
Comment on attachment 8634686 [details] [review]
Create minimal JunkEntries, non-adjacent, create error messages after the fact

I started working on refactoring the parser return value:  instead of emitting we can simply return all errors since the whole parsing is synchronous anyways.  It's quite a task and I don't want to block you from working on the entries parser.  I suggest then to go ahead and add junk gluing from this PR to the entries parser now (just remove the non-emit path in parse()) and I'll rebase my wip accordingly next week or so.
Attachment #8634686 - Flags: feedback?(stas) → feedback+
Attached file PR on branch
I've went ahead with the more evolved code-path on the ast parser, that doesn't reparse if it doesn't have to.

The port to the entries parser is a bit over my head.

I got to write some tests, and an ad-hoc port, but the combination is failing, and I lack insight on how to go forward in a smart way.

Also, with emit() being late, the difference between parser errors and duplicateentry ones becomes more interesting. Don't have a good idea on that one yet.
Attachment #8634686 - Attachment is obsolete: true
Blocks: 1189841
This has landed already IIRC
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: