Closed Bug 1639564 Opened 4 years ago Closed 4 years ago

moz-phab is broken with --less-context / files above 4MB | `Unexpected leading character "@"` error

Categories

(Conduit :: moz-phab, defect, P3)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: sg, Assigned: glob)

References

Details

(Keywords: conduit-triaged)

Attachments

(2 files)

When accessing https://phabricator.services.mozilla.com/D76139, the following error is shown.

Unexpected leading character "@" at line index 104!
Keywords: conduit-triaged
Priority: -- → P2

:sg -- can you please include which version of mercurial you are using, as well as how you submitted your patch to Phabricator?

Flags: needinfo?(sgiesecke)

https://sentry.prod.mozaws.net/operations/mozphab-prod/issues/8495417

Exception: Unexpected leading character "@" at line index 104!
  File "/phabricator/src/applications/differential/parser/DifferentialHunkParser.php", line 507, in DifferentialHunkParser::parseHunksForLineData
    throw new Exception(
  File "/phabricator/src/applications/differential/parser/DifferentialChangesetParser.php", line 672, in DifferentialChangesetParser::process
    $hunk_parser->parseHunksForLineData($changeset->getHunks());
  File "/phabricator/src/applications/differential/parser/DifferentialChangesetParser.php", line 660, in DifferentialChangesetParser::tryCacheStuff
    $this->process();
  File "/phabricator/src/applications/differential/parser/DifferentialChangesetParser.php", line 816, in DifferentialChangesetParser::render
    $this->tryCacheStuff();
  File "/phabricator/src/applications/differential/parser/DifferentialChangesetParser.php", line 76, in DifferentialChangesetParser::renderChangeset
    return $this->render($this->rangeStart, $this->rangeEnd, $this->mask);
...
(19 additional frame(s) were not displayed)
Attached file data.text

Hunk dump from Phabricator DB

I suspect this is a bug in moz-phab or possibly your version of Mercurial, not Phabricator. The data or corpus of the single hunk that is causing this problem (which is parsed and sent to Phabricator) appears to include additional diff headers that are invalid, the first of which is at line 105 in the attached file. I'm not sure how those headers made it into the diff/hunk, but this requires further investigation.

Possible relevant code in moz-phab: https://github.com/mozilla-conduit/review/blob/master/mozphab/mercurial.py#L894

Attached is the full body from the hunk that's causing the problem (phabricator_differential.differential_hunk.data with id == 2442624)

Snippet below as an example:

    1  /******************************************************************************
    2  ** This file is an amalgamation of many separate C source files from SQLite
    3 -** version 3.31.1.  By combining all the individual C code files into this
    4 +** version 3.32.0.  By combining all the individual C code files into this
    5  ** single large file, the entire code can be compiled as a single translation
    6  ** unit.  This allows many compilers to do optimizations that would not be
...
  101  #endif
  102  #if SQLITE_CHECK_PAGES
  103    "CHECK_PAGES",
  104  #endif
  105 @@ -121,524 +121,521 @@ static const char * const sqlite3azCompi
  106  #endif
  107  #if SQLITE_DEFAULT_AUTOVACUUM
  108    "DEFAULT_AUTOVACUUM",
  109  #endif
...

Interesting. I submitted that using moz-phab. My moz-phab version is MozPhab (0.1.86) (at least I think that was the case when submitting that patch as well). My mercurial version is 4.9-1.fc30 (from Fedora 30).

Flags: needinfo?(sgiesecke)

I've pushed this patch to phabricator-dev
The bug shows up.
Mercurial 5.2, MacOS.
Latest MozPhab.

I believe this issue is manifesting because the number of lines of context to show (of sqlite3.c) is greater than the hard-coded environment.MAX_CONTEXT_SIZE value (currently set to 4 * 1024 * 1024), so it is defaulting to 100. moz-phab expects the diff to include a single header and the full context (and thus strips the headers accordingly) and does not seem to support hunks larger than 4,194,304 lines. For those diffs, the context size defaults to 100, and causes multiple headers to be included in the diff and thus breaks the functionality.

For the record, sqlite3.c's unified diff is 8,116,000 lines. One quick workaround could be to split this into multiple changesets that do not span the length of the file, or temporarily change the hard-coded context size value here.

Successful patch and relevant section of the code.

Component: Phabricator → moz-phab
Priority: P2 → P3

I am not sure, maybe this can resolved as WONTFIX if it's only an issue for this huge file. This patch isn't going to land, RyanVM is usually updating SQLite. I only need to have that temporarily in the stack, since the following patches require a newer SQLite version. I am ni'ing Ryan, so maybe he can tell if it's an issue for him.

Flags: needinfo?(ryanvm)

Fair enough -- I do think there should be an error shown rather than defaulting to multiple hunks per diff, as a cleaner behaviour, so that no invalid diffs are sent to Phabricator. :zalun thoughts?

https://phabricator.services.mozilla.com/D76860 seems to be working without error, though I do use phabricator.py in case that matters.

Flags: needinfo?(ryanvm)

:RyanVM -- that would make a difference, as it's up to the client to parse hunks/diffs and send them via the API; I imagine phabricator.py is doing it differently.

The code for processing lines is just completely broken in the presence of multiple hunks:
https://github.com/mozilla-conduit/review/blob/a7fe91713fadc98ce149483e22682e773977cd79/mozphab/mercurial.py#L877-L926
It needs to process the diff output and create a separate Diff.Hunk for each hunk in the diff.

If you use --less-context and your changes are less than 200 lines apart you'll be fine since the hunks will overlap and merge into one, but any more than that and you'll end up with hunk headers inside a hunk's corpus, which causes the error on phab. It's lucky moz-phab uses 100 as the "less" context value, instead of arc's 3, else this problem might show up a bit more (though I'm guessing --less-context isn't used that much).

Summary: `Unexpected leading character` error → --less-context is broken in moz-phab | `Unexpected leading character "@"` error

I reported this issue originally, and I wasn't using --less-context. I think that should be filed as a separate bug, or at least the title of this one should make it clear that this does not only happen with --less-context.

Sure, but third_party/sqlite3/src/sqlite3.c is larger than 4 * 1024 * 1024, at which point moz-phab automatically turns on --less-context for that file, so if any hunks are more than 200 lines apart it'll trigger the bug. I've updated the title again to hopefully capture this (but left out the hunk distance part so it doesn't become too unwieldy).

Summary: --less-context is broken in moz-phab | `Unexpected leading character "@"` error → moz-phab is broken with --less-context / files above 4MB | `Unexpected leading character "@"` error

Ah, now I understand that. Thanks for clarifying that! I just thought it might be unrelated issues causing the same error message, but now I understand the connection!

Assignee: nobody → glob
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Regressions: 1680836
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: