Closed Bug 1328880 Opened 7 years ago Closed 7 years ago

Log line numbers differ between Treeherder and the unified log viewer

Categories

(Tree Management :: Treeherder, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: RyanVM, Assigned: emorley)

References

Details

Not sure if this only affects Windows Taskcluster builds or others as well. See the log below:
https://treeherder.mozilla.org/logviewer.html#?job_id=66423492&repo=try&lineNumber=10049

The upper box shows the failures starting at line 10049, but line 10049 down below appears to be something entirely different.
Ok so this is strange. For this error line:

13:16:17     INFO -  z:/build/build/src/gfx/harfbuzz/src/hb-ot-layout.cc(190): error C3861: 'hb_ot_shaper_face_data_ensure': identifier not found

grepping the raw log shows it as being on line 10049:

$ grep -Hn 'hb-ot-layout.cc(190): error C3861' live_backing.log
live_backing.log:10049:13:16:17     INFO -  z:/build/build/src/gfx/harfbuzz/src/hb-ot-layout.cc(190): error C3861: 'hb_ot_shaper_face_data_ensure': identifier not found

Treeherder's log parser says 10049, agreeing with grep.

However Sublime Text 3 displays it on line 10295, and the unified log viewer says line 10297.

For reference:
* unified log viewer iframe target: https://taskcluster.github.io/unified-logviewer/?url=https://queue.taskcluster.net/v1/task/EPiPaSGqTOSM55lrgm2Z_w/runs/0/artifacts/public/logs/live_backing.log&lineHeight=13&lineNumber=10049&highlightStart=10049&highlightEnd=10049
* Raw log: https://queue.taskcluster.net/v1/task/EPiPaSGqTOSM55lrgm2Z_w/runs/0/artifacts/public/logs/live_backing.log
* Treeherder error steps (note these are zero based, so +1 to get what you see in the UI; bug 1328455) https://treeherder.mozilla.org/api/project/try/jobs/66423492/text_log_steps/
There are ansi escape sequences in the log (from the Hg progress bar printouts) - I believe some tools treat lines separated by these are separate lines, others not.

eg this should be one line, not several merged:

$ grep -Hn '1116497532/1682282333' live_backing.log
live_backing.log:67:clone [>                                               ]   52282940/1682282333clone [=>
                                       ]  100600264/1682282333 48sclone [==>
       ]  124223466/1682282333 51sclone [==>                                       ]  135834830/1682282333 1m01sclone [==>                                       ]  146062334/1682282333 1m07sclone [===>
                   ]  172637105/1682282333 1m04sclone [===>                                      ]  195402921/1682282333 1m04sclone [====>                                     ]  213024039/1682282333 1m05sclone [====>
                                 ]  232136530/1682282333 1m05sclone [=====>
 ]  245702771/1682282333 1m07sclone [=====>                                    ]  253259664/1682282333 1m10sclone [======>                                   ]  280914949/1682282333 1m07sclone [======>
               ]  293164457/1682282333 1m09sclone [======>                                   ]  315289004/1682282333 1m07sclone [=======>                                  ]  322761275/1682282333 1m10sclone [=======>
                             ]  330557041/1682282333 1m12sclone [=======>                                  ]
341263707/1682282333 1m13sclone [=======>                                  ]  357524859/1682282333 1m13sclone [========>
Unless all of the tools we have that interact with the logs all use the same control character stripping behaviour it seems that the best solution is to just avoid outputting them to the log in the first place.

gps, thoughts on turning off the Hg progress bar - or at least making it not use control characters?
Blocks: 1315808
Flags: needinfo?(gps)
Priority: -- → P1
Summary: Disagreement between failure summary and actual log line numbers on Windows TC builds → Log line numbers differ between Treeherder and the unified log viewer
The hg progress bar is configured to print every 1.0s so we have a good sense of how long certain operations take to complete. I'd strongly prefer to retain it.

The hg progress bar currently prints CR (\r) characters on every update. That should be the only thing resembling a "control character" emitted by Mercurial in automation. If you see otherwise, that's a bug.

I was going to say the fix should be for `run-task` to normalize \r to \n. But, it already does this at https://hg.mozilla.org/mozilla-central/annotate/845cc4dea57f/taskcluster/docker/recipes/run-task#l69 and I now see this bug is about Windows.

So, now I'm going to say the preferred solution is to get Windows TC to use run-task or an analog[ue]. This is long overdue IMO because it will help establish uniformity between all the platforms.

Anyway, you probably won't like the scope bloat of adding run-task to Windows. So the easy solution is to disable the Mercurial progress bars by either a) removing progress.assume-tty=true b) setting progress.disable=true. Alternatively, maybe there's some Windows [Power]shell magic you can do to translate the \r to either \n or \r\n.

Longer term, upstream Mercurial could be taught to emit alternate line endings. Or, a custom extension could be written to normalize bare \r to something else.
Flags: needinfo?(gps)
Assignee: nobody → emorley
Fixed as part of switching from urllib2 to requests in bug 1165356, by removing the custom gzip streaming implementation which was incorrectly splitting lines that didn't use LF as a line ending:
https://github.com/mozilla/treeherder/commit/6717d9f5fc71ff2477968b827a55344d765e13bd#diff-f7ee472a3c7a80338868b13567e6ce13L108

That commit is now on stage, and will be on prod in the next few days. (Note: It only fixes newly ingested logs, not existing)
Status: NEW → RESOLVED
Closed: 7 years ago
Depends on: 1165356
Resolution: --- → FIXED
Component: Treeherder: Log Viewer → TreeHerder
You need to log in before you can comment on or make changes to this bug.