Closed Bug 625887 Opened 14 years ago Closed 13 years ago

Log parsing on the TBPL server

Categories

(Tree Management Graveyard :: TBPL, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mstange, Assigned: mstange)

References

Details

Attachments

(4 files, 1 obsolete file)

If we want to use Buildbot instead of Tinderbox as our data source, something needs to take over the log parsing that Tinderbox does at the moment. That is:
 - Generating a log summary (which is displayed by TBPL in the blue box) and
 - Creating the HTML pages for logs, with the summary at the top and clickable
   links to the error lines.

I think the TBPL server is a good place for that. Doing the parsing using JavaScript right in the TBPL client would be very bandwidth and power consuming. I want TBPL to work in phone browsers, too, and downloading whole logs is just not a viable option there.
Attached file example short log
In my local patch queue I've got some PHP code that can already do most of what's necessary. Here's an example output.
FWIW, I think parsing in the tbpl/bbpl client is a better approach.  Comment 0 points at two issues
 - bandwidth.  Some build logs are ridiculous, uselessly large.  We should just fix that.
 - client-side parsing perf on handhelds.  I have a prototype log parser in JS that parse at ~1GB/sec because of our regexp JIT.  That rate is near the same order of magnitude as my memory bandwidth.  The same should be true for fennec, which means it should be able to parse logs at >10MB/sec.  (Who cares about other browsers on handhelds ;) .)

Parsing on the server leaves the server vulnerable to the kinds of accidental DoS's that knock out tinderbox on occasion.
Comment 2 is more of a philosophical aside, though, not intended to stall patches here that would help kill off tinderbox.
(In reply to comment #2)
> I have a prototype log parser in JS
> that parse at ~1GB/sec because of our regexp JIT.  That rate is near the same
> order of magnitude as my memory bandwidth.

Impressive. And probably an order of magnitude faster than regexps in PHP...
But yeah, we can still reimplement this in JavaScript after Tinderbox is gone and our logs are shorter.
(Forgot to add: parsing on the server side also prevents hacking the parser, for most intents and purposes.  That's extremely useful for transient failures that otherwise have to be grepped by hand.)
(In reply to comment #2)
>  - bandwidth.  Some build logs are ridiculous, uselessly large.  We should just
> fix that.

Fixing at the source is always a good idea :-) But I’m out of ideas there, except for fixing compile warnings and not printing the compile command line unless there are actually warnings or errors.
Building with -s would be the biggest win, yeah.  We also should stop dumping the env on every build step.
We could also print relative paths for test filenames and so forth.
(In reply to comment #7)
> We also should stop dumping the env on every build step.

In the original build logs, those are in a separate stream, that you can switch on or not. Also they're less annoying if you're only looking at the log for the step, instead of seeing everything pasted together. See http://people.mozilla.org/~axel/logs-example/#make_unpack for a mock-up of how the two relate.
The war-on-orange ElasticSearch db has some of this info already.  E.g., see

http://elasticsearch1.metrics.sjc1.mozilla.com:9200/logs/testfailures/_search?q=starttime:1295197392

(you'll need to be connected to MTP)

Using this for the stuff in the blue box would make it very fast, since nothing has to be parsed, it's just a REST query.

Some issues, though:  we only store this info for mozilla-central at present.  We may expand this to other trees, but it's not clear we'll be able to store this for all trees that tbpl is responsible for.  And, this is not quite real-time, there is some delay in getting this info into ES.

One possible approach is to query ES to see if it has data available for the log in question, and if so, use it, otherwise use the slower tbpl-server-parse approach outlined in previous comments.
Would it not actually be possible to get buildbot or some of the builders generating this data?

For example, unit test builders already parse some of the output so they know how many tests have passed/failed, surely they could upload a second, reduced, log with just the test results in?

Then you could still maintain the full log for those situations where people need to analyse in detail, but have smaller logs/files generated for test summaries and maybe html page summaries all on the builders - hence generated at source.
Blocks: 644869
The new PHP files will follow this rule: Files that are meant to be used by the outer world go into php/ directly, and files that are only included (which will be the lion's share) will go into php/inc/.
Attachment #532222 - Flags: review?(arpad.borsos)
I'm only separating this PHP file from all the others because it shows how things are going to work at a fundamental level:
Every result from log parsing will be cached in the local file system, and every cached thing is only generated once. Two parallel PHP execution threads are prevented from generating the same thing simultaneously by file system based synchronization, which is provided in this class.

Also, the cached files will all be gzipped. And they'll be sent to the browser in gzipped form.
Attachment #532225 - Flags: review?(arpad.borsos)
Have fun.

The entry points here are getParsedLog.php and getLogExcerpt.php. The cache directory structure looks like this:

cache/
  rawlog/
    4632818.txt.gz
  excerpt/
    4632818_unittest.txt.gz
    4632818_tinderbox_print.txt.gz
    4632818_reftest.txt.gz
  annotatedsummary/
    4632818_unittest.txt.gz
  parsedlog/
    4632818_full.txt.gz
    4632818_short.txt.gz

Security considerations:
 - the results of the get*.php files are public and have liberal access
   control headers.
 - database and file system contents are trusted
 - there are no secrets in the database
 - cookies aren't involved anywhere
 - user data enters into the system in three places via $_GET: in
   getParsedLog.php, in getLogExcerpt.php, and in inc/RunForLog.php.
 - file systetm access happens in ParallelFileGenerating.php, GzipUtils.php
   and RawGzLogDownloader.php
 - file names / paths are generated in RawGzLogDownloader::getLines,
   AnnotatedSummaryGenerator::ensureAnnotatedSummaryExists,
   LogParser::ensureExcerptExists and ParsedLogGenerator::ensureLogExists
   and only make use hardcoded strings and numbers (run IDs) from the database
 - connections to other servers are only established in one place: Logs are
   downloaded per FTP from ftp.mozilla.org (hardcoded) in
   RawGzLogDownloader.php
Attachment #532230 - Flags: review?(arpad.borsos)
Comment on attachment 532222 [details] [diff] [review]
move JSON.php into an inc/ folder [checked in]

Review of attachment 532222 [details] [diff] [review]:
-----------------------------------------------------------------

Hm, neither splinter nor bz diff view shows me the rename lines :(
Attachment #532222 - Flags: review?(arpad.borsos) → review+
Comment on attachment 532225 [details] [diff] [review]
add php/inc/ParallelFileGenerating.php

Review of attachment 532225 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the comment on flock()
Attachment #532225 - Flags: review?(arpad.borsos) → review+
Blocks: 656919
Comment on attachment 532230 [details] [diff] [review]
rest of the PHP gunk: log parsing and summary generation

Review of attachment 532230 [details] [diff] [review]:
-----------------------------------------------------------------

Very good patch, r+ with nits fixed. What do we do about the timezone issue?

::: php/getLogExcerpt.php
@@ +30,5 @@
> +    $rawErrorSummaryFilename = $logParser->ensureExcerptExists();
> +    if ($type != "annotated") {
> +      GzipUtils::passThru($rawErrorSummaryFilename, 'text/plain');
> +    } else {
> +      date_default_timezone_set('America/Los_Angeles');

Is there a way of doing the timezone resolution clientside to avoid hardcoding Mountain View times that do not mean anything to people not in MV?
Or maybe now that I see this is used as data attribute, it may be fine this way as long as it is not shown in the UI.

::: php/inc/FullLogGenerator.php
@@ +40,5 @@
> +  protected function generateHTML($summary, $fullLog) {
> +    $lines = $this->logParser->getLines();
> +    $linesWithErrors = $this->logParser->getFilteredLines();
> +    date_default_timezone_set('America/Los_Angeles');
> +    $logDescription = $this->machineType.' on '.date("Y-m-d H:i:s", $this->startTime);

For this static html, I guess we can’t really do anything about the timezone, but please include the timezone name there so people at least have a clue how to translate it :-D

@@ +43,5 @@
> +    date_default_timezone_set('America/Los_Angeles');
> +    $logDescription = $this->machineType.' on '.date("Y-m-d H:i:s", $this->startTime);
> +    $revLink = '<a href="../?branch='.$this->branch.'&rev='.$this->revision.'">push '.$this->revision.'</a>';
> +    return "<!DOCTYPE html>\n".
> +      "<html lang=en>\n".

Quote that attribute.

::: php/inc/GeneralErrorFilter.php
@@ +12,5 @@
> +
> +class GeneralErrorFilter implements LineFilter {
> +
> +  public function getType() {
> +    return "unittest";

As the Filter is named GeneralError, that should also be reflected inside the type. “unittest” seems to be a relic from tinderbox times, and as I see it, this filter is also used for builds and other types of jobs so unittest is not a good fit here.

@@ +25,5 @@
> +       || preg_match("/^C /", $line)  // . . . . . . . . . . . . . . . . . . . cvs merge conflict
> +       || preg_match("/Automation Error\:/", $line) // . . . . . . . . . . . . Release Automation
> +       || preg_match("/ error\([0-9]*\)\:/", $line) // . . . . . . . . . . . . C
> +       || preg_match("/\[checkout aborted\]/", $line)   // . . . . . . . . . . cvs
> +       || preg_match("/\: cannot find module/", $line)  // . . . . . . . . . . cvs

Do we really care about cvs anymore?

::: php/inc/LogParser.php
@@ +30,5 @@
> +  public function getExcerpt($asHTML = false) {
> +    $lines = $this->getLines();
> +    $filteredLines = $this->getFilteredLines();
> +    $excerptLines = array();
> +    foreach ($filteredLines as $errorNumber => $errorLine) {

As this seems to be a generalized file, I don’t think calling these vars “error” is a good idea, just make it $i => $lineno or something generic.

@@ +38,5 @@
> +      $excerptLines[] = $asHTML ?
> +        '<a href="#error'.$errorNumber.'">'.htmlspecialchars($lineMatch).'</a>' :
> +        $lineMatch;
> +    }
> +    return implode('', $excerptLines);

LineFilter->matchLine() returns the line with a trailing \n also the lines inside $lines do have a trailing \n as far as I see it.
Having the only whitespace inside the a tag in case of $asHTML is not really nice, but as I see that is inside a <pre> it seems acceptable.

::: php/inc/RunForLog.php
@@ +8,5 @@
> +  $mongo = new Mongo();
> +  $run = $mongo->tbpl->runs->findOne(array('_id' => +$_GET["id"]));
> +  if (!$run)
> +    die("Unknown run ID.");
> +  if (!isset($run['log']) || !$run['log'])

if (empty()) is what you want.

::: php/inc/ShortLogGenerator.php
@@ +90,5 @@
> +  protected function generateHTML($summary, $shortLog) {
> +    $lines = $this->logParser->getLines();
> +    $linesWithErrors = $this->logParser->getFilteredLines();
> +    date_default_timezone_set('America/Los_Angeles');
> +    $logDescription = $this->machineType.' on '.date("Y-m-d H:i:s", $this->startTime);

Same as for full log.

@@ +93,5 @@
> +    date_default_timezone_set('America/Los_Angeles');
> +    $logDescription = $this->machineType.' on '.date("Y-m-d H:i:s", $this->startTime);
> +    $revLink = '<a href="../?branch='.$this->branch.'&rev='.$this->revision.'">push '.$this->revision.'</a>';
> +    $header = "<!DOCTYPE html>\n".
> +      "<html lang=en>\n".

Quotes please.

@@ +98,5 @@
> +      "<title>Log - ".$logDescription."</title>\n".
> +      "<h1>Log</h1>\n".
> +      '<p class="subtitle">'.$logDescription." for ".$revLink."</p>\n".
> +      '<p class="viewFullLog"><a href="?id='.$this->runID.'&amp;full=1">View Full Log</a></p>'.
> +      '<p class="downloadFullLog"><a href="'.$this->logURL.'">Download Full Log</a></p>';

Maybe name that Raw Log or something, as it has no relation to the View Full Log above.

@@ +108,5 @@
> +        $shortLog;
> +    return $header.
> +      "<p>No errors or warnings found. See below for the full log.</p>\n".
> +      "<h2>Full Log</h2>\n".
> +      "<pre>".htmlspecialchars(implode("", $lines))."</pre>";

Do we really want to show the full log if there are no errors?
Attachment #532230 - Flags: review?(arpad.borsos) → review+
It's not at all the case that MVT is meaningless to people elsewhere. It's only partly a joke to call it Mozilla Standard Time - it's the timezone that all the hardware runs in, and the times in logs are in, and it can be very significant that a run started at 23:55 machine time. And since we've caught bugs by running tests with a timezone offset from GMT, and one which observes DST (like our users do), that's not likely to change. If we needed to drop one time, and only show one in the UI, it would be better to drop local, and only show the MST time.
Comment on attachment 532222 [details] [diff] [review]
move JSON.php into an inc/ folder [checked in]

http://hg.mozilla.org/users/mstange_themasta.com/tinderboxpushlog/rev/ebc45e912488
Attachment #532222 - Attachment description: move JSON.php into an inc/ folder → move JSON.php into an inc/ folder [checked in]
(In reply to comment #18)
> It's not at all the case that MVT is meaningless to people elsewhere. It's
> only partly a joke to call it Mozilla Standard Time - it's the timezone that
> all the hardware runs in, and the times in logs are in, and it can be very
> significant that a run started at 23:55 machine time. And since we've caught
> bugs by running tests with a timezone offset from GMT, and one which
> observes DST (like our users do), that's not likely to change. If we needed
> to drop one time, and only show one in the UI, it would be better to drop
> local, and only show the MST time.

Well you are right, MST is universally used when communicating (as in “look what the build at 20:12 did”). So going with that is the most sensible idea.
(In reply to comment #17)
> ::: php/getLogExcerpt.php
> @@ +30,5 @@
> > +    $rawErrorSummaryFilename = $logParser->ensureExcerptExists();
> > +    if ($type != "annotated") {
> > +      GzipUtils::passThru($rawErrorSummaryFilename, 'text/plain');
> > +    } else {
> > +      date_default_timezone_set('America/Los_Angeles');
> 
> Is there a way of doing the timezone resolution clientside to avoid
> hardcoding Mountain View times that do not mean anything to people not in MV?
> Or maybe now that I see this is used as data attribute, it may be fine this
> way as long as it is not shown in the UI.

Right, this one is not shown in the UI. It only appears in the comments left by TBPLBot.

> ::: php/inc/FullLogGenerator.php
> @@ +40,5 @@
> > +  protected function generateHTML($summary, $fullLog) {
> > +    $lines = $this->logParser->getLines();
> > +    $linesWithErrors = $this->logParser->getFilteredLines();
> > +    date_default_timezone_set('America/Los_Angeles');
> > +    $logDescription = $this->machineType.' on '.date("Y-m-d H:i:s", $this->startTime);
> 
> For this static html, I guess we can’t really do anything about the
> timezone, but please include the timezone name there so people at least have
> a clue how to translate it :-D

Added "T" in the date string (which currently translates to "PDT").

> @@ +43,5 @@
> > +    date_default_timezone_set('America/Los_Angeles');
> > +    $logDescription = $this->machineType.' on '.date("Y-m-d H:i:s", $this->startTime);
> > +    $revLink = '<a href="../?branch='.$this->branch.'&rev='.$this->revision.'">push '.$this->revision.'</a>';
> > +    return "<!DOCTYPE html>\n".
> > +      "<html lang=en>\n".
> 
> Quote that attribute.

Done.

> ::: php/inc/GeneralErrorFilter.php
> @@ +12,5 @@
> > +
> > +class GeneralErrorFilter implements LineFilter {
> > +
> > +  public function getType() {
> > +    return "unittest";
> 
> As the Filter is named GeneralError, that should also be reflected inside
> the type. “unittest” seems to be a relic from tinderbox times, and as I see
> it, this filter is also used for builds and other types of jobs so unittest
> is not a good fit here.

Renamed to "general_error".

> @@ +25,5 @@
> > +       || preg_match("/^C /", $line)  // . . . . . . . . . . . . . . . . . . . cvs merge conflict
> > +       || preg_match("/Automation Error\:/", $line) // . . . . . . . . . . . . Release Automation
> > +       || preg_match("/ error\([0-9]*\)\:/", $line) // . . . . . . . . . . . . C
> > +       || preg_match("/\[checkout aborted\]/", $line)   // . . . . . . . . . . cvs
> > +       || preg_match("/\: cannot find module/", $line)  // . . . . . . . . . . cvs
> 
> Do we really care about cvs anymore?

Don't think so. Removed those lines.
> 
> ::: php/inc/LogParser.php
> @@ +30,5 @@
> > +  public function getExcerpt($asHTML = false) {
> > +    $lines = $this->getLines();
> > +    $filteredLines = $this->getFilteredLines();
> > +    $excerptLines = array();
> > +    foreach ($filteredLines as $errorNumber => $errorLine) {
> 
> As this seems to be a generalized file, I don’t think calling these vars
> “error” is a good idea, just make it $i => $lineno or something generic.

Done.

> @@ +38,5 @@
> > +      $excerptLines[] = $asHTML ?
> > +        '<a href="#error'.$errorNumber.'">'.htmlspecialchars($lineMatch).'</a>' :
> > +        $lineMatch;
> > +    }
> > +    return implode('', $excerptLines);
> 
> LineFilter->matchLine() returns the line with a trailing \n also the lines
> inside $lines do have a trailing \n as far as I see it.
> Having the only whitespace inside the a tag in case of $asHTML is not really
> nice

I've added a new method "linkLine" which moves the \n after the </a>.

> ::: php/inc/RunForLog.php
> @@ +8,5 @@
> > +  $mongo = new Mongo();
> > +  $run = $mongo->tbpl->runs->findOne(array('_id' => +$_GET["id"]));
> > +  if (!$run)
> > +    die("Unknown run ID.");
> > +  if (!isset($run['log']) || !$run['log'])
> 
> if (empty()) is what you want.

Oh, good. I'll use that in bug 656902 (getRevisionBuilds.php), too.

> @@ +98,5 @@
> > +      "<title>Log - ".$logDescription."</title>\n".
> > +      "<h1>Log</h1>\n".
> > +      '<p class="subtitle">'.$logDescription." for ".$revLink."</p>\n".
> > +      '<p class="viewFullLog"><a href="?id='.$this->runID.'&amp;full=1">View Full Log</a></p>'.
> > +      '<p class="downloadFullLog"><a href="'.$this->logURL.'">Download Full Log</a></p>';
> 
> Maybe name that Raw Log or something

Done.

> @@ +108,5 @@
> > +        $shortLog;
> > +    return $header.
> > +      "<p>No errors or warnings found. See below for the full log.</p>\n".
> > +      "<h2>Full Log</h2>\n".
> > +      "<pre>".htmlspecialchars(implode("", $lines))."</pre>";
> 
> Do we really want to show the full log if there are no errors?

Yes, at least I do :-)
That's what Tinderbox did, so people are used to it. Moreover, I don't think showing nothing would make much sense: you usually open a log if you want to figure out what failed, and an empty log won't help you there.
Attachment #532230 - Attachment is obsolete: true
Blocks: 658543
Depends on: 661365
Depends on: 669000
Depends on: 682914
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Product: Webtools → Tree Management
Product: Tree Management → Tree Management Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: