992485 - Reduce the size of debug browser-chrome logs, exceeding the maximum log size and thus closing trunk and aurora trees

Reporter

Description

•

11 years ago

On mozilla-inbound, green Win7 browser-chrome logs are sizes like 52413446 bytes. The maximum log size is 52428800 bytes. On fx-team, because https://hg.mozilla.org/integration/fx-team/rev/e3564d9cd015 added four tests, we went from one or two of five Win7 debug b-c runs exceeding the maximum size to five of five. That's it, we're done. If, as is apparently the case with the bug 983948 thing with Linux debug browser-chrome timing out constantly, we're not going to do anything other than wait until the devtools tests get moved out of browser-chrome into their own suite, then we're not going to have any open desktop Firefox trunk trees until that happens. And no, you can't land your not-obviously-browser-chrome related patches because your thing isn't obviously related so you should be allowed to just ignore the problem.

Phil Ringnalda (:philor)

Reporter

Comment 1

•

11 years ago

Randomly, https://tbpl.mozilla.org/php/getParsedLog.php?id=37313795&tree=Mozilla-Inbound, a Win8 debug log exceeding the maximum size. Mac and Linux seem to be at a "comfortable" 45-47MB, but then, we have a lot of tests disabled on Mac and Linux.

:Gavin Sharp [email: gavin@gavinsharp.com]

Comment 2

•

11 years ago

Why do we have a maximum log size?

Phil Ringnalda (:philor)

Reporter

Comment 3

•

11 years ago

Buildbot something something bad things will happen to you if you increase it beyond that.

Phil Ringnalda (:philor)

Reporter

Comment 4

•

11 years ago

I'm pretty sure Dustin's actual explanation doesn't have the handwaving, can't remember whether there's an actual hard upper limit, or just the certainty that increasing it will eventually end with buildbot masters going up in flames.

Timothy Nikkel (:tnikkel)

Comment 5

•

11 years ago

I pulled a random win7 debug log. The most common line was INFO - [Parent 4056] WARNING: NS_ENSURE_TRUE(gIMM32Handler) failed: file c:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/widget/windows/nsIMM32Handler.cpp, line 254 with 20k, the next most common had 5k occurences, and the rest of the top 10 looked cross platform. Do we need that to be NS_ENSURE_TRUE? Can we just replace it with a quiet early return that doesn't note it in the log?

Flags: needinfo?(masayuki)

Flags: needinfo?(m_kato)

Timothy Nikkel (:tnikkel)

Comment 6

•

11 years ago

Attached patch make imm less spammy — Details — Splinter Review

What do you think?

Attachment #8402171 - Flags: review?(masayuki)

Attachment #8402171 - Flags: review?(m_kato)

Timothy Nikkel (:tnikkel)

Comment 7

•

11 years ago

This line http://hg.mozilla.org/mozilla-central/annotate/5fa70bd90a8b/browser/devtools/netmonitor/test/head.js#l214 is responsible for about 595 dumps that produce about 3465 bytes each for about 2mb total.

Ed Morley [:emorley]

Updated

•

11 years ago

Depends on: logspam

Ed Morley [:emorley]

Updated

•

11 years ago

Depends on: 984930

Ed Morley [:emorley]

Updated

•

11 years ago

Depends on: 819963

Phil Ringnalda (:philor)

Reporter

Comment 8

•

11 years ago

Gavin's going to get more with his https://hg.mozilla.org/integration/fx-team/rev/ed65a3fd835a for comment 7, but I should get around 1.2MB for https://hg.mozilla.org/integration/fx-team/rev/c527625c6727

Keywords: leave-open

Phil Ringnalda (:philor)

Reporter

Comment 9

•

11 years ago

That got us about two thirds of the way back to where aurora is, so if we're careful it maybe bought us a couple of weeks.

Bill Gianopoulos [:WG9s]

Comment 10

•

11 years ago

(In reply to Timothy Nikkel (:tn) from comment #7) > This line > http://hg.mozilla.org/mozilla-central/annotate/5fa70bd90a8b/browser/devtools/ > netmonitor/test/head.js#l214 > is responsible for about 595 dumps that produce about 3465 bytes each for > about 2mb total. Seems to me this output should more appropriately be created under control of a DEBUG_XXX define. It seems inappropriate for the automated tests.

Tim Taubert [:ttaubert] (inactive)

Updated

•

11 years ago

Depends on: 842511

Phil Ringnalda (:philor)

Reporter

Comment 11

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/ed65a3fd835a https://hg.mozilla.org/mozilla-central/rev/c527625c6727

Phil Ringnalda (:philor)

Reporter

Updated

•

11 years ago

Severity: blocker → critical

Dustin J. Mitchell [:dustin] (he/him)

Comment 12

•

11 years ago

The max log length is configurable (logMaxSize) and the current setting is pretty arbitrary. The problem is that the version of Buildbot Mozilla uses keeps that log in RAM, so increasing it to an arbitrarily large value may lead to OOMs on buildmasters when lots of builds are blabbering tens of megabytes at the same time. So: releng could increase this in the short term, but code changes to keep log lengths reasonable are the better long-term solution. I'd argue that anything outside of try shouldn't be more than, say, 5MB, but what do I know!

Makoto Kato [:m_kato]

Updated

•

11 years ago

Attachment #8402171 - Flags: review?(m_kato) → review+

Flags: needinfo?(m_kato)

Timothy Nikkel (:tnikkel)

Updated

•

11 years ago

Attachment #8402171 - Flags: review?(masayuki)

Flags: needinfo?(masayuki)

Timothy Nikkel (:tnikkel)

Comment 13

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/cc8f1cb05cbc

Girish Sharma [:Optimizer]

Comment 14

•

11 years ago

Here is one thought: In the whole log file, majority of the log size is coming from the full path of the test file before each info/is/ok assertion. This can be cleaned up as those full paths do not provide any extra information. So something like : 22:42:27 INFO - TEST-START | chrome://mochitests/content/browser/toolkit/devtools/server/tests/browser/browser_storage_dynamic_windows.js 22:42:28 INFO - console.log: GETDBNAME : file exists : true 22:42:28 INFO - console.log: opening file c:\users\cltbld~1.t-w\appdata\local\temp\tmpmntor9\storage\persistent\http+++test1.example.org\idb\1031932777i1db.sqlite 22:42:28 INFO - console.log: opening file c:\users\cltbld~1.t-w\appdata\local\temp\tmpmntor9\storage\persistent\http+++test1.example.org\idb\3686368546i2db.sqlite 22:42:28 INFO - console.log: returning db names for host http://test1.example.org idb1,idb2 22:42:28 INFO - console.log: GETDBNAME : file exists : false 22:42:28 INFO - TEST-PASS | chrome://mochitests/content/browser/toolkit/devtools/server/tests/browser/browser_storage_dynamic_windows.js | cookies storage actor is present 22:42:28 INFO - TEST-PASS | chrome://mochitests/content/browser/toolkit/devtools/server/tests/browser/browser_storage_dynamic_windows.js | Number of hosts for cookiesmatch 22:42:28 INFO - TEST-PASS | chrome://mochitests/content/browser/toolkit/devtools/server/tests/browser/browser_storage_dynamic_windows.js | Host test1.example.org is present ... 22:42:29 INFO - TEST-INFO | chrome://mochitests/content/browser/toolkit/devtools/server/tests/browser/browser_storage_dynamic_windows.js | Everything to be received is received. 22:42:29 INFO - TEST-INFO | chrome://mochitests/content/browser/toolkit/devtools/server/tests/browser/browser_storage_dynamic_windows.js | in clear 0 22:42:29 INFO - TEST-INFO | chrome://mochitests/content/browser/toolkit/devtools/server/tests/browser/browser_storage_dynamic_windows.js | in clear 1 22:42:32 INFO - TEST-INFO | MEMORY STAT vsize after test: 1333383168 22:42:32 INFO - TEST-INFO | MEMORY STAT vsizeMaxContiguous after test: 1893662720 22:42:32 INFO - TEST-INFO | MEMORY STAT residentFast after test: 412942336 22:42:32 INFO - TEST-INFO | MEMORY STAT heapAllocated after test: 159515792 22:42:32 INFO - INFO TEST-END | chrome://mochitests/content/browser/toolkit/devtools/server/tests/browser/browser_storage_dynamic_windows.js | finished in 5211ms can be converted to: 22:42:27 INFO - TEST-START | chrome://mochitests/content/browser/toolkit/devtools/server/tests/browser/browser_storage_dynamic_windows.js 22:42:28 INFO - console.log: GETDBNAME : file exists : true 22:42:28 INFO - console.log: opening file c:\users\cltbld~1.t-w\appdata\local\temp\tmpmntor9\storage\persistent\http+++test1.example.org\idb\1031932777i1db.sqlite 22:42:28 INFO - console.log: opening file c:\users\cltbld~1.t-w\appdata\local\temp\tmpmntor9\storage\persistent\http+++test1.example.org\idb\3686368546i2db.sqlite 22:42:28 INFO - console.log: returning db names for host http://test1.example.org idb1,idb2 22:42:28 INFO - console.log: GETDBNAME : file exists : false 22:42:28 INFO - TEST-PASS | cookies storage actor is present 22:42:28 INFO - TEST-PASS | Number of hosts for cookiesmatch 22:42:28 INFO - TEST-PASS | Host test1.example.org is present ... 22:42:29 INFO - TEST-INFO | Everything to be received is received. 22:42:29 INFO - TEST-INFO | in clear 0 22:42:29 INFO - TEST-INFO | in clear 1 22:42:32 INFO - TEST-INFO | MEMORY STAT vsize after test: 1333383168 22:42:32 INFO - TEST-INFO | MEMORY STAT vsizeMaxContiguous after test: 1893662720 22:42:32 INFO - TEST-INFO | MEMORY STAT residentFast after test: 412942336 22:42:32 INFO - TEST-INFO | MEMORY STAT heapAllocated after test: 159515792 22:42:32 INFO - INFO TEST-END | chrome://mochitests/content/browser/toolkit/devtools/server/tests/browser/browser_storage_dynamic_windows.js | finished in 5211ms This will potentially reduce the log file size to half, without loss of any meaningful information.

Carsten Book [:Tomcat]

Comment 15

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/cc8f1cb05cbc

:Felipe Gomes (needinfo for replies!)

Comment 16

•

11 years ago

(In reply to Girish Sharma [:Optimizer] from comment #14) > Here is one thought: > > In the whole log file, majority of the log size is coming from the full path > of the test file before each info/is/ok assertion. This can be cleaned up as > those full paths do not provide any extra information. So something like : Agree. Sometimes tests which fail to clean-up properly end up running listeners and checks during other tests, so it's useful to know which test is generating each output line. But having the test leaf name instead of the full path is probably enough and would still save a lot.

Girish Sharma [:Optimizer]

Comment 17

•

11 years ago

(In reply to :Felipe Gomes from comment #16) > (In reply to Girish Sharma [:Optimizer] from comment #14) > > Here is one thought: > > > > In the whole log file, majority of the log size is coming from the full path > > of the test file before each info/is/ok assertion. This can be cleaned up as > > those full paths do not provide any extra information. So something like : > > Agree. Sometimes tests which fail to clean-up properly end up running > listeners and checks during other tests, so it's useful to know which test > is generating each output line. But having the test leaf name instead of the > full path is probably enough and would still save a lot. iianw, the assertion methods is/ok/info etc are not available to the test after it is complete. So the test would simply go on giving out exceptions, which anyways will have the full url of the js file.

Dave Townsend [:mossop]

Comment 18

•

11 years ago

Wouldn't that make it more of a pain for sheriffs to figure out which components to file bugs in for test failures?

Girish Sharma [:Optimizer]

Comment 19

•

11 years ago

the test start and test end log still have names.

Dave Townsend [:mossop]

Comment 20

•

11 years ago

(In reply to Girish Sharma [:Optimizer] from comment #19) > the test start and test end log still have names. Yes, but that requires opening the log, sometimes the full log to find them

Girish Sharma [:Optimizer]

Comment 21

•

11 years ago

This should be an easy fix in the getLogExcerpt.php [0] file which shows the summary of failure when you click on failed run. [0] https://mxr.mozilla.org/webtools-central/source/tbpl/php/

Phil Ringnalda (:philor)

Reporter

Comment 22

•

11 years ago

Oh, convenient, this is still open, I'll borrow it for closure number two. Now it's Win8 debug, starting nearly simultaneously across mozilla-aurora and trunk trees, smelling a great deal like it's the result of one of the network dependencies that we keep letting slip in. We went from around 47.6MB on aurora to (so far) permanently over 52 across the span of a mobile/ push, a gaia push, and a push only touching LinuxNativeApp.js, so not a code change. Now we're running over (or only running a few thousand bytes under) on trunk trees too. Once again, everything's closed.

Severity: critical → blocker

OS: Windows 7 → Windows 8

Summary: Reduce the size of debug browser-chrome logs, exceeding the maximum log size and thus closing trunk trees → Reduce the size of debug browser-chrome logs, exceeding the maximum log size and thus closing trunk and aurora trees

:Gijs (he/him)

Comment 23

•

11 years ago

So this seems to be tests suddenly starting to hit the network. Causing stuff like this: 16:56:24 INFO - TEST-INFO | chrome://mochitests/content/browser/browser/base/content/test/general/browser_aboutHome.js | Console message: [JavaScript Warning: "Error in parsing value for 'unicode-bidi'. Declaration dropped." {file: "https://www.google.com/search?q=a+search&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:unofficial&client=firefox-a&channel=np&source=hp" line: 11 column: 63 source: "[dir='ltr'],[dir='rtl']{unicode-bidi:-moz-isolate;unicode-bidi:isolate}bdo[dir='ltr'],bdo[dir='rtl']{unicode-bidi:bidi-override;unicode-bidi:-moz-isolate-override;unicode-bidi:isolate-override}"}] and various other messages that echo all of google's minified CSS, and therefore fill up a lot of space really quickly and effectively. :-| Was there any change on infra which might explain why previously well-running tests are suddenly hitting the network without any code or test changes in the product they run on?

Phil Ringnalda (:philor)

Reporter

Comment 24

•

11 years ago

They've never been well-running (those are tests which perform a Google search, and then count on hitting browser.stop() quickly enough to prevent the search from actually happening), but the only possible infra change I can think of that would cause an increased frequency of not hitting stop() in time would be a very amusing "we made the network faster, so now you have less time to stop a request that you never should have created in the first place." I don't know what the actual cause for them is, but logs which manage to sneak under the limit seem to have a couple thousand instances of "WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80520012: file c:/builds/moz2_slave/m-aurora-w32-d-000000000000000/build/netwerk/base/src/nsFileStreams.cpp" while ones over the limit have more like 19000 of them.

Ryan VanderMeulen [:RyanVM]

Comment 25

•

11 years ago

browser_aboutHome.js was disabled by khuey. https://hg.mozilla.org/integration/mozilla-inbound/rev/86f9aa305c86

Ryan VanderMeulen [:RyanVM]

Comment 26

•

11 years ago

I filed bug 995041 to track fixing and re-enabling browser_aboutHome.js.

:Gavin Sharp [email: gavin@gavinsharp.com]

Comment 27

•

11 years ago

(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #25) > browser_aboutHome.js was disabled by khuey. > https://hg.mozilla.org/integration/mozilla-inbound/rev/86f9aa305c86 http://hg.mozilla.org/integration/mozilla-inbound/rev/6a21b8af35b7

Ryan VanderMeulen [:RyanVM]

Comment 28

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/86f9aa305c86 https://hg.mozilla.org/mozilla-central/rev/6a21b8af35b7

Erik Vold [:erikvold] (please needinfo? me)

Updated

•

11 years ago

Blocks: 993272

:Felipe Gomes (needinfo for replies!)

Updated

•

11 years ago

Depends on: 998861

Andrew McCreight [:mccr8]

Comment 29

•

9 years ago

The tree has been reopened for quite a while. If we start exceeding the size again, please needinfo me on the bug.

Status: NEW → RESOLVED

Closed: 9 years ago

Keywords: leave-open

Resolution: --- → FIXED