Closed Bug 934783 Opened 7 years ago Closed 7 years ago
Fix and re-enable the GPUAdapter
Bug 917496 will disable the GPUAdapterReporter because the "gpu-shared" and "gpu-dedicated" values are often bogus (i.e. huge negative values) on Win7 debug builds, which triggers an assertion in about:memory. They seems to be ok on other Windows builds. toolkit/components/aboutmemory/tests/test_aboutmemory5.xul (which bug 929797 will add) demonstrates the problem -- it fails more than half the time with a message like this: ERROR TEST-UNEXPECTED-FAIL | chrome://mochitests/content/chrome/toolkit/components/aboutmemory/tests/test_aboutmemory5.xul | Assertion count 1 is greater than expected range 0-0 assertions. See https://tbpl.mozilla.org/?tree=Try&rev=b9a80850a174 for an example try trun showing the test failures (though note that this link will expire in a few days or maybe weeks). I suspect that there is a problem with the structs in gfx/thebes/d3dkmtQueryStatistics.h, but I could be wrong.
7 years ago
OS: Mac OS X → Windows 7
7 years ago
No longer blocks: 929797
Finally got together a build on a Win7 machine. I seem to be able to always have bad values from child processes but not the main process. I am also regularly hitting what appears to be a some sort of race condition when running this test. Everything just halts with the child windows open for about a minute until a stack trace is output in the console. Other times there is just a stack trace from the child process as it was closing. As for the bogus values, if I make the test leave open the child processes and re-measure the memory moments later there are no bogus values.
I have found what looks to be a hack around this problem in the process hacker code. BytesCommitted on segment info needs to be recast to ULONG from its normal ULONGLONG which seem to make the value 0 all the time for Win7 sub-processes. I am not sure why this is needed but it the same values were seen in both process hacker and process explorer so I believe it should be ok. Still WIP as I am going to check its all still working on Win8 later then remove the 'reporter disabled' comments. njn: Would you mind checking that this does in fact fix the problem and/or send it to try?
Assignee: nobody → hughnougher
Status: NEW → ASSIGNED
I don't have a Windows machine suitable for this. And I believe you have try access now? It'd be quicker and simpler for everyone if you did the try run.
(In reply to Hugh Nougher [:Hughman] from comment #2) > BytesCommitted on segment info needs to be recast to ULONG from > its normal ULONGLONG which seem to make the value 0 all the time for Win7 > sub-processes. I had a hard time parsing this sentence. Do you mean this fix makes the value always 0 for all sub-processes, instead of some really high bogus value? In other words, the high 32 bits get set to some bogus value for sub-processes, and by casting you're setting them to zero? Or do you mean they're always zero *without* the fix? If the former, I personally think |bytesCommitted &= 0xffffffffULL;| would be clearer than the cast :)
Oh, and does this limit the amount of memory that can be reported for the main process to 32-bits as well? That seems like it could be bad on 64-bit builds (though those aren't tier1 right now) in a few years if someone gets a bad leak.
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #4) > Do you mean this fix makes the value always 0 for all > sub-processes, instead of some really high bogus value? > In other words, the high 32 bits get set to some bogus value for > sub-processes, and by casting you're setting them to zero? Or do you mean > they're always zero *without* the fix? The result of the cast is that 0 has appeared every time for both gpu-dedicated and gpu-shared on sub-processes. From my investigation over 3 machine I have access to I found Win8 never showed any problems and Win7 only showed it on one of my configurations. No idea why. The machine that was able to reproduce showed about 50% of the time a segment reported a value over 2^32 and the other times it was zero. Given there seemed to be an average 3 segments per process, some interesting numbers showed up. I never saw any problems with the main process leading me to believe it must be something to do with how Firefox shares surface(s) from the main process to the children. > I personally think > |bytesCommitted &= 0xffffffffULL;| would be clearer than the cast :) If that does the same as casting then I agree that it is clearer. I was just going with what was done at http://sourceforge.net/p/processhacker/code/HEAD/tree/2.x/trunk/plugins/ExtendedTools/gpumon.c#l434 (In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #5) > Oh, and does this limit the amount of memory that can be reported for the > main process to 32-bits as well? That seems like it could be bad on 64-bit > builds (though those aren't tier1 right now) in a few years if someone gets > a bad leak. The fix does not seem to be needed in Win8 as it always gives a proper ULL value so future 64bit will be OK (so far). (In reply to Nicholas Nethercote [:njn] from comment #3) > If the former, I believe you have try access now? Hmm. I have not heard anything about this. Unless it comes with editbugs access? I will attempt it tomorrow in any case. PS: I'm going to attempt to see if changing the segment info struct could give other results tomorrow but my hopes are not high as the main process is giving proper results.
> > If the former, I believe you have try access now? > Hmm. I have not heard anything about this. Unless it comes with editbugs > access? I will attempt it tomorrow in any case. Oh, I mixed up editbugs and try access. We should definitely give you level 1 (try) access. Details are at https://www.mozilla.org/hacking/committer/. File a bug to get the access. Bug 710590 is an example you can copy.
I didn't realise so much time had passed since the last time I had a patch. Anyway, new patch. Main changes compared to WIP is there is not an explicit cast to ULONG any more as that logic was moved to the stuct making naming obvious. I have to conclude that this strange anomaly only effects some configurations of Win7 and most likely versions of drivers. Tested on try a few times with minor changes, the most recent being https://tbpl.mozilla.org/?tree=Try&rev=1175fb96eca3
Comment on attachment 8409268 [details] [diff] [review] Patch Review of attachment 8409268 [details] [diff] [review]: ----------------------------------------------------------------- Apologies for the review delay. I don't really have any idea about these magic Windows structs, but if try server says it's getting reasonable results, that's good enough for me. Thanks.
Attachment #8409268 - Flags: review?(n.nethercote) → review+
Fixed commit message, carried review forward.
Comment on attachment 8413319 [details] [diff] [review] Patch Final Please just use checkin-needed :)
Attachment #8413319 - Flags: checkin? → checkin+
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla31
You need to log in before you can comment on or make changes to this bug.