Closed Bug 718169 Opened 13 years ago Closed 13 years ago

Missing symbols for ntdll.dll 6.1.7601.17725 and 6.0.6002.18541

Categories

(Toolkit :: Crash Reporting, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: scoobidiver, Assigned: ted)

References

Details

Please add 'ntdll\.dll@0x.' to the skiplist as prefixSignatureRegEx.
Ted, I am somewhat confused as to even seeing a number of ntdll.dll signatures pop up, I thought we could resolve all of those via the MS symbol servers? Do we miss something there recently?
OK, I'm taking this off the skiplist request list, the problem is that we are missing symbols for an apparently new version of ntdll.dll:

ntdll.dll 	6.1.7601.17725 	D74F79EB1F8D4A45ABCD2F476CCABACC2 	wntdll.pdb


It's no solution to skiplist everything, we need to resolve the real problem and that's this missing symbol file.
Component: Infra → Breakpad Integration
Product: Socorro → Toolkit
QA Contact: infra → breakpad.integration
Summary: [skiplist] Add 'ntdll\.dll@0x.' to the prefixSignatureRegEx → Missing symbols for ntdll.dll 6.1.7601.17725
Now in a different report I'm seeing:
ntdll.dll 	6.1.7601.17725 	093D2CD7F95B4CC6B5318D405CC315662 	ntdll.pdb

Same version, different pdb name and debug ID? Smells strange.
That is odd, but I suppose it's possible that Microsoft issued a patch for the binary that didn't change its version number.

I'm looking into this. The symbol upload script appears to be running and attempting to upload things, but they don't seem to be getting to the symbol server.
Assignee: nobody → ted.mielczarek
It looks like the script has been screwed up for a while now. I logged on to the VM where it runs, and it was missing the read-only network mount where it can see the existing symbols. The log was full of attempts to upload 10,000+ symbol files at once, which was probably (silently) failing. I fixed the mount and re-ran it, and it successfully uploaded ~1000 symbol files.
This particular symbol file got uploaded, so I think the issue is fixed.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
(In reply to Ted Mielczarek [:ted, :luser] from comment #6)
> This particular symbol file got uploaded, so I think the issue is fixed.
The last crash occurs at 23:02 on January 16th.
Did you reprocess one day of crashes or is there something wrong with dates in Socorro?
(In reply to Scoobidiver from comment #7)
> The last crash occurs at 23:02 on January 16th.
Now the last crash occurs at 23:59 on January 17th.
Are all crashes of a PST day hide in Socorro?
(In reply to Scoobidiver from comment #8)
> Now the last crash occurs at 23:59 on January 17th.
Without specifying a date and time, it takes the current day at 0AM PST.

It still happens:
https://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A9.0.1&query_search=signature&query_type=contains&query=ntdll.dll%400x38dc9&reason_type=contains&date=01%2F18%2F2012%2002%3A46%3A51&range_value=1&range_unit=weeks&hang_type=any&process_type=any&do_query=1&signature=ntdll.dll%400x38dc9

There also new crashes at ntdll.dll@0x39ef1 (6.1.7601.17725) and other addresses.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Blocks: 718944
Summary: Missing symbols for ntdll.dll 6.1.7601.17725 → Missing symbols for ntdll.dll 6.1.7601.17725 and 6.0.6002.18541
Status: REOPENED → NEW
Yes, this isn't going to fix everything right away. It should fix most new crashes. After it has a chance to run a few more times it should get the remaining symbols.
Status: NEW → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
(In reply to Scoobidiver from comment #8)
> (In reply to Scoobidiver from comment #7)
> > The last crash occurs at 23:02 on January 16th.
> Now the last crash occurs at 23:59 on January 17th.
> Are all crashes of a PST day hide in Socorro?

1) What makes you think it's PST? It might actually be UTC now.

2) Anything linked from topcrasher reports will show full days according to UTC. A search should show you all processed crashes up to now (unless you give it a different date to start with).
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #11)
> 1) What makes you think it's PST? It might actually be UTC now.
At 8:38 CET (PST+9), it showed me crashes until January 16th.
At 10:41 CET, it showed me crashes until January 17th
In addition, when I click Advanced search now (see https://crash-stats.mozilla.com/query?advanced=1), the Before field is filled in with 01/18/2012 05:34:53 while CET is currently 14:34.
(In reply to Ted Mielczarek [:ted, :luser] from comment #10)
> Yes, this isn't going to fix everything right away. It should fix most new
> crashes.
It doesn't fix new crashes:
https://crash-stats.mozilla.com/report/list?range_value=7&range_unit=days&signature=ntdll.dll%400x38dc9&version=Firefox%3A9.0.1
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Hm, ok. The symbol file for that is definitely present now:
[tmielczarek@dm-symbolpush01 symbols_os]$ ls -l wntdll.pdb/D74F79EB1F8D4A45ABCD2F476CCABACC2
total 1220
-rw-r--r-- 1 symbolfetch users 1244266 Jan 17 10:39 wntdll.sym

It's possible this is a problem with the symbol syncing from SJC->PHX. Jabba: were you the one that had investigated how Aravind's symbol syncing scripts worked? It looks like some symbols in /mnt/netapp/breakpad/symbols_os might not be getting synced.
I think it was jakem that looked at this in the past. CC'ing him and jason as well.
I see this symbol on the SJC side, but not PHX. It appears that the incremental sync isn't picking this up.

The incremental sync works by checking for new *-symbols.txt files inside the symbols dirs (symbols_os, symbols_ffx, etc). If one exists, it uses that file as input to rsync for the files to sync.

Normally this is fine, however the symbols_os/microsoftsyms-1.0-WINNT-*-symbols.txt files are formatted incorrectly for this usage. For example:

wntdll.pdb\D74F79EB1F8D4A45ABCD2F476CCABACC2\wntdll.sym

Note the Windows-style *backslashes* as directory separators, instead of forward-slashes. When rsync tries to read that, it doesn't interpret that as a directory separator... it treats it as an escape symbol for the next character (\D, \w), and hence determines that the file does not exist.

By contrast, here's a random line from ./symbols_ffx/firefox-9.0.1-WINNT-20111220165912-symbols.txt:

certutil.pdb/5313449FAB5D4BB08577108E10C678A42/certutil.sym

Similarly, the symbols.txt files for Linux and OSX seem to be fine... it's only the microsoftsyms-1.0-WINNT ones that are wrong.

Once a file has been processed (errors or not), the incremental script won't hit it again... it's no longer "new enough".

However, the weekly "do everything" script will catch these just fine... they don't pay attention to the contents of the symbols.txt files and just rsync everything. It takes about 10 hours or so (which is why it's only done weekly).


If you like I can sync over this particular symbol file, or we can just wait till Sunday morning. This won't fix the bug though. For that we either need to modify how those files get generated (to use / instead of \) or we need to modify the incremental script to "correct" this before feeding rsync. I don't know anything about these files outsides of how the incremental sync script uses them, so I don't know what's feasible here.
Ah, thanks for finding that! I guess my script has always had that bug and I never noticed because it was never important. I'll fix the script to put the correct slashes in. I think we can probably wait for Sunday morning for these to sync, unless Kairo would like them synced now. (In which case maybe you can just sed | rsync your way to victory here.)
Sunday should be fine this time, if we fix it in the future. Now we know at least what's up, that's good.
The latest crashes occurred on January, 21st before 20:00 UTC:
https://crash-stats.mozilla.com/report/list?range_value=7&range_unit=days&date=2012-01-23&signature=ntdll.dll%400x39ef1
https://crash-stats.mozilla.com/report/list?range_value=7&range_unit=days&date=2012-01-23&signature=hang%20|%20ntdll.dll@0x65cd4

(In reply to Ted Mielczarek [:ted, :luser] from comment #17)
> I'll fix the script to put the correct slashes in.
Is there any review needed for a script change?
No, I just landed some changes and tested them, it seems to be working (see microsoftsyms-1.0-WINNT-20120121104147-symbols.txt on the symbol server if you have access).

http://hg.mozilla.org/users/tmielczarek_mozilla.com/fetch-win32-symbols/ if you're curious.

I also fixed an environment issue in the scheduled task on the Win32 VM that was preventing it from uploading the symbols properly. I'll verify that it runs properly tonight, but I'm going to call this FIXED for now.
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
Blocks: 641026
You need to log in before you can comment on or make changes to this bug.