Open Bug 1361055 Opened 8 years ago Updated 5 years ago

Scrape symbols from non-current OS X versions

Categories

(Socorro :: Symbols, task, P3)

Tracking

(Not tracked)

People

(Reporter: ted, Unassigned)

References

(Blocks 1 open bug)

Details

In bug 1301471 I got some scripts running that pull Apple system updates and scrape symbols out of them. Unfortunately since we hadn't been running them on a regular basis, and Apple only offers the latest version for each major release as an update, we don't have symbols for minor versions in between. For example, we have symbols for 10.12.4, but not 10.12.3 or older. Apple does still make the older update packages available at https://support.apple.com/downloads/macos , so we should be able to just download all of those packages and scrape the symbols out.
Assignee: nobody → ted
I'm trying out a thing where I download the update packages listed on the downloads/macos URL I mentioned in the previous comment: https://tools.taskcluster.net/task-group-inspector/#/E0XZYF9US_GM8WVvhvlB_w?_k=524s03
That run broke because one of the update packages had a PBZX payload that expanded to multiple parts, which we didn't handle. I pushed a fix for that scenario, tested it locally on the offending package, and kicked off a new job: https://tools.taskcluster.net/task-group-inspector/#/PQMltUwfTr6ynUFG9Zieww?_k=p4m8ck
I had been poking at this off and on, but it looks like someone at Apple didn't like this and blacklisted EC2 IP addresses from downloading the dmg files, so I'm not going to push any farther on this. It ought to be possible to run the same script from somewhere else, like a Mozilla office, and upload the results.
Assignee: ted → nobody
What do we do with this? Could it be that your script was too eager and triggered ratelimits or do you think they blacklisted blocks of EC2 and that still being the case? I clicked around in https://github.com/luser/breakpad-mac-update-symbols and basically concluded it's a bunch of stuff around repo_sync which is also python but written by someone else. Is there anything here I can help with? ...with Tecken in mind.
The scripts that power this are kind of ugly, but they work. The stuff I have running as a daily task in Taskcluster runs repo_sync to fetch the software updates that Apple publishes on their update servers, then tries to extract everything useful out of them. The point of this bug was to backfill symbols for outdated OS versions. Like I said in comment 0, when Apple published the 10.12.4 update they pulled the 10.12.3 update from their servers. In the future this shouldn't be a problem, since we're checking daily we should always get symbols for every point release. However, there are a few that we missed and people still run outdated OS versions, so they show up in crash reports. The weekly missing symbols report this week, for example, shows: libsystem_kernel.dylib 6434 We should be able to get symbols for those, but we just don't have them. The scripts I worked on in this bug share some code with the cron job, but instead of using repo_sync it just calls the JSON API that populates the downloads on this page: https://support.apple.com/downloads/macos https://github.com/luser/breakpad-mac-update-symbols/blob/master/get_update_packages.py Apple provides outdated updates as downloads there, so we'd just need to download a bunch of them, dump the symbols, and upload those symbols to the symbol server. It's possible that rate limiting the script would make it work, I just got frustrated and gave up. This isn't super critical, but it'd be nice to fill in some of the gaps in our macOS symbol coverage.
Blocks: 1316675

Gabriele: Is this a duplicate of another bug covering getting system symbols?

Flags: needinfo?(gsvelto)

This is something I believe we haven't fixed yet, Marco do we have a plan to scrape old/missing macOS symbols? The link provided in comment 0 still seems to have all downloads all the way back to ancient versions of macOS 10.x

Flags: needinfo?(gsvelto) → needinfo?(mcastelluccio)

We should try to do this at some point, but no clear timeline for now.

Flags: needinfo?(mcastelluccio)
Priority: -- → P3
You need to log in before you can comment on or make changes to this bug.