"Address" columns in tables use string sort, not numerical sort
Categories
(Socorro :: Webapp, task, P3)
Tracking
(Not tracked)
People
(Reporter: mccr8, Assigned: willkg)
References
Details
Attachments
(2 files)
| Assignee | ||
Comment 1•5 years ago
|
||
Andrew: The link doesn't work. Is this still a problem? If so, can you give a new example?
| Reporter | ||
Comment 2•5 years ago
|
||
Yes, this is still a problem.
Here's some STR.
- Go to a crash for the signature [@ shutdownhang | mozilla::SpinEventLoopUntil<T> | mozilla::dom::quota::QuotaManager::Observer::Observe ], for instance bp-af3e2363-5872-4192-a10e-25d1b0200409.
- Click on "more reports"
- Click on the "reports" tab.
- Click on the "address" header to sort by address.
The list has things in "increasing" order, but then there are things like this:
0x17fe7e4a
0x1b09319b441
[...]
0x4b23a5f
0x1b09319b441 is much larger than 0x4b23a5f, but is listed before it, presumably because these are sorted as strings in lexicographical order rather than by their numerical value.
I'm not sure how much this really matters. I don't remember if there was some specific reason I wanted this for.
| Assignee | ||
Comment 3•5 years ago
|
||
I tweaked the subject and making it a P3. This is part of a bigger issue in that all the columns are string sort and that doesn't work for numbers, addresses, versions, etc.
| Assignee | ||
Comment 4•5 years ago
|
||
| Assignee | ||
Comment 5•5 years ago
|
||
| Assignee | ||
Comment 6•5 years ago
|
||
I pushed this to prod in bug #1680511. I looked at some signatures and it mostly works. If there are < 100 reports, then it works fine since it's doing all the sorting client-side. If there are > 100 reports, then it does a server-side sort which is still wrong.
I'm going to keep this open until we fix the server-side sorting, too.
| Reporter | ||
Comment 7•5 years ago
|
||
Thanks! This isn't the hugest deal, but sorting by addresses can be useful for looking for poison values that might be an indication of a security issue, and making the ordering easier to follow is helpful.
| Assignee | ||
Comment 8•4 years ago
|
||
All hex values including address are indexed as strings. For some of the fields, all the values of that field are the same length, so indexing it as a string and sorting that alphanumerically works fine. For example the "adapter vendor id" works like this.
For address, the values are of different lengths, so sorting it alphanumerically doesn't work.
I have a few ideas on how to fix this:
-
Left-pad the hex addresses with
0so they're all the same length. Then0x17fe7e4abecomes something like0x00000017fe7e4a. That's an easy fix and doesn't require rewriting anything major. It does mean that crash reports processed before the fix is pushed to production will have non-padded values and we'll have to wait until the range of time we're looking at only contains padded values. It also means that the values will show up left-padded in the interface everywhere. -
I figure out a way to index both the original value (
0x17fe7e4a) and a sort value (maybe convert it to an int?) and then rework everything to sort using the sort value if there is one. This is probably a complicated project, but it'd give us the ability to sort by alternate columns.
I'll look into option 2 for a bit and see how that might work. If it's too involved, I may go for option 1.
| Assignee | ||
Comment 9•4 years ago
|
||
I asked on #crashreporting and Jeff pointed out that 32-bit and 64-bit addresses are different and we should keep in mind the higher-order bit means things. Thus we can't just left-pad hex addresses.
I want to create a field that's used solely for server-side sorting of data. So if address is specified in _sort, then it'll use the sort field. I think we should create this field in the processor. Maybe we prefix fields like this with sortkey_. Then we add it to super search fields and adjust the supersearch code and I think that'll work.
For the sortkey, we talked about figuring out if the crash report was emitted from a system with a 32-bit or 64-bit architecture. We can glean that from cpu_arch. Then the key becomes a left-padded form of what's there and that should work.
I looked at cpu_arch and I don't think we have enough granularity there to determine the architecture. For example, what's "x86"? Some of the crash reports from Windows on x86 have isWow64 set in the Telemetry environment. I think that means it's a 32-bit process on a 64-bit architecture. So I think that counts as 32-bit. Does that mean the ones where isWow64=false are 64-bit? That seems fiddly.
I'm wondering whether we could determine a sortkey by converting the hex to an unsigned number. Will that be good enough for sorting?
| Assignee | ||
Comment 10•4 years ago
|
||
Jeff, Andrew: Does comment #9 summarize things correctly?
Do you think converting the hex to an unsigned number will be good enough for a sort key? We wouldn't display it anywhere--just sort with it.
Comment 11•4 years ago
|
||
cpu_arch = x86 should be sufficient for determining 32 bit. I believe cpu_arch is determined by the build so we don't have to worry about whether we're running on a 64 bit os or not.
Assuming the minidump processor is producing the formatted crash address I think this would be simplest to solve by just having the minidump processor pad out the crash address to the bitwidth. We should be able to do that and change nothing in socorro.
| Assignee | ||
Comment 12•4 years ago
|
||
I wrote up https://github.com/mozilla-services/minidump-stackwalk/issues/30 to cover fixing minidump-stackwalk to emit a left-padded address.
| Assignee | ||
Updated•4 years ago
|
| Assignee | ||
Comment 13•4 years ago
|
||
| Assignee | ||
Comment 14•4 years ago
|
||
| Assignee | ||
Comment 15•4 years ago
|
||
I pushed this to prod just now in bug #1712816. I'm marking this as FIXED, though It'll take a bit for enough data to be left-padded that it's sortable.
Description
•