Closed
Bug 201622
Opened 21 years ago
Closed 21 years ago
codesighs gives incorrect symbol sizes
Categories
(SeaMonkey :: Build Config, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: bryner, Assigned: bryner)
References
Details
Attachments
(1 file, 4 obsolete files)
9.29 KB,
patch
|
brendan
:
superreview+
|
Details | Diff | Splinter Review |
(filing under Build Config because I couldn't find anything more appropriate) Codesighs can, in certain cases, report incorrect sizes for symbols on Linux. This leads to wrong attributed blame for code size increases. One particular case I found today is that string literals are completely anonymous (no symbol table entry), so based on the method nm uses to calculate symbol sizes, the size of all string literals up to the next symbol are added into the size of the previous symbol. This is just nm being lame, but there's not much we can do about that. Instead, we can convert codesighs to use readelf on linux, which knows how to look at the size field of the symbol table entry to give the correct size for the symbol. This will leave string literals unaccounted for, but I'd be in favor of fixing that later if we care.
Assignee | ||
Comment 1•21 years ago
|
||
This seems to work. I'm not entirely happy about using both a perl wrapper and a separate output processor, but it turns out some things are just easier to do in perl -- the sorting (which seems to not be necessary for codesighs, but is nice to have when examining the file by hand) and buffering up the lines until we [possibly] hit a .symtab table. Comparing the TSV files generated from readelf and nm, the readelf one uses non-rounded-up sizes for everything, so there will likely be an artificial decrease in code size if we switch to this. I think the improved accuracy makes it worth the hit of a one-time jump (down) in the results.
Assignee | ||
Comment 2•21 years ago
|
||
... maybe I'll lose elf2tsv.c and just write the whole thing in perl.
Comment 3•21 years ago
|
||
perl is better at all that sort of stuff. /be
Assignee | ||
Comment 4•21 years ago
|
||
Ok, this is much simpler. I'm converting the readelf output directly into codesighs TSV output. I'm also now getting real values for section names and object files within the library for each symbol.
Assignee | ||
Updated•21 years ago
|
Attachment #120302 -
Attachment is obsolete: true
Assignee | ||
Comment 5•21 years ago
|
||
Attachment #120361 -
Attachment is obsolete: true
Assignee | ||
Updated•21 years ago
|
Attachment #120366 -
Flags: superreview?(brendan)
Attachment #120366 -
Flags: review?(sfraser)
Assignee | ||
Comment 6•21 years ago
|
||
Anyone have any thoughts on whether we should try to account for symbol-less data, like string literals? One possibility is that if there's a gap between the end of a symbol and the beginning of the next one, we could generate a fake symbol name to fill that space, like __data_N, where N counts up from 0 for each library (or maybe __string_N since it's most likely a string).
Comment 7•21 years ago
|
||
Comment on attachment 120366 [details] [diff] [review] cleaned up a bit >Index: readelf_wrap.pl >=================================================================== >+# First read in the list of sections >+open(READELF_OUTPUT, "readelf -SW $ARGV[0] |") or die "readelf failed to run on $ARGV[0]\n"; ... >+open(READELF_OUTPUT, "readelf -sW $ARGV[0] | c++filt |") or die "readelf failed to run on $ARGV[0]\n"; How expensive is it to run readelf twice? Maybe run it once and store the output? >+ # Filter out types of symbols that we don't care about: >+ # - anything that's not of type OBJECT, FUNC, or FILE What other types are there? Should you include other types with non-zero sizes? Otherwise it looks good.
Attachment #120366 -
Flags: review?(sfraser) → review+
Assignee | ||
Comment 8•21 years ago
|
||
> How expensive is it to run readelf twice? Maybe run it once and store the > output? Good point. I can run 'readelf -sSW' and get both the section headers and the symbol table from the same run. > What other types are there? Should you include other types with non-zero sizes? The other types are: NOTYPE (symbols like gcc2_compiled, which are also 0 size) SECTION (section symbols; also 0 size) COMMON (I can't find these anywhere in real files. I suppose they could be non 0 size though.) TLS (thread local storage - I can't find any of these symbols either) So maybe I should just include all symbol types, and call anything DATA that's not of type FUNC.
Comment 9•21 years ago
|
||
We could try treating anything not FUNC as signal, and see how that goes. If we want to be strict about code size regressions, we'll need to avoid false positives like this string foozle, though. COMMON shouldn't show up unless someone is using Fortran, I bet. ;-) /be
Assignee | ||
Comment 10•21 years ago
|
||
Turns out we can't do what I wanted to do to figure out the object file a symbol comes from. For example, the patch above will mis-attribute all global functions and data (oops). We really need debugging info (either stabs or dwarf) to get this right. I'd like to get this patch in without the object file parts and come back and do a later revision to fix that. As for the debugging symbols, finding the right object file is done roughly the same way for both stabs and dwarf (find the SO before the symbol for stabs or the DW_TAG_compile_unit for dwarf). These can be dumped with 'objdump -G' and 'readelf -wi' respectively.
Assignee | ||
Updated•21 years ago
|
Attachment #120366 -
Flags: superreview?(brendan)
Assignee | ||
Comment 11•21 years ago
|
||
Attachment #120366 -
Attachment is obsolete: true
Assignee | ||
Updated•21 years ago
|
Attachment #120681 -
Flags: superreview?(brendan)
Comment 12•21 years ago
|
||
Comment on attachment 120681 [details] [diff] [review] patch with faulty object file guessing removed Extra newline after comment? + # module name + + print "$module\t"; Looks great -- just curious, under what conditions do we not find a .symtab that trumps a .dynsym section? /be
Attachment #120681 -
Flags: superreview?(brendan) → superreview+
Comment 13•21 years ago
|
||
I tried it, and tried to compare with an old-style .tsv file. They are not compatible, so codesighs was runnig very long to compare everything. Is this intentional? It now uses names like "rodata" instead of "R". I don't know the old names exactly. Anyway, the new ones look clearer. At least, I can understand them :) It also seems te be slower. Is that the perlscript, or is it readelf being slow?
Comment 14•21 years ago
|
||
The old readelf definitely is no speed daemon. The best you can do is use the readelf from elfutils, installed as eu-readelf on RHL9. It should be possible to compile the package oon RHL7.3 but I never tried.
Assignee | ||
Comment 15•21 years ago
|
||
Unfortunately, the output of readelf and eu-readelf isn't identical, so I'll probably have to rewrite the parsing to use eu-readelf.
Assignee | ||
Comment 16•21 years ago
|
||
Changes from the last patch: - Work with readelf or eu-readelf (prefer the latter, for speed) - Report space used in a section but not associated with a symbol (this will make sure we take into account string constants, etc). The extra space is reported as ".nosyms.<section>" I ran this with both readelf and eu-readelf and verified that the results were identical.
Assignee | ||
Updated•21 years ago
|
Attachment #120681 -
Attachment is obsolete: true
Assignee | ||
Updated•21 years ago
|
Attachment #132393 -
Flags: superreview?(brendan)
Comment 17•21 years ago
|
||
Comment on attachment 132393 [details] [diff] [review] revised patch >+while (<READELF_OUTPUT>) { >+ >+ if (!$in_symbols) { >+ # capture the 'null' section which has no name, so that the Wahhh, 2- vs. 4-space indentation unit alert! Other than that, sr=brendan@mozilla.org. /be
Attachment #132393 -
Flags: superreview?(brendan) → superreview+
Assignee | ||
Comment 18•21 years ago
|
||
checked in and reporting good numbers on comet and luna (the numbers are up a bit from the nm-based reporting - this is expected)
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Updated•20 years ago
|
Product: Browser → Seamonkey
You need to log in
before you can comment on or make changes to this bug.
Description
•