208563 - code size / code footprint keeps increasing

Reporter

Description

•

21 years ago

User-Agent:       Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.4) Gecko/20030529
Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.4) Gecko/20030529

This is only an error if you agree with the Mozilla development roadmap--

"how you can help

"C and C++ hackers: tinderbox now measures code footprint, so let's start
working to reduce it rather than increase it. Resist the all-too-common tendency
to add more code. Try to remove code, simplify over-complicated code, undo
premature optimizations, and purge gratuitous use of XPCOM and its
not-quite-XPCOM precursors. If you have to add a new feature, make sure it's
bundled in the right library, which may mean adding a new library."

http://www.mozilla.org/roadmap.html

checking the code status every day shows that it sometimes goes down--and it
went down a lot (~200kb) with 1.4a, but now code footprint is every bit as large
as it was when the roadmap was written.

getting below 15,000,000 would be interesting.


Reproducible: Sometimes

Steps to Reproduce:
1. write down the code size
2. wait for a few days
3. check code size

Actual Results:  
goes up, almost always.


Expected Results:  
gone a little bit down--maybe some libraries are redundant / should be separated
for use by everything / could be eliminated / could be used natively from each
system (zlib, for a minor example on some systems).

Jo Hermans

Comment 1

•

21 years ago

This is a bit vague. You're only looking at a specific report of Beast (NT 5)
Compare it with Comet (Linux) :
http://tegu.mozilla.org/graph/query.cgi?testname=codesize&tbox=comet&autoscale=1&units=bytes&days=7&avg=0&showpoint=2003:06:07:02:12:59,21069715

Boris Zbarsky [:bzbarsky]

Comment 2

•

21 years ago

You are looking at the wrong graph.  You want to look at codesize_embed (mZ on
the tinderbox page), not codesize.  codesize_embed includes just the gecko core;
codesize includes things like test applications, etc (so codesize went up a lot
when the automated cookie tests were checked in, eg, but codesize_embed did not
change).

Joseph Myers

Reporter

Comment 3

•

21 years ago

good! code size almost down to 15,130,000 bytes.

Joseph Myers

Reporter

Comment 4

•

21 years ago

Great! down to 15,090,000. Keep track.

Joseph Myers

Reporter

Comment 5

•

21 years ago

bad 15,110,000.

Joseph Myers

Reporter

Comment 6

•

21 years ago

less bad: barely less than 15110000.

Joseph Myers

Reporter

Comment 7

•

21 years ago

Beautiful! down to 15,040,000.

Joseph Myers

Reporter

Comment 8

•

21 years ago

up and down

Joseph Myers

Reporter

Comment 9

•

21 years ago

super bad: test went up to 15,065,000. oh, come on! go down to < 15M. everybody
should be watching this. it is possible for the browser to be much smaller.

code policy of ambitious C programmers: rewrite an item every six months,
without breaking externs.

Just look at this example. It is completely random, and happened to be chosen
from the file "nsprpub/lib/tests/string.c"--and I don't even know what it does.
I am going to show you how to fix code, make it easier to inspect, easier to
compile.

--old file, random lines--
    printf("Testing the Portable Library string functions:\n");

    if( 1
        && test_001()
        && test_001()
        && test_002()
        && test_003()
        && test_004()
        && test_005()
        && test_006()
        && test_007()
        && test_008()
        && test_009()
        && test_010()
        && test_011()
        && test_012()
        && test_013()
        && test_014()
        && test_015()
        && test_016()
        && test_017()
        && test_018()
        && test_019()
        && test_020()
        && test_021()
        && test_022()
        && test_023()
        && test_024()
        && test_025()
        && test_026()
        && test_027()
        && test_028()
        && test_029()
        && test_030()
        && test_031()
      )
    {
        printf("Suite passed.\n");
        return 0;
    }

--end (852 bytes)--

--new file, random lines--

printf("Testing the Portable Library string functions:\n");
if(
test_001() && test_001() && test_002() &&
test_003() && test_004() && test_005() &&
test_006() && test_007() && test_008() &&
test_009() && test_010() && test_011() &&
test_012() && test_013() && test_014() &&
test_015() && test_016() && test_017() &&
test_018() && test_019() && test_020() &&
test_021() && test_022() && test_023() &&
test_024() && test_025() && test_026() &&
test_027() && test_028() && test_029() &&
test_030() && test_031()
)
{
printf("Suite passed.\n");
return 0;
}

--end (553 bytes)--

  Furthermore, this simply increases code awareness. Whoever created this file
may wish to investigate whether or not the repeated functions "test_001() &&
test_001()" are actually a mistake. Not caring about code cleanliness would
never have discovered this potential mistake.

  Some misguided individuals (not the person who wrote this file
originally--when authoring code, it is supposed to be organized in whatsoever
fashion is useful only for the author, not for readers) may suggest that it was
a lot better when the lines were all in order, but think of this: suppose you
were about to edit tests 003, 0010, and 0027--it would be harder to jump across
a window instead of having all the code in front of you. Spreading code out can
be helpful or harmful; in this perfectly random case it was harmful in just the
same sense as the middle of a sentence being dropped onto the backside of a book
page. Yes, it is better when they are "together" but not when they are all apart.

  Just imagine that one-third only of the Mozilla code could be treated in
similar fashion. That's outrageous. The savings between 852 and 553 bytes is
about three eighths. Multiplying one-third of the code size of Mozilla (one
third of ~15,000,000 is 5,000,000) by 5/8 (since taking out three eighths leaves
five eighths) would be 3,125,000, almost eliminating two megabytes of source
code, and in fact, _making it better_ in all sorts of ways.

  It makes no difference at all whether the example is even a part of the
Mozilla source. It is an example of a file in the source for some reason, and it
is obvious that no one has looked at it for years.

  Things like indentation are much more useful in expressions like this (from
the same file):

            if( &array[i].str[ array[i].off ] != rv )
            {
                printf("FAIL %d: %s,%s/%lu -> 0x%x, not 0x%x+%lu\n", i, 
                       array[i].str ? array[i].str : "(null)",
                       array[i].sub ? array[i].sub : "(null)",
                       array[i].max, rv, array[i].str, array[i].off);

  And yet, that example is flawed because it is not useful to indent the if/else
clauses, especially when they are separated by tremendous lengths of
code--multiplying not only the point at which the indentation is no longer
useful because the eyes don't even see the part which it is being indented from
(a blank line is much better, because it shows, at the exact place where
separation is necessary, where the switch of logic is occurring), but also the
enormous number of spaces which must be added to the beginning of each line,
making plain text editing almost impossible, and killer, buggy software almost a
necessity for even the creation of less buggy software. Ironic!

  Further, those indentations have the effect of ruining the discussability of
code: I can predict that the entire excerpt has a 50% chance of being trashed
into something hideous, if not impossible to compile, by the wrapping
necessitated by such odoriferous indentation. Also, it serves well to note that
without the useless (useless: not have its use; not serving a purpose; having or
being of no use--and in this case only, which should not be generalized!, I have
explained why it has no use) spacing the code would have become:

if( &array[i].str[ array[i].off ] != rv )
{
    printf("FAIL %d: %s,%s/%lu -> 0x%x, not 0x%x+%lu\n", i, 
           array[i].str ? array[i].str : "(null)",
           array[i].sub ? array[i].sub : "(null)",
           array[i].max, rv, array[i].str, array[i].off);


  and should have saved us 72 bytes, besides reducing the chance of being
trashed to about 20%.

  This is the size regression Mozilla bug, since it exists for you to look at.

  Indentation isn't a problem, but uselessness is. That's my point.

Joseph Myers

Reporter

Comment 10

•

21 years ago

Here's another bad deal:

Inside of the top of many, many files is something like this:

/* -*- Mode: C++; tab-width: 4; indent-tabs-mode: nil; c-basic-offset: 2 -*- */
/* 
 * The contents of this file are subject to the Mozilla Public
 * License Version 1.1 (the "License"); you may not use this file
 * except in compliance with the License. You may obtain a copy of
 * the License at http://www.mozilla.org/MPL/
 * 
 * Software distributed under the License is distributed on an "AS
 * IS" basis, WITHOUT WARRANTY OF ANY KIND, either express or
 * implied. See the License for the specific language governing
 * rights and limitations under the License.
 * 
 * The Original Code is the Netscape Portable Runtime (NSPR).
 * 
 * The Initial Developer of the Original Code is Netscape
 * Communications Corporation.  Portions created by Netscape are 
 * Copyright (C) 1998-2000 Netscape Communications Corporation.  All
 * Rights Reserved.
 * 
 * Contributor(s):
 * 
 * Alternatively, the contents of this file may be used under the
 * terms of the GNU General Public License Version 2 or later (the
 * "GPL"), in which case the provisions of the GPL are applicable 
 * instead of those above.  If you wish to allow use of your 
 * version of this file only under the terms of the GPL and not to
 * allow others to use your version of this file under the MPL,
 * indicate your decision by deleting the provisions above and
 * replace them with the notice and other provisions required by
 * the GPL.  If you do not delete the provisions above, a recipient
 * may use your version of this file under either the MPL or the
 * GPL.
 */

  When new code is edited, this should be deleted and those instructions shorted
to refer to a LICENSE file included at least once in a distribution, and
possibly within every subdirectory, unless a particular file itself requires a
different license. The very fact that a file may require a different license is
disregarded by the current behavior, because no one will notice that a file is
special (and uses a different license) with a code policy that puts a copyright
(seemingly the same) into every single file, with an entire paragraph of part of
the terms of agreement spelled out.

  Of course, this allows the code to be compressed a great deal (with 1KB of
every file being mostly identical), but it is not useful.

  Additionally, I have noticed that thousands of "XXX" comments are made which
nobody has even bothered to take out--assuming that they turned out not to need
any rewriting after all.

  Just look at these at the very beginning:

./nsprpub/config/win16.mk:# XXX FIXME: I doubt we use this.  It is redundant with
./nsprpub/lib/msgc/include/prgc.h:#define PR_ALLOC_ZERO_HANDLE 0x4             
/* XXX yes, it's a hack */
./nsprpub/lib/msgc/src/prmsgc.c:** XXX tune: put subtract of _sp->base into
_sp->hbits pointer?
./nsprpub/lib/msgc/src/prmsgc.c:      /* XXX do something about this */
./nsprpub/lib/msgc/src/prmsgc.c:        ** XXX:  
./nsprpub/lib/msgc/src/prmsgc.c:  ** XXX:  
./nsprpub/lib/msgc/src/prmsgc.c:    /* XXX I don't know how to make it wait (yet) */
./nsprpub/lib/msgc/src/prmsgc.c:  ** XXX:  
./nsprpub/lib/msgc/src/prmsgc.c:        /* XXX THIS IS NOT ACCEPTABLE*/
./nsprpub/lib/msgc/src/prmsgc.c:        /* XXX THIS IS NOT ACCEPTABLE*/
./nsprpub/lib/msgc/src/unixgc.c:/* XXX - This is disabled.  MAP_FIXED just does
not work. */
...
./nsprpub/pr/src/io/prprf.c:        /* XXX not quite sure here */
./nsprpub/pr/src/io/prprf.c:        /* XXX should use cpp */
./nsprpub/pr/src/io/prprf.c:        /* XXX not supported I suppose */
./nsprpub/pr/src/io/prprf.c:        /* XXX not quite sure here */
./nsprpub/pr/src/io/prprf.c:        /* XXX not supported I suppose */
./nsprpub/pr/src/io/prstdio.c:    /* XXX this could be better */
./nsprpub/pr/src/linking/prlink.c:        abort();/* XXX */
...
./security/nss/lib/certdb/genname.c:        /* XXX This URI hostname parsing is
wrong because it doesn't
./security/nss/lib/certdb/polcyxtn.c:           /* XXX - only one byte integers
right now*/
./security/nss/lib/certdb/secname.c:/* XXX This template needs to go away in
favor of the new SEC_ASN1 version. */
./security/nss/lib/certdb/secname.c:    ** XXX for now we are going to just
assume that a bitwise
./security/nss/lib/certdb/stanpcertdb.c:      default:  /* XXX added to quiet
warnings; no other cases needed? */

  Get what I'm saying? People who don't know anything at all about the code can
fix some of this stuff.

  Look for those places if you have the authority and pop five of them a
day--they are the minority, even though there are many, and they will get fixed.
Otherwise, at least take out the 'XXX' comment when you have checked it. Leave
it in if it has entertainment value, however.

  Another thing, I'm sure that some files are not used, not ever. Try making
everything in a directory at a time and then removing from CVS any supposed
source file whose atime has not been changed. Do this five times a day to any
directory you choose as well--or even once a day--and clean up a lot, and things
will be easier.

  It's not fun not to work on new things, but it's also not fun to have the new
things make no difference in a general landscape of untrimmed waste, keeping
your own improvements in its shadow.

Joseph Myers

Reporter

Comment 11

•

21 years ago

don't know if that advice about the LICENSE is legal or not.

Joseph Myers

Reporter

Comment 12

•

21 years ago

Quoting some good effort:
--
cvs removal of nsCookies.{h,cpp} and nsCookieManager.{h,cpp} will follow; to
find old blame info and logs for those files, look in attic.

burn the witch!

b=200632, r=alecf, sr=darin.
---

Thanks.

Joseph Myers

Reporter

Comment 13

•

21 years ago

size is exactly: 15,050,000.

Joseph Myers

Reporter

Comment 14

•

21 years ago

it's about 15,026,000. that's very good.

Joseph Myers

Reporter

Comment 15

•

21 years ago

it was probably this:

removing bitmap/png, screen capturing, and boxmodel colorpicking functionality
from DOM Inspector. 

b=211096, r=caillon, sr=jst 


  However, there should be no need to remove even unneeded items to reduce code
size. It is wonderful, though--do I need it, for example, screen capturing in
the DOM inspector? If no, it is useless. Of course, the decision is whose?

Joseph Myers

Reporter

Comment 16

•

21 years ago

down to 15,025,000; up to 15,055,000.

Joseph Myers

Reporter

Comment 17

•

21 years ago

down to 15,044,000

Joseph Myers

Reporter

Comment 18

•

21 years ago

15,042,000

Joseph Myers

Reporter

Comment 19

•

21 years ago

15,041,700

Joseph Myers

Reporter

Comment 20

•

21 years ago

15,070,000

Joseph Myers

Reporter

Comment 21

•

21 years ago

this is positive: down from about 15074000 to 15062000.

Joseph Myers

Reporter

Comment 22

•

21 years ago

15,063,000

Joseph Myers

Reporter

Comment 23

•

21 years ago

15,052,000

Joseph Myers

Reporter

Comment 24

•

21 years ago

15,092,000

BTW, Mac OS X icons too large and stupid and ugly.
Why not get some from pixelgirl?
http://www.pixelgirlpresents.com/

Someone ought to email her and suggest it.

Besides, it injures the uniform branding of Mozilla, unless FB is going to take
over that.

Boris Zbarsky [:bzbarsky]

Comment 25

•

21 years ago

Marking invalid for the following reasons:

1)  The number being tracked includes the browser, all extensions and all
    automated tests.  This is not a useful metric -- someone checking in some
    automated tests makes this number go up, though this is not an increase from
    the point of view of the roadmap (or anyone sane).  Again, see comment 2.
2)  The comments starting with comment 9 have nothing to do with this bug --
    removing whitespace or comments from source files will have absolutely no
    effect on _compiled_ codesize.

Status: UNCONFIRMED → RESOLVED

Closed: 21 years ago

Resolution: --- → INVALID

Joseph Myers

Reporter

Comment 26

•

21 years ago

sorry

Myk Melez [:myk] [@mykmelez]

Updated

•

20 years ago

Product: Browser → Seamonkey