Closed Bug 98533 Opened 18 years ago Closed 16 years ago

[minimo] Remove JS from prefs backend; use lightweight parser instead [was: store in .properties]

Categories

(Core :: Preferences: Backend, enhancement)

enhancement
Not set

Tracking

()

RESOLVED FIXED
mozilla1.6alpha

People

(Reporter: alecf, Assigned: darin.moz)

References

Details

(Keywords: memory-footprint, perf, Whiteboard: [ready to land])

Attachments

(3 files, 3 obsolete files)

This has been a pet-peeve of mine and recently it's become a dependency nightmare.

Basically, with the exception of autoconfig, there is NO reason that preferences
needs to be using JavaScript to read and write prefs. 

Why is this bad?
1) Basically all we do with prefs.js is use JS to read it, and then output it in
a very well known format. It's overkill.
2) dragging in JS creates a a dependency tree such that any library that makes
use of prefs must also include JS - which sucks for embeddors. (ok, so right now
they need JS anyway, but this wont always be true!)
3) it's a performance and bloat issue - we don't need to create a JS context and
drag in the JS parser and interpreter just to read in a set of name/value pairs. 

This is obviously a bit of blue-sky, but I figured I'd file the bug given the
discussion in bug 89137.
Attached file initpref.js
Attached file macprefs.js
that's almost all autoconfig stuff - defining variables that will be set up in
the autoconfig JS context (which, per 89137, will still get a JS context)
alecf: I agree with all points (although "embeddors won't need JS" must refer to
mythical embeddors who don't want a modern-era web browser, which requires JS,
DOM, HTML4, CSS, HTTP1.1, etc.).  Way back when, the switch to store prefs in JS
was done with an eye toward some kind of autoconfig-y scripted prefs future, but
that wasn't a good reason to store prefs themselves as JS source.  Which leads
to reason #4, you forgot:

4) prefs.js has been a point of security and privacy vulnerability.

The worst attack I know of involved redefining user_pref and the like so as to
capture all of one's pref settings.  This attack relied on other holes, to be
sure, but defense in depth suggests closing it off by storing prefs in other
than an interpreted scripting language format!

/be
Sorry, got involved in in IM session....

As I noted in bug 89137, I tried this one afternoon. Hacking out the basic JS
ties wasn't that difficult. But when I launched mozilla, I immediately hit
snags. On launch, the above attachements are executed (I've attached the mac .js
file, but the others are similar). As you can see, there is JS being executed
here. This will need to be resolved before we can go forward with this plan.

I don't know where this is used specifically, so if alec is correct... great. But
the mime type stuff worrries me.
I think this is a great idea and would probably give us a significant
performance win on startup.

In response to the security question, Brendan, are you suggesting we change the
format of the pref files themselves, in addition to the parsing mechanism? Will
each line no longer start with pref or user_pref? If we don't change the format,
then we haven't really improved our security story - and there are good reasons
not to change the format.
How much of startup does pref JS evaluation take?  Last I saw, not much.

I'm assuming we can change the format, with some work in the profile migration
code.  Who's with me?

/be
I'm talking about changing the format as well, into something understood by
nsIPersistentProperties (i.e. same as string bundles, and part of xpcom)

network.ftp.userid=alecf
some.integer.id=4
a.boolean.value=true

and the pref code can be smart about evaluating types.
(i.e. if I call getCharValue on a.boolean.value, I get back "true")
Depends on: 89137
Blocks: 7251, 110908
Target Milestone: --- → mozilla0.9.9
we use xml at all edges of mozilla, why not in the prefs file ?

to increase the startup speed the prefs file could also be added in a user
specific "xul cache" and only rewritten if the prefs has changed.

see also bug #17199 and the xslt transformation to solve this
Target Milestone: mozilla0.9.9 → mozilla1.0
Moving Netscape owned 0.9.9 and 1.0 bugs that don't have an nsbeta1, nsbeta1+,
topembed, topembed+, Mozilla0.9.9+ or Mozilla1.0+ keyword.  Please send any
questions or feedback about this to adt@netscape.com.  You can search for
"Moving bugs not scheduled for a project" to quickly delete this bugmail.
Target Milestone: mozilla1.0 → mozilla1.2
Depends on: 132140
Keywords: perf
Keywords: footprint
Suggesting following changes:
Platform: PC --> All
OS: Windows 2000 --> All

Bug 62755 is about sparating prefs in to categories so unnecessary prefs won't
need to be included in embedded versions. These two bug will on implementation
effect eachother so maybe some dependencies are in order?
xml is overkill and gives us no useful benefits, just like js. properties 
format sounds fine and would mean that prefs would require fewer components 
(xml would mean that prefs would require htmlparser)
OS: Windows 2000 → All
Hardware: PC → All
yes xml is too heavy weight here - there are many low level components which
need pref access but do not need any kind of markup-language capability. This
allows those components to be re-used without bringing in the htmlparser.
Blocks: 92580
Taking some of Brian's bugs.
Assignee: bnesse → ccarlen
Target Milestone: mozilla1.2alpha → Future
*** Bug 199326 has been marked as a duplicate of this bug. ***
Summary: Remove JS from prefs backend → Remove JS from prefs backend, store in .properties
Summary: Remove JS from prefs backend, store in .properties → [minimo] Remove JS from prefs backend, store in .properties
so, this isn't exactly a switch to .properties, but i think this is a good,
small step toward the same goal.  anyways, changing the preferences file format
sounds painful from a migration point of view.	this patch avoids that problem.
Attachment #128922 - Attachment is obsolete: true
Attachment #128923 - Flags: superreview?(brendan)
Attachment #128923 - Flags: review?(alecf)
-> me
Assignee: ccarlen → darin
Comment on attachment 128923 [details] [diff] [review]
v1.1 patch : caught a typo in the previous patch

Aesthetic comment: return PRBool is clearer than Unix-ish -1/0, especially when
callers test != 0.

>+static int
>+pref_GrowBuf(char **buf, char **cur, char **end)
>+{
 . . .
>+    *buf = (char*) realloc(*buf, bufLen);
>+    if (!*buf)
>+        return -1;
>+
>+    *cur = *buf + curPos;
>+    *end = *buf + bufLen;
>+    return 0;
>+}

Here you will skip the next line after a line of the form "#\n", because
PREF_PARSE_COMMENT_MAYBE_START insists on at least one character before the
newline and after the #.  Also, you should handle // comments.

>+        case PREF_PARSE_INIT:
>+            switch (b) {
>+            case '/':       /* begin comment block or line? */
>+            case '#':       /* accept shell style comments */
>+                state = PREF_PARSE_COMMENT_MAYBE_START;
>+                break;
>+            case 'u':       /* indicating user_pref */
>+                ps->fuser = 1;
>+                break;

Wouldn't it be best here to consume all of "user_pref" and set fuser iff you
found that exact string?  And complain if you don't find the other valid kinds
(how many are there nowadays?) of pref-setting JS function names.  We don't
want malformed input to get away with murder (in the dire case of some
multiple-exploit attack).

>+        /* name parsing */
>+        case PREF_PARSE_UNTIL_NAME:
>+            if (b == '"')
>+                state = PREF_PARSE_NAME;
>+            break;

Otherwise b had better b white space, or a /*...*/ inline comment char, right?

>+        case PREF_PARSE_NAME:
>+            if (ps->nbcur == ps->nbend) {
>+                if (pref_GrowBuf(&ps->nb, &ps->nbcur, &ps->nbend) != 0)
>+                    state = PREF_PARSE_UNTIL_EOL; /* unable to grow buffer; skip this line */
>+            }
>+            if (b == '"') {
>+                *ps->nbcur++ = '\0';
>+                state = PREF_PARSE_UNTIL_VALUE;
>+            }
>+            else
>+                *ps->nbcur++ = b;
>+            break;

Do we need separately allocated nb and vb buffers?  Why not isolate the strings
in place in the pref buffer and NUL-terminate 'em there (overwriting their
closing quote characters)?

>+
>+        /* value parsing */
>+        case PREF_PARSE_UNTIL_VALUE:
>+            if (b == ',') {
>+                state = PREF_PARSE_VALUE;
>+                ps->vtype = PREF_INVALID;
>+            }
>+            break;

More states needed again, to tighten up parsing to exclude invalid inputs.

>+        case PREF_PARSE_VALUE:
>+            if (ps->vtype == PREF_INVALID) {
>+                if (b == '"') {
>+                    ps->vtype = PREF_STRING;
>+                    break;      /* wait next char */
>+                }
>+                else if (b == 't' || b == 'f')
>+                    ps->vtype = PREF_BOOL;

Full match, either "true" or "false".

>+                else if (isdigit(b) || (b == '-'))
>+                    ps->vtype = PREF_INT;

else error, ignore this line (if not the whole file?).

>+            switch (ps->vtype) {
>+            case PREF_STRING:
>+                /* skip char if start of escape sequence */
>+                if (!ps->fesc && b == '\\')
>+                    ps->fesc = 1;
>+                else {
>+                    /* treat '"' as data if preceeded by a backslash */
>+                    if (ps->fesc || b != '"')
>+                        *ps->vbcur++ = b;
>+                    else {
>+                        *ps->vbcur++ = '\0';
>+                        state = PREF_PARSE_LINE_COMPLETE;
>+                    }
>+                    ps->fesc = 0;
>+                }

Here I expect you'll need to handle all JS escape sequences: \[bfnrtv], \xHH
for two hex digits H, \uXXXX for four hex digits spelling a Unicode point, and
maybe even \O{1,3} for octal digits (if you get my metachar-meaning -- [] and
{,} are meta).	See js/src/jsscan.c.

>+                break;
>+            case PREF_BOOL:
>+                if (strchr("truefals", b))
>+                    *ps->vbcur++ = b;
>+                else {
>+                    *ps->vbcur++ = '\0';
>+                    state = PREF_PARSE_LINE_COMPLETE;
>+                }
>+                break;

Don't need this loose "truefals" membership test if you insist on exact match
at the first opportunity.

>+            case PREF_INT:
>+                if (isdigit(b) || (b == '-'))
>+                    *ps->vbcur++ = b;
>+                else {
>+                    *ps->vbcur++ = '\0';
>+                    state = PREF_PARSE_LINE_COMPLETE;
>+                }

Want one - at most.  JS allows unary +, but who cares?	Still, easy to do. 
Does anyone want to spell int-type pref values using hex or octal?  I trust
not, but RGB colors come to mind.

/be
I meant "PRBool or int poor-man's Boolean", but really, why not include
prtypes.h here and use PRBool?

/be
Another worry: don't you need to handle \r-terminated lines, for old MacOS
compatibility?  Maybe only when migrating profiles?  Maybe not.

/be
> Otherwise b had better b white space, or a /*...*/ inline comment char, right?

Heh. Canonical name for the char variable used by scanners is c, not b. Ultra-nit.

/be
thanks for all the review comments brendan!!

>Do we need separately allocated nb and vb buffers?  Why not isolate the strings
>in place in the pref buffer and NUL-terminate 'em there (overwriting their
>closing quote characters)?

the parser is designed to handle chunked input, so the input buffer might not
contain a full line.  so, i need at least one scratch buffer.  i will
consolidate the two into one.

as for all your other comments, they make good sense.  i was trying to take the
approach of not being strict about the format, but maybe being strict is better
from a safety/security point of view.  ok, new patch coming up....

with respect to escaping/unescaping, prefapi.cpp only recognizes '\r', '\n',
'\\', and '\"' as characters to be escaped.  moreover, if '\uxxxx' appeared in a
pref file, the current code would truncate the high bits since prefapi.cpp
currently calls JS_GetStringBytes to extract the value of a string-type
preference.  i'm tempted to only unescape what we would escape when writing
prefs.js.  alternatively, maybe i should fix prefapi.cpp to escape more stuff. 
i'm not sure that should include unicode, hex, or octal escape sequences.
Blocks: 213938
Severity: normal → enhancement
Status: NEW → ASSIGNED
Target Milestone: Future → mozilla1.6alpha
Attached patch v1.2 patch (obsolete) — Splinter Review
ok, this patch is a major revision over the last one.  i've made the parsing
much stricter per brendan's advice.  i'm actually making PREF_ParseBuf return
PR_FALSE if any syntax error is discovered.  i like this better than just
discarding the rest of the current line since a pref function call might
actually span several lines.  recovering from syntax errors is a rats nest i'd
prefer to avoid.  i have also only implemented support for escape sequences
that we would generate when we write prefs.js.	i know this might cause an
inconsistency between this pref parser and the JS pref parser, but i think it
is highly unlikely that anyone will care (*knock-on-wood*).
Attachment #128923 - Attachment is obsolete: true
Attachment #128923 - Flags: superreview?(brendan)
Attachment #128923 - Flags: review?(alecf)
Attached patch v1.3 patchSplinter Review
added a check to detect an integer value of only "-" or "+" and error out. 
this patch does not support integer values encoded in hex or octal.  again, i
think that is reasonable given that we only generate values in decimal, and
folks can work around this if it surfaces as an issue.
Attachment #129327 - Attachment is obsolete: true
Attachment #129328 - Flags: superreview?(brendan)
Attachment #129328 - Flags: review?(dougt)
the v1.3 patch is checked in on the MINIMO_6_AUG_2003_BRANCH for testing.  it
might be easier to review the code by checking out the branch (only
modules/libpref/src) from CVS.
Comment on attachment 129328 [details] [diff] [review]
v1.3 patch

You changed the capitalization of some of the function names, but didn't fix
the comments:

+ * pref_InitParseState
Attachment #129328 - Flags: review?(dougt) → review+
thanks doug!  i fixed up that comment in my local tree (and on the minimo branch).
this is ready to land once the trunk opens for 1.6 alpha.  brendan gave verbal
sr= yesterday.
Summary: [minimo] Remove JS from prefs backend, store in .properties → [minimo] Remove JS from prefs backend; use lightweight parser instead [was: store in .properties]
Whiteboard: [ready to land]
Comment on attachment 129328 [details] [diff] [review]
v1.3 patch

Yeah, this is money, baby!

/be
Attachment #129328 - Flags: superreview?(brendan) → superreview+
fixed-on-trunk
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Nice fix! These things makes Mozilla much better, as a whole.

It could be wise however to open a new bug, to implement:
+  // XXX maybe we should read the file in chunks instead??

Also: may be best to let the parser handle this, by just ignoring (or stopping
at) an EOF:
  #ifdef XP_OS2 /* OS/2 workaround - our system editor adds an EOF character */
      if (readBuf[amtRead - 1] == 0x1A) {
         amtRead--;
      }

  #endif

Also: why not remove the 'PRBool bCallbacks' param of
'PREF_EvaluateConfigScript' (it seems always to be 'PR_TRUE').
+    NS_ASSERTION(bCallbacks, "no support for disabling callbacks");

Furthermore the 'leafName' thing seems also not needed anymore. At least
PREF_EvaluateConfigScript doesn't seem to use it.

Cool. Now all we need is to get rid of the chrome registry....
see bug 219479 for further cleanup...

looks like we got a small Ts perf increase on some of the slower tinderboxen.
luna and comet (neither is all that slow) both show about 2.5-3% improvements in Ts.
Could this be related to bug 219711?
Bug 220095 Navigator->Languages for Web Pages empty, wrong character coding

Regressed with BuildID 20030916, is this related to the checkin mentionend in
comment 32?
hermann: yes, please update to the latest nightly build.  there was a very
serious regression caused by this patch.
This patch seems to have broken the ability for extension authors to provide
default prefs for their own extension.  On install, my extension (download
statusbar) drops its pref file (downbarconfig.js) in "[program
folder]/defaults/pref" - where it used to be picked up and added to prefs.js.

This doesn't work anymore.  Is there a alternate method provided by this new
prefs backend?  
that mechanism is still supposed to work... this only affected the manner of
parsing, not the choice of while files to parse... please file a new bug. More
than likely some extension had some javascript rather than just raw pref
definitions. Can you paste the prefs file that isn't working?
Just filed bug 222023 - with a testcase of a simple "testpref.js" file that only
contains:

pref('aaatestpreference.for.extension', true);
You need to log in before you can comment on or make changes to this bug.