Closed Bug 801209 Opened 12 years ago Closed 7 years ago

Replace NSPR Logging with something better

Categories

(Core :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: valentin, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

Attached patch v0.1 patchSplinter Review
A new logging system is needed for tracking events in Necko.
The current logging system cannot be turned on/off while firefox is running and implies the user setting certain environment variables. Non-technical users find this difficult.

The logger should be able to be enabled/disabled following the user's input.
Current logging system may be kept in place, while the new one is running.

This is needed for debugging. For example, the user can enable the logger, visit a buggy website, then dump the events to a file, and attach it to a bug.

The attachment provides a naive and simple example of such a logger.
Comment on attachment 670995 [details] [diff] [review]
v0.1 patch

Review of attachment 670995 [details] [diff] [review]:
-----------------------------------------------------------------

Hmm, there are many things to consider here...

1) We presumably don't want to keep this just for necko usage, we want to replace NSPR logging in general. So it won't be 'NetLog' it'll be Mozilla::Logger or something like that. Not sure where it makes the most sense to live in the tree. As far UI goes, it might make sense to show up in the Web Console (?) as much as in about:net.

2) I'm not sure nsIObserverService is the best way to do this. It's not cross-process, and the overhead might be too high (or not: it might be ok since we're only using it when logging is turned on). I asked on #b2g if we have a better way for cross-process messages (Message manager?) but haven't gotten an answer yet.

3) You need to have printf-style support for LOG msgs.

4) One strength of the current NSPR network logging is that you can ask for just a subset of network events (HTTP, DNS, Socket thread, etc). We should keep that. Right now we do it by #defining LOG differently, so it expands to a different PRLogModuleInfo. We could replace that by having your Log/Event function(s) take two char * (one the topic, one the message) and have LOG expand like so

  #define LOG(msg) NetLog->Log("http", msg);

5) We'll want the ability to timestamp log entries. Internally that can be a mozilla::TimeStamp or whatever, but it should map to something easily human-readable in output. In NSPR you have to pass 'timestamp' in NSPR_LOG_MODULES to get timestamps--we'd be better off always collecting them IMO, perhaps with an option to turn them off when displaying output (since they consume a lot of columns).

6) I think the 'severity level' that NSPR uses (from 1-5, so you can filter output by number) is a failure. I've never used it, and virtually all our calls just use the same priority. It might be possible to do something more useful if we provided a way to pass in a 'subcomponent' string, so that calls to LOG in http code would result in sending a 'network-http-SUBCOMPONENT', and one could then provide a list of strings/regexes to the logging console to select the messages you're interested in. But that's probably too ambitious (would require changing all logging call sites, for one thing), and this bug is already getting large in scope.

::: netwerk/base/src/NetLog.cpp
@@ +10,5 @@
> +void
> +NetLog::Init()
> +{
> +    if (!serv)
> +        serv = new nsObserverService();

You don't want to construct services directly: use

  mObserverService = mozilla::services::GetObserverService();

::: netwerk/base/src/NetLog.h
@@ +1,1 @@
> +/* -*- Mode: C++; tab-width: 4; indent-tabs-mode: nil; c-basic-offset: 4 -*- */

Need both emacs and vi modelines, to preserve harmony :)

@@ +21,5 @@
> +    static bool AddObserver(nsIObserver *, const char *, const bool);
> +    static bool RemoveObserver(nsIObserver *, const char *);
> +    static bool NotifyObservers(nsISupports *, const char *, const char *);
> +    static bool Log(const char *);
> +    static bool Event(const char *, const char *);

Why the separate Log/Event functions? (and NotifyObservers, which seems like it ought to be private).
Component: Networking → General
Summary: Logging events in Necko → Replace NSPR Logging with something better
(In reply to Jason Duell (:jduell) from comment #1)
> 1) We presumably don't want to keep this just for necko usage, we want to
> replace NSPR logging in general. So it won't be 'NetLog' it'll be
> Mozilla::Logger or something like that. Not sure where it makes the most
> sense to live in the tree. As far UI goes, it might make sense to show up in
> the Web Console (?) as much as in about:net.

Yes. We should get devtools UX involved, because it is confusing to have the web consoling logging some stuff and about:net logging different stuff, even if the intended uses/users are different.

> 3) You need to have printf-style support for LOG msgs.

I think we should explicitly NOT have printf-style support for logging, but instead just use iostreams-style. See bug 801749.

> 5) We'll want the ability to timestamp log entries. Internally that can be a
> mozilla::TimeStamp

I've heard that mozilla::TimeStamp is very expensive and should only be used when extreme precision is required. On the other hand, isn't Telemetry using mozilla::TimeStamp all over?

> 6) I think the 'severity level' that NSPR uses (from 1-5, so you can filter
> output by number) is a failure.

I think the level mechanism will more useful once we're exposing more network logging to about:net and/or the web console. Probably only need two levels though: things that get logged to the web console and about:net, and more detailed things that only get logged to about:net and/or are only logged upon request.


> Need both emacs and vi modelines, to preserve harmony :)

https://developer.mozilla.org/en-US/docs/Developer_Guide/Coding_Style#Mode_Line
(In reply to Brian Smith (:bsmith) from comment #3)
> (In reply to Jason Duell (:jduell) from comment #1)
> > 5) We'll want the ability to timestamp log entries. Internally that can be a
> > mozilla::TimeStamp
> 
> I've heard that mozilla::TimeStamp is very expensive and should only be used
> when extreme precision is required. On the other hand, isn't Telemetry using
> mozilla::TimeStamp all over?

Yes, Telemetry is using TimeStamp everywhere.
> I think we should explicitly NOT have printf-style support for logging, 
> but instead just use iostreams-style.

A fine long-term goal, but I think that should be a followup--an initial fix that doesn't need to change every existing LOG call in the tree seems like a logical first step.

This is quickly feature-creeping its way past the scope of the summer intern who's working on about:net.  He may punt and so something specific for the events about:net needs to see without trying to tackle this.

cjones informs me that child processes can still open files in /data/local/tmp, so this isn't needed for B2G.  I've updated our page on HTTP logging so developers know that they need to use that directory for log files:

 https://developer.mozilla.org/en-US/docs/HTTP_Logging#Creating_separate_logs_for_child_processes_(electrolysis-only)
AFAICT, it is NOT safe to write logging to /data/local/tmp or anywhere else readable by child processes on B2G, because the logs may contain sensitive information that we're trying to prevent the child process from having, and because it leads to potential DoS attacks (e.g. eating all disk space).
That's true for release builds.  However, we should never have developer logging enabled in release builds.
A p0wned child process can/will enable logging whenever it wants, production release or no. There are many logging modules that we use FORCE_PR_LOG for (not just Necko, but throughout Gecko).
NSPR logging is by default enabled for many components in our releases.  We can turn it off for B2G if we really want to (i.e. if we somehow ignore FORCE_PR_LOG).  But we're talking about code that only gets hit if you've set up NSPR_LOG_MODULES in your environment (i.e. actively requesting logging), so I really don't think there's much of a security angle here, at least insofar as we permit writing to /data/local/tmp.  If we get rid of that, then we need to replace NSPR logging with some IPC-based logging.

> eating all disk space

Sure.  Not sure it's v1-worthy.

> logs may contain sensitive information that we're trying to prevent the child
> process from having

Hmm, possibly.  But since we're not logging unless users request it, it seems an unlikely attack vector (targeting only users who are trying to send us HTTP logs).
(In reply to Brian Smith (:bsmith) from comment #8)
> A p0wned child process can/will enable logging whenever it wants, production
> release or no. There are many logging modules that we use FORCE_PR_LOG for
> (not just Necko, but throughout Gecko).

App processes can already cause /data to be filled up, even when 100% non-malicious/non-pwned.
erahm, has your logging overhaul effectively resolved this bug? Note that this is a blocking bug for the 'dieprtypesdie' bug, so one of the criteria for answering that question is whether logging still depends on prtypes.h.
Flags: needinfo?(erahm)
(In reply to Nicholas Nethercote [:njn] from comment #11)
> erahm, has your logging overhaul effectively resolved this bug?

I think the requirements from comment 0 have been met, so yes we can close this.

> Note that
> this is a blocking bug for the 'dieprtypesdie' bug, so one of the criteria
> for answering that question is whether logging still depends on prtypes.h.

Internally we still rely on prprf.h [1] which requires prtypes.h. This is due to the non-standard format strings that were used in prlog (and prprf). We could attempt to audit every log call for non-standard format strings, but that might be a rather monumental effort given the amount of ad-hoc log wrappers there are eg:

> #define LOG_D(msg,...) MOZ_LOG(sLogger, Debug, (msg,...))

Additionally we include prlog.h (which includes prtypes.h) in Logging.h for backwards compatibility for calling MOZ_LOG with a |PRLogModuleInfo*|. It's possible that could be removed.

So all that said, we can probably close this but spin off two new blocking bugs for dieprtypesdie.

[1] https://dxr.mozilla.org/mozilla-central/rev/01748a2b1a463f24efd9cd8abad9ccfd76b037b8/nsprpub/pr/include/prprf.h
Flags: needinfo?(erahm)
(In reply to Eric Rahm [:erahm] from comment #12)
> We could attempt to audit every log call for non-standard format strings,
> but that might be a rather monumental effort given the amount of ad-hoc log
> wrappers there are eg:

I don't recall the details, but at least finding the non-standard things might not be too bad using the printf-format checking __attribute__ stuff.
(In reply to Eric Rahm [:erahm] from comment #12)
> (In reply to Nicholas Nethercote [:njn] from comment #11)
> > erahm, has your logging overhaul effectively resolved this bug?
> 
> I think the requirements from comment 0 have been met, so yes we can close
> this.

Lets go ahead and close the bug, I'm just going to mark WFM. Parts were implemented in various bugs over the past few years.

> Internally we still rely on prprf.h [1] which requires prtypes.h. This is
> due to the non-standard format strings that were used in prlog (and prprf).
> We could attempt to audit every log call for non-standard format strings,
> but that might be a rather monumental effort given the amount of ad-hoc log
> wrappers there are

This has been done in bug 1060419.

> 
> > #define LOG_D(msg,...) MOZ_LOG(sLogger, Debug, (msg,...))
> 
> Additionally we include prlog.h (which includes prtypes.h) in Logging.h for
> backwards compatibility for calling MOZ_LOG with a |PRLogModuleInfo*|. It's
> possible that could be removed.
> 
> So all that said, we can probably close this but spin off two new blocking
> bugs for dieprtypesdie.

I filed bug 1353143 for removing prlog.h from mozilla logging, patches are up.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: