Closed Bug 783974 Opened 12 years ago Closed 12 years ago

Log SSL errors to the error console

Categories

(Core :: Security: PSM, defect)

17 Branch
defect
Not set
blocker

Tracking

()

RESOLVED FIXED
mozilla18
Tracking Status
firefox17 --- verified

People

(Reporter: KaiE, Assigned: KaiE)

References

Details

Attachments

(5 files, 4 obsolete files)

Since bug 682329, the Mozilla platform only has SSL related error feedback for connections related to the top level page in the browser window.

We no longer have any error feedback for 
- browser sub-content
- connections made by add-ons
- connections made by other Mozilla appliations such as Thunderbird or XulRunner apps

In order to provide at least a minimal solution, this patch implements logging of SSL protocol errors and SSL certificate errors on the error console.
Attached patch Patch v1 (obsolete) — Splinter Review
Assignee: nobody → kaie
Attachment #653320 - Flags: review?(honzab.moz)
Attached file test addon
This is an add-on that can be installed into firefox, mobile, thunderbird, seamonkey, instantbird.

It will start several SSL connections in the background. With this patch fixed, after a couple of seconds, you should see messages appears on the error console.
Attachment #653320 - Flags: review?(bsmith)
Attachment #653320 - Flags: review?(rrelyea)
Comment on attachment 653320 [details] [diff] [review]
Patch v1

Review of attachment 653320 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with all review comments addressed.

Note that Honza is unavailable for a few weeks.

::: security/manager/ssl/src/SSLServerCertVerification.cpp
@@ +180,5 @@
>    }
>  }
>  
> +static void
> +nsHandleInvalidCertError(TransportSecurityInfo *socketInfo, 

This should be in the unnamed namespace that starts right below it (vs using "static"), to match the newer Mozilla coding style. Also, LogInvalidCertError would be a better name, IMO.

@@ +189,5 @@
> +                         ::mozilla::psm::SSLErrorMessageType errorMessageType,
> +                         nsIX509Cert* ix509)
> +{
> +  nsCOMPtr<nsISSLStatus> status;
> +  socketInfo->GetSSLStatus(getter_AddRefs(status));

The above two lines are dead code that should be removed.

@@ +379,5 @@
>                                  : mErrorCodeMismatch ? mErrorCodeMismatch
>                                  : mErrorCodeExpired  ? mErrorCodeExpired
>                                  : mDefaultErrorCodeToReport;
> +                                
> +  SSLServerCertVerificationResult *result = 

Better to use nsRefPtr<SSLServerCertVerificationResult> here to future-proof against leaks when we modify this code ...

@@ +392,5 @@
> +                           result->mErrorCode,
> +                           result->mErrorMessageType,
> +                           mCert);
> +
> +  return result;

...and return result.forget() here.

::: security/manager/ssl/src/TransportSecurityInfo.cpp
@@ +256,5 @@
>    return *aText != nullptr ? NS_OK : NS_ERROR_OUT_OF_MEMORY;
>  }
>  
> +void
> +TransportSecurityInfo::GetErrorMessage(PRErrorCode errorCode,

Please rename this to GetCertErrorMessageForLogging, because it is confusing to have two methods with the same name that do something similar, but with important differences.

Also, this function is always called with wantsHtml and supressPort443 set to false, so those parameters are not necessary and should be removed.

@@ +257,5 @@
>  }
>  
> +void
> +TransportSecurityInfo::GetErrorMessage(PRErrorCode errorCode,
> +                                       ::mozilla::psm::SSLErrorMessageType errorMessageType,

This line is too long. (You shouldn't need to qualify SSLErrorMessageType here, which will help.)

@@ +293,5 @@
>  //      formatErrorMessage into GetErrorMessage().
>  nsresult
> +TransportSecurityInfo::formatErrorMessage(MutexAutoLock const & proofOfLock, 
> +                                          PRErrorCode errorCode,
> +                                          ::mozilla::psm::SSLErrorMessageType errorMessageType,

Line is too long (should be able to remove qualification to fix it).

::: security/manager/ssl/src/nsNSSIOLayer.cpp
@@ +540,2 @@
>    socketInfo->SetCanceled(err, PlainErrorMessage);
> +  nsString error_string;

Use camelCase.

@@ +543,5 @@
> +  socketInfo->GetErrorMessage(&u);
> +  if (u) {
> +    error_string = u;
> +    NS_Free(u);
> +  }

All of the above should be written:

nsXPIDLString errorString;
socketInfo->GetErrorMessage(getter_Copies(errorString));
Attachment #653320 - Flags: review?(rrelyea)
Attachment #653320 - Flags: review?(honzab.moz)
Attachment #653320 - Flags: review?(bsmith)
Attachment #653320 - Flags: review+
Comment on attachment 653320 [details] [diff] [review]
Patch v1

Review of attachment 653320 [details] [diff] [review]:
-----------------------------------------------------------------

Kai, I just realized that nsIConsoleService is not the right thing to be using for this, and I verified that on #developers. 

Instead, the errors need to be logged in the web console so that they can be associated with a particular tab/page. And, that requires a reference to the docshell, which means the code doing the logging should be in toolkit, not in PSM.

<bsmith> When should an error be logged to the error console and when should the error be logged to the web console? And, should the same error ever be logged into both consoles?
<gavin> errors should always be logged to the web console, generally
<gavin> (assuming they can be tied to a single browser tab)
<gavin> we would like to get rid of the error console
<gavin> but are being hampered by it being the only place some errors appear

<bsmith> Would it be fair to say that, if the error is specific to a particular browser tab, then it goes in the web console; otherwise (e.g. chrome code), in the error console
<gavin> bsmith: yes
Attachment #653320 - Flags: review+ → review-
Severity: blocker → enhancement
Keywords: regression
Summary: Log SSL errors to the error console → Log SSL errors to the web console (for web content) or error console (for extensions)
Please don't delay this work based on "should be done differently". This is a remedy regression fix that's urgently needed. You can work on smarter solutions later. The fact that people want to eventually get rid of the error console shouldn't stop us to implement the remedy right now. Things can be changed later.
bsmith, I agree with kaie here. I agree that the Web console would be nice to use, if it was easy to do. But it's not - this would be a major code change for relatively little gain. Demanding things like that is a prime way to drive away contributors.

That said, I don't think this bug helps with the original problem at all. The user needs to be notified of the problem, in an error message (inline, popup, whatever), not the tester (error console). The end user will never notice a message on error console or web console, so it doesn't help with the original problem/regression we have.
The Thunderbird maintenance team (Standard8) already said that they want the dialog box that PSM used to use for Thunderbird's UI. So, the UI for Thunderbird is supposed to be the dialog box. Nobody from the Thunderbird product asked for this to be done, AFAICT. And, I think this also addresses Ben's concerns for Thunderbird. I also agree that this is reasonable. I already agreed to help Mark get the dialog box into Thunderbird. It would be great if Kai could help with that (in a way that doesn't negatively affect Firefox). 

Gavin from the Firefox team said that they want the errors reported in the web console and not the error console. I agree with him that this is the most reasonable way of doing this.

By the way, are we sure that these errors are not already getting logged to the web console, as part of the mechanism that logs networking errors to the web console? If not, then we should really fix that more general bug.

(In reply to Ben Bucksch (:BenB) from comment #7)
> That said, I don't think this bug helps with the original problem at all.
> The user needs to be notified of the problem, in an error message (inline,
> popup, whatever), not the tester (error console). The end user will never
> notice a message on error console or web console, so it doesn't help with
> the original problem/regression we have.

We already have a bug for a UX for this. I think you guys are already CC'd on it. The first step is getting the design from the UX team. If you're eager to get that bug fixed then I can try to connect you guys with someone from the UX team, if you're not sure who to contact.
(In reply to Brian Smith (:bsmith) from comment #5)
> <bsmith> Would it be fair to say that, if the error is specific to a
> particular browser tab, then it goes in the web console; otherwise (e.g.
> chrome code), in the error console
> <gavin> bsmith: yes


Brian, that brings you back to the original problem of bug 682329, which you were supposed to fix, but didn't.

It would have been your job in bug 682329, and it would be the job of this new approach, to clearly distinguish between scenarios "in a browser window/tab" and "somewhere else". That old detection didn't work, thereby causing popup at incorrect times, and instead of making it work, you had removed it.

Because we currently don't have a clear method (code) to distinguish scenarios, it seems unlikely that we'll get this fixed in a way as trivially as would be required for an immediate remedy.

By rejecting (r-) my patch based on this argument you are preventing pragmatism and you make life very hard.

I would like to ask "what harm is caused by showing SSL errors on the error console instead of the web console?". My answer to that is "none". Even if you argued it were somewhat harmful, it's much more harmful to not have any logs at all for all the other scenarios.

If you block this attempt to find a mimimal workaround in shared PSM code, you are proving to me that forking PSM is the only way to move forward in a pragmatic way.
Brian, I urge you to not hijack bugs I'm filing and give them a different meaning.

You morphed my general, cross-application solution bug description, into a new bug description that only talks about web browsers, and not mentions the needs of Thunderbird at all.

Your new bug subject only mentions "SSL errors on the web" and "SSL errors in extensions". You don't mention "SSL errors in apps like Thunderbird", which was my primary motivation for filing this bug.

I'm reverting this bug to the old description.
Summary: Log SSL errors to the web console (for web content) or error console (for extensions) → Log SSL errors to the error console
Severity: enhancement → blocker
Bob, could you please approach/r+ this patch, that adds one string?

This adds back a string that got added in the past.

If we have this string back, then people can patch Mozilla to get the error console logging if desired, and even have it translated.

The string was translated in the past, so it won't be lots of work for localizers.
Attachment #654546 - Flags: review?(rrelyea)
Comment on attachment 654546 [details] [diff] [review]
patch - RESTORE ONE OLD STRING ONLY

I've changed my mind. Anyone wanting to apply this patch can simply use the non-localized error string, the error details are non-localized anyway.
Attachment #654546 - Attachment is obsolete: true
Attachment #654546 - Flags: review?(rrelyea)
Attached patch patch v4 (obsolete) — Splinter Review
This patch doesn't require changes to the localizable strings.
Attachment #653320 - Attachment is obsolete: true
I have no intention to work on this bug further.

I understand Mozilla doesn't have an interest to work on it either, so I'm closing it as WONTFIX.

If anyone would like to add this functionality to their Thunderbird 17 ESR branch, they can do so by using the attached patch v4. (btw patch v4 addresses most of the review comments)
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
I was chatting with kaie and he said he's still willing to work on this bug, if:
- he doesn't have to fight for it
- only cosmetic changes are requested (not a complete redesign)
- gavin is fine with the Web Console changes happening later, in another bug.
  gavin stated in email that he's OK with that, I think bsmith as well,
  but would be good to have it on record here in the bug, too.
- either bsmith or rrelyea are giving their r+
- all that happens before Monday, because kaie wants this in for TB 17
  (given that TB is now based on ESR only) and the cutoff seems to be on Monday.

gavin, could you confirm the above here, please?
bsmith, based on that, could you do a review, please?

bsmith, you stated that this patch here conflicts with your planned work to re-add the error dialog box in Thunderbird. For me, such end user error messaging is of utmost importance, and I wouldn't want that jeopardized or held up by the logging. Is that a serious conflict or simple to adapt to?

If we could get the error dialog box in TB 17, that would be most important IMHO. bsmith, what do you suggest? kaie and bsmith, maybe you could even work together on that dialog?
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
(In reply to Ben Bucksch (:BenB) from comment #15)
> - gavin is fine with the Web Console changes happening later, in another bug.
>   gavin stated in email that he's OK with that, I think bsmith as well,
>   but would be good to have it on record here in the bug, too.

Right.

> - either bsmith or rrelyea are giving their r+

I already did a complete review of the original patch. As long as all the review comments are addressed in the new patch, I will r+ it when I am asked to review it.

> - all that happens before Monday, because kaie wants this in for TB 17
>   (given that TB is now based on ESR only) and the cutoff seems to be on
> Monday.

I will review it before Monday if I get the review request today.

But, the deadline for this work isn't Monday. AFAICT, most people that have given an opinion agree we can land such changes in the aurora phase.
I'm glad to hear we have agreement at least for the console logging.

I'll attach a patch shortly, so we can get that landed.

(I'll still continue to work on bug 785426 afterwards.)
Attached patch Patch v5 (obsolete) — Splinter Review
I've addressed all your requests.

Using return result.forget();
directly wasn't possible, so I used an assignment prior to forget().

You suggested to use GetCertErrorMessageForLogging,
I used a shorter, still obvious name GetErrorLogMessage.

You asked to remove two parameters from GetErrorLogMessage. I will require them soon again later, but sure, I'll remove them now (and add them back in bug 785426).
Attachment #654567 - Attachment is obsolete: true
Attachment #655079 - Flags: review?(bsmith)
Attached patch patch v6Splinter Review
Exactly same changes as before, but patch merged to latest mozilla-central

(there were some conflicts in context due to the PRInt32 to int32 change)
Attachment #655079 - Attachment is obsolete: true
Attachment #655079 - Flags: review?(bsmith)
Attachment #655203 - Flags: review?(bsmith)
Blocks: 785426
Comment on attachment 655203 [details] [diff] [review]
patch v6

Review of attachment 655203 [details] [diff] [review]:
-----------------------------------------------------------------

r+ if comments addressed.

::: security/manager/ssl/src/SSLServerCertVerification.cpp
@@ +392,5 @@
> +                      mCert);
> +
> +  SSLServerCertVerificationResult *resultPtr = result;
> +  result.forget();
> +  return resultPtr;

Nit: IIRC, you can avoid this awkwardness by changing the return type of the function to already_AddRefed<SSLServerCertVerificationResult> which is what I should have done when I created it.

::: security/manager/ssl/src/TransportSecurityInfo.cpp
@@ +287,5 @@
>  
>  // XXX: uses nsNSSComponent string bundles off the main thread when called by
>  //      nsNSSSocketInfo::Write(). When we remove the error message from the
>  //      serialization of nsNSSSocketInfo (bug 697781) we can inline
> +//      formatErrorMessage into GetErrorLogMessage().

This comment isn't accurate anymore, because it is called by two callers. Remove the second sentence.

::: security/manager/ssl/src/nsNSSIOLayer.cpp
@@ +530,5 @@
>      
>        bool suppressMessage = false; // obsolete, ignored
>        rv = sel->NotifySSLError(csi, err, hostWithPortString, &suppressMessage);
> +      if (NS_FAILED(rv))
> +        suppressMessage = false;

Remove this. There's no need to check the return result or change suppressMessage here because it is ignored and it doesn't mean "suppressLogging."

@@ +540,3 @@
>    socketInfo->SetCanceled(err, PlainErrorMessage);
> +  nsXPIDLString errorString;
> +  socketInfo->GetErrorMessage(getter_Copies(errorString));

This should be GetErrorLogMessage, right? (Also, note the comment says GetErrorMessage too.)
Attachment #655203 - Flags: review?(bsmith) → review+
Attached patch patch v7 [leaks]Splinter Review
this patch addresses all change requests
If I understand corectly, the aurora-17 merge has already happened?
I propose to uplift this patch to aurora.
Comment on attachment 655583 [details] [diff] [review]
patch v7 [leaks]

requesting approval for aurora 17

Bug caused by (feature/regressing bug #): 682329

User impact if declined: no user feedback for ssl errors in thunderbird (and non-browser-toplevel connections in firefox)

Testing completed (on m-c, etc.): I performed local testing, with the help of the attached test addon (that simplifies opening background connections). Yes, we get error messages on the error console using this patch.

Risk to taking this patch (and alternatives if risky): no risk

String or UUID changes made by this patch: one string added back (same ID and same contents as removed in bug 682329)
Attachment #655583 - Flags: review+
Attachment #655583 - Flags: approval-mozilla-aurora?
> String or UUID changes made by this patch: one string added back (same ID
> and same contents as removed in bug 682329)

FYI, had been removed with attachment 565141 [details] [diff] [review]
https://hg.mozilla.org/mozilla-central/rev/1e937319fe2b
Comment on attachment 655583 [details] [diff] [review]
patch v7 [leaks]

unfortunately this patch has a leak.
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=a6890a3b8f72
Attachment #655583 - Flags: approval-mozilla-aurora?
(In reply to Brian Smith (:bsmith) from comment #20)
> > +
> > +  SSLServerCertVerificationResult *resultPtr = result;
> > +  result.forget();
> > +  return resultPtr;
> 
> Nit: IIRC, you can avoid this awkwardness by changing the return type of the
> function to already_AddRefed<SSLServerCertVerificationResult> which is what
> I should have done when I created it.

I believe this change introduced the leak. I had changed the final return of the function and the function signature. But I missed that there are additional code paths where the function returns simply a new pointer. I think changing all the places that currently do "return new" is an even bigger awkwardness.

This function simply returned a newly allocated pointer, and all existing code paths uses the same style of using "return new". I will undo this reviewer request, run a try build, and if it has zero leaks, I'll reland.
FYI, this was also causing Linux debug xpcshell crashes like the one below. Please verify that this runs successfully on Try before re-landing.

https://tbpl.mozilla.org/php/getParsedLog.php?id=14751672&tree=Mozilla-Inbound

TEST-UNEXPECTED-FAIL | /home/cltbld/talos-slave/test/build/xpcshell/tests/netwerk/test/unit/test_spdy.js | test failed (with xpcshell return code: 1), see following log:
PROCESS-CRASH | /home/cltbld/talos-slave/test/build/xpcshell/tests/netwerk/test/unit/test_spdy.js | application crashed (minidump found)

Operating system: Linux
                  0.0.0 Linux 2.6.31.5-127.fc12.i686.PAE #1 SMP Sat Nov 7 21:25:57 EST 2009 i686
CPU: x86
     GenuineIntel family 6 model 23 stepping 10
     2 CPUs

Crash reason:  SIGSEGV
Crash address: 0x0

Thread 0 (crashed)
 0  libmozalloc.so!mozalloc_abort [mozalloc_abort.cpp : 23 + 0x0]
    eip = 0x006d0edd   esp = 0xbfc79760   ebp = 0xbfc79778   ebx = 0x006d1118
    esi = 0x00c59844   edi = 0xbfc797a8   eax = 0x0000000a   ecx = 0x00000001
    edx = 0x00c5a32c   efl = 0x00010246
    Found by: given as instruction pointer in context
 1  libxul.so!NS_DebugBreak_P [nsDebugImpl.cpp : 423 + 0x5]
    eip = 0x02008b55   esp = 0xbfc79780   ebp = 0xbfc79bb8   ebx = 0x02d08be4
    esi = 0xbfc7ce48   edi = 0xbfc797a8
    Found by: call frame info
 2  libxul.so!nsRunnable::Release [nsThreadUtils.cpp : 30 + 0x26]
    eip = 0x01fbaaf2   esp = 0xbfc79bc0   ebp = 0xbfc79bf8   ebx = 0x02d08be4
    esi = 0x088a1180   edi = 0x00000000
    Found by: call frame info
 3  libxul.so!nsRefPtr<mozilla::psm::<unnamed>::SSLServerCertVerificationResult>::~nsRefPtr [nsAutoPtr.h : 874 + 0x8]
    eip = 0x01c11e9f   esp = 0xbfc79c00   ebp = 0xbfc79c18   ebx = 0x02d08be4
    esi = 0x088a0750   edi = 0x00000000
    Found by: call frame info
 4  libxul.so!mozilla::psm::::CertErrorRunnable::~CertErrorRunnable [SSLServerCertVerification.cpp : 231 + 0x26]
    eip = 0x01c12462   esp = 0xbfc79c20   ebp = 0xbfc79c38   ebx = 0x02d08be4
    esi = 0x088a0750   edi = 0x00000000
    Found by: call frame info
 5  libxul.so!mozilla::psm::::CertErrorRunnable::~CertErrorRunnable [SSLServerCertVerification.cpp : 231 + 0x8]
    eip = 0x01c12490   esp = 0xbfc79c40   ebp = 0xbfc79c58   ebx = 0x02d08be4
    esi = 0x088a0750   edi = 0x00000000
    Found by: call frame info
 6  libxul.so!nsRunnable::Release [nsThreadUtils.cpp : 30 + 0x3]
    eip = 0x01fbab2b   esp = 0xbfc79c60   ebp = 0xbfc79c88   ebx = 0x02d08be4
    esi = 0x088a0750   edi = 0x00000000
    Found by: call frame info
 7  libxul.so!nsCOMPtr<nsIRunnable>::~nsCOMPtr [nsCOMPtr.h : 492 + 0x8]
    eip = 0x00fd2a92   esp = 0xbfc79c90   ebp = 0xbfc79ca8   ebx = 0x02d08be4
    esi = 0xbfc79cec   edi = 0x00000000
    Found by: call frame info
 8  libxul.so!nsThread::ProcessNextEvent [nsThread.cpp : 604 + 0xe]
    eip = 0x01fff779   esp = 0xbfc79cb0   ebp = 0xbfc79d18   ebx = 0x02d08be4
    esi = 0x08486090   edi = 0x00000000
    Found by: call frame info
 9  libxul.so!NS_ProcessNextEvent_P [nsThreadUtils.cpp : 220 + 0x11]
    eip = 0x01fbad29   esp = 0xbfc79d20   ebp = 0xbfc79d48   ebx = 0x02d08be4
Target Milestone: mozilla17 → ---
Attachment #655583 - Attachment description: patch v7 → patch v7 [leaks]
Attached patch patch v9Splinter Review
This patch changes patch v7 to the simpler function API that returns a pointer.

I did a Linux-only mochitest-only try-run and it worked fine, no leaks:
https://tbpl.mozilla.org/?tree=Try&rev=43760f6be86b

I've just started another full try-run
https://tbpl.mozilla.org/?tree=Try&rev=f867191f0c64

If that succeeds, I'll re-land.

I'm marking this patch as r+, because it keeps the function interface that bsmith had originally written, and because the proposed change turned out to be worse.
Attachment #655737 - Flags: review+
Keywords: late-l10n
https://hg.mozilla.org/mozilla-central/rev/c9e473186b60
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
Comment on attachment 655737 [details] [diff] [review]
patch v9

requesting approval for aurora 17

Bug caused by (feature/regressing bug #): 682329

User impact if declined: no information for ssl protocol errors at all, in thunderbird (and non-browser-toplevel connections in firefox), users unable to understand why thunderbird fails, administrators unable to investigate the cause.

Testing completed (on m-c, etc.): I performed local testing, with the help of the attached test addon (that simplifies opening background connections). Yes, we get error messages on the error console using this patch.

Risk to taking this patch (and alternatives if risky): no risk

String or UUID changes made by this patch: one string added back (same ID and same contents as removed in bug 682329)
Attachment #655737 - Flags: approval-mozilla-aurora?
Though this is re-adding a string that was once there won't this trip l10n alerts, breaking central string-freeze?  I'm inclined to have this ride the trains since there is a string change involved.
I'm disappointed. Please reconsider, as this fix is intended for the Thunderbird ESR 17 cycle.

I had made pressure to people to get this patch reviewed prior to the merge, but Brian in comment 16 (and Mike Hommey in another bug) said, we don't need to hurry, because it can be uplifted to Aurora without a problem.

Should you really reject this patch because of the string, then I propose to approve patch v4 - which uses an embedded english-only string.
I'd prefer the embedded english-only string for the uplift.

And Brian and Mike were wrong, uplifting patches with l10n impact is troubling in general.
I've modified the patch v9 (the one that landed on mozilla-central).

This modified version has the english-only error string embdedded.

No l10n changes.

Feel free to compare v9 and v9-aurora-17, the changes are trivial.
Attachment #658470 - Flags: approval-mozilla-aurora?
Attachment #655737 - Flags: approval-mozilla-aurora?
Attachment #658470 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Keywords: late-l10n
Keywords: verifyme
Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/17.0 Firefox/17.0

Verified on 17 beta 1, Ubuntu 12.04, Mac OS 10.7, Windows 7. Error messages displayed in error console as described in comment 2.
mass remove verifyme requests greater than 4 months old
Keywords: verifyme
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: