Closed Bug 401575 Opened 17 years ago Closed 17 years ago

Support cert overriding from SSL error pages

Categories

(Core Graveyard :: Security: UI, defect)

defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED
mozilla1.9beta1

People

(Reporter: johnath, Assigned: johnath)

References

()

Details

(Whiteboard: [has patch][has reviews])

Attachments

(2 files, 9 obsolete files)

The changes made in 327181 result in PSM "bad certificate" warnings being presented as error pages with no forward progress instead of the old dialogs, which defaulted to unsafely allowing clickthrough.  Even though that work also introduced a mechanism (bug 387480) for adding permanent exceptions so that users could declaratively establish trust with certs to known sites, the result has still been a fair bit of user confusion and consternation.

On the other hand, PSM has multiple consumers, some dialog, some error page, and it's arguably not appropriate to try to compel them all to accept some kind of override UI in the warning.  Instead, I propose that:

- PSM continue to dump to an error page when available, a dialog when not, and that PSM not include any direct-override UI
- netError.xhtml be modified to include an advanced section which allows current-session override ability (styled off and on so that it doesn't show up for other kinds of errors).  Permanent override ability would still require use of the PSM Add Exception UI that already exists.  This lets people see sites in a heightened state of awareness, but requires them to be very explicit before we allow a permanent override.

The concern from the security side of things is obviously that clickthrough acceptance rates will be high, but I'm not sure we can let that concern trump the pervasive concern that we are blocking large swaths of the internet.

As a reference data point for these discussions, which I wish I could link to directly, but which requires a paid netcraft subscription, current estimates are that there are approximately 700k SSL sites on the net presenting valid, CA-signed certificates, and that there may be twice that number again which present either invalid or self-signed certs.
Flags: blocking1.9?
Doesn't this go right back to training users to habitually override errors?
Isn't the habitual override EXACTLY the thing we're most trying to defeat?

Aren't temporary overrides WORSE than permanent ones, in terms of how 
habit forming they are, due to greater frequency?

If other products are unwilling to enhance their UI, and are willing for 
their users to continue with the habitual overrides, must we also facilitate habitual overrides for all users of FireFox and ThunderBird to accomodate 
those products?

What percentage of Alpha/Beta users have really complained (1%?  0.1%?  0.01%  0.001%?) who weren't satisfied when they learned how to use the exceptions?  

We anticipated, expected, and received a vocal response from a small minority
of users.  We anticipated that we would NOT hear from the people who have no problem with the new code, and so the vocal record would SEEM to be overwhelmingly against the change (See the discussion at the URL above).  

Are we ready to abandon our goal of ending habitual override over the first dozen complaints (all of which were anticipated)?  

As for 700k vs a larger number, the question is: how many average Mom & Pop 
mozilla users encounter an invalid or self-signed cert in (say) a week 
(or even in a year) that is really in front of a valid site?  

If that number is (say) 1%, shall we facilitate habitual override for all 
users for the sake of that 1%?  

Even if the number is (say) 10%, must we facilitate habitual override for 
the other 90% to satisfy that 10%?  At what percentage do we decide that 
we must facilitate habitual overrides to accomodate them? 

Can we find a way to satisfy the minority without facilitating habitual 
override for the majority?  Have we given up on bug 399275?

Elsewhere, in other bugzilla RFEs, many new cert processing features have 
been requested for Mozilla products, features such as AIA cert fetching, 
CRL fetching, cert caching, etc.  Many of those features provide new ways
to detect invalid certs.  Is there ANY point to those features if the users 
are simply going to habitually override errors?  
Those are two examples of common bookmarks which will break entirely. (In reply to comment #1)
> Doesn't this go right back to training users to habitually override errors?
> Isn't the habitual override EXACTLY the thing we're most trying to defeat?

I hear what you're saying Nelson, and I'd like to believe that you trust Johnathan (and me) to be making decisions that are in the best interests of users, and taking into account the viewpoints held by all members of our community, including yourself.

You're making a few assumptions here, though. First and foremost that Johnathan is proposing a 180 degree shift in policy, which is not what I'm hearing him saying. In fact, I'm hearing him say something more like what he wrote on his blog a while back (http://blog.johnath.com/index.php/2007/10/11/todo-break-internet/).

> We anticipated, expected, and received a vocal response from a small minority
> of users.  We anticipated that we would NOT hear from the people who have no
> problem with the new code, and so the vocal record would SEEM to be
> overwhelmingly against the change (See the discussion at the URL above).  

That's right, we did. Expecting a vocal response does not mean, however, that we will automatically ignore what that community has to say. There have been a lot of good counterpoints raised to the foundational assumptions of the current design (namely that the majority of SSL encounters one has are using valid SSL certs), and I think that we've all been incredibly responsible in teasing out what the right things to do are.

> Are we ready to abandon our goal of ending habitual override over the first
> dozen complaints (all of which were anticipated)?  

No, and I think your statement is based on an assertion that providing any UI that allows a user some way forward from an invalid SSL certificate will always result in habitual override.

The fact is that only one such UI has been offered to users (a "Whatever" button, essentially) and now Johnathan is positing another, where a user must read the warning text in order to understand how to proceed.

He's not basing this idea on nothing, either. The Anti-Phishing bubble contains a click-through, but it's not a big friendly button, and it's text is couched in the context of the text of the warning. Testing on that UI showed that the vast majority (all but one) of test subjects took the safe alternative in those scenarios.

So, I don't think the design goal is to allow habitual override; it's to allow informed override.

> As for 700k vs a larger number, the question is: how many average Mom & Pop 
> mozilla users encounter an invalid or self-signed cert in (say) a week 
> (or even in a year) that is really in front of a valid site?  

https://amazon.com
https://gmail.com

Those are both bookmarks that "mom & pop" (as an aside, I really hate that characterization) users had which would be broken with this change. Not just broken, but inexplicably broken.

> Elsewhere, in other bugzilla RFEs, many new cert processing features have 
> been requested for Mozilla products, features such as AIA cert fetching, 
> CRL fetching, cert caching, etc.  Many of those features provide new ways
> to detect invalid certs.  Is there ANY point to those features if the users 
> are simply going to habitually override errors?  

Yes. Key continuity management, for example, gives us extra information which allows us to strengthen or weaken the approach we take to warning users. What Johnathan is proposing in this bug, as I see it, is to act as responsibly as possible given the information that we have about the state of the web, the task that a user is attempting to accomplish, and the security context of a user given the situation.
To break habitual override, you have to do something that requires thought.
Displaying text (that can be ignored) doesn't help.  Increasing from one
click to two or even three doesn't help, if they can all be done without 
thinking about it.  

The exception dialog requires some thought.  The user must input some things.
Requires at least some copy-n-paste.  Requires that the user pays attention.
That's the crucial part to keep in.  

You want to shorten the path to the exception dialog?  I don't object. 
Elsewhere I have called for shortening the path to the cert manager, a LOT.
But don't take away the steps that require the user to do more than click.
Let me suggest a test for "will it become habitual?".  It's simple.
Can they do it with their eyes closed, after a few practice tries?
If so, it's no defense against habitual override.
This is a bit off-topic, but...

(In reply to comment #2)
> > As for 700k vs a larger number, the question is: how many average Mom & Pop 
> > mozilla users encounter an invalid or self-signed cert in (say) a week 
> > (or even in a year) that is really in front of a valid site?  
> 
> https://amazon.com
> https://gmail.com
> 
> Those are both bookmarks that "mom & pop" (as an aside, I really hate that
> characterization) users had which would be broken with this change. Not just
> broken, but inexplicably broken.

Those are the wrong URLs.  The correct ones are:
https://www.amazon.com
https://mail.google.com

In the Amazon example, I have been unable to get to the https://amazon.com page when starting from http://amazon.com. And when you start at http://amazon.com and try to perform any operation that leads to an SSL page, it redirects you to the correct www URL.  In other words, I would not expect people to have this problem. It would be strange for someone to bookmark a page, and then edit it to remove the "www." part. 

I'd like to suggest that we try to be very specific about problems that average users will encounter. For example, if we thought that the problem of users dropping off the "www." part was a common problem, perhaps we should address that issue specifically, maybe with text on the error page or perhaps some other helpful mechanism. 

(In reply to comment #0)
> - PSM continue to dump to an error page when available, a dialog when not, and
> that PSM not include any direct-override UI
> - netError.xhtml be modified to include an advanced section which allows
> current-session override ability (styled off and on so that it doesn't show up
> for other kinds of errors).  Permanent override ability would still require use
> of the PSM Add Exception UI that already exists.  This lets people see sites in
> a heightened state of awareness, but requires them to be very explicit before
> we allow a permanent override.

How about this change to your proposal:
* netError.xhtml gets modified to include, only in the case of trust-related SSL errors, an advanced section that contains a button linking directly to the Execptions dialog box.
* Users who encounter this type of error can click on the button to open the Exeptions window.  They then *type* in the URL of the host in question, and then click View Certificate.  The client fetches the cert in question, showing the list of problems with the certificate.  They can then add the exception.
* The exception is permanent

Users would have to make 3 clicks, and would have to manually type in the hostname of the site they are visiting.  They cannot do this through muscle memory.  Still, administrators will be able to add sites without *too* much trouble, and it becomes more discoverable to them.

Why can't the URL be auto-filled in? 
> Users would have to make 3 clicks, and would have to manually type in the
> hostname of the site they are visiting.  They cannot do this through muscle
> memory.  Still, administrators will be able to add sites without *too* much
> trouble, and it becomes more discoverable to them.
> 

3 clicks plus typing in the host name (very error prone) seems overly complex just to override a cert warning for one session - especially for 100+ hosts.  Also, you say this will be permanent - there are cases where I want it to just persist for the session so I am reminded that the cert is invalid each time I visit the site.  How would we do that?

Seems the easiest way to appease the "small minority" (I think we are under estimating the impact as well) would be to have an about:config pref to enable FF2-type behavior.
(In reply to comment #6)
I proposed pretty much this in bug 398718 comment 45 (including a badly-made mockup), with the exception that I suggested, like Matthew Zeier in comment 7, that the URL should be filled in automatically.

(In reply to comment #8)
> Seems the easiest way to appease the "small minority" (I think we are under
> estimating the impact as well) would be to have an about:config pref to enable
> FF2-type behavior.

That is bug 399275, although there hasn't been too much discussion on that recently.
The only part of the exception dialog that is any improvement (in terms of
reducing habitual override) over "just another click" is the need to fill
in that text box with the URL.  With auto-fill, this dialog becomes just 
another click that you can get past with your eyes closed.  
(In reply to comment #7)
> Why can't the URL be auto-filled in? 

Users who can click-through dialog boxes will do so.  We are embarking on this effort to increase users' safety. We know that users who are in the process of being phished will click on anything that gets them to what they *think* is the correct server.

By requiring users to type the URL, we expect average users being phished to complete the task at a lower rate than following a click-through.

Bob I totally get what your saying but why not let me play with a loaded gun and give be an about:config pref that turns on URL auto-fill?
(In reply to comment #8)
> 3 clicks plus typing in the host name (very error prone) seems overly complex
> just to override a cert warning for one session - especially for 100+ hosts. 
> Also, you say this will be permanent - there are cases where I want it to just
> persist for the session so I am reminded that the cert is invalid each time I
> visit the site.  How would we do that?

You raise an interesting point about permanent vs. temporary exceptions. My assumption had been that admins who wanted to add a host would want that to be permanent.  Also, as Nelson pointed out, the more times people see a dialog the more likely they are to click through it without thinking. 

What sorts of scenarios should we consider where a temporary exception would be needed?

And as you hint, this particular proposal does not address the case where admins have many devices (routers, web servers, etc.) that all suffer from self-signed certs, for example. That issue is discussed in another bug report (which I don't see right now...anyone? anyone?)

> Seems the easiest way to appease the "small minority" (I think we are under
> estimating the impact as well) would be to have an about:config pref to enable
> FF2-type behavior.

The other point is that admins are increasingly the target of spear-phishing attacks. We have customers who tell us about how good those attacks are getting these days, and that a shocking percentage of people fall for them, even people in the IT Security teams! :-(

Bob, I think the exception dialog should offer a choice between setting a
temporary or permanent exception.  As I understand it, the proposal in 
comment 0 was to have a two-click (or three-click) exception addition that bypassed the exception addition dialog.  My point about temporary exceptions being more habit forming that permanent was made in the context of that proposal.  If that proposal is changed to one that requires some thought,
not mere click-throughs, then I have no objection to temporary exceptions.

As someone frequently called upon to diagnose sites with bad certs, I often
want to be able to get past the bad cert, just to see what's behind it, 
without wanting it to become permanent.  I'm quite willing to use the 
cert exception dialog for that.  I do not consider the small amount of 
work it asks of me to be prohibitive.  I would like to be able to get to 
the cert manager (not the exception dialog) in much less work than 
presently required, which is why I filed bug 401663 and bug 401665.
Attached patch First pass (obsolete) — Splinter Review
This patch adds a section to the error pages (visible only from Firefox currently, see implementation notes below) which allows users to click "Overriding security warnings" and get an expanded section.  The expanded section has some warning text, and an "Add Exception" button.  When clicked, the Add Exception button pops up the existing "Add Exception" dialog, pre-populated with the hostname and cert information.  Adding the exception at that point dismisses the dialog and reloads the page.  The exception is permanent.

I'd really like it if we could add temporary exceptions as well/instead, but the exception dialog doesn't support it, and nsICertOverrideService doesn't currently expose a way to do so directly either.  If this is an easily containable change, it would be great, otherwise I'll file a follow-up bug for it.

Obviously no one wants to go back to the bad old days of the old dialogs, security wise but as I've said before, my own feeling is that the current error pages go too far the other direction.

Habituation is a problem but keep in mind that the old dialogs spoke in unclear terms, didn't discuss consequences, and had an unsafe default - hitting enter, or clicking a button marked "OK" was the same as extending trust.  In the current patch (screencap forthcoming) the user has to expand a section on "Overriding Security", click a button named "Add Exception" and then confirm a dialog which talks in reasonably unfriendly tones about what's happening.

This isn't about giving up on better SSL security, it's about trying to find a balance where the user hassle compels site operators to clean up their acts, while users who need to push on through are given a way to do so that is cautionary, but findable.  

Implementation notes:
 - This patch hooks into onLocationChange, and while it only introduces a single regex check to the critical path, we'll want to watch for a Tp hit.  I would *hope* that one (eval-time-compiled) regex check isn't too expensive, but these critical-path functions have bitten us before.

 - The patch has a sort of crazy backwards way of listening for button press.  Our error pages don't have chrome privileges for Very Good Reasons, but invoking cert override dialogs (or directly invoking the exception-adding XPCOM service for that matter) require them.  Instead of trying to find a way to let unprivileged error pages do privileged things, the patch leaves the error pages as-is, and browser is the one that attaches the handler.  Browser can "reach in" and attach a privileged listener to the error pages much more safely than most mechanisms that would "reach out" to privilege from unprivileged space.

- There seems to be a fleeting bug where sometimes hitting "reload" on an SSL error page causes the override link to disappear.  This might just be a styling glitch.
Comment on attachment 286840 [details] [diff] [review]
First pass

In two places:

>+<!ENTITY securityOverride.warningText "
>+<p>It is possible to add an exception for this site but if you don't normally have problems accessing this site, you should not do so.  Because of the security problems, you should not share any sensitive information or passwords with this site.</p>
>+<p><strong>Only continue if you know it is safe to do so.</strong></p>

s/this site but if/this site, but if/

Please drop the double spaces between the sentences. It just reminds me of the old typewriter days (before my time) where you needed the two spaces just to tell the difference between two sentences. That problem does not exist today with variable fonts and the like, so we should not patronize those ancient practices.

"Because of the security problems" sounds really odd. Need something worded better, imho. Maybe add "encountered" or "experienced" or something (e.g., "Because of the encountered security problems" or "Because of the security problems encountered")?
If I understand the patch, I think I support it with the exception of pre-filling.  We should not pre-fill by default.  If power users feel they need pre-filling for their intranet-based web sites and they can live with a hidden pref, then I think I'm on board.


Comment on attachment 286840 [details] [diff] [review]
First pass


First of all, I just want to say that this would be a huge improvement over the current situation, and I think most users who have problems with the current implementation will find this solution the right compromise (and I think Bob Lord's suggestion (comment 18) of not pre-filling by default but with a hidden pref to enable pre-filling strikes the right balance also).

I think bug 399275 should continue to be considered, because if I understand correctly, it would solve the problem of only accepting certificates temporarily, and of being reminded each time about problems with a security certificate.

(In reply to comment #17)
> >+<!ENTITY securityOverride.warningText "
> >+<p>It is possible to add an exception for this site but if you don't normally have problems accessing this site, you should not do so.  Because of the security problems, you should not share any sensitive information or passwords with this site.</p>
> >+<p><strong>Only continue if you know it is safe to do so.</strong></p>
> 
> s/this site but if/this site, but if/

How about 'however' instead of 'but', IMHO it reads better:
"It is possible to add an exception for this site, however, if you don't normally have problems accessing this site, you should not do so."

> "Because of the security problems" sounds really odd. Need something worded
> better, imho. Maybe add "encountered" or "experienced" or something (e.g.,
> "Because of the encountered security problems" or "Because of the security
> problems encountered")?

May I suggest: "Because of the problems with the site's security certificate, you should not share any sensitive information or passwords with this site."
I noticed one more thing after looking at the screenshots (attachment 286843 [details]): The current text talks about 'servers', but the patch refers to 'sites'.
I would like to share a story about how some users learn to click through dialogs that stand between them and the web page they think they want to visit.  

A few months ago I was asked to visit an intranet web site before it went live.  The web site used a self-signed cert rather than using one that chained to the corporate root. I knew the owner of the site, so I called him to let him know that he needed to work with the IT team to get a cert from the corporate CA.  I explained that it was not appropriate for this server to use a self-signed cert, which of course invoked the old warning dialog.

He was surprised, and said that there was no such dialog.  I realized that in the past he had clicked the option to remember the decision, so he never saw the dialog again.  I asked him to create a new FF profile and to try again so he could see what other people in the company were going to see.  Again, he denied seeing any warning box, even after me describing the dialog.

I was quite confused.  I asked him to set up a new profile, and for him not to touch the keyboard until I said to do so. Eventually, of course, he got the warning.  "Oh, THAT!", he said.  "Yeah, I didn't know what you meant.  Whenever I see that dialog, I just click the remember button". I also found out that this was what they told new people coming onto the team. :-(

For him, that dialog was an obstacle.  Not only did he not read the message and take action, it literally was below his consciousness.  (This guy is smart, btw, but not a techie)

I realize that this is just one anecdote, but I've seen enough of this sort of thing to believe that average (and above-average) people will click through dialogs and not have any idea what the risks are. It becomes muscle memory. I have no reason to believe that people are going to suddenly start reading security warnings. 

Just some food for thought.
With pre-fill, this is a 3-click solution.  The net effect of all the massive
changes is that the user must click 3 times instead of one.  
I can tell a similar story to Bob's.  A certain big comporation's IT security
department (whose job it is to enforce the rule that all internal https 
servers must have valid certs) transferred in a new team member from another
corporate IT group.  Her new job was to reproduce reports of invalid certs 
and then follow the corporate procedures for getting it corrected.  Her 
first few weeks on the job, she rejected every report of sites with invalid
certs, claiming she could not reproduce the problem.  First we had to get 
her to use a new profile (she had already premanently rememered the certs 
from these sites in the past), but then she still claimed she could not 
reproduce the problem.  Walking her through, step by step, revealed that she
was so accustomed to clicking past that dialog that she no longer thought of
it as a problem.  It was just "what you do" when you come to a secure site.
This habitual overriding of these warnings was so ingrained, and so difficult for the IT staff to unlearn, that they finally established a formal procedure 
that starts with a new profile when beginning to investigate EVERY report of 
invalid certs.  If they work hard at noticing the dialog, they are capable of
doing so, but in day to day work, they still just blow past it.  

I don't believe that 3 clicks will substantially change that, not even with 
flashing bold red text and dancing bananas.  
It has been mentioned twice in this bug, that temporary exceptions would be helpful.

Johnathan mentioned the backend does not yet offer temporary exceptions.

In comment 19 Andrew presumed that temporary exceptions would have to get done as part of the pref bug.


My point of view is, supporting temporary exceptions are independent of the other problems. 

I could provide a patch that adds the backend support, that adds a new checkbox to the add-exception-dialog, and that has a new column in cert manager to display the lifetime of the exception.

Only permanent exceptions would get saved to disk, and temporary exceptions are for the current session only.
(In reply to comment #15)
> I'd really like it if we could add temporary exceptions as well/instead, but
> the exception dialog doesn't support it, and nsICertOverrideService doesn't
> currently expose a way to do so directly either.  If this is an easily
> containable change, it would be great, otherwise I'll file a follow-up bug for
> it.

Patch attached.
Aren't temporary exceptions the subject of bug 399696?
(In reply to comment #26)
> Aren't temporary exceptions the subject of bug 399696?

Thanks for the pointer, Nelson. I'm open to move the patch over to bug 399696 and have the review discussion there.

I'm also open to include the patch as part of this bug, because Johnathan considers to make use of the temporary feature and modify his patch accordingly. So, if we move my patch over to bug 399696, this bug would probably have to depend on (and wait) for bug 399696.

I'll let Johnathan decide.
(In reply to comment #27)
> (In reply to comment #26)
> > Aren't temporary exceptions the subject of bug 399696?
> 
> Thanks for the pointer, Nelson. I'm open to move the patch over to bug 399696
> and have the review discussion there.

Please move the patch over to that bug. Having multiple patches for completely different things in the same bug will only lead to confusion later. Please separate the two issues into separate bugs.
Attachment #286865 - Attachment description: Patch to support temporary exceptions → Patch to support temporary exceptions [moved to 399696]
Attachment #286865 - Attachment is obsolete: true
Depends on: 399696
Attached patch Fix CSS hackiness (obsolete) — Splinter Review
Fixing the problem of the disappearing link was more complicated than expected - it appears that there's a bug with how CSS styles are applied to xhtml - I will find a reduced test case for another bug.  In the meantime though, since browser/ is the only current implementation and since all the logic happens in browser.js anyhow, I'm adding the button tag in the browser/ override for netError.dtd.  So non-firefox implementations will get different, explanatory text, but no (useless) button.  Obviously other implementations could do the same thing browser does, or something totally different.

I haven't addressed the string changes already suggested in the bug, since beltzner is ui-r'ng the screenshots anyhow - I'll roll up all the changes there together.

I haven't added the pref to toggle pre-populating either.  This is programmatically pretty straightforward, but it's a weird user experience, to have a clearly context-sensitive UI behaving in a not-context-sensitive way.  We could justify that if we thought it was adding security, but I'm not sure it does.  Counting clicks doesn't seem like the right way to assess how easy it is to habituate to an interface.  Does asking users to type in the hostname again add psychological weight, or just labour?  I think the aim is to draw attention, at every step, to the decision they are making and the potential consequences, but I'm not sure that the added typing will do that any moreso than the current proposal.  On the other hand, as I say, it is an easy thing to implement if we decide otherwise.

Kai's patch to allow temp exceptions is a great thing, but I think it's a good idea to track it separately.  Since we're reusing the same UI, this can take advantage of it whenever it lands.

This patch will probably need reviews from browser, docshell, and security peers - there are minor changes in other places (netError CSS in toolkit, netError strings in dom) but that should do it.  Tagging gavin for the browser review first to get things going - the remaining discussion points are minor, code-wise, and shouldn't block reviews, I don't think.
Attachment #286840 - Attachment is obsolete: true
Attachment #286899 - Flags: review?(gavin.sharp)
Attachment #286843 - Flags: ui-review?(beltzner)
(In reply to comment #23)
> I can tell a similar story to Bob's. 
[...snip...]
> reproduce the problem.  Walking her through, step by step, revealed that she
> was so accustomed to clicking past that dialog that she no longer thought of
> it as a problem.  It was just "what you do" when you come to a secure site.
> This habitual overriding of these warnings was so ingrained, and so difficult
[...snip...]
> I don't believe that 3 clicks will substantially change that, not even with 
> flashing bold red text and dancing bananas.  

Thanks for story time, Bob and Nelson. They were informative anecdotes, though not precisely news. We have spoken about this before, at length, in this bug, in other newsgroups, and even face to face. Beyond sending a notarized letter, I'm not sure what else I can possibly do to assure you that Johnathan and I have both heard and appreciated your arguments and feelings on this matter. However, at this time, I'd like to remind you both of Bugzilla Ettiquette (ref:  https://bugzilla.mozilla.org/page.cgi?id=etiquette.html):

"Unless you have something new to contribute, then the bug owner is aware of all the issues, and will make a judgement as to what to do."

Your interests in this matter (avoiding habituation, taking a hard line against misconfigured and self-signed certificates) are well-understood. As, I think, are my interests (providing meaningful and usable options for a user who is looking to complete a task, supporting IT professionals and other weberati who Know What They Are Doing).

The current state of affairs is looking at some UI which attempts to prevent click-through by not presenting that option without the user looking for it. The rationale for this is that habituation is about more than just SSL error dialogs, it's about an entire class of dialogs to which a user is saying "Whatever". Our expectation is that habituation will be lessened through a UI which requires that the user first *expose* the bypass mechanism, and one that does not resemble other "whatever" confirmation dialogs (EULAs, JS alerts, confirmation dialogs) that he encounters throughout his computing day. This lessening will hopefully provide a balance.

I would be absolutely thrilled to hear (or have bugs filed) on alternative suggestions of what we can do. Some ideas I've thought of are:
 - forwarding users to the proper domain when there is a domain mismatch
 - asking users to "name" their self-signed certificates before allowing an exception to be added (to get them thinking about why they have to do that)
 - key continuity management of any fashion
 - dropping all security indicators for self-signed certificates and just passing the traffic through as if it were normal HTTP (yes, this allows for MITM attacks if the user doesn't notice the lack of SSL indicators, though one wonders what advantage that has over just MITMing the user and redirecting https:// traffic to http:// traffic at some malicious host)

Sadly, none of those ideas can be implemented in time for beta, but this bug can and should be in order to keep more people using the beta for testing purposes.

Now, on to my ui-review ...
Comment on attachment 286843 [details]
Screens of error page pre- and post-expansion

Johnathan,

This is mostly right, but a couple of UX thoughts:

 - not gonna deal with the error strings in this ui-r, they're out of scope for this bug
 - instead of only offering "Overriding Security Warnings >>", we should offer a "Close Page" button in the non-expanded UI
 - need less text in the expanded UI
 - should add a background colour to the background UI to call attention to it, make it look bad

Horrible ASCII art:

Error page collapsed (Default)

  .    Secure Connection Failed
 /!\   ----------------------------------------------------------------
 '''   amazon.com uses an invalid security certificate

       The certificate is only valid for www.amazon.com

       (Error code: ssl_error_bad_cert_domain)
       ----------------------------------------------------------------
       * This could be a problem with the server's configuration, or it
         could be someone trying to impersonate the server.

       * If you have connected to this server successfully in the past, 
         the error may be temporary, and you can try again later.

       * You can see and change your current list of servers with known
         security problems in your advanced encryption settings.

        (( Close this page ))

         To ignore this warning, you must _add an exception..._


Error page expnded (by clicking on "add an exception...")


  .    Secure Connection Failed
 /!\   ----------------------------------------------------------------
 '''   amazon.com uses an invalid security certificate

       The certificate is only valid for www.amazon.com

       (Error code: ssl_error_bad_cert_domain)
       ----------------------------------------------------------------
       * This could be a problem with the server's configuration, or it
         could be someone trying to impersonate the server.

       * If you have connected to this server successfully in the past, 
         the error may be temporary, and you can try again later.

       * You can see and change your current list of servers with known
         security problems in your advanced encryption settings.

       .---------------------------------------------------------------.
       | You should not add an exception if you are using an internet  |
       | connection that you do not trust completely, or if you are    |   <-- yellow
       | not used to seeing this warning for this site.                |
       '---------------------------------------------------------------'

       (( Close this page ))  ( Add Exception... )
Attachment #286843 - Flags: ui-review?(beltzner) → ui-review-
What action will happen when one clicks "Close this page"?
The same as "close this tab" or "close window"?
(In reply to comment #33)
> What action will happen when one clicks "Close this page"?
> The same as "close this tab" or "close window"?

That's the idea, yes. That way the "whatever" choice (ie: the big button available) is a more safe option, and reading is required in order to get to the override options.

On IRC, reed mentioned to me ..:
> <@reed> beltzner: I liked having the "you should not share any sensitive 
> information or passwords with this site" part in the warning message.

I agree that this information is valuable, but it's not really that clear. If someone's added an exception, they've decided to trust the site. The next time they browse to the website, they'll still trust it. Should they never again enter their information at https://amazon.com?

Instead, I thought we should make it clear that the safety of their internet *connection* was in question, not the endpoint address that they're trying to hit.
(In reply to comment #34)
> (In reply to comment #33)
> > What action will happen when one clicks "Close this page"?
> > The same as "close this tab" or "close window"?
> 
> That's the idea, yes. That way the "whatever" choice (ie: the big button
> available) is a more safe option, and reading is required in order to get to
> the override options.

How about having the button be "Close this page" if there are no previous items in history (i.e. you typed the address in or opened the address from another tab or window in a new tab or window) and change it to "Go Back" when there are previous items in history, and clicking the button would just take you back to the previous page in history. After all, you might just want to return to the page where you clicked on the link to the site with an invalid certificate.
Since it's unlikely that gavin's review comments or beltzner's ui-review comments are going to impact any of the changes made in security/, can I ask Kai or some other PSM peer to review those portions of the patch?  Basically they just concern:
 - support for passing in a location string for pre-population (whether that happens by default in the particular instance here, per comment 18, is a question for callers moreso than the dialog itself, I'd argue).
 - support for returning a boolean result parameter
Comment on attachment 286899 [details] [diff] [review]
Fix CSS hackiness

Questions:


>Index: security/manager/pki/resources/content/exceptionDialog.js
>@@ -76,16 +76,26 @@ function initExceptionDialog() {
>+  
>+  if(window.arguments[0].location) {

What happens if (windows.arguments[0]) is undefined, because the caller did not pass a parameter?
Will this throw?

In the past I've seen code like
  if (window.arguments[0]
      && location in window.arguments[0]
      && window.arguments[0].location)

But I'm not using JavaScript enough to tell whether that's needed or not.


>+  window.arguments[0].certAdded = false;

I can see, in the browser code you are declaring a little "type" for the param variable.
You're declaring the "location", but you're never declaring the "certAdded" member.
Should you declare it?

I think I would prefer "exceptionAdded" as the variable name, since "certAdded" reminds me "this dialog has a cert loaded from the website", but I think you're testing whether an exception got added. Would improve future readibility.
FWIW, I tried your patch, but when I expand the error page section, then click the "Add Exception" button, nothing happens.
This patch changes the following:

 - From Kai's review, I have changed the name of the outparameter to exceptionAdded, and pre-set it in browser.js to false.  This isn't really necessary since the dialog correctly initializes it during startup, but it doesn't hurt to be declarative about expectations.  I also added a version of Kai's expanded sanity check for window.arguments[0].location - I don't think the 2nd of the 3 checks he suggests is necessary (location in window.arguments[0]) but he's absolutely right that I should confirm that window.arguments[0] is defined before checking a member variable.
- From beltzner's UI review, I've made the string changes (different link text than his original review, but confirmed on IRC), shrunken the expanded section, and formatted it to call out that something important is happening.  New screens to follow.

I haven't added the ((Close this Window)) button yet.  Adding it in a way that's visible all the time, and yet drops down and aligns with the "Add Exception" button on expansion (or disappears and has an identical one take its place) is going to make this already hacky-feeling patch a little moreso.  I wonder if he'd consider taking that as an M10 blocker follow-up?  By then, my hope is that I'll have found a reduced test case for the CSS weirdness I've been seeing, and will be able to use CSS again to manage buttons, at which point it would be comparatively easy to do what he suggests.

I've pinged bz for review on the docshell stuff, but also to get his opinion of the overall approach, and his suggestions for alternatives that don't involve a) being in the onLocationChange path, and b) don't involve playing tricks to avoid weird CSS behaviour.

Gavin is still on point for review of the browser components.  He's also concerned about taking a hacky approach here.  mconnor has argued in favour of taking something that solves the problem and gets beta out the door now, but then seriously-for-reals finding a fix for it before shipping.  I am *all for* finding a better way to do this.

(In reply to comment #38)
> FWIW, I tried your patch, but when I expand the error page section, then click
> the "Add Exception" button, nothing happens.

I haven't seen that behaviour, are there STR?  The only things I can think of right away are that either a) the call to checkCert after opening is blocking, and preventing the dialog from showing, or b) the onLocationChange code is not attaching itself properly for some reason.  Is any exception logged?
Attachment #286899 - Attachment is obsolete: true
Attachment #286991 - Flags: review?(gavin.sharp)
Attachment #286899 - Flags: review?(gavin.sharp)
Attached image Updated screencaps
Attachment #286843 - Attachment is obsolete: true
(In reply to comment #39)
> > FWIW, I tried your patch, but when I expand the error page section, then click
> > the "Add Exception" button, nothing happens.
> 
> I haven't seen that behaviour, are there STR?  

What's STR?


> The only things I can think of
> right away are that either a) the call to checkCert after opening is blocking,
> and preventing the dialog from showing, 

I uncommented the call to checkCert for testing, didn't help.


> or b) the onLocationChange code is not
> attaching itself properly for some reason.

I don't know. I added several
  alert("something") 
calls to initExceptionDialog()
and to this._certOverrideButtonHandler = {
      handleEvent : function(event) {

But I don't get any alert boxes, only silence.


> Is any exception logged?

The error console remains empty.
Nothing on the Linux terminal console either.
(In reply to comment #41)
> > > FWIW, I tried your patch, but when I expand the error page section, then click
> > > the "Add Exception" button, nothing happens.
> > 
> > I haven't seen that behaviour, are there STR?  
> 
> What's STR?


Thanks to Nelson for telling me STR = Steps to Reproduce...

I used an obvious test case:
- https://www.cacert.org
- error page
- click on blue link on error page to expand
- click add exception button
- nothing happens, no alert boxes, nothing on the console

Yes, I have it now.  I suspect the problem is in how the error page is being detected - the current code in onLocationChange does not seem to be firing when the page loads the first time.  Of course, it fires on session restore, on reload, on tab-away-tab-back, so I wasn't seeing it during development.

Gavin, mconnor and I are looking for an alternate way to get things done - we had discussed before just making the error page privileged, which would make these problems go away, but would open a large can of worms in terms of security, since these privileged content pages can be caused to appear via content.
So gavin and I have tried about a dozen different ways to make this work and keep getting close, but not quite there.  We think the right way to do this is to get the docshell to fire a domevent on error pages, to let interested parties (like the browser) know that an error has occurred.  We have code that does this:

       if (mLoadType == LOAD_ERROR_PAGE) {
          printf("here's an error\n");
          // Fire a DOM event notifying interested listeners (like browser UI) that
          // an error has occurred.
          nsCOMPtr<nsIDOMDocument> domDoc;
          nsresult rv = mContentViewer->GetDOMDocument(getter_AddRefs(domDoc));
          if (NS_SUCCEEDED(rv)) { 
              nsCOMPtr<nsIDOMDocumentEvent> docEvent(do_QueryInterface(domDoc));
              nsCOMPtr<nsIDOMEvent> event;
              docEvent->CreateEvent(NS_LITERAL_STRING("Events"),
                                    getter_AddRefs(event));
              if (event) {
                event->InitEvent(NS_LITERAL_STRING("DOMErrorPage"),
                                 PR_TRUE, PR_TRUE);
                
                nsCOMPtr<nsIDOMEventTarget> targ(do_QueryInterface(domDoc));
                PRBool defaultActionEnabled;
                targ->DispatchEvent(event, &defaultActionEnabled);
            }
          }
        }

But we can't seem to find a good place for it to live.  First we tried methods like DisplayLoadError, but obviously that's too early since the channel hasn't had time to load, let alone build a DOM tree.  We tried EndPageLoad but error pages don't make it there.  Putting it in OnStateChange works, in that we can detect the error and fire the event, but since no one has told browser about the change, browser still has the old content document, meaning that it still can't access the error page to interact with it.

Going home now - then I'll log in again and keep churning, but in the meantime, anyone with docshell savvy would be much appreciated.  Error pages load as LOAD_BACKGROUND so they don't fire onload - is that something that is up for changing?  Is there a better way to get what we want, here? 
Summary: Add netError support for session overrides for SSL certificate errors → Support cert overriding from SSL error pages
Status: NEW → ASSIGNED
Flags: blocking1.9? → blocking1.9+
Target Milestone: --- → mozilla1.9 M9
Attached patch Use a command listener instead (obsolete) — Splinter Review
Gavin and I were close, but bz cracked it.  All event bubble up to chrome - not just events like the one we were trying to create, but even things like the command event when the user clicks the button.  This patch is a much cleaner approach:

 - On delayed startup in browser, register an eventListener on gBrowser that watches for "command" events. Most of these we'll ignore, but if they come from an error page, we can respond to them.
 - Now we're no longer in the OnLocationChange critical path, which means no processing work done on pageload at all.  That's much better.
 - We're also safer, since no privileged code is touching the error page at all - the event bubbles up to chrome where it is handled, but nothing in the error page content is touched with chrome privileges.  We didn't think there was an obvious hole with the other way of doing things, but this is still preferable.

Re-tagging Gavin for review, but kai and bz, please feel free to jump the gun and offer review on the sections that touch your components.
Attachment #286991 - Attachment is obsolete: true
Attachment #287044 - Flags: review?(gavin.sharp)
Attachment #286991 - Flags: review?(gavin.sharp)
Attachment #287044 - Flags: review?(kengert)
Attachment #287044 - Flags: review?(bzbarsky)
Attachment #286993 - Flags: ui-review?(beltzner)
Comment on attachment 287044 [details] [diff] [review]
Use a command listener instead


> certErrorMismatch=The certificate is not valid for the name %S.
>-certErrorMismatchSingle=The certificate is only valid for name %S.
>+certErrorMismatchSingle=The certificate is only valid for %S.
> certErrorMismatchMultiple=The certificate is only valid for the following names:

Note that this patch picked up (semi-accidentally, but fortuitously) a drive-by change to this error page string.  No need for a rev since nothing has changed semantically, just removed the wayward "name."
I confirm this patch works.


With this patch, the interaction I see is as this:

- error page
- there is an inviting blue link, let's click it
- click the button
- a dialog comes up with a lot of words, but "luckily" (for the poor user)
  there is a button already enabled to "confirm"


My biggest concern is with the last step. We are showing a new dialog to the user, which has a lot of information, but the user is not required to read it.


Could we omit the pre-loading of the cert?

Could we have the "cert status" section be empty and require the user to discover and use the "view cert" button?


In my opinion, this is not just a "make it one click harder".

It also means, at this state the dialog contains much less information.
The user's attention (hopefully) might be drawn to the bold warning further.

The user can not yet click-through, because the confirm-button is not yet enabled (we enable it once the user clicks get-certificate.)

It also means, the dialog feels like we guide the user, and the user actually notices the two steps going on in the dialog:
- general warning first (banks would not ask you to do this)
- click the button
- get the status
- require to confirm what I see

I think with that change, we still make it easily discoverable to override, but require a bit more thinking.


In case my arguments seem convincing, the following change is sufficient to achieve it. In Johnathan's patch, change:
  checkCert();
to
  gDialog.getButton("extra2").disabled = false;
Comment on attachment 287044 [details] [diff] [review]
Use a command listener instead

Besides my proposal in comment 47, here is another nit:


>Index: security/manager/pki/resources/content/exceptionDialog.js
>===================================================================
>+  
>+  if (window.arguments[0]
>+      && window.arguments[0].location) {
>+    // We were pre-seeded with a location.  Populate the location bar, and check
>+    // the cert
>+    document.getElementById("locationTextBox").value = window.arguments[0].location;


When the "add exception" dialog comes up, the "get cert" button is disabled. We usually have that button enabled, as soon as there is something entered into the location field. Should you always enable the button here?
  gDialog.getButton("extra2").disabled = false;


>+    checkCert();

As said in comment 47, I'd prefer to have this call removed (but that would be a UI review comment, not a code review comment).


r=kengert on the code changes in security/
Attachment #287044 - Flags: review?(kengert) → review+
I'm having a very hard time following this bug. I'm an embedder and I'm currently about to be stuck by this (quite likely in a beta we'd ship before the year ends) will have whatever you guys do here today.

comment 32 has:

       The certificate is only valid for www.amazon.com
Is this going to be:

<p>
       The certificate is only valid for <a href="https://www.amazon.com">www.amazon.com</a>
</p>

If not, why not?

IME the sites I've encountered generally are just bad redirect mazes. Abusing a cert because a company refuses to buy wildcards (which they only need to handle redirects). https://bugs.maemo.org/show_bug.cgi?id=2153

I will ask a simple question: Where and how do you expect to contact embedders so that we can deal with changes like this? I can see three other bug reports with obscure references, and zero references to this bug in newsgroups or on the web. And do note that I don't consider notification via newsgroups, blogs, or wikis valid.
Blocks: 402207
(In reply to comment #48)
> (From update of attachment 287044 [details] [diff] [review])
> Besides my proposal in comment 47, here is another nit:
> 
> >Index: security/manager/pki/resources/content/exceptionDialog.js
> >===================================================================
> >+  
> >+  if (window.arguments[0]
> >+      && window.arguments[0].location) {
> >+    // We were pre-seeded with a location.  Populate the location bar, and check
> >+    // the cert
> >+    document.getElementById("locationTextBox").value = window.arguments[0].location;
> 
> 
> When the "add exception" dialog comes up, the "get cert" button is disabled. We
> usually have that button enabled, as soon as there is something entered into
> the location field. Should you always enable the button here?
>   gDialog.getButton("extra2").disabled = false;
> 
> 
> >+    checkCert();
> 
> As said in comment 47, I'd prefer to have this call removed (but that would be
> a UI review comment, not a code review comment).
> 
> 
> r=kengert on the code changes in security/

I have filed bug 402207 (and nom'd it blocking-firefox3) to address the
manipulation of these two behaviours (pre-populate and pre-fetch).  I think we
do need to have that flexibility either way, in addition to deciding what
default to use.  I take that discussion pretty seriously - and don't want it to
be lost if we land this patch as-is for beta.

Doing this will also require that a second option be passed into the dialog to
address pre-fetch, independent of pre-population.  I would propose that in in
the case where we pre-populate-but-don't-pre-fetch, we should absolutely enable
the button.  When we also pre-fetch, I'm not sure it's helpful to do so since
clicking the button won't really do anything (it will, but the result will be
the same).
(In reply to comment #49)
> comment 32 has:
> 
>        The certificate is only valid for www.amazon.com
> Is this going to be:
> 
> <p>
>        The certificate is only valid for <a
> href="https://www.amazon.com">www.amazon.com</a>
> </p>
> 
> If not, why not?

I've filed (and nom'd blocking) bug 402210 for this, and it's something I'd really like to see too because I agree that it's an obvious improvement, if the issues can be addressed.  I don't think it's in scope for this bug though, which is about a specific piece of override functionality.  Absolutely though, changes which reduce the need to use the override are worth pursuing.

> I will ask a simple question: Where and how do you expect to contact embedders
> so that we can deal with changes like this? I can see three other bug reports
> with obscure references, and zero references to this bug in newsgroups or on
> the web. And do note that I don't consider notification via newsgroups, blogs,
> or wikis valid.

I actually don't know what, if any, communication channels exist with embedders as a group, so I don't have an answer here.  Suggestions?
Comment on attachment 286993 [details]
Updated screencaps

- needs a "Get Me Out Of Here" button to provide a safe, alternative option

- the string for adding an exception should be the same for embedders, just with the additional line about how to do so when no friendly button is provided

- the "Or you can add an exception" text should be a little smaller to remove emphasis

I discussed all these issues with Johnathan, and he's got a new patch coming to address this & the other review comments.
Attachment #286993 - Flags: ui-review?(beltzner) → ui-review-
Attached patch UI changes plus a couple nits (obsolete) — Splinter Review
Mostly little things - introducing the "Get Me Out Of Here" button, adjusting the presentation of the link text, and other changes requested by beltzner.  This patch has ui-r=beltzner.

Also included a check for event.isTrusted suggested by gavin to block synthetic events.
Attachment #287044 - Attachment is obsolete: true
Attachment #287154 - Flags: superreview?(bzbarsky)
Attachment #287154 - Flags: review?(gavin.sharp)
Attachment #287154 - Flags: review?(bzbarsky)
Attachment #287044 - Flags: review?(gavin.sharp)
Attachment #287044 - Flags: review?(bzbarsky)
Comment on attachment 287154 [details] [diff] [review]
UI changes plus a couple nits

Looks good, and works in my tests. Just a couple of nits:

>Index: browser/base/content/browser.js

>+  // Attach a listener to watch for "command" events bubbling up from error
>+  // pages.  This lets us fix bugs like 401575 which require error page UI to
>+  // do privileged things, without letting error pages have any privilege
>+  // themselves.
>+  gBrowser.addEventListener("command", function(event) {
>+    
>+    // Don't trust synthetic events
>+    if(!event.isTrusted)
>+      return;

It turns out this isn't really needed because chrome event listeners only get trusted events by default (you need to pass a magic fourth "true" parameter to addEventListener to get untrusted events). Doesn't really hurt to leave it in, I don't think.

(uber-nit: space after "if" :)

>+        try {
>+          url = prefs.getComplexValue("browser.startup.homepage",
>+                                      Ci.nsIPrefLocalizedString).data;
>+          // If url is a pipe-delimited set of pages, just take the first one.
>+          if (url.indexOf("|") != -1)
>+            url = url.split("|")[0];
>+        } catch(e) {
>+          G_Debug(this, "Couldn't get homepage pref: " + e);
>+        }

This G_Debug won't work here :) Components.util.reportError should be fine.

>+      if (ot == content.document.getElementById('exceptionDialogButton')) {
>+        var params = { location : content.document.location.href,

>+          window.content.location.reload();

>+      else if (ot == content.document.getElementById('getMeOutOfHereButton')) {

>+        content.location = url;

Be consistent with "window.content" vs. just "content" (prefer "content") and "content.location" vs. "content.document.location" (prefer content.location, since the document just gets it off the window anyways). Might also want to assign content.document to a local variable if it's still referred to frequently after those changes are made. 

>+  }, true);

I suppose you should probably use a bubbling listener instead.

>Index: toolkit/themes/pinstripe/global/netError.css
>Index: toolkit/themes/winstripe/global/netError.css

It occurs to me that these have some hard-coded colors that might not play well with non-default native themes, but that's certainly a separate issue if it's an issue at all.

I didn't look too closely at the netError changes.
Attachment #287154 - Flags: review?(gavin.sharp) → review+
Comment on attachment 287154 [details] [diff] [review]
UI changes plus a couple nits


I was looking at the patch and noticed two tiny nits:

+        // If the user  added the exception cert, attempt to reload the page

Double space between user and added

+<!ENTITY securityOverride.getMeOutOfHereButton "Get Me Out Of Here">

On the malware and phishing error pages, the button text is "Get me out of here!". Don't know which one is better, but they should probably be consistent :)
Comment on attachment 287154 [details] [diff] [review]
UI changes plus a couple nits


One more uber-nit (I like that word): Should the Add Exception button have an ellipsis, which I believe is an indication that it will open a dialog?

+<!ENTITY securityOverride.exceptionButtonLabel "Add Exception">

The Add Exception... button on the Servers tab of the Certificate Manager has one.
Attached patch Gavin's nits addressed (obsolete) — Splinter Review
Over to Boris to bring it home.
Attachment #287154 - Attachment is obsolete: true
Attachment #287187 - Flags: superreview?(bzbarsky)
Attachment #287187 - Flags: review?(bzbarsky)
Attachment #287154 - Flags: superreview?(bzbarsky)
Attachment #287154 - Flags: review?(bzbarsky)
Whiteboard: [has patch][need r+sr bz]
Comment on attachment 287187 [details] [diff] [review]
Gavin's nits addressed

>Index: browser/base/content/browser.js
>+      var contentDoc = content.document;

This is wrong.  Try an SSL error page in a subframe, for example.  We should add both that and a toplevel error page to tests.

I think you want |var contentDoc = ot.ownerDocument;| instead.

I would tend to put a function of this size on its own in the file (with a name) so as to not make delayedStartup quite as long and confusing, but if it's OK with gavin it's OK with me.

>Index: docshell/resources/content/netError.xhtml
>+        <!-- Override section - For ssl errors only.  Removed on init for other
>+             error types.  -->
>+        <div id="securityOverrideDiv">
>+          <br/>

Why is that <br> there?  If you need a top margin on the securityOverrideDiv, just give it a top margin in the CSS.  No need for a <br>.

>Index: toolkit/themes/pinstripe/global/netError.css

We need to get a bug filed on the color issues here.  We should fix that for 1.9; as things stand it's an accessibility nightmare.

>Index: toolkit/themes/winstripe/global/netError.css

Same here.

r+sr=bzbarsky with the above issues addressed.
Attachment #287187 - Flags: superreview?(bzbarsky)
Attachment #287187 - Flags: superreview+
Attachment #287187 - Flags: review?(bzbarsky)
Attachment #287187 - Flags: review+
Whiteboard: [has patch][need r+sr bz] → [has patch][need final patch for landing]
Blocks: 402356
Flags: in-testsuite?
Flags: in-litmus?
Attached patch bz's review comments addressed (obsolete) — Splinter Review
(In reply to comment #58)
> (From update of attachment 287187 [details] [diff] [review])
> >Index: browser/base/content/browser.js
> >+      var contentDoc = content.document;
> 
> This is wrong.  Try an SSL error page in a subframe, for example.  We should
> add both that and a toplevel error page to tests.
> 
> I think you want |var contentDoc = ot.ownerDocument;| instead.

You're right, I do.  Changed.  As for tests, my first impulse here was to flag this for Litmus, but Gavin suggested it might work as a browser chrome test too.  I'll flag it both in-litmus? and in-testsuite? and get QA to chime in on testing approach.

> I would tend to put a function of this size on its own in the file (with a
> name) so as to not make delayedStartup quite as long and confusing, but if it's
> OK with gavin it's OK with me.

Gavin and I discussed this earlier when the listener code was smaller, and decided that a four-liner didn't really need its own function.  Since then, though, it's grown substantially, so I've extracted it out into BrowserOnCommand.

> >Index: docshell/resources/content/netError.xhtml
> >+          <br/>
> 
> Why is that <br> there?  If you need a top margin on the securityOverrideDiv,
> just give it a top margin in the CSS.  No need for a <br>.

Yep, you're right.  Padded.

> >Index: toolkit/themes/pinstripe/global/netError.css
> 
> We need to get a bug filed on the color issues here.  We should fix that for
> 1.9; as things stand it's an accessibility nightmare.

Filed bug 402356 on this (and some other obvious cases of the same thing that leapt to mind - error console and notification-bar colours).
Attachment #287187 - Attachment is obsolete: true
Keywords: checkin-needed
Whiteboard: [has patch][need final patch for landing] → [has patch]
Attached patch Rev string IDsSplinter Review
Attachment #287226 - Attachment is obsolete: true
Checking in browser/base/content/browser.js;
/cvsroot/mozilla/browser/base/content/browser.js,v  <--  browser.js
new revision: 1.879; previous revision: 1.878
done
Checking in browser/locales/en-US/chrome/overrides/netError.dtd;
/cvsroot/mozilla/browser/locales/en-US/chrome/overrides/netError.dtd,v  <--  netError.dtd
new revision: 1.12; previous revision: 1.11
done
Checking in camino/embed-replacements/locale/en-US/global/netError.dtd;
/cvsroot/mozilla/camino/embed-replacements/locale/en-US/global/netError.dtd,v  <--  netError.dtd
new revision: 1.8; previous revision: 1.7
done
Checking in docshell/resources/content/netError.xhtml;
/cvsroot/mozilla/docshell/resources/content/netError.xhtml,v  <--  netError.xhtml
new revision: 1.25; previous revision: 1.24
done
Checking in dom/locales/en-US/chrome/netError.dtd;
/cvsroot/mozilla/dom/locales/en-US/chrome/netError.dtd,v  <--  netError.dtd
new revision: 1.12; previous revision: 1.11
done
Checking in security/manager/locales/en-US/chrome/pipnss/pipnss.properties;
/cvsroot/mozilla/security/manager/locales/en-US/chrome/pipnss/pipnss.properties,v  <--  pipnss.properties
new revision: 1.30; previous revision: 1.29
done
Checking in security/manager/pki/resources/content/exceptionDialog.js;
/cvsroot/mozilla/security/manager/pki/resources/content/exceptionDialog.js,v  <--  exceptionDialog.js
new revision: 1.5; previous revision: 1.4
done
Checking in security/manager/ssl/src/nsNSSIOLayer.cpp;
/cvsroot/mozilla/security/manager/ssl/src/nsNSSIOLayer.cpp,v  <--  nsNSSIOLayer.cpp
new revision: 1.140; previous revision: 1.139
done
Checking in toolkit/themes/pinstripe/global/netError.css;
/cvsroot/mozilla/toolkit/themes/pinstripe/global/netError.css,v  <--  netError.css
new revision: 1.7; previous revision: 1.6
done
Checking in toolkit/themes/winstripe/global/netError.css;
/cvsroot/mozilla/toolkit/themes/winstripe/global/netError.css,v  <--  netError.css
new revision: 1.7; previous revision: 1.6
done
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Whiteboard: [has patch] → [has patch][has reviews]
(In reply to comment #56)
> One more uber-nit (I like that word): Should the Add Exception button have an
> ellipsis, which I believe is an indication that it will open a dialog?
> 
> +<!ENTITY securityOverride.exceptionButtonLabel "Add Exception">

Filed bug 402369 on this and a couple of other string issues I noticed.
By the way, has someone filed a bug on seamonkey about updating its themes to handle the core changes made to netError.css here?
Blocks: 402420
I don't think it's filed, but it's only one theme anyways (Modern), as the default theme is using *stripe for the toolkit stuff.
See bug 402420, which has a patch.
(In reply to comment #65)
>See bug 402420, which has a patch.
Only because bz CCd me on this bug ;-)
Depends on: 402479
What I don't understand is why Firefox/Minefield users are presented two buttons (getMeOutOfHereButton, exceptionButtonLabel) while SeaMonkey users only get an extra line of text (warningText). Is this intended behavior, overlooked or will it be addressed in a followup bug?
Because this was a Firefox bug, more or less (though filed in Core).  You want to file a separate bug on Seamonkey....
As Bo(In reply to comment #68)
> What I don't understand is why Firefox/Minefield users are presented two
> buttons (getMeOutOfHereButton, exceptionButtonLabel) while SeaMonkey users only
> get an extra line of text (warningText). Is this intended behavior, overlooked
> or will it be addressed in a followup bug?

As Boris says, the changes made here represent the direction we want Firefox to go but since it's obviously (see bug 327181 for more debate) a heated subject, I would not presume to set SeaMonkey's policy here, nor am I familiar with their code.

SeaMonkey may well want to do the exact same thing as us, or something entirely different.  Likewise for Camino, or other browsers built on our platform.  The patch on this bug is pretty small though, I hope that if other browsers want to do something similar, it will be pretty straightforward for them to do so.
SeaMonkey uses a straight dom/netError.dtd and even coming up to equal functionality as Firefox suddenly seems to need us to override as complete file here with identical stuff and make all localizers localize the same strings a third time or so. This sucks majorily.
(In reply to comment #71)
> SeaMonkey uses a straight dom/netError.dtd and even coming up to equal
> functionality as Firefox suddenly seems to need us to override as complete file
> here with identical stuff and make all localizers localize the same strings a
> third time or so. This sucks majorily.

I agree, but so does adding the button into netError for everyone, when many won't have it wired up to do anything.  If there is a way to get the same flexibility of function while providing less pain for gecko implementers, I'd want to know.
What would be neat is if we could hook the notificationbox up to this, but that would need a reliable way of detecting the showing of an SSL certificate error.
Can a page reference multiple DTDs?  e.g. can DTDs include other DTDs or something?

I agree that the DTD setup here is suboptimal (and always has been); I'm just not sure there's a better way to do it.  :(
verified fixed using  Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.9a9pre) Gecko/2007110605 Minefield/3.0a9pre also on Linux and Mac 10.4

Adding Exception and "Get me out of here" is working fine on the sites i tested this -> Verified

Also added Litmus Testcases for this new feature.
Status: RESOLVED → VERIFIED
Flags: in-litmus? → in-litmus+
Depends on: https-error-pages
(In reply to comment #74)
> Can a page reference multiple DTDs?  e.g. can DTDs include other DTDs or
> something?

I think they can, but we still would need to do that from the dom/ variant, and that's bad. What Firefox and Camino do is to override the whole netError.dtd file, which is quite suboptimal as well. It would be so nice if we had a solution that would make this easier and doesn't need everyone to ship a complete override of that file.
Well, we could have two core DTDs.  One that contains stuff that's not likely to need overriding, and one that contains the rest.  The former includes the latter.  Apps override the latter, for the most part.
We might want to take this discussion to a new bug and spare all the people who were mostly interested in getting their Add Exception button, but nevertheless.  I think having two files would be a workable solution here, and would likely lighten the translation load on other projects, since they could pick up core translations for most of it, although note:

http://mxr.mozilla.org/mozilla/source/camino/embed-replacements/locale/en-US/global/netError.dtd

Camino overrides basically every string to replace "the browser" with "Camino."  We could fix that if "overrideThis.dtd" had an entity like &brandShortName; which "leaveThisOneAlone.dtd" pulled in and used.

All of this is predicated on there being no particularly better way to add and remove UI from the generic error page.  Maybe we could dynamically source in some js from an app-defined location </handwaving>?  Enh, maybe the actual changes being made are still lightweight enough that DTDs get the job done and we shouldn't over-architect.
bz, jonathan:
That sounds like a good way to go for me. We should try to reduce the amount of stuff we override and duplicate in our products, and either solution in the last post sounds like a way we can go for it.
Who files the bug?
(In reply to comment #39)
> doesn't hurt to be declarative about expectations.  I also added a version of
> Kai's expanded sanity check for window.arguments[0].location - I don't think
> the 2nd of the 3 checks he suggests is necessary (location in
> window.arguments[0]) but he's absolutely right that I should confirm that
> window.arguments[0] is defined before checking a member variable.

By the by, going through Tools->Options->Advanced->Encryption->View Certs->Servers->Add Exception throws window.arguments is undefined onload and on Confirming the exception.
(In reply to comment #81)
> By the by, going through Tools->Options->Advanced->Encryption->View
> Certs->Servers->Add Exception throws window.arguments is undefined onload and
> on Confirming the exception.

Doh! Nevermind - bug 404428 takes care of this.
Blocks: 405289
Blocks: 405514
Blocks: 405862
Depends on: 406655
Blocks: 407369
The decision to permanently record certificate security exceptions has had 
one very desired effect: it had greatly reduced the frequency with which 
users experience cert errors and with which they create exceptions, thereby
reducing the habitual nature of the act.  

But it has also had an unintended consequence.  It had brought to light the
fact that (some) Firefox users encounter FAR MORE invalid certificates with 
duplicate issuer names and serial numbers than we ever before imagined.  

Prior to FF3 making exceptions permanent by default, the problem of duplicate issuer names and serial numbers was thought to be relatively rare.  There were two or three will-known bogus cert generator programs that created them, mostly for web hosting services, and the word got 
out to use different software for cert generation.  

When FF3 began to make these exceptions permanent, we discovered that several different vendors of cheap NAT routers for home use provide software to generate self-signed certs, software that ALWAYS uses the same issuer name and same serial number.  FF3 users who encounter multiple of those routers now run into reused_issuer_and_serial errors, where they did not before with FF2.

The users wonder: why does FF3 report duplicate issuer and serial when FF2 
did not?  The answer is: because FF3 is "permanently" remembering the cert 
exceptions by default, but FF2 did not, so FF3 is discovering duplicates 
where FF2 did not.  The duplicates were always there.  They were merely
undiscovered.  

What can we do about this?

Well, we COULD revisit the decision to make exceptions permanent by default,
but I think that's the wrong solution.  

Here's another idea.  It involves changing the way we handle exception 
creation for duplicate issuer names and serial numbers.

When the user encounters a server cert with the same issuer name and serial number as a cert already stored as a FF3-style exception, (and ONLY for
that case!) it offers to create a new exception.  It displays a warning
telling the user that certificate for this server has changed, and that
this is a VERY serious security error (violates the fundamental tenet of KCM).  (This is EXACTLY the same situation for which SSH balks at allowing the SSH client to connect to the SSH server with dire warnings, and we should be no less dramatic in this situation than is SSH.) Then if the user chooses to proceed, the act of creating this new exception would:
a) delete the previously stored exception, and
b) delete the cert from the cert DB, and 
c) create a new TEMPORARY exception

Related bugs: 
Bug 312732 - this is probably where the main discussion should occur
Bug 399910
Bug 399914 
No longer blocks: 407369
Depends on: 407369
In reply to comment 84, a related idea, much simpler to implement:
"store the exception rule (based on hostname, port and hash), but do not add the server cert to the permanent cert storage"

This would have the side effect that users can not view the full cert when they go to cert manager and look at their stored exceptions, but they'll only know for which hosts/ports an exception exists.
Depends on: 450254
Product: Core → Core Graveyard

Is comment 86 spam? I don't see those URLs referenced in the mentioned comment 2.
Tom, can you please clarify your intention to comment on this very old and resolved issue?

Flags: needinfo?(vseerror)
Flags: needinfo?(vseerror)
Flags: needinfo?(management)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: