1949755 - WISeKey: S/MIME certificate issuance without proper validation

Assignee

Description

•

10 months ago

Steps to reproduce:

We received today at 12:23 UTC a notification about a potential issue in our service for free S/MIME certificates that could lead to the issuance without proper mailbox verification. This notification was not sent using our CPR mechanism.
Our team is verifying the problem and we preventively disabled new issuances until further clarification.
I'm opening this as a placeholder and will publish an incident report once we have more information about this.

Ben Wilson

Updated

•

10 months ago

Assignee: nobody → pfuentes

Status: UNCONFIRMED → ASSIGNED

Type: defect → task

Ever confirmed: true

Whiteboard: [ca-compliance] [smime-misissuance]

Pedro Fuentes

Assignee

Comment 1

•

10 months ago

We identified, and solved, the problem and revoked the four s/mime certificates affected by the incident.
We will publish early next week the full report.

Pedro Fuentes

Assignee

Comment 2

•

10 months ago

Incident Report

Summary

On 2025-02-21, WISeKey received an email communicating that an anonymous researcher found a problem in our web service for Free S/MIME certificates that could be exploited to issue a MAILBOX-VALIDATED certificate to a different mailbox to the one having been validated. This communication didn't arrive through our CPR mechanism, but it was equally processed.

In response to the report WISeKey stopped immediately the service and confirmed the problem, identified as due to a flaw on the server-side re-validation prior to issuance, which introduced a bug after a recent update and affecting the code for MAILBOX-VALIDATED S/MIME Free certificates.

The same day, WISeKey deployed a patch to correct the problem. Additional checks and preventive actions were executed to ensure that was safe to re-enable the issuance.

Impact

WISeKey ran a verification check on all the certificate corpus to identify affected certificates, finding four certificates that were instantly revoked. Those certificates were found to be issued, by the same researcher, while confirming the problem in our service.

Timeline

All times in CET (GMT+1)

2025-01-10 A new version of the Free Certificate Request Form is deployed in production. Usual initial deployments in the preview environment and code reviews are done, not detecting the problem.
2025-02-21 13:23 We received a notification to our Privacy contact mailbox notifying a potential vulnerability in our Free S/MIME service that allows to generate certificates for an address different to the one having been validated.
2025-02-21 13:57 We acknowledge to the sender the reception of the problem and we start the investigation. We disable preventively the issuance of new certificates.
2025-02-21 14:00 - 16:00 The development team reproduced the issue and a patch is deployed to avoid the problem.
2025-01-21 17:14 The incident sender was informed about the resolution of the issue and revocation of the affected certificates.
2025-02-21 16:00 - 19:00 We verify the certificate corpus and logs to identify the affected issuances, finding four certificates. All misissued certificates were revoked, and the issuance service is resumed once verified that the patch was acting as expected.

Root Cause Analysis

Here we present the results of the “4 whys” that help to identify the root cause. We are using here a methodology that was followed by HARICA in recent similar incidents, which we found useful to be adopted as reference:

Why did we receive a problem report?
Because mailbox validated certificates could be issued without proper validation of the email address.
Why mailbox validated certificates could be issued without proper validation of the email address?
Because there was a flaw in the S/MIME workflow.
Why was there a flaw in the S/MIME workflow?
Because the new code introduced a bug that failed to revalidate the information to be included in the certificate, in case this was manipulated by an attacker in the process.
Why was the code with this bug deployed in production without being detected?
Because our process for code review and testing was not followed diligently, and this demonstrated that we need to set additional controls and strengthen our process to avoid this situation to repeat.

Lessons Learned

What went well

The incident management process was followed, even if the notification didn't come through our official CPR Method
WISeKey responded quickly and fixed the bug
The affected certificates were quickly identified and revoked on time

What didn't go well

Code review didn’t catch the bug
QA didn’t catch the bug before deploying in production

Where we got lucky

The bug affected only to four mailbox-validated certificates
The bug was detected by a researcher and not a malicious attacker, minimizing damage and not affecting to real subscribers

Action Items

Action Item	Kind	Due Date
Patch code to fix bug	Correct	Already implemented
Peer review of code related to certificate issuance, to ensure consistency and detect possible new flaws	Prevent	Already executed
Review our development workflow to ensure that peer reviews are systematically performed in all changes and not only in major updates	Prevent	Already executed

We don't foresee other actions related directly to this incident.

Appendix

Details of affected certificates

Serial, SHA-256 fingerprint

1FE7A31AC0CE3FC272CFAC3D6C9B0E2D534F6BAA, 2D6DC34341511A5F0B48EE018A8795FB86B5CFE66F49719758F1173AB8BF8E6F
4F3DD94C5750D3B5C400C374462FC4EA53DF2205, 1EBA771C23C93B5A373E01F09E81562578A0A439779630CC6A04564709D76176
013EC5DBD1CB958C43532531A802D456957A122B, 1F36C419BCE3100D416B76FEF54E95C17CA301AD293000B7C5900529052F3606
39494550266B14925C7AAC50C24907C6EC3E19EC, 85E42DE4AA993F3F9A028F495E5C570B681F5017C0557CAED740C4BCE160D8EC

Based on Incident Reporting Template v. 2.0

Martijn Katerbarg

Comment 3

•

10 months ago

While the RCA uses 4 whys to identify the root cause as insufficient code review and testing, it doesn’t tell me what exactly the issue was, which makes it hard for other CAs to learn from this incident and investigate if they might have a similar issue.

I understand this additional testing might have discovered this bug. That’s usually the case with bugs. They make it into software and, had maybe just a little bit more testing be done, might have been discovered. Then again, another x days of testing might also not have led to discovery. That’s something we’ll never know for sure.

Could you share some more technical details around the actual bug itself?

What was the actual vector of attack required to make this issuance possible?

Pedro Fuentes

Assignee

Comment 4

•

10 months ago

(In reply to Martijn Katerbarg from comment #3)

While the RCA uses 4 whys to identify the root cause as insufficient code review and testing, it doesn’t tell me what exactly the issue was, which makes it hard for other CAs to learn from this incident and investigate if they might have a similar issue.

Hi Martijn,
thanks for your comment!

I would say that either I didn't explain myself properly, either there're different interpretations for the purpose of the RCA. Most likely it was the first reason ;)

My understanding is that the purpose of the RCA is not to describe the issue (which is done in the Summary section of the report and in point #3 of the RCA), but to identify its cause (so the "why", and not the "what"). This interpretation would be in line of some recent comments in other issues such as this one.

In fact, for us the RC here is not "lack of testing" or "insufficient" testing (which indeed happened), but we identified the RC as a flaw in our testing methodology and code acceptance criteria. In other words, the "lack of testing" was a consequence, and not a cause, and this is what we expect to improve with the actions we put in place.

Martijn Katerbarg

Comment 5

•

9 months ago

I would say that either I didn't explain myself properly, either there're different interpretations for the purpose of the RCA. Most likely it was the first reason ;)

My understanding is that the purpose of the RCA is not to describe the issue (which is done in the Summary section of the report and in point #3 of the RCA), but to identify its cause (so the "why", and not the "what"). This interpretation would be in line of some recent comments in other issues such as this one.

In fact, for us the RC here is not "lack of testing" or "insufficient" testing (which indeed happened), but we identified the RC as a flaw in our testing methodology and code acceptance criteria. In other words, the "lack of testing" was a consequence, and not a cause, and this is what we expect to improve with the actions we put in place.

The Summary section should contain a short description of the nature of the issue. Which I think has been done.

The RCA must contain a detailed analysis of the conditions which combined to give rise to the issue.

From your current RCA, the analysis seems to stop at the completed testing and code review, and it makes me wonder if a different, or say additional why should be asked:

Why was an attacker (or was it researcher, as I believe there’s a big difference, seeing how someone reported this), able to manipulate the process?

In this answer, I would like to understand more about the nature of the code-bug / workflow.

The purpose of incident reporting is to help us work together to build a more secure web. Therefore, the incident report should share lessons learned that could be helpful to all CA Owners in building better systems. If other CAs want to learn from this, and prevent the same thing from happening, I’m not sure tighter code review should be the lesson learned here, as that, theoretically, could be the answer to many cases of CA miss issuance. Rather, I believe sharing the details of the actual bug would help CAs to analyze their own code for similar workflows that might be affected by a (potential) likewise bug.

Pedro Fuentes

Assignee

Comment 6

•

9 months ago

(In reply to Martijn Katerbarg from comment #5)

From your current RCA, the analysis seems to stop at the completed testing and code review, and it makes me wonder if a different, or say additional why should be asked:

Why was an attacker (or was it researcher, as I believe there’s a big difference, seeing how someone reported this), able to manipulate the process?

Hi Martjin,

your question is a reformulation of the question no. 2 in the RCA, and the answer would be the same: "Because there was a flaw in the S/MIME workflow."

The RCA is still valid and would lead to the same conclusion: The root cause was that "our process for code review and testing was not followed diligently"

Now, I understand that your inquiry is not really about the Root cause, but about the nature of the flaw itself, and I don't have really any problem to satisfy your curiosity and give you more details.

In this particular case, the web page was presenting the user with a form including the values to be inserted in the certificate, which were read-only and informative, but included fields (such as the mail for notification purposes) that could be changed before sending the request. The flaw in the server-side processing allowed an attacker that had modified the DOM content of the webpage in the browser side to send data that failed to be re-validated. This bug was introduced after a recent update. The process to exploit this problem was in fact quite simple and didn't require a high skilled hacking. As disclosed in this incident (Where we got lucky), the problem was discovered soon by a researcher and the only mis-issuances were the ones done while probing the problem.

As you say "The purpose of incident reporting is to help us work together to build a more secure web", and for this purpose the objective of the RCA is to understand the reasons (the "Why", and not the "What"). What we think is relevant here is the lesson learned that "peer code reviews" is a powerful tool to avoid bug in code that can lead to vulnerabilities. Our process was considering this for more relevant changes and new functionalities and the "Action Items" address the adoption of such practices as a daily routine, even for small changes,

Pedro Fuentes

Assignee

Comment 7

•

9 months ago

All actions are taken and we propose writing a closure report in one week.
Txs.

Pedro Fuentes

Assignee

Comment 8

•

9 months ago

We will be writing then the closure report in the next days.

Pedro Fuentes

Assignee

Comment 9

•

8 months ago

Report Closure Summary

Incident description: On January, 10, 2025, a new version of the Free Certificate Request Form was deployed in production. This new version included a bug that broke the server-side revalidation, which allowed a potential attacker to obtain certificates for a mailbox different of the one validated for their account. This problem was detected by an independent researcher that notified the issue (not using our CPR mechanism). We verified the whole corpus of certificates, finding four affected certificates, that were immediately revoked; and confirmed that the bug was not exploited beyond the tests done to confirm the problem.
Incident Root Cause(s): Our process for code review and testing was not followed diligently, and while processing this incident, we determined that it had to be reinforced to avoid these situations to happen.
Remediation description: We reviewed our development and testing workflow to ensure that peer reviews are systematically performed in all changes and not only in major updates, this is expected not only to help to detect problems in the code, but also to facilitate a process to increase performance and code quality.
Commitment summary: WISeKey is committed to maintaining the highest security standards and ensuring compliance with industry best practices. We have implemented already a first set of changes in our development workflow, and we will periodically review it to ensure that our process is aligned with best practices.

All Action Items disclosed in this report have been completed as described, and we request its closure.

Ben Wilson

Updated

•

8 months ago

Flags: needinfo?(bwilson)

Ben Wilson

Comment 10

•

8 months ago

I will close this on or about Friday, 11-Apr-2025, unless there are issues or questions to discuss.

Ben Wilson

Updated

•

8 months ago

Status: ASSIGNED → RESOLVED

Closed: 8 months ago

Flags: needinfo?(bwilson)

Resolution: --- → FIXED