Closed Bug 622889 Opened 14 years ago Closed 14 years ago

Channel event log

Categories

(Cloud Services :: Server: Key Exchange, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: Atoll, Assigned: tarek)

References

Details

(Whiteboard: [qa+])

Attachments

(1 file)

Bug 622685 requires access to certain internal data within the JPAKE server that we can't easily (or in some cases at all) parse from the http access logs with certainty.

So I'd like to request a new log file from the JPAKE server.  General structure would be one line per event with the timestamp, channel, event, and optional useragent fields.

The events we're interested so far are:

* Channel created by client A [ua, ip]
* Channel connected by client B [ua, ip]
* Error reported from client A [ua, ip]
* Error reported from client B [ua, ip]
* Credentials transmitted successfully
* Channel terminated by server (timed out)
* Channel terminated by server (other reasons)

All of the above listed events are associated with a specific JPAKE channel (eg. "a6z3" or similar), which is key for metrics analysis of session results.

I noted above with [ua, ip] where we should additionally log the useragent and IP address associated with the event.

I suggest the following log format, which adapts easily to optional fields now and new fields (required or optional) in the future.  Timestamp is time.time(), floating point okay.

ts="1234567890" channel="a6z3" event="..." ip="..." useragent="..."
This data also satisfies a forensics logging requirement.
This level of details seems a bit overkill - a few remarks:

* ''Channel connected by client B'' what this is needed for ? what do you mean by "connected" - ISTM that the goal of the log is to determine if a Pake session was successful or not. So it seems overkill to log this if ''connected'' means any request by the client.

* ''Channel terminated by server (timed out)'', since the time out is handled by memcached and I don't have any event triggered when it happens, it would require a much more complex infra. Notice that it would be feasible with Redis since it has a pubsub API.

Overall I think we can strip the logs to (with your format):

* A: Channel created by client A [ua, ip]
* B: Credentials transmitted successfully [ua, ipa, ub, ipb]
* C: Channel terminated [ua, ipa, ub, ipb, reason]
  With reason : the log reported by any client or the server 

Because when a client report an error, the direct action is to terminate a channel.

For channels timeout I don't have any other solution than doing some maths :

 A - (B + C) = channels that have timed out
Logging when client B connects permits us to tell whether or not client B *ever* successfully connected, as opposed to when they connect at first and then timeout.  It also gives us an explicit forensic log of both [ua, ipa] and [ub, ipb] in one place.

Agreed that timeouts are impossible to log under memcache regime, maybe someday with redis :)

So, if it won't cost us performance, I would request A2:

* A: Channel created by client A [ua, ipa]
* A2: Client B connected to client A's channel [ub, ipb]
* B: Credentials transmitted successfully [ua, ipa, ub, ipb]
* C: Channel terminated [ua, ipa, ub, ipb, reason]
  With reason : the log reported by any client or the server 

With a glossary for the curious:

* ua, ub - useragent of client A and client B respectively
* ipa, ipb - ip of client A and B respectively
* reason - whatever they said to /report
After a second thought, there's another thing I can't log right now and being 100% sure: the success of the transaction.

Even if I count the number of PUT and GET I cannot know for sure that no timeout occurred and that the client really finished the transaction.

The only way to do it would be to ask for the client that receives the credentials to tell the server by doing a final call to /report for example.

Since knowing if the transaction was successful is really important, I suggest that /report gets called *everytime* a transaction is over, and that we define a specific value for X-KeyExchange-Log in case of a success.

This marker will skip the CEF logging, and issue a "Credentials transmitted successfully" log and if needed will close the channel.

Adding Stefan and Phillip for more feedback on this.
It would indeed be nice to know that the transmitted credentials were received.  I split B into two parts to reflect both sides of the transaction.  As of today, B2 isn't really possible.  If we adjust the client to always POST /report when it receives the credentials successfully, then B2 will end up merging into C, which is fine.

* A: Channel created by client A [ua, ipa]
* A2: Client B connected to client A's channel [ub, ipb]
* B1: Credentials transmitted successfully by client B [ub, ipb]
* B2: Credentials received successfully by client A [ua, ipa]
* C: Channel terminated by server [ua, ipa, ub, ipb, reason]
  With reason : the log reported by any client or the server
Stefan suggests that the process for B2..C is more like B2 "credentials fetched by client A" and then we use C to detect whether the channel was terminated because "credentials processed successfully by client A" or for a variety of other reasons.

B2 reworded to reflect that:

* A: Channel created by client A [ua, ipa]
* A2: Client B connected to client A's channel [ub, ipb]
* B1: Credentials transmitted successfully by client B [ub, ipb]
* B2: Credentials fetched successfully by client A [ua, ipa]
* C: Channel terminated by server [ua, ipa, ub, ipb, reason]
  With reason : the log reported by any client or the server
There is a concern that the current implementation is not able to keep up with detailed logging.  In review, we figured out that the nginx logs already have *nearly* everything we need for metrics and forensics.  C is already logged, and if someday it begins including "all good" reports from the client, yay, until then, not a big deal.  A2, B1, B2 can be interpreted from nginx logs as-is, without further server work.  A is *almost* doable from nginx logs, except that we need some way to log the channel that was created.

Tarek, could you add to /new_channel the following http response header (no quotes):

X-KeyExchange-Channel: abcd

And then we can add $sent_http_x_keyexchange_channel to the nginx logs.
(In reply to comment #7)
>  C is already logged, and if someday it begins including "all good" reports from the client, yay,
> until then, not a big deal

Ok. Just to make sure: C is only logged for failures at this point. 

IOW you cannot make the difference between a timed out channel and a success because the client that receives the credentials does not ack that it has successfully received and treated them.

If we want to do this, we need to make a final /report call stating that the transaction was a success. Theoretically that last call could also fail, but if the client was able to do a full transaction there are good chances that this last request will succeed. So there could be a very tiny error rate on timed-out channels vs successful transactions. And we could also run deeper log analysis to see if the user that added a new device is using it to call our auth server afterwards. But that could be overkill ;)

Stefan, Phillipp, what are you opinions on that last /report call on success ?
I'd be OK to add it on server side, I think it makes sense.

> Tarek, could you add to /new_channel the following http response header (no
> quotes):
> 
> X-KeyExchange-Channel: abcd
> 
> And then we can add $sent_http_x_keyexchange_channel to the nginx logs.
.
Will do
Once this is done, we need to update the wiki and close bug 615886
Attachment #503173 - Flags: review?(jrconlin) → review+
Done in http://hg.mozilla.org/services/server-key-exchange/rev/f3f120109bb8.

Last missing bit is about the final report/ call
Forked the disscussion about ack-ing on successful transactions in Bug 625435
Will appear in the next rpm tag. Richard will have all he needs for Metrics
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Blocks: 624791
Whiteboard: [qa+]
Cleaning up ye olde bugs...
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: