Closed Bug 290446 Opened 19 years ago Closed 16 years ago

Update of remote calendar does not use WebDAV locking (concurrency control)

Categories

(Calendar :: Provider: ICS/WebDAV, defect)

defect
Not set
major

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: theo, Unassigned)

References

()

Details

(Keywords: dataloss)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; da-DK; rv:1.7.6) Gecko/20050319
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; da-DK; rv:1.7.6) Gecko/20050319

When updating/inserting a new event in a remote WebDAV calendar, the calendar
file is not locked.
In order to avoid loosing data the concurrency control of WebDAV should be used
(Locking).

Reproducible: Always

Steps to Reproduce:
1. User A starts creating a new event in the remote calendar
2. User B starts creating a new event in the remote calendar
3. Users A and B submits the new event at the same time

Actual Results:  
As there will always be a delay between the two submit requests (leading to a
Get and a Put each) they will be handled sequential.
The actual sequence could/would be:
1. User A - GET test.ics
2. User B - GET test.ics
3. User A - PUT test.ics
4. User B - PUT test.ics

In this case the new event posted by user A is lost.


Expected Results:  
When the get request is sent to WebDAV the calender file should be locked, this
way the sequence could look like this:
1. User A - GET(request lock) test.ics - Lock granted
2. User B - GET(request lock) test.ics - Denied
3. User A - PUT(release lock) test.ics - Lock released
4. User B - GET(request lock) test.ics - Lock granted
5. User B - PUT(release lock) test.ics - Lock released

This would also require that the calendar retried its GET in a second or two,
when the lock is denied.
QA Contact: gurganbl → general
Reassigning all automatically assigned bugs from Mostafa to nobody@m.o

Bugspam filter: TorontoMostafaMove
Assignee: mostafah → nobody
Component: General → Provider: ICS/Webdav
QA Contact: general → ics-provider
No critical bug, but a request for enhancement?
Whiteboard: [qa discussion needed]
There is a possibility of data loss, I'd call it a bug.
I don't see how this can be anything but a critical bug.
Whiteboard: [qa discussion needed] → [qa discussion needed][dataloss]
Confirmed this on Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a2pre) Gecko/20070122 Calendar/0.6a1 using WebDav.  Sunbird no longer silently fails in this case. Now, it throws an error stating that it could not publish the file and places the calendar into readonly mode. However, if the user reloads the calendar they will lose the event that they attempted to create, which failed to publish.  

There is no good way for the user to re-add that event to the calendar, and it is not abundantly clear that they will lose that last event they just added.

Moving to major severity because we do not silently fail anymore and the application still works for many users.  

Related to bug 327933?
Severity: critical → major
Status: UNCONFIRMED → NEW
Ever confirmed: true
Whiteboard: [qa discussion needed][dataloss] → [dataloss]
There are some related issues here with bug 287612, as well as some additional information, although that bug appears to be related to a more general issue about data loss with multiple users.

I cannot see any changes to my WebDAV lock database so it looks like Sunbird never requests a file lock.
Update: Still happens with Sunbird 0.7. This is a catastrophic bug that has been on the books for 2 and a half years.
A major (I would say critical) bug sitting for almost 3 years. For calendar/sunbird to supports WebDAV, this bug must be fixed.
Flags: blocking-calendar0.8?
This won't make 0.8.
Flags: blocking-calendar0.8? → blocking-calendar0.8-
I think this may also be related to bug 329570. I would like to know the reason why none of the developers think this is worth working on, because for many of us, Sunbird is our only long term hope for a shared calendaring system that can compete with Outlook / Exchange.

This issue should block the release of 0.8. If not, why not?
Flags: blocking-calendar0.8- → blocking-calendar0.8+
We are very short before a release candidate. I doubt we will hold the release. Please leave blocking+ to the release drivers! Simon already noted this *wont* make 0.8. The reason is primarily that we have a different focus and not enough resources to take care.
Flags: blocking-calendar0.8+ → blocking-calendar0.8-
Blocks: 329570
Flags: blocking-calendar0.8- → wanted-calendar0.9?
Flags: wanted-calendar0.9? → wanted-calendar0.9+
Flags: blocking-calendar0.9+
Please only nominate (?) bugs, not approve (+) as blocking, unless you're on the Calendar team.
Flags: blocking-calendar0.9+ → blocking-calendar0.9?
Sorry I thought I was nominating it.  As I'm not on the dev team I would have thought Bugzilla would automatically hide options to approve anything.

Thanks for the clarification
It's not clear to me how DAV locking would be superior to etag-checking as we
do now (granted that there's a bug with regard to etag-checking, but that's a
separate issue). Given that we're talking about the PUT of a single file here,
I would think that either LOCK or the etag-check would prevent lost-update
problems - and the etag-check is simpler.
In response to comment #14

I think WebDAV locking is needed because it works at server level, meaning that any client (Mozilla-Calendar or not) will not be able to modify an ICS-file locked by another client. And besides it's not complicated (as opposed to simple) - just request the lock, do the modifications and release the lock. And it the lock is denied, wait a few ms and try again.
(In reply to comment #15)
I doubt that locks (and especially locks over a longer period of time on editing) are good here. The better way to go is to keep the edited file up to date periodically, so we don't run into collisions. Thus I am in favor of the PUT with If-match etag that Bruno proposed: It's equally good w.r.t. data consistency; there's no difference to external locks, because the server could execute the PUT/check atomically.
(In reply to comment #16)
> (In reply to comment #15)
> I doubt that locks (and especially locks over a longer period of time on
> editing) are good here. The better way to go is to keep the edited file up to
> date periodically, so we don't run into collisions. Thus I am in favor of the
> PUT with If-match etag that Bruno proposed: It's equally good w.r.t. data
> consistency; there's no difference to external locks, because the server could
> execute the PUT/check atomically.

You have an incorrect assumption here. The lock is not around the entire editing session, just the update. When the user has edited an event/task and the client wants to write the updated calendar, the sequence goes:

1) LOCK
2) GET and merge differences (etags can be used for this to determine that the version hasn't changed and not actually retrieve the file)
3) PUT
4) UNLOCK

The lock is brief and there is no waiting for the user to finish an edit. It declares to other clients that may want to write that there is an update in progress that should not be interrupted. And a proper server will not allow the update to be interrupted. It's your standard concurrency lock, as opposed to the weird locking various source control systems provide.
Gregory, why doesn't the following process provide the same sort of locking?


GET (If-Not-Match <old etag>, new etag is retrieved, lets assume 12345)
PUT (If-Match 12345)
  - 412 Happens if the file has been changed in the meanwhile
    The proccess should be repeated, to make sure the write happens 
  - 200/207 if the request succeeded and the file hasn't been changed


This way, not as many requests are needed, and the same result is achieved. The lock is indeed around the entire editing session, since the only "edit" we do is to PUT the file. The retrieval is not part of the editing session.

I admit this puts more load onto the client, as opposed to the server locking, but I believe this is called optimistic concurrency. The version can only be edited if the etags match, otherwise the client has to try again.

Especially with slow clients, this will narrow the timeframe the files are locked and others cannot edit it. Of course this means the slow client might take even longer if there are a lot of concurrent edits.
That's often referred to as "optimistic locking," which only works if every client supports it in the same way. The reason for using real locks is that the server takes care of the concurrency for all clients. Remember that the world is not just Mozilla clients.

Yes, it's more requests. It's also correct behavior. Most importantly, it plays nice with other clients and optimistic locking does not.
Why does this not play together with other clients? If a different client uploads another version between the GET and PUT, with or without locking, our PUT fails with a 412 since the etag no longer matches and should then be retried with a more recent version?
Consider this sequence:

1) Moz client does a GET, finds etag 12345
2) non-Moz client does a GET, ignores etag
3) Moz client does a PUT, which succeeds since the etag matches
4) non-Moz client does a PUT, which succeeds because it ignores the etag and just assumes that the version it just got is up to date

The non-Moz client is misbehaving. It does, however, result in data loss. Now consider this sequence:

1) non-Moz client does a GET
2) Moz client does a LOCK
3) non-Moz client does a PUT, which fails because the resource is locked
...

Here the misbehaving client is prevented from overwriting changes by the Moz client, and is notified that an update is in progress. There's nothing to be done about a truly misbehaving client, but the locking mechanism exists to make it just a bit harder to lose data.
(In reply to comment #17)
> 1) LOCK
> 2) GET and merge differences (etags can be used for this to determine that the
> version hasn't changed and not actually retrieve the file)
> 3) PUT
> 4) UNLOCK
Gregory, please elaborate the difference/benefit of the above sequence (performend on the client) to passing the etag with PUT and let the server do the consistency check. I quite don't see the difference really instead that it puts more burden on the client.
(In reply to comment #21)
> 1) non-Moz client does a GET
> 2) Moz client does a LOCK
> 3) non-Moz client does a PUT, which fails because the resource is locked
> ...
This scenario is irrelevant since the period of locking, putting and unlocking is considerably small.
Keywords: dataloss
OS: Windows XP → All
Hardware: PC → All
Whiteboard: [dataloss]
Flags: blocking-calendar0.9?
Is this issue really occurring? Since we have etag checking in place, I cannot imagine how to cause dataloss. Setting wanted-1.0?, calling for opinions...
Flags: wanted-calendar0.9+ → wanted-calendar1.0?
No opinions yet. Reporter, please reopen if you think we really need locks instead of just doing etag checks.
No longer blocks: 329570
Status: NEW → RESOLVED
Closed: 16 years ago
Flags: wanted-calendar1.0?
Resolution: --- → INVALID
(In reply to comment #23 and comment #24)
I don't understand how the scenario can be irrelevant. You might argue that the calender file is only vulnerable for a short time, but **** happens.  I've seen it happen and fail (because there where no WebDav-locking) that's why I reported the issue in the first place.

I don't have the technical insight to say WebDav-locking is superior to etag, or whether etag is sufficient in this case, I hope that someone with a more in-depth technical understanding will make the right choice. I can only point out (as I did in the first section of this comment) that I've experienced the problem (in older versions, probably before etag where used by the calendar component.) and it had catastrophic consequences.
You need to log in before you can comment on or make changes to this bug.