Closed Bug 662636 Opened 13 years ago Closed 12 years ago

Cleaning up the LDAP structure into something slightly saner

Categories

(Cloud Services :: Server: Core, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: telliott, Assigned: telliott)

References

Details

(Whiteboard: [qa-])

Attachments

(1 file)

There's a bunch of cruft in our LDAP accounts. In the long run that's going to hurt us in terms of storage space and processing delay.

This bug is to campaign for 2 (related) changes.

1) Delete backupNode: This was for when we thought we'd need redundancy, and has never been used. We just write default values to it. It can be dropped without any consequence at all.

2) Delete primaryNode and create a new entry called syncNode. primaryNode is structured as a dictionary of strings prefixed by product (i.e. weave:weavenode2". For *every single* sync call, we grab the entire hash, iterate across it, and do a substring match. That's terrible and not for any good reason.

syncNode would be a single string returned by the application, and be much more efficient. In the event that we need to create a new product that has a node, we create a new field and use that. It'll be more efficient in terms of processing and not much worse in terms of storage. It also means there's much less danger that an application writing its new node in tromps over the old values (which would be pretty scary, possibly)

It also means that we don't have to do silly hacks in the sql implementation which, while not a priority, is at least a bit of a win.

Thoughts? Objections?
So I got the full structure from Pete, and it's... goofy

objectclass ( 1.3.6.1.4.1.13769.1.2.11
   NAME 'dataStore'
   DESC 'Object pointing to users data'
   SUP top
   AUXILIARY
       MUST ( account-enabled $ mail-verified )
       MAY  ( primaryNode $  rescueNode $ secondaryNode $ path $ accountStatus $ statusMessage $ uidNumber ) )

account-enabled: we need this, but it should be an int, not a string (currently set to Yes or No!)

mail-verified: a good idea for the future and should be kept, but again, can just be an int or Bool. We (will soon) have a facility for doing codes, so storing one in the ldap is just inefficient

primaryNode: becomes syncNode as described above

rescueNode, secondaryNode, path: all completely unused and can be dropped

accountStatus, statusMessage: also unused. We had thoughts to possibly put user messages in here, but that hasn't gone anywhere. I think we can kill them for now and if we come up with a reason to have them, put them back

uidNumber: obviously essential


It would also be nice to get some naming consistency in here, which has the side effect of allowing us to do this in backwards compatible fashion - we push out the adds, get the code migrated, get all the data migrated, then drop the old fields. so, I would propose:

MUST: uidNumber, status
MAY: mailVerified, syncNode

all are ints except syncNode, which is a string.
I agree with most changes, they sound wise. 

So:

 MUST: uidNumber, status, accountEnabled
 MAY: mailVerified, syncNode

> For *every single* sync call, we grab the entire hash, iterate across it, and > do a substring match. That's terrible and not for any good reason.

you mean for every call on node/weave right ? other sync calls are not doing this since the node is known when we call the storage server.

The LDAP call done when you call a sync API is basically a query to get the userId and the account-enabled flag.

I agree that having a singled-value is an improvement, but the benefit is just on the one node/weave call.
All sync calls should be doing it - at least in the php version, we parse out the node assigned and compare it to the host header:

http://hg.mozilla.org/services/sync-server/file/4df296e0880e/1.1/weave_user/mozilla.php#l107

If we don't do this, a user can claim to be on any random node and bad things happen as a result.
also, why do we need a status and an accountEnabled? Seems to me like we need a single int there.
Assignee: nobody → petef
Attaching the intermediate schema (for backwards-compatibility with the old data). Basically, this just defines the new attributes and adds them to the "MAY" section of the dataStore class.
Assignee: petef → telliott
(In reply to comment #3)
> If we don't do this, a user can claim to be on any random node and bad
> things happen as a result.

Good point. This needs to be added in server-storage see bug 662859 

> also, why do we need a status and an accountEnabled? Seems to me like we need a single int there.

yeah I misread status vs accountStatus
Depends on: 663911
Transitional steps:

1) move to intermediate scheme defined by pete. 

2) start writing to syncNode (Bug 663911)'

3) Copy primaryNode:weave to syncNode for all users

4) start authing against syncNode instead of primaryNode
Instead of adding "status", let's just re-use the existing "accountStatus".  "status" is too generic and is already in use in the mozilla schema (for some corp thing; right now we share the same schema OID space).
Are there backwards compatibility issues for moving it from a string to an int? Will LDAP suddenly start complaining if we make them all 1?
the existing accountStatus is an integer (I believe you're thinking of the existing statusMessage string).
OK, good. That makes total sense, then.
We made these changes and things are happy... until we need to support BID sync and appsync in there for a while. But that's for other bugs.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Whiteboard: [qa-]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: