277796 - Profiles.userid should be profiles.id

Reporter

Description

•

20 years ago

In a database full of inconsistencies, this column is in a class of its own. 1) The majority of tables with an 'id' field in the database these days now use simply that as a column name. To wit: classifications.id, components.id, flags.id, flagtypes.id, groups.id, keyworddefs.id, products.id, series_categories.id, whine_events.id, whine_queries.id, whine_schedules.id 2) The remaining tables with 'id' in a column name are: attachments.attach_id, bugs.bug_id, fielddefs.fieldid, profiles.userid, quips.quipid, series.series_id. Note that even in category two, profiles stands out as an anomaly. All others have some variation of 'table.table_id'... not so 'profiles', which uses 'userid' as a column name in complete defiance of even the minimal conventions of the rest of the schema. I propose to make a patch to change it, and to take one step at fixing up the schema to be consistent throughout. This bug is about the discussion of that patch - how it should be done, and even *whether* it should be done. First of all... should it be done. I strongly believe that it should, because I believe that it is a Good Thing for the database to be consistent throughout. It will, however, break existing customizations... and unlike the discussion on bug 155628, I don't think that there's any 'aliasing' that would work to get around that. In my opinion, however the current hodge-podge is a sign of poor guidance and reflects poorly on the tool as a whole. Secondly, if this is going to be a first step in making the database consistent, it needs to be decided what that consistency is going to *be*. I'm ambivalent on whether the overall design should include columns named foo.id or foo.foo_id, but the fact that both exist bothers me. I personally like foo.foo_id better, but not so much that I'd fight for it if the decision was otherwise. Looking at the most recent additions to the database, they are almost all of the type foo.id (except for quips, which somehow snuck past with foo.fooid), and they outnumber the foo.foo_id category by 11 to 6, so it should be less work to change everything (eventually) to foo.id than foo.foo_id. Of course we can all speak up and argue one way or the other, but at the end of the day, if we're going to go ahead with this, then there is going to need to be a Command-Level Decision on How Things Should Be from here on in, because then we won't be making any *new* mistakes. I'm not going to start this today, or even this week. I'd like to get it into 2.22, but even that may be too optomistic. Nonetheless, the time for discussion on the issue is sooner rather than later, so that when the time DOES come that it can be done and checked in, there are no more objections.

Kevin Benton

Comment 1

•

20 years ago

Per DB normalization standards that I've been taught, fields should be named dbname_fieldname. This prevents any collision of field names. So, in this case, it would be named profiles.profiles_id. Better yet, the table should be renamed to profile and then it could be called profile.profile_id.

Kevin Benton

Comment 2

•

20 years ago

Nice comments, Shane - I agree with you wholeheartedly. A tremendous amount of clarity, however, can be gained by normalizing the field names. That way, you know which ID you're comparing in the select. While a little ease in coding is gained by having "id" as the field name, a lot more clarity is gained by having to do a little more typing.

Frédéric Buclin

Comment 3

•

20 years ago

(In reply to comment #1) > case, it would be named profiles.profiles_id. Better yet, the table should be > renamed to profile and then it could be called profile.profile_id. I disagree. Table names are all plural and I think it is fine like that. Keep plural for table names and singular for columns. This renders the code even easier to read IMO (=> profiles.profile_id, bugs.bug_id, quips.quip_id, etc...).

Kevin Benton

Comment 4

•

20 years ago

(In reply to comment #3) > (In reply to comment #1) > > case, it would be named profiles.profiles_id. Better yet, the table should be > > renamed to profile and then it could be called profile.profile_id. > > I disagree. Table names are all plural and I think it is fine like that. Keep > plural for table names and singular for columns. This renders the code even > easier to read IMO (=> profiles.profile_id, bugs.bug_id, quips.quip_id, etc...). Sounds great to me as long as we agree on a standard that is implemented DB-wide.

Christopher Hicks

Comment 5

•

20 years ago

I'd like to weigh in for the camp that wants to see tables.id instead of tables.table_id. The reason for this is several: (1) If you've see a .id column you know several things about it naturally: (a) its a primary key, (b) its almost certainly autogenerated by the database, (c) its not null, etc. A .blah_id column could be a foreign key or a primary key, so it lacks the clarity of a .id column without further context such as the position in the table or the table name. (2) Unless you've aliased table names as part of the query you have "bugs.id = table.bugid" which makes it very clear that the .id is a bug ID. Even with aliases you'd have "b.id = c.bugid" which since you've defined b and c shouldn't be confusing in context. (3) Any tools for manipulating the database can assume that the .id column is a primary key. So basically, it seems to me to be a handy convention that rarely impacts clarity and in fact more clarity is provided when you can tell which column in a link is the PK and which is the FK. Regardless of which direction the consensus ultimately lies, having it be consistant is the most important thing and I applaud Sean's effort to improve the consistancy of the bugzilla database schema.

Shane H. W. Travis

Reporter

Comment 6

•

20 years ago

Explicitly adding a few more people to the cc: list whom I know think/care about this sort of thing. Yeah, I know it's like herding cats to get everyone to agree, but I'm hoping that we can generate *enough* agreement that Dave has a fairly large consensus when he decides on the direction for the future. 'Generating a standard that is implemented DB-wide' is what this bug is all about, ultimately, so if you've got input on DB schema... please feel free to add your two cents worth. Logic and justification is better than just 'I want this' ... but something is better than nothing.

Shane H. W. Travis

Reporter

Comment 7

•

20 years ago

(In reply to comment #5) > I applaud Sean's effort to improve > the consistancy of the bugzilla database schema. Oh sure, I do all the skullsweat, and some guy named Sean gets all the credit. ;-)

Christopher Hicks

Comment 8

•

20 years ago

(In reply to comment #7) > (In reply to comment #5) > > I applaud Sean's effort to improve > > the consistancy of the bugzilla database schema. > > Oh sure, I do all the skullsweat, and some guy named Sean gets all the credit. > > ;-) Its just my sort of luck that I would show my ghastly ineptitude at remembering names in the midst of discussing naming conventions. So I'll applaud Shane for being so poignant in many regards. :)

Dave Miller [:justdave]

Comment 9

•

20 years ago

This is the kind of thing that I would really love to see happen. It's also the kind of thing that I think we should keep on the backburner and get it in the queue when we decide we're ready to call it 3.0. I'm willing to make major breaking changes like this when we do a full version number bump. Such as fixing all of the primary key columns at once to be consistant all the way through isntead of having two different rules (either id or table_id). We have a number of major things coming down the pike soon that'll make it worth doing that (hack-free packaging capability for distros, multiple database support, etc). I have a feeling the cleanest way to make checksetup.pl work with multiple databases is to start from scratch and pretend it's the first ever installation when we do 3.0. We can stash the old checksetup.pl (the database upgrade part) in a subdirectory of utilities somewhere for people to use when they upgrade from a version older than 3.0 with MySQL, stick pgsetup.pl along side it for people upgrading from RH's branch. Both can update from 2.x to 3.0 and then tell them to run the new Makefile.PL or whatever we call it. :)

Shane H. W. Travis

Reporter

Comment 10

•

20 years ago

(In reply to comment #9) > This is the kind of thing that I would really love to see happen. Good to hear. > It's also the kind of thing that > I think we should keep on the backburner and get it in the > queue when we decide we're ready to call it 3.0. Not so good to hear, even though I feel some sympathy for your position and can mostly understand your logic. I feel that the problems with this approach, however, are as follows; #1: Delay. I'm the one bringing this issue up because I'm willing to work on it. I am certain that I'll still be interested in improving Bugzilla tomorrow, and that I'll have time to do so. I'm fairly certain that I'll have the time and interest to do so when the 2.22 branch opens. the further this gets from 'today', however, the hazier this picture gets. Perhaps if I'm not around/willing/able to do it then, someone else will pop up with the same enthusiasm... but perhaps not. #2: Size. One cannot swallow a watermelon whole; it is necessary to take bites. Even in my short time here, I've seen good ideas snuffed out because they didn't do EVERYTHING that needed doing, which is what you seem to be proposing; we can't fix *this* table now because it doesn't fix ALL tables now. It seems to me that fixing just this one table and doing it right is going to be a big enough change already; requiring that whoever do it also take on the burden of fixing every other table before they can proceed seems to be destined to kill the effort... again. #3: Paradox. You say that you're willing to allow sweeping, localization- breaking changes when we uprev to 3.0, but (I'm guessing) there's no reason to uprev to 3.0 unless we have some sweeping, localization-breaking changes that need to go in. Chicken and egg situation. If there is going to be a 3.0, then a decision needs to be made at some point that 2.xx will be *the last* 2.xx version. From there until the next release -- which will be 3.0 -- changes can be as radical and schema-breaking and localization-breaking as it needs to be to GET IT DONE RIGHT; rename the tables, rename the fields, reorder the directories, fix all the variables, etc. That decision will create a lot of work and a pressing need for a lot of attention in a LOT of areas; you've already made some mention of the size of the issue in your previous comment. I would place money down now that it cannot be done on the currently projected six-month release schedule (although I'd be pleased to be proven wrong). If that's really your stand -- that this bug cannot be resolved until 3.0 when every other table is fixed at the same time -- then please set the milestone appropriately. Regardless, I think that it's still worth the discussion on how it should be done on that day (by whoever is still around at the time) so that there is at least *one* less thing that needs to be ironed out and resolved. 3.0 is going to be busy enough that anything that can be decided ahead of time should be.

Max Kanat-Alexander

Comment 11

•

20 years ago

Tables should be named singularly. It makes a lot of logic dealing with perl-generated SQL a LOT easier (see bug 17453, where we already had this discussion, and the patch proves it). table.id is better -- table.table_id will not avoid collisions, it will just make them more confusing. We would have, for example, profile_id in multiple tables. In some places it would be a foreign key, in some places it would be a primary key. We already have a bug_id in multiple tables, I believe. (In reply to comment #9) >I have a feeling the cleanest way to make checksetup.pl work > with multiple databases is to start from scratch and pretend it's the first ever > installation when we do 3.0. Actually, I've actually been looking at it, and making checksetup cross-DB compatible won't be impossible at all. See bug 277617, the blockers on bug 248175, and the discussion starting at bug 237862 comment 55. In any case, I'd rather not throw away 5000 lines of working code. :-)

Christopher Hicks

Comment 12

•

20 years ago

(In reply to comment #9) While I respect the logic at making this a 3.0 type change I disagree. While this change will certainly need some extensive shaking out, waiting to do so in a "this is a big release" sort of cycle seems the worst time to get over the hump of that shaking out. However, given the barnacled mass that checksetup.pl has become dumping it at a 3.0 release would make a tremendous amount of sense. Until a 3.0 cycle, keep accumulating barnacles in checksetup.pl to fix the various things that need to be fixed incrementally. Until the obvious things are fixed and cleaned up its going to be hard to get down to serious discussions about what's wrong or right about the database. The only thing that I would want to be a blocker for progress on database schema cleanup is a mostly complete Perl API that could be used to easily construct a test suite on top of. With a test suite that had adequate coverage these changes could be made and vetted thoroughly before victimizing anyone to actually test it by hand. (This would also make it easy to create a statiscically adjustable random Bug creator for testing purposes.)

Max Kanat-Alexander

Comment 13

•

20 years ago

(In reply to comment #12) > However, given the barnacled mass that checksetup.pl has become dumping it at a > 3.0 release would make a tremendous amount of sense. Actually, I think that checksetup can be refactored quite nicely. But that's probably a discussion for somewhere else than this bug. :-) (But see my above comment.)

Kevin Benton

Comment 14

•

20 years ago

(In reply to comment #13) > (In reply to comment #12) > > However, given the barnacled mass that checksetup.pl has become dumping it at a > > 3.0 release would make a tremendous amount of sense. > > Actually, I think that checksetup can be refactored quite nicely. But that's > probably a discussion for somewhere else than this bug. :-) (But see my above > comment.) I wonder if Dave's idea to rewrite checksetup.pl doesn't make total sense. After all, we're starting at a vastly different rev. and a 2.xx -> 3.0 script can be provided without naming it checksetup.pl. It seems to me that 3.0 ought to have its own (new) checksetup.pl for the simple reason that admins maintaining an existing DB shouldn't need the 2.xx checks/upgrades and people putting 3.0 over top of 2.xx should know that an upgrade program is provided that should only be run once. Heck, we might even want to have checksetup.pl call the external upgrade program if a file like .bugzilla_version_id doesn't exist or if the version isn't right, or if a command line parameter is specified to force a DB check is given. The .buzilla_version_id file should have comments in it not to hand-modify the file. Also, I wonder if functions shouldn't be created for each version upgrade so that readers can see what DB changes were made when, and more importantly, to insure that the changes happen in proper order. It would also make it easier for customizers to see where to put their code. Maybe I'm just rambling, but these are just a few thoughts to help us get there... Back on to the topic at hand, even though it is very convenient to have just .id, if our foreign keys are named tablename_id, what benefit does it serve? I still believe we should name the primary key tablename_id to keep it consistent. Regardless, what ever the group decides, I'm willing to go along with.

Dave Miller [:justdave]

Comment 15

•

20 years ago

Putting it off until 3.0 doesn't mean putting it off forever. The criteria for what I want to be able to call it 3.0 is stuff that's within reach right now. Most of these changes I've been talking about have had a lot of activity on them lately, and are showing signs of actually landing soon. It would be good to take advantage of it. (That, and we've been 2.x for almost 7 years, which is a long time for a software project that gains features as fast as we do.

Olav Vitters

Updated

•

19 years ago

QA Contact: mattyt-bugzilla → default-qa

Frédéric Buclin

Updated

•

15 years ago

Assignee: shane.h.w.travis → general

Max Kanat-Alexander

Updated

•

15 years ago

Severity: normal → enhancement

Summary: Profiles.userid should be either profiles.profile_id or profiles.id → Profiles.userid should be profiles.id

Max Kanat-Alexander

Updated

•

15 years ago

Assignee: general → database

Component: Bugzilla-General → Database

Bugzilla

Profiles.userid should be profiles.id

Categories

(Bugzilla :: Database, enhancement)

Tracking

()

People

(Reporter: shane.h.w.travis, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Updated

Updated

Updated

Updated