Closed Bug 604388 Opened 14 years ago Closed 12 years ago

Filenames used to store data for Old Charts should be based on product IDs, not names (avoid dataloss when renaming products)

Categories

(Bugzilla :: Administration, task)

task
Not set
normal

Tracking

()

RESOLVED FIXED
Bugzilla 5.0

People

(Reporter: Wurblzap, Assigned: Wurblzap)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 6 obsolete files)

Attached patch Patch (obsolete) — Splinter Review
This is a stab towards bug 267240 and does part of it.

When renaming a product, the old charts file name doesn't get renamed. This leads to a stale file staying behind, and an incomplete new data file getting built. The file should be renamed so that it stays in sync with its product.
Attachment #483180 - Flags: review?(LpSolit)
Comment on attachment 483180 [details] [diff] [review]
Patch

The right fix is to include the product ID instead of the product name in the filename. This will be fixed as part of bug 419014.
Attachment #483180 - Flags: review?(LpSolit) → review-
I'm taking this bug as it will be fixed by the sec bug. No offense! :)
Assignee: wurblzap → LpSolit
Depends on: CVE-2010-3764
Whiteboard: [blocker will fix]
Target Milestone: Bugzilla 4.2 → Bugzilla 3.2
Bug 419014 is about graph file names, not about data file names; detaching.
Assignee: LpSolit → wurblzap
No longer depends on: CVE-2010-3764
Whiteboard: [blocker will fix]
Target Milestone: Bugzilla 3.2 → Bugzilla 4.2
Target Milestone: Bugzilla 4.2 → Bugzilla 5.0
Attached patch Patch 2 (obsolete) — Splinter Review
Here's an updated patch, which makes Bugzilla use product IDs instead of product names for data files.

This does not include file renaming in checksetup.pl. The idea is to mention in the release notes that you need to run collectstats.pl --regenerate.
Attachment #483180 - Attachment is obsolete: true
Attachment #625145 - Flags: review?(LpSolit)
Comment on attachment 625145 [details] [diff] [review]
Patch 2

It's just a quick glance, but I already have some comments:


>=== modified file 'collectstats.pl'

>+    my $file = join '/', $dir, $product_id;

Maybe files should be of the form prod_$product_id to be a bit more explicit?


>+    if ($product_id ne 0) {

Write if (!$product) or if ($product_id != 0) as you are comparing integers. I have a preference for !$product.


>+        $and_product = q{ AND products.id = ?};
>         $from_product = q{ INNER JOIN products 
>                           ON bugs.product_id = products.id};

You no longer need to join products as you are now looking at the product ID, not its name.
Attached patch Patch 2.0.1 (obsolete) — Splinter Review
Cleaned up.
I'm not happy about the idea with the prod_ prefix on filenames... Both collectstats.pl and reports.cgi would need to know about it. This would mean we'd need a constant for this in Constants.pm, which seems oversized to me. I think data/mining/ is a good enough prefix.
Attachment #625145 - Attachment is obsolete: true
Attachment #625155 - Flags: review?(LpSolit)
Attachment #625145 - Flags: review?(LpSolit)
Attached patch Patch 2.0.2 (obsolete) — Splinter Review
Sorry, sent the last one off without having run bzr diff again.
Attachment #625155 - Attachment is obsolete: true
Attachment #625161 - Flags: review?(LpSolit)
Attachment #625155 - Flags: review?(LpSolit)
(In reply to Marc Schumann [:Wurblzap] from comment #4)
> This does not include file renaming in checksetup.pl. The idea is to mention
> in the release notes that you need to run collectstats.pl --regenerate.

I think that we should convert existing files to their ID equivalent from checksetup.pl. I suppose most installations do not rename their products once they are created, and it's not acceptable to require hours of recompilation because we are too lazy to rename existing files.

I suggest we do something like this (pseudo-code):

if (my @old_files = grep {$_ !~ /^\d+$/} @all_files) {
    rename_file($_) foreach @old_files;
}

If rename_file() doesn't recognize a file name because the product has been renamed, then a warning should be printed by checksetup.pl asking the administrator to run collectstats.pl --regenerate. Else it should say that all files have been correctly converted. And in the release notes, we would be more explicit by telling admins that they only need to run --regenerate if checksetup.pl asks them to do so.
Keywords: relnote
Attached patch Patch 2.1 (obsolete) — Splinter Review
Ok. This should do it.
Attachment #625161 - Attachment is obsolete: true
Attachment #625161 - Flags: review?(LpSolit)
Attachment #628451 - Flags: review?(LpSolit)
Blocks: 473135
Comment on attachment 628451 [details] [diff] [review]
Patch 2.1

>=== modified file 'Bugzilla/Install/Filesystem.pm'
>--- Bugzilla/Install/Filesystem.pm	2012-04-09 21:22:50 +0000
>+++ Bugzilla/Install/Filesystem.pm	2012-05-30 20:35:11 +0000
>@@ -637,6 +645,65 @@
>+    print "Updating old chart mining file names...";
>+    my @products = Bugzilla::Product->get_all();
>+    unshift(@products, $product_all_products);

I'd like to make this a push() instead of an unshift(), so that we will rename the -All- file last. This lets us enter _update_old_mining_filenames() in the next checksetup.pl run if something went wrong before we took care of the -All- file.
(In reply to Marc Schumann [:Wurblzap] from comment #10)
> I'd like to make this a push() instead of an unshift(), so that we will
> rename the -All- file last. This lets us enter
> _update_old_mining_filenames() in the next checksetup.pl run if something
> went wrong before we took care of the -All- file.

This is a good idea. Do you want to update your patch, or should I review the current patch first?
Attached patch Patch 2.1.1 (obsolete) — Splinter Review
The difference is just the unshift/push replacement.
Attachment #628451 - Attachment is obsolete: true
Attachment #628451 - Flags: review?(LpSolit)
Attachment #631991 - Flags: review?(LpSolit)
Comment on attachment 631991 [details] [diff] [review]
Patch 2.1.1

Sorry for taking so long to review your patch. Your patch looks good (I didn't test it yet), but there are some pieces of code with which I disagree or have problems with.


>=== modified file 'Bugzilla/Install/Filesystem.pm'

>+    # We use a dummy product instance with ID 0 for all products

This wording is confusing and confused me the first time I read this comment. We don't use ID = 0 for all products, we use ID = 0 for a fake -All- product.


>+    my $product_all_products = {id => 0, name => '-All-'};

Nit: this variable name is a bit too long and unclear. Maybe $product_default or $product_all or $product_total.


>+    print "Updating old chart mining file names...";

I think nobody will understand what "mining" is. Maybe drop this word.


>+    my @products = Bugzilla::Product->get_all();
>+    push(@products, $product_all_products);
>+    foreach my $product (@products) {

What do you do for files which have no corresponding product ID (e.g. because they have been renamed)? I think we should display something for them too (without dying).


>+        if (! -e File::Spec->catfile($miningdir, $product->name)) {
>+            push(@conversion_errors,
>+                 { product => $product,
>+                   message => 'The old mining file named "' . $product->name .
>+                              '" doesn\'t exist.' });
>+        }

We shouldn't throw an error in that case. If you run collectstats.pl e.g. weekly or monthly and you created a product meanwhile, it will not have a file in mining/ yet, but this doesn't mean there is something wrong. Not having a file present is not a problem, IMO, and shouldn't throw any error. The job of this method is to convert old filenames to new ones, not to complain about the data integrity. That's the main reason of my r-.


>+        if (-e File::Spec->catfile($miningdir, $product->id)) {
>+            push(@conversion_errors,
>+                 { product => $product,
>+                   message => 'A file named "' . $product->id .
>+                              '" already exists.' });
>+        }

This case is more problematic. Does the file exist because a product name is an integer (which is legal; we have no code to forbid this), or is this an unrelated file? Maybe it makes sense to throw an error in this case. IMO, you should throw the error and stop the conversion process immediately. No need to loop over all products (especially for installations with hundreds of products).



>+            printf "Cannot convert mining file for product %d (%s): %s\n",

As said above, I don't think it makes sense to enumerate all products for which there is a problem as you end your error message with "you need to empty the mining directory" anyway. This means you wouldn't need your @conversion_errors variable at all, and make the code a bit shorter and cleaner.



>=== modified file 'collectstats.pl'

>+# We use a dummy product instance with ID 0 for all products
>+my $product_all_products = {id => 0, name => '-All-'};

Same comment about the confusing comment and variable name as above.



>=== modified file 'reports.cgi'

>+# We use a dummy product instance with ID 0 for all products
>+my $product_all_products = {id => 0};

Same as above.


>+        $product = Bugzilla::Product->new($product_id);
>+        $product && $user->can_see_product($product->name)
>+            || ThrowUserError('invalid_product_id',

I never noticed that reports.cgi had its own error message for products you cannot see. I see no reason to let it display its own error message. We should call "product_access_denied" instead, for consistency.
Attachment #631991 - Flags: review?(LpSolit) → review-
Summary: The old chart file name should be renamed on a product rename → Filenames used to store data for Old Charts should be based on product IDs, not names (avoid dataloss when renaming products)
Attached patch Patch 2.2Splinter Review
Thanks for the review!

> >=== modified file 'Bugzilla/Install/Filesystem.pm'
> 
> >+    # We use a dummy product instance with ID 0 for all products
> 
> This wording is confusing and confused me the first time I read this
> comment. We don't use ID = 0 for all products, we use ID = 0 for a fake
> -All- product.

Reworded in all places in all files.

> >+    my $product_all_products = {id => 0, name => '-All-'};
> 
> Nit: this variable name is a bit too long and unclear. Maybe
> $product_default or $product_all or $product_total.

Using $product_all in all places in all files.

> >+    print "Updating old chart mining file names...";
> 
> I think nobody will understand what "mining" is. Maybe drop this word.

Using "charting data file name" now.

> >+    my @products = Bugzilla::Product->get_all();
> >+    push(@products, $product_all_products);
> >+    foreach my $product (@products) {
> 
> What do you do for files which have no corresponding product ID (e.g.
> because they have been renamed)? I think we should display something for
> them too (without dying).

I don't know, we cannot distinguish these from "free" files. If you don't mind, let's keep this issue out of this bug.

> >+        if (! -e File::Spec->catfile($miningdir, $product->name)) {
> >+            push(@conversion_errors,
> >+                 { product => $product,
> >+                   message => 'The old mining file named "' . $product->name .
> >+                              '" doesn\'t exist.' });
> >+        }
> 
> We shouldn't throw an error in that case. If you run collectstats.pl e.g.

I agree. Removed.

> >+        if (-e File::Spec->catfile($miningdir, $product->id)) {
> >+            push(@conversion_errors,
> >+                 { product => $product,
> >+                   message => 'A file named "' . $product->id .
> >+                              '" already exists.' });
> >+        }
> 
> This case is more problematic. Does the file exist because a product name is
> an integer (which is legal; we have no code to forbid this), or is this an
> unrelated file? Maybe it makes sense to throw an error in this case. IMO,
> you should throw the error and stop the conversion process immediately. No
> need to loop over all products (especially for installations with hundreds
> of products).

The list may get long, sure, but I'd rather see a full report and a long list and be warned before I purge the directory! We're talking about exotic cases here, and the list is really not very likely to get long. And it won't take a long time, either.

I'm keeping this.

> I never noticed that reports.cgi had its own error message for products you
> cannot see. I see no reason to let it display its own error message. We
> should call "product_access_denied" instead, for consistency.

Done, and old message removed.
Attachment #631991 - Attachment is obsolete: true
Attachment #665810 - Flags: review?(LpSolit)
Comment on attachment 665810 [details] [diff] [review]
Patch 2.2

>=== modified file 'reports.cgi'

>+my $product_all = {id => 0};

Also define name => '-All-' as you did in collecstats.pl and Filesystem.pm. This will let you remove the hack below.



>=== modified file 'template/en/default/reports/old-charts.html.tmpl'

>+                <option value="[% product.id FILTER html %]">[% product.name || '-All-' FILTER html %]</option>

With what I suggested above, you can remove || '-All-' as now all product objects have a name.


Your patch is working fine. r=LpSolit with the fix above.
Attachment #665810 - Flags: review?(LpSolit) → review+
I would prefer to take it for trunk only as we are late in the game for 4.4. I hope you don't mind.
Flags: approval+
Target Milestone: Bugzilla 4.4 → Bugzilla 5.0
(In reply to Frédéric Buclin from comment #15)
> With what I suggested above, you can remove || '-All-' as now all product
> objects have a name.

I don't think that's a good idea. I want to be able to localize -All-.
(In reply to Marc Schumann [:Wurblzap] from comment #17)
> I don't think that's a good idea. I want to be able to localize -All-.

You already have

 "title" => "Status Counts for $product_in_title",

hardcoded in reports.cgi. How is that different?
It's not, of course -- but it cannot be, because it's not templatized, but goes straight into Chart::Lines :)
(In reply to Marc Schumann [:Wurblzap] from comment #19)
> It's not, of course -- but it cannot be, because it's not templatized, but
> goes straight into Chart::Lines :)

Let's not fight for such a minor thing. Commit your patch as is. (But no localizer besides you will know it can be localized without breaking everything.)
Thanks! And never underestimate a localizer ;)

Committing to: bzr+ssh://wurblzap%40gmail.com@bzr.mozilla.org/bugzilla/trunk/
modified Bugzilla/Install/Filesystem.pm
modified collectstats.pl
modified reports.cgi
modified template/en/default/global/code-error.html.tmpl
modified template/en/default/global/user-error.html.tmpl
modified template/en/default/reports/old-charts.html.tmpl
Committed revision 8439.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Blocks: 824640
Added to relnotes for 5.0rc1.
Keywords: relnote
Blocks: 1273846
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: