Closed Bug 360231 Opened 18 years ago Closed 10 years ago

importxml.pl ignores the maxattachmentsize and maxlocalattachment parameters when importing attachments

Categories

(Bugzilla :: Bug Import/Export & Moving, defect, P3)

defect

Tracking

()

RESOLVED FIXED
Bugzilla 4.4

People

(Reporter: ycombarnous, Assigned: LpSolit)

Details

Attachments

(1 file, 5 obsolete files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1) Gecko/20061010 Firefox/2.0
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1) Gecko/20061010 Firefox/2.0

Bugzilla version: 2.22.1

How can I tell importxml.pl to import files on local storage instead of DB?

I use the encoding "filename" for the XML, and put all my files to import inside a folder that I give as parameter (--attach_data = ...).

Here is a sample of my xml file:
--------------------------------

<attachment
              isobsolete="0"
              ispatch="0"
              isprivate="0">
            <attachid>302</attachid>
            <date>2005-07-27 09:49:29 CET</date>
            <desc>Screen shots CR 1426</desc>
            <filename>Screen shot.doc</filename>
            <type>application/msword</type>
            <data encoding="filename">CR1426-20050727094924.doc</data>        
          </attachment> 

I tried to customize import.xml, but I fail to insert a null value into the 'thedata' field on Mysql (though I think this is needed so that it knows it is on local storage). I took out the code from attachment.cgi.

Below is my failing patch:
--------------------------

93a94
> use File::Copy;
418a420
> 		$attachment{'filepath'} = $attach_filename; #YC
1143c1145,1146
<         my $att_data = $att->{'data'};
---
>         my $att_data; #YC
> 	trick_taint($att_data);
1146c1149
<         trick_taint($att_data);
---
>         # YC trick_taint($att_data);
1148a1152,1159
> 	
> 	my $attachdir = "d:/bugzilla/bugzilla/data/attachments";
> 	my $hash_yc = ($att_id % 100) + 100;
> 	$hash_yc =~ s/.*(\d\d)$/group.$1/;
> 	mkdir "$attachdir/$hash_yc", 0770;
> 	chmod 0770, "$attachdir/$hash_yc";
> 	copy($att->{'filepath'}, ">$attachdir/$hash_yc/attachment.$att_id");

Reproducible: Always
The rewrite of importxml happened at about the same time as the local storage became an option, which meant I wasn't thinking of it when I did it ;)

What importxml should do is check the maxlocalattachmentsize paramater and respect it. The code in attachment.cgi and Attachment.pm can give you a better idea of how this works. If you want to write a patch, feel free to do so and attach it to this bug and request me as the reviewer. 
Status: UNCONFIRMED → NEW
Ever confirmed: true
Attached patch Patch v1 (against 2.22.1+) (obsolete) — Splinter Review
This should be working, added a "--local_storage|-l" option on the command line.
Attachment #245754 - Flags: review?(ycombarnous)
Attachment #245754 - Flags: review?(ycombarnous) → review?(ghendricks)
Comment on attachment 245754 [details] [diff] [review]
Patch v1 (against 2.22.1+)

please create a valid and readable patch with "diff -uN". There must be at least 3 lines of context.
Attachment #245754 - Flags: review?(ghendricks) → review-
Attached patch Patch v1.1 (against 2.22.1) (obsolete) — Splinter Review
Patch created with cvs.
Attachment #245754 - Attachment is obsolete: true
Attachment #246157 - Flags: review?(LpSolit)
I would prefer a patch against 2.23.3+ as no enhancement bug will be taken on the 2.22 branch. Your patch will be delayed till 3.2 anyway as the code is frozen to prepare Bugzilla 3.0.
Severity: normal → enhancement
Comment on attachment 246157 [details] [diff] [review]
Patch v1.1 (against 2.22.1)

>@@ -79,7 +79,7 @@
> use Bugzilla::Component;
> use Bugzilla::Milestone;
> use Bugzilla::FlagType;
>-use Bugzilla::Config qw(:DEFAULT $datadir);
>+use Bugzilla::Config qw(:DEFAULT $attachdir);
> use Bugzilla::BugMail;
> use Bugzilla::User;
> use Bugzilla::Util;

Config now provides a bz_locations function. Use this instead.


>@@ -1140,12 +1148,30 @@
>             $att->{'desc'}, $att->{'ctype'}, $att->{'ispatch'},
>             $att->{'isprivate'}, $att->{'isobsolete'}, $exporterid);
>         my $att_id   = $dbh->bz_last_key( 'attachments', 'attach_id' );
>-        my $att_data = $att->{'data'};
>+        my $att_data; 
>+	if ($local_storage) {
>+ 	    $att_data = $att_data || '';
>+ 	}
>+ 	else {
>+	    $att_data = $att->{'data'};
>+ 	}

What is the point of this? $att->{'data'} is the data read from the file. It should make no difference whether it is being stored in the database or locally.

>         my $sth = $dbh->prepare("INSERT INTO attach_data (id, thedata) 
>                                  VALUES ($att_id, ?)" );
>         trick_taint($att_data);
>         $sth->bind_param( 1, $att_data, $dbh->BLOB_TYPE );
>         $sth->execute();

So we are storing in both the database and on the filesystem? shouldn't there be an if block otherwise?

As LpSolit said, this really needs to be taken against the tip. 2.22 branch is closed to new features.
Attachment #246157 - Flags: review?(LpSolit) → review-
(In reply to comment #6)
> (From update of attachment 246157 [details] [diff] [review] [edit])
> >@@ -79,7 +79,7 @@
> > use Bugzilla::Component;
> > use Bugzilla::Milestone;
> > use Bugzilla::FlagType;
> >-use Bugzilla::Config qw(:DEFAULT $datadir);
> >+use Bugzilla::Config qw(:DEFAULT $attachdir);
> > use Bugzilla::BugMail;
> > use Bugzilla::User;
> > use Bugzilla::Util;
> 
> Config now provides a bz_locations function. Use this instead.
> 
> 
> >@@ -1140,12 +1148,30 @@
> >+	if ($local_storage) {
> >+ 	    $att_data = $att_data || '';
> >+ 	}
> >+ 	else {
> >+	    $att_data = $att->{'data'};
> >+ 	}
> 
> What is the point of this? $att->{'data'} is the data read from the file. It
> should make no difference whether it is being stored in the database or
> locally.

Ah I think I get it now. Since the database needs a 0 byte attachment in order to know to check the local files.
> 
> >         my $sth = $dbh->prepare("INSERT INTO attach_data (id, thedata) 
> >                                  VALUES ($att_id, ?)" );
> >         trick_taint($att_data);
> >         $sth->bind_param( 1, $att_data, $dbh->BLOB_TYPE );
> >         $sth->execute();
> 
> So we are storing in both the database and on the filesystem? shouldn't there
> be an if block otherwise?
> 

That would explain this as well.

> As LpSolit said, this really needs to be taken against the tip. 2.22 branch is
> closed to new features. 
> 

This is still true however. :)
(In reply to comment #7)
> Ah I think I get it now. Since the database needs a 0 byte attachment in order
> to know to check the local files.

Correct. All I did is taking code from attachment.cgi, miming exactly what it is doing. This is in the "validateData" function:

 if (!$cgi->param('bigfile'))
  {
      # enable 'slurp' mode
      local $/;
      $data = <$fh>;
  }

  $data
    || ($cgi->param('bigfile'))
    || ThrowUserError("zero_length_file");

  return $data || '';

> > >         my $sth = $dbh->prepare("INSERT INTO attach_data (id, thedata) 
> > >                                  VALUES ($att_id, ?)" );
> > >         trick_taint($att_data);
> > >         $sth->bind_param( 1, $att_data, $dbh->BLOB_TYPE );
> > >         $sth->execute();
> > 
> > So we are storing in both the database and on the filesystem? shouldn't there
> > be an if block otherwise?
> > 
> 
> That would explain this as well.

Yep.

>> As LpSolit said, this really needs to be taken against the tip. 2.22 branch is closed to new features. 
> This is still true however. :)

I will check for porting this to HEAD.

However, be reminded that the -attach_data option in importxml.pl (for reading local files instead of base64) appeared in 2.22.1 version, not in 2.22 ;-).
(In reply to comment #8)
> However, be reminded that the -attach_data option in importxml.pl (for reading
> local files instead of base64) appeared in 2.22.1 version, not in 2.22 ;-).
> 
Actually, meant "attach_path" option, not attach_data.
>+=item B<-m>
>+
>+    Store attachment on local filesystem (big file).

Actually, should be +=item B<-l>
Attached patch Patch v1.2 (against 2.23.3+) (obsolete) — Splinter Review
Patch against 2.23.3+ this time.
Should be ok now
Attachment #246157 - Attachment is obsolete: true
Attachment #246899 - Flags: review?(ghendricks)
Comment on attachment 246899 [details] [diff] [review]
Patch v1.2 (against 2.23.3+)

This looks good. I am not sure what the status of the tip is as for being frozen.
Attachment #246899 - Flags: review?(ghendricks) → review+
Flags: approval?
Comment on attachment 246899 [details] [diff] [review]
Patch v1.2 (against 2.23.3+)

>Index: importxml.pl

>+	    if (length($attachment{'data'})>$limit) {

Nit: add whitespaces around ">".


>+        my $att_data;
>+	if ($local_storage) {
>+ 	    $att_data = $att_data || '';

You already know that $att_data is undefined, so this should be: $att_data = '';

>+ 	}
>+ 	else {
>+	    $att_data = $att->{'data'};
>+ 	}

This IF block should be written: my $att_data = $local_storage ? '' : $att->{'data'};


>+	    my $attachdir = bz_locations()->{'attachdir'};
>+            my $hash = ($att_id % 100) + 100;
>+            $hash =~ s/.*(\d\d)$/group.$1/;
>+            mkdir "$attachdir/$hash", 0770;
>+            chmod 0770, "$attachdir/$hash";
>+            open(AH, ">$attachdir/$hash/attachment.$att_id");
>+	    binmode AH;

This part of the code is duplicated; it already exists as is in Attachment.pm. You should refactor this code in a routine such as create_local_attachment() or prepare_local_attachment() or some other explicit name. That's the main reason for my r-.
Attachment #246899 - Flags: review-
Assignee: import-export → ycombarnous
Flags: approval?
Target Milestone: --- → Bugzilla 3.2
Bugzilla 3.2 is now frozen. Only enhancements blocking 3.2 or specifically approved for 3.2 may be checked in to the 3.2 branch. If you would like to nominate your enhancement for Bugzilla 3.2, set the "blocking3.2" flag to "?". Then, either the target milestone will be changed back, or the blocking3.2 flag will be granted, if we will accept this enhancement for Bugzilla 3.2.

This particular bug has not been touched in over eight months, and thus is being retargeted to "---" instead of "Bugzilla 4.0". If you believe this is a mistake, feel free to retarget it to Bugzilla 4.0.
Target Milestone: Bugzilla 3.2 → ---
Priority: -- → P3
Whiteboard: [needs new patch]
Assignee: ycombarnous → import-export
This is actually a bug. importxml.pl bypasses checks done by Bugzilla::Attachment->create as it does it own stuff. Everything is hardcoded in importxml.pl. It's time for it to call new methods such as Bugzilla::Bug->create and Bugzilla::Attachment->create.

(in a separate bug, it's also time to use Bugzilla::Error instead of self-made error messages.)
Severity: enhancement → normal
Summary: Enable importxml to import attachment on local storage (versus DB) → importxml.pl ignores the maxattachmentsize and maxlocalattachment parameters when importing attachments
Whiteboard: [needs new patch]
Assignee: import-export → LpSolit
Status: NEW → ASSIGNED
Target Milestone: --- → Bugzilla 4.4
Attached patch patch, v2 (obsolete) — Splinter Review
The code is working fine except for one thing:

If the attachment is larger than maxattachmentsize, then Bugzilla::Attachment->create will store the attachment locally, as expected. But the problem is that it calls "chmod 0770" and depending on the UID and GID used when running importxml.pl, the directory and the file can be unreadable by the web server (permission denied). Once we run checksetup.pl or contrib/fixperms.pl, the attachment is readable again, but I'm not sure that a good workaround.
Attachment #246899 - Attachment is obsolete: true
Attached patch patch, v2.1 (obsolete) — Splinter Review
Cleaner patch. But the permission issue is still there.
Attachment #8355720 - Attachment is obsolete: true
Attachment #8355720 - Attachment description: bug0-9.diff → patch, v2
Attachment #8355721 - Attachment description: patch, v1.1 → patch, v2.1
<justdave> probably should advise to run it as the webserver user then

So not something to do in 4.4.
Target Milestone: Bugzilla 4.4 → Bugzilla 5.0
I think we can take it for 4.4 on the following grounds:

1) The current situation means if you submit a large attachment it breaks
2) The situation with this fix is that if you submit a large attachment it breaks unless you happen to be running the script as the correct user.

#2 is arguably a better situation, since you can then correct your wrapper scripts or whatever to run as the correct user and have it fixed.
Target Milestone: Bugzilla 5.0 → Bugzilla 4.4
For clarification, the reason NOT to take it is that it potentially required configuration changes on your system, before before it didn't matter what user it ran as as long as it could read the config, and now *if* it needs to write an attachment, it also needs to be able to write to disk.  If the script worked at all before with large attachments, then this is a bad thing, because it would break people who weren't running it as the correct user.  But since it's already breaking anyway, and the fix lets you unbreak it if you use the correct user (where you couldn't unbreak it before), this is okay.
Comment on attachment 8355721 [details] [diff] [review]
patch, v2.1

In that case the patch is ready. :)
Attachment #8355721 - Flags: review?(dkl)
Attached patch patch, v2.2Splinter Review
Forgot to include the isobsolete attribute.
Attachment #8355721 - Attachment is obsolete: true
Attachment #8355721 - Flags: review?(dkl)
Attachment #8355768 - Flags: review?(dkl)
Comment on attachment 8355768 [details] [diff] [review]
patch, v2.2

Review of attachment 8355768 [details] [diff] [review]:
-----------------------------------------------------------------

r=dkl
Attachment #8355768 - Flags: review?(dkl) → review+
Flags: approval?
Flags: approval4.4?
Flags: approval?
Flags: approval4.4?
Flags: approval4.4+
Flags: approval+
Committing to: bzr+ssh://lpsolit%40gmail.com@bzr.mozilla.org/bugzilla/trunk/
modified importxml.pl
Committed revision 8858.

Committing to: bzr+ssh://lpsolit%40gmail.com@bzr.mozilla.org/bugzilla/4.4/
modified importxml.pl
Committed revision 8648.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Added to relnotes for 4.4.2.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: