Closed Bug 292492 Opened 20 years ago Closed 20 years ago

[BUGZILLA] Integrate bzbot's ability to receive and report bugmail into Bugzilla.bm

Categories

(Webtools Graveyard :: Mozbot, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: mkanat, Assigned: mkanat)

Details

Attachments

(3 files, 9 obsolete files)

I'm surprised there isn't already a bug for this, but here it is.

bzbot can receive email from a Bugzilla installation and report the changes it
receives into a channel. Since bzbot is in perl, it should be easy to integrate
this functionality into Bugzilla.bm.

Wolf has already created a bzbot.bm that does it, so I'm going to use that as a
base.
Status: NEW → ASSIGNED
Summary: Integrate bzbot's ability to receive and report bugmail into Bugzilla.bm → [BUGZILLA] Integrate bzbot's ability to receive and report bugmail into Bugzilla.bm
Attached file The new bugmail.pl (obsolete) —
Here's the new bugmail.pl file. It's not totally complete, but it has a much
more module log format than the previous bugmail.pl.

Right now the only thing that I really have to do is fix how it deals with
flags.

And then I have to also implement the special-case messages that bzbot
currently has, for things like flags and status changes (because we have to
report status/resolution as a single change).
Comment on attachment 182318 [details]
The new bugmail.pl

Two questions:

>    # Product
>    my $product = $email->header('X-Bugzilla-Product');
>    unless ($product) {
>        debug_print("X-Bugzilla-Product header not found.");
>        return undef;
>    }
>    $bug_info{'product'} = $product;
>    debug_print("Product '$bug_info{product}' found.");

Could you not do this in one step?

>my @mail_array = <STDIN>;
>my $bug_info = parse_mail(\@mail_array);
>    use Data::Dumper;
>    warn Data::Dumper->Dump([$bug_info]);
>if (defined $bug_info) {
>    my $log_lines = generate_log($bug_info);
>    append_log($log_lines);
>}
>
>debug_print("All done!");
>exit 0;

Is the |use Data::Dumper; warn Data::Dumper->Dump([$bug_info]);| meant to be
there?
Attached patch Patch to Bugzilla.bm, v1 (obsolete) — Splinter Review
The new "Bugmail" mozbot code works like this:

Your MTA delivers a message to bugmail.pl. bugmail.pl creates a file called
bugmail.log in the BotModules directory. Then, every little while (controlled
by the updateDelay var), Bugzilla.bm reads that file and decides whether or not
to output messages.

The format of the file is described above generate_log() in bugmail.pl.

There are a few tricky bits with file permissions:
+ The user running bugmail.pl must be able to write to the BotModules
directory.
+ When bugmail.pl creates the file, it must be readable by mozbot.

I deal with that by making my MTA run bugmail.pl as the same user who is
running mozbot.


This is a patch to Bugzilla.bm to read the bugmail.log file created by
bugmail.pl and output the appropriate messages to the appropriate channels.

In order to get it to output messages, you need to set the
"productReportChannels" var, which is explained in the comments.

The types of messages it will report is controlled mostly by the 'reportFields'
var.

If you'd like to see this code in action, join #fedorabot on irc.freenode.org.
It's currently reporting the default reportFields for Bugzilla, Core, Firefox,
JSS, and the Mozilla Application Suite proudcts on bmo.
Attachment #182468 - Flags: review?(bugzilla)
Attached file bugmail.pl, v1.0 (obsolete) —
And here's the actual bugmail.pl script. It parses a bugmail from <STDIN> using
various sorts of tricks, and writes a log message to bugmail.log. It's fairly
well documented, but it is rather long.

It should be put in the BotModules directory.

I think it may have a few bugs here and there with parsing flags (mostly with
un-wrapping the text in the various columns), but I'm not certain.
Attachment #182318 - Attachment is obsolete: true
Attachment #182470 - Flags: review?(colin.ogilvie)
Comment on attachment 182468 [details] [diff] [review]
Patch to Bugzilla.bm, v1

Oh, I forgot to mention:

The only thing this doesn't currently handle correctly is doing the
CheckForBugs and announcing the bug details when a bug change is announced.
Attached file bugmail.pl, v2 (obsolete) —
OK, Colin found a few bugs in the v1.0 script that we worked out in IRC.

Now, we parse the Flags fields in generate_log in an entirely different and
much better way. This also generates much more sensible log lines for Flags.
Attachment #182470 - Attachment is obsolete: true
Attachment #182488 - Flags: review?(colin.ogilvie)
Attachment #182470 - Flags: review?(colin.ogilvie)
Comment on attachment 182488 [details]
bugmail.pl, v2

Most of the warnings and issues found have been fixed however, one particular
email I have in my collection caused warnings to be generated. Uploading it
shortly, along with the debug output.
Attachment #182488 - Flags: review?(colin.ogilvie) → review-
Attached file Output Log (obsolete) —
The output log for the email.
Attached file Email (obsolete) —
The email that generates the warning... I'm guessing its due to the delimiting
with the | char.
Two quick comments on the .bm patch -- please use the normal log (i.e. the
debug() method) instead of creating a new log, so that when mozbot learns about
rotating logs, etc, it is all handled for this module too; and, please have a
whitelist of incoming mail From:s, not a blacklist. (Not that is makes much
difference since faking From: lines is so easy, but still.)
(In reply to comment #10)
> please use the normal log (i.e. the
> debug() method) instead of creating a new log, so that when mozbot learns 
> about rotating logs, etc, it is all handled for this module too;

  It's not a real log -- it's somewhere where we put lines, and then the file is
immediately deleted as soon as Bugmail.bm reads it. The file itself is created
by bugmail.pl. It's basically the way that we "communicate" the incoming email
to mozbot.

> and, please have a
> whitelist of incoming mail From:s, not a blacklist. (Not that is makes much
> difference since faking From: lines is so easy, but still.)

  I actually don't have either, right now. The bugmail.pl script just determines
if something is a mail from the Subject line, and then if it's not a real
bugmail (or somehow malformed), the script will just skip it.
That'll teach me to skim patches. ;-)

> It's not a real log -- it's somewhere where we put lines, and then the file is
> immediately deleted as soon as Bugmail.bm reads it. The file itself is created
> by bugmail.pl. It's basically the way that we "communicate" the incoming email
> to mozbot.

Ah, interesting. Would it make more sense to have some sort of IPC? In case it
would I just checked in some changes to mozbot.pl that allow you to watch a
socket (or other filehandle) for changes. The way to use it is to first create a
filehandle, e.g. using TCP/IP sockets:

    use IO::Socket;
    $self->{handle} = IO::Socket::INET->new(PeerAddr => $self->{host},
                                            PeerPort => $self->{port},
                                            Proto => 'tcp');

...then to register that socket with mozbot:

    $self->registerDataHandle($event, $self->{handle});

You can then respond to data:

    sub DataAvailable {
        my $self = shift;
        my ($event, $handle, $data, $closed) = @_;
        if (defined $self->{handle} and
            $self->{handle}->connected and
            $handle == $self->{handle}) {
            $self->{pending} .= $data;
            # ... handle data in $self->{pending} if it is complete
        } else {
            $self::SUPER->DataAvailable($event, $handle, $data, $closed);
        }
    }

In your case it might be better to use Unix sockets for this, or even a pipe to
a bugmail.pl subprocess (it can then just output on STDOUT as mail comes in and
you can watch that).

In general I would recommend against using files for this kind of thing. You're
likely to end up with all kinds of race conditions and will have to deal with
error cases like bad permissions, out of disk space, unexpected paths, etc.


> I actually don't have either, right now. The bugmail.pl script just determines
> if something is a mail from the Subject line, and then if it's not a real
> bugmail (or somehow malformed), the script will just skip it.

That's probably fine.
Attached file bugmail.pl, v3 (obsolete) —
OK, here's the third version. I've corrected bugs with reporting new
attachments and new bugs, and I think I've also handled Colin's final error
case, where the script could get confused if a comment looked like the diff
table.
Attachment #182488 - Attachment is obsolete: true
Attachment #182497 - Attachment is obsolete: true
Attachment #182498 - Attachment is obsolete: true
Attachment #182679 - Flags: review?(colin.ogilvie)
(In reply to comment #12)
> Ah, interesting. Would it make more sense to have some sort of IPC? [snip]

  Yeah, ideally it would. In fact, the ideal would be if somehow the MTA could
get an email directly into mozbot, in a reliable fashion. I'd use a unix socket,
but I wouldn't be too sure how to (1) get an MTA to reliably deliver to it, (2)
have both mozbot and the MTA be OK when mozbot isn't running, (3) retrieve the
data in batches.

  I suppose (3) is probably handled in the mozbot code, in some fashion or
another. I just need to make sure that we don't try to parse half of a line, or
only one line when there are three available.

  I could probably do it by having a TCP/IP socket on some defined port, but I'm
not sure how to have an MTA just dump an email to a defined port.

  I mean, it would move a lot of the bugmail.pl code into Bugzilla.bm (making
Bugzilla.bm the Largest File Ever), but that would probably be OK. Maybe I could
even have a BugMail.pm that I could "use" in Bugzilla.bm.

> [snip]
> In your case it might be better to use Unix sockets for this, or even a pipe 
> to a bugmail.pl subprocess (it can then just output on STDOUT as mail comes in 
> and you can watch that).

  Yeah, I think the sockets way would be the best to go. If I had a bugmail.pl
subprocess, it would probably have to be a daemon of its own, and that would
probably get complicated.

> In general I would recommend against using files for this kind of thing. 
> You're likely to end up with all kinds of race conditions and will have to 
> deal with error cases like bad permissions, out of disk space, unexpected 
> paths, etc.

  Yeah. I do use an flock on the file in both Bugzilla.bm and bugmail.pl, but
I've already had difficulties with paths and permssions just while I was testing.
Incidentally, one thing you might not have considered is to make a new module
BugzillaWatcher.bm that just does this bugmail handling, instead of making the
Bugzilla.bm module handle both this and queries.
OK, I am trying to use the new mozbot IPC with a Listen socket, and it's not
working. I suspect, looking at mozbot.pl, that the system isn't really set up
for mozbot itself having a Listen sockets.

When I receive some data on my socket, I get:

2005-05-05 18:22:45 UTC Module Bugzilla received some data
2005-05-05 18:22:45 UTC Module Bugzilla: Got data: [] Closed: [1] Handle:
[IO::Socket::UNIX=GLOB(0x9d704e0)]
2005-05-05 18:22:45 UTC Module Bugzilla: Handling Data.
2005-05-05 18:22:45 UTC Dropping handle...

Is there some hack I could do to get a Listen socket to work? It makes a lot of
sense for mozbot to open a Listen socket for various purposes, since it's a
"daemon" (most of the time).
Attachment #182468 - Attachment is obsolete: true
Attachment #182710 - Flags: review?(ian)
Attachment #182468 - Flags: review?(bugzilla)
Hm, I never tried to set it up as a listener. I wouldn't want to consider mozbot
a stable deamon, it can restart at a moment's notice. I'd encourage you to
consider mozbot to be a client only, connecting to deamons that do the work.
(In reply to comment #17)
> Hm, I never tried to set it up as a listener. 

  It doesn't work because listeners need something like:

while ( my $client = $listener->accept() ) {
    while ( <$client> ) { 
        print; 
    }
}

  Whereas I think mozbot.pl is just trying to do a select on the $listener
itself, which won't work. It's possible there's some way to hack it, and do
continual registerDataHandle calls whenever we get an incoming listener. The
trick, of course, is that accept() is blocking, as far as I know.

> I wouldn't want to consider mozbot
> a stable deamon, it can restart at a moment's notice.

  In my case, that's OK. The bugmail.pl script just discards incoming emails if
mozbot isn't up, and I re-create the socket whenever we re-appear.

> I'd encourage you to
> consider mozbot to be a client only, connecting to deamons that do the work.

  Except then I have to write a daemon, when I already have a known
stable-and-running mozbot.
(In reply to comment #15)
> Incidentally, one thing you might not have considered is to make a new module
> BugzillaWatcher.bm that just does this bugmail handling, instead of making the
> Bugzilla.bm module handle both this and queries.

  Well, I need the getURI functionality of Bugzilla.bm to be called directly
from my code, to display the bug details, unless there's some hackery that I can
do to get the bot to respond to itself.
Fair enough about getURI() (although maybe you could abstract the functionality
into a separate .pm module, but there's no much point I guess).

About the deamon thing -- can't you just make bugmail.pl your deamon? Logically,
you would only want one bugmail.pl but I could see having multiple mozbot.pls
interacting with that one bugmail.pl. No?
(In reply to comment #20)
> About the deamon thing -- can't you just make bugmail.pl your deamon? 

  The problem is: Then how would an MTA get email to it? MTAs usually get email
to processes by piping into a script. I'd have to have one piping script, one
daemon, and then mozbot. It seems kind of silly to me when I already have one
long-running script (mozbot).

> Logically,
> you would only want one bugmail.pl but I could see having multiple mozbot.pls
> interacting with that one bugmail.pl. No?

  I think in the case of multiple mozbots, you would handle that on the MTA
side. You'd either be delivering different email addresses to different scripts,
or multiplexing incoming email to multiple mozbots.
Yeah, I don't really like the idea of having to keep two long-running scripts going.
As for the unix socket stuff, will that work under cygwin-perl? (that's
firebot's current platform) If not, how hard would it be to convert back to
using the file? (I'd guess not that hard.)

As for the permissions issues, I solved them locally by having mozbot /not/
delete the file but just empty it at module startup, therefore you set the
permissions once, and leave it (and since NTFS permissions are arguably more
annoying than Unix. heh.).

Firebot's MTA doesn't like piping to a process, instead, it passes the path to
the stored mail as an argument to bugmail.pl, so firebot has to process and
delete it. heh.
Oh, I see. I didn't realise how bugmail.pl worked (I figured it just polled IMAP
or something -- I don't have local mail delivery here so didn't think of that).

Yeah, what you were saying earlier makes sense to me now. I guess you get to
hack Listener support into mozbot.pl. :-)
(Wolf: you got mozbot to work on windows?! wow.)
Yeah, it works pretty well, Cygwin Perl 5.8.6. Had to use rebase on the cygwin
install to get it to run (dll errors about using the same memory space), but its
fairly stable, definitely not production ready but. heh. :-)
Attached file bugmail.pl, v4
OK, I've decided to go back to files, but make sure that mozbot itself is the
only thing that creates the file. I've also made it a hidden file, to keep
things cleaner.

Sockets had the same permissions problems, also had race conditions (if two
processes provided data at the same time), and would have required intense
hacking of mozbot.

The files are flocked by both bugmail.pl and mozbot, so there should be no race
conditions. I haven't had any race conditions on my testing server in several
days of getting all bmo mail, and I had one almost immediately when using
sockets.

Here's the bugmail.pl file, which I think is ready for prime-time.
Attachment #182679 - Attachment is obsolete: true
Attachment #182803 - Flags: review?(colin.ogilvie)
Attached patch Bugzilla.bm, v3 (obsolete) — Splinter Review
OK, and here's the Bugzilla.bm part of the patch. It now always creates a
.bugmail.log file and just truncates it when it's done reading it.

Theoretically, this also makes the flocks much more sensible, since the file is
always the same file. :-)

I'm asking justdave for review, but anybody's welcome to do it if he doesn't
get to it.

It has one slight bug: The first bug number that it reports the details of (not
the change report, but the FetchBug() thing) will be directed to the admin who
loaded the Bugzilla module. I think that's actually a possible bug with $event
in getURI, but I'm not sure.
Attachment #182710 - Attachment is obsolete: true
Attachment #182804 - Flags: review?(justdave)
Attachment #182679 - Flags: review?(colin.ogilvie)
Attachment #182710 - Flags: review?(ian)
Attached patch Bugzilla.bm, v4Splinter Review
OK, this version actually works. :-)
Attachment #182804 - Attachment is obsolete: true
Attachment #183035 - Flags: review?(colin.ogilvie)
Attachment #182804 - Flags: review?(justdave)
Comment on attachment 182803 [details]
bugmail.pl, v4

Not sure I'm qualified to give a review on this; but Max asked me to so I will.

I tested this on just over 3000 bug mails and it worked every time.
Attachment #182803 - Flags: review?(colin.ogilvie) → review+
Comment on attachment 183035 [details] [diff] [review]
Bugzilla.bm, v4

Not sure I'm qualified to give a review on this; but Max asked me to so I will.

I've tested this on the same 3000 bug mails, and it seems to work correctly
every time. (I just wish I could disable the check for the XML file :))
Attachment #183035 - Flags: review?(colin.ogilvie) → review+
Comment on attachment 183035 [details] [diff] [review]
Bugzilla.bm, v4

OK, Hixie, is it OK with you if I check this in? I know that we have control
over this module, but it's a big change so I wanted to also clear it with you.
Attachment #183035 - Flags: review?(ian)
Sure. Make sure justdave is ok with it.

bugmail.pl should be check in to the same directory as Bugzilla.bm. Ideally, it
should have a name that starts with Bugzilla.bm, like, BugzillaMailHandler.pl.
Also, please write a BugzillaMailHandler.txt file that documents how to set all
this up, what typical problems could be, etc, and check it in the same place.

Thanks!
Comment on attachment 183035 [details] [diff] [review]
Bugzilla.bm, v4

<justdave_> mkanat: I say it's ok as long as someone's reviewed it and it works
:)
Attachment #183035 - Flags: review?(ian)
(did anyone _code_-review it? as opposed to just testing it)
OK, here's the docs that I'm going to check in.
<mkanat> Colin2: Did you review the code on Bugzilla.bm at all?
<Colin2> as best i could, yep

I'm also planning to do some cleanup on Bugzilla.bm sometime soon, so if I
missed anything here (which I don't think I did) I'll catch it then.
Checking in BotModules/Bugzilla.bm;
/cvsroot/mozilla/webtools/mozbot/BotModules/Bugzilla.bm,v  <--  Bugzilla.bm
new revision: 2.10; previous revision: 2.9
done
RCS file: /cvsroot/mozilla/webtools/mozbot/BotModules/BugzillaMailHandler.pl,v
done
Checking in BotModules/BugzillaMailHandler.pl;
/cvsroot/mozilla/webtools/mozbot/BotModules/BugzillaMailHandler.pl,v  <-- 
BugzillaMailHandler.pl
initial revision: 2.1
done
RCS file: /cvsroot/mozilla/webtools/mozbot/BotModules/BugzillaMailHandler.txt,v
done
Checking in BotModules/BugzillaMailHandler.txt;
/cvsroot/mozilla/webtools/mozbot/BotModules/BugzillaMailHandler.txt,v  <-- 
BugzillaMailHandler.txt
initial revision: 2.1
done
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Target Milestone: --- → Mozbot 2.6
Great! Thanks!!
QA Contact: kerz → mozbot
Verified Fixed.
Status: RESOLVED → VERIFIED
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: