Closed Bug 850830 Opened 11 years ago Closed 10 years ago

Notify app developers when their manifest is broken

Categories

(Marketplace Graveyard :: Validation, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED
2013-09-17

People

(Reporter: eviljeff, Assigned: mat)

References

Details

(Whiteboard: p=2 [qa-])

Attachments

(1 file)

Currently when an app developer breaks their manifest after submission we find out about it during the daily cron job and then flag the app for re-review.  
We should automatically email the developer to give them a chance to fix it themselves in the mean time rather than waiting a few days for the review team to notify them manually.

As well as reducing the amount of time an an app in broken on the Marketplace it should also reduce reviewer time spent typing a response (and then subsequently having to approve it again)
The email should have an opt out, I would think.
Whiteboard: p=2
(In reply to Rob Hudson [:robhudson] from comment #1)
> The email should have an opt out, I would think.

The app delete button?
(In reply to Matt Basta [:basta] from comment #2)
> (In reply to Rob Hudson [:robhudson] from comment #1)
> > The email should have an opt out, I would think.
> 
> The app delete button?

heh, yeah.  The email should say something like 'if you can't fix the manifest right now you can disable the listing here <...> and re-enable later'.
I see this as a P2, it's important. App servers go down, that's how the web works. If the developer is not notified then the app could get kicked back into the review queue and that could take days. If the author is notified then they might be able to restart their server within minutes.
Priority: -- → P2
I wrote a draft of what the email template should be.  I'm not 100% happy with the wording tbh.  We need to get across:
* they should fix the manifest problem quickly
* if there isn't a problem don't worry
* we don't want them to reply to the email or contact us in any case
* we'll contact them after we manually check the app only if we find a problem
(In reply to Lisa Brewster [:adora] from comment #7)
> Here's an edit of eviljeff's draft: 
> https://docs.google.com/a/mozilla.com/document/d/
> 12_JhBoiYT9_vvxG4sbYnpDophQDuXzO95b5LOy7_2M8/edit

I can't access that doc - can you attach it as an attachment to this bug instead?
Note: there are 2 different cases in webapps.tasks._update_manifest that causes an app to end up in the ReReview queue: 
- Failure to fetch the manifest 3 times in a row (1 hour between each attempt)
- Manifest validation failure on a successful fetch

I assume the mail should be sent in both cases, maybe including the failure directly in the mail.

A couple questions:
- If there are multiple developers for an app, should we email all developers ? Only the owner ?
- I assume we aren't going to send the mail multiple times for the same problem. How are we going to store the information to avoid this and how is it going to be reset when the problem is fixed, and by who ? (the developer ? a reviewer ? both ?)
- Do we need some kind of safeguard to prevent us from mailing everyone in case of say, a major network outage ?
Assignee: nobody → mpillard
(In reply to Mathieu Pillard [:mat] from comment #9)
> Note: there are 2 different cases in webapps.tasks._update_manifest that
> causes an app to end up in the ReReview queue: 
> - Failure to fetch the manifest 3 times in a row (1 hour between each
> attempt)
> - Manifest validation failure on a successful fetch
> 
> I assume the mail should be sent in both cases, maybe including the failure
> directly in the mail.

we can include the validation report link - I *think* that includes the failure reason.

> A couple questions:
> - If there are multiple developers for an app, should we email all
> developers ? Only the owner ?

I'd say follow the same logic as during the review.  (all of them?)

> - I assume we aren't going to send the mail multiple times for the same
> problem. How are we going to store the information to avoid this and how is
> it going to be reset when the problem is fixed, and by who ? (the developer
> ? a reviewer ? both ?)

Just don't send an email if its already been flagged for re-review.

> - Do we need some kind of safeguard to prevent us from mailing everyone in
> case of say, a major network outage ?

possibly... any ideas?
Discussion with Andrew on IRC:

Currently, the app enters the re-review queue after the 3rd attempt. But that means we wait *2* hours, not 3 : 2nd attempt is one hour after the first one, and 3rd attempt is one hour after the 2nd one. 

I kinda like the 3 hours delay, so I'm going to implement this:
- Add a 4th attempt
- On 3rd attempt failure, email the developer
- Move re-review add code to the 4th attempt

That leaves the developer one hour "bonus time" to catch the mail before it hits re-review. It's obviously not a very big window, but if the developer catches it and it's an easy fix, it can save us some useless re-reviews. If he catches it later then we'll do an useless re-review, but at least we try avoiding it :)


Everything should be easy to tweak anyway, if we want to change when the mail is sent, when the app enters the re-review queue, how much time to wait between attempts, etc.
Could we email on each failed attempt and include in the email something like "attempt 2 of 4. Next attempt will be in one hour."? That notifies the developer on the first instance of the problem and gives them a timeline. It might be excessive but I'd like to know right away and if my attempt to fix still failed or not.
> Could we email on each failed attempt and include in the email something
> like "attempt 2 of 4. Next attempt will be in one hour."? That notifies the
> developer on the first instance of the problem and gives them a timeline. It
> might be excessive but I'd like to know right away and if my attempt to fix
> still failed or not.

I don't think it's a good idea to rely on email to notify the developer he managed to fix it. This would send many more mails from us, many of them useless for the developer. And if he doesn't receive the next one, he doesn't know if it's because mails are lagging behind or just if the fix worked.

I'd prefer being able to display on the developer info page that something is wrong (or not), with a date. Possibly just displaying when was the last successful fetch + would be enough? That's easy to do with the 'modified' field on AppManifest I think. We could also go even further and display the last fetch status by adding another field somewhere to store that info.(In reply to Rob Hudson [:robhudson] from comment #12)
part of the problem is for hosted apps we don't expose the validation report link for developers apart from on first upload.  You can 'refresh' the manifest via the manage pages but if there was an error the report is only visible via the reviewer review page.
Status: NEW → ASSIGNED
Target Milestone: --- → 2013-09-17
Fixed in https://github.com/mozilla/zamboni/commit/29454099cc4bec8fb0c6d831d5cb7cbe43b4f29f

This is rather tricky to test, so I'm inclined to mark it as [qa-] and test it ourselves, but for the record:
you need to:
a) Submit a valid hosted app, and get it reviewed
b) Change something in the manifest making it invalid
c) Wait for the daily cron to start (not sure it's even launched on -dev)
d) Check that a mail has been sent and the app is now in the re-review queue

And *also*, different case:

a) Submit a valid hosted app, and get it reviewed
b) Change something in the *server* handling the manifest and make it return an error (either an HTTP error, like 404, or simulate a full network outage on this server)
c) Wait for the daily cron to start (not sure it's even launched on -dev)
d) Wait for the 3rd attempt (t+2 hours)
d) Check that a mail has been sent and the app is *not yet* in the re-review queue
e) Wait for the 4th attempt, one hour later
f) Check that no new mail was sent and the app is now in the rereview queue (status unchanged).

Fun times.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Whiteboard: p=2 → p=2 [qa-]
Has this been implemented on prod and tested using whatever process is most appropriate?
It's in prod. I tested this locally only - because of the delays and the need to break your app on purpose it was annoying to test elsewhere so I had to set it to qa- (only verified that it didn't seem to be breaking the process on -dev).
You need to log in before you can comment on or make changes to this bug.