Closed Bug 1135471 Opened 9 years ago Closed 9 years ago

Add-on validation takes forever

Categories

(addons.mozilla.org Graveyard :: Developer Pages, defect)

defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: shinglyu, Unassigned)

References

Details

Description
Addon Validation takes forever

Steps
* Go the the manage my submissions page
* Upload a new version of xpi for my add-on https://addons.mozilla.org/en-US/developers/addon/focusblocker/versions#version-upload

Expected 
Validation finished in a few minutes.

Actual
Validation runs for half an hour now.

Note
Internet connection quality is normal. Tried 2 of my add-ons but the validation just won't pass.
I can confirm this for add-on https://addons.mozilla.org/en-US/developers/addon/form-history-control
Uploading a new version seems to be stuck at validating 100%

The same thing occurs when only validating at https://addons.mozilla.org/en-US/developers/addon/validate
I can confirm this for every add-on I tried to upload today.
I have been uploading new versions of my add-on for the past few days, but I'm unable to get any to validate right now.
Severity: normal → critical
Component: Add-on Validation → Developer Pages
OS: Linux → All
Hardware: x86_64 → All
Jason, can you check if the celery queues are backed up on prod?
Flags: needinfo?(jthomas)
Same issue here, I eventually see an error message "error contacting the server".

Can't change icons for my addon either. When I click 'save changes' after uploading a new screenshot, they're never shown. Not sure if its related (same backend server experiencing issues?).
Bug #1135528 looks like a duplicate, though with the added information:

> P.S. If i enable it using Inspector, and submit, then i get this:
> There was an error with your upload. Please try again.
(In reply to Kris Maglione [:kmag] from comment #4)
> Jason, can you check if the celery queues are backed up on prod?

The celery queues for devhub and images tasks were backed up on prod. I have restarted the workers. Can we please try again?
Flags: needinfo?(jthomas)
@Jason I was just able to upload anew version of addon and images are being updated also, thanks!
(In reply to bobbyrne01 from comment #9)
Ditto 4 me: update submittal was OK a bit earlier. Thanks.
It works now. Thanks!
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
It's happening again...  Is there a deeper problem here?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
It's working again.  Is there a status page or something to look at?  Are these working monitored or otherwise watched?
I also ran into this earlier. It took me several tries to get an update submitted.
Having a status page for the validator would be great.
It happens again. Is there a deeper problem or it's simply because the queue don't have enough capacity?
The celery queues for devhub and images tasks were backed up again. This usually occurs because tasks do not complete correctly, stay in the queue,  and the celery workers are unable to process news tasks in the queue. I've restarted the workers to clear out the queue.

I do see the following sentry error that started about 8 days ago, possibly a issue with amo-validator? http://sentry.mktmon.services.phx1.mozilla.com/mkt/addonsmozillaorg/group/14020/
Hm. It looks like that error was caused by switching to pypi for the validator, but not including those reference XPIs in the package. The error itself should only affect language packs, though.
Filed bug 1138662 for that issue. Still don't know if it's related to this.
Queue was backing up again, restarted workers. Another sentry error I've been seeing related to validator timeout http://sentry.mktmon.services.phx1.mozilla.com/mkt/addonsmozillaorg/group/14274/
It's happening again, can't upload a new version today (was fine last night).
Restarted workers, same errors as mentioned in comment 17 and 21.
Any chance the validator servers get upgraded to something more powerful to stop this issue from occuring again? An automated restart once a day would probably also help.
It's not an issue of more powerful. The servers are more than capable of handling the load. This is a software bug.

magopian, is there any chance you can look into what's causing this? It's turning into a major problem.
Flags: needinfo?(mathieu)
Pushed https://github.com/mozilla/olympia/compare/2015.02.19...2015.03.05, we'll see how it goes.
Flags: needinfo?(mathieu)
Queues have been looking good since the deploy. Please reopen if you continue to have issues.
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Hello, I'm still having the same issue. Yesterday I was able to upload a new add-on. Today I'm trying to upload a new version, though it won't get past validation.
Hello, I've been facing this issue for a few hours now.I uploaded an add-on yesterday but I am not able to submit an update.
Blocks: 1163799
Apparently the queue is hanging again.
Any chance these "workers" can get restarted automatically every day, so we do not ran into this problem anymore?
Depends on: 1181748
I'm a little surprised to see this marked RESOLVED FIXED. I had a hang this morning again, and #1181748 doesn't apply to me.
It's been a recurring issue. The specific instances reported here have been fixed. Bug 1177865 addresses the general problem and will go live today.
Er, rather, bug 1163799 addresses the general problem and will go live today.
super, thanks for the feedback.
This should be fixed on production now. Please reopen bug 1163799 if you see it again.
Product: addons.mozilla.org → addons.mozilla.org Graveyard
The issue is present again (2016-08-16).
Months later, it is very likely that this is a different issue. In fact, we are currently investigating some database issues.
Our rabbitmq server was having issues. I've added additional monitoring to capture this issue in the future.
It looks like this is happening again, I have filed an issue at https://github.com/mozilla/addons/issues/230.
You need to log in before you can comment on or make changes to this bug.