Closed Bug 1593278 Opened 5 years ago Closed 4 years ago

AC snapshots have misconfigured maven.metadata.xml

Categories

(Release Engineering :: Release Automation: Other, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mtabara, Unassigned)

References

Details

We have a problem with the latest published A-C snapshots. Looks like the jars went out correctly, but for some modules the maven-metadata file wasn't updated: https://snapshots.maven.mozilla.org/?prefix=maven2/org/mozilla/components/browser-state/20.0.0-SNAPSHOT/

Latest published jars are 20191101.130054, but metadata points to 20191031192217 as last updated.

This is quite a problem right now as we have incompatible snapshots out and can't update Fenix and R-B.

I suspect it could be related to bug 1589065.

Jeremy provided us with logs, pasting it here until we debug more:

An error occurred (Throttling) when calling the CreateInvalidation operation (reached max retries: 4): Rate exceeded: ClientError
Traceback (most recent call last):
File "/var/task/metadata.py", line 63, in lambda_handler
metadata_function=generate_snapshot_listing_metadata
File "/var/task/metadata.py", line 118, in craft_and_upload_maven_metadata
bucket_name, folder, METADATA_BASE_FILE_NAME, metadata, content_type='text/xml'
File "/var/task/metadata.py", line 273, in upload_s3_file
invalidate_cloudfront(path=key)
File "/var/task/metadata.py", line 293, in invalidate_cloudfront
'CallerReference': request_id,
File "/var/task/botocore/client.py", line 357, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/var/task/botocore/client.py", line 661, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (Throttling) when calling the CreateInvalidation operation (reached max retries: 4): Rate exceeded

For the record, re-triggering an AC snapshot worked like a charm so this was intermittent.

To clarify a bit here.

Plan is to get maven lambda RO access for RelEng so that we can debug better this things in the future. We'll dig into this next week, but until then:

  1. If this kind of error occurs again, we should just re-schedule the hook to push another release. The issue seems intermittent.
  2. Sadly, for now mobile team folks can't trigger the hook as we made some changes in ciadmin while doing taskgraph-ing for AC. Fixing this is expected on Monday, 4th of November and is tracked in bug 1593279.

Until then, feel free to ping in #releaseduty-mobile and someone from RelEng can trigger this for you.

Depends on: 1593279

(In reply to Mihai Tabara [:mtabara]⌚️GMT from comment #3)

Plan is to get maven lambda RO access for RelEng so that we can debug better this things in the future. We'll dig into this next week,

Hey :Mihai, do we have an update here? This happened quite a bit last week, causing various problems for us.

Another temporary solution could be to publish snapshots more frequently. We can catch some of these issues with tests before publishing Nightly, but won't capture all of them.

Flags: needinfo?(mtabara)
See Also: → 1600916

(In reply to Christian Sadilek [:csadilek] from comment #4)

(In reply to Mihai Tabara [:mtabara]⌚️GMT from comment #3)

Plan is to get maven lambda RO access for RelEng so that we can debug better this things in the future. We'll dig into this next week,

Hey :Mihai, do we have an update here? This happened quite a bit last week, causing various problems for us.

Another temporary solution could be to publish snapshots more frequently. We can catch some of these issues with tests before publishing Nightly, but won't capture all of them.

Spoke again with CloudOps again to bump this in priority. We might get this soon, tracked in bug 1600995.

Depends on: 1600995
Flags: needinfo?(mtabara)

This was fixed a while ago in a different bug. Closing this for now, please re-open should it happens again.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.