Closed Bug 1630659 Opened 5 years ago Closed 5 years ago

Assist investigation of HMAC corruption issue

Categories

(Cloud Services Graveyard :: Operations: Sync, defect, P1)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: eolson, Assigned: eolson)

References

Details

Tracking efforts to identify the HMAC issue in the user migration from sync-py to sync-rs.

Type: task → defect
Priority: -- → P1

ni? myself for a test user account on which to try to repro this.

Flags: needinfo?(rfkelly)

I had a test user account with current storage URI:

https://sync-660-us-west-2.sync.services.mozilla.com/1.5/143087941

Please try migrating this account to spanner storage and I'll see if the HMAC issue appears.

Flags: needinfo?(rfkelly)
Blocks: 1631788

Apologies, I just noticed this comment yesterday. We can not selectively 401 individual users so I had to reassign this uid to a node we have dedicated to migrating in production (815). Then we can test the e2e migration workflow from there.

Your new userid on node 815 is 143742198. Once it syncs there, I can run the data migration to spanner, reassign the user, and 401 the node.

Re-adding ni for you Ryan so you see that last bit in comment #7.

Flags: needinfo?(rfkelly)

Thank, I can confirm that I've now synced this account to https://sync-815-us-west-2.sync.services.mozilla.com/1.5/143742198/

FWIW, I believe :pjenvey had a pretty good theory about the cause of the HMAC errors over in https://github.com/mozilla-services/services-engineering/issues/45, so I would be happy to just close this bug as resolved. But I'm also happy to give it one more test from my point of view if that's useful.

Flags: needinfo?(rfkelly)

We were scheduled to do prod user migration testing today so I went ahead and migrated this user as well. The new uid is 143742198

Thanks. After a fresh sync this profile is now syncing successfully to https://sync-1-us-west1-g.sync.services.mozilla.com/1.5/144105202/

The new uid is 143742198

Which, I note that I got a different uid to the one you shared above, is this something to be concerned about? The browser did not appear to re-upload any of its data (apart from tabs) which IIUC indicates that things were migrated successfully on the backend.

There were no HMAC errors present during the sync.

Indeed, your new uid should be 144105202. In my haste, I forgot that your new id for the spanner node would not be created until the first sync. The uid I mentioned above was for the client's time on the migrate sync-py node.

I pulled the server logs, these are all the requests:

timestamp                       request                                                                   status  request_length  bytes_sent
2020-04-28 23:11:26.661374 UTC  GET /1.5/144105202/info/collections HTTP/1.1                              200     1140.0          186.0
2020-04-28 23:11:26.868866 UTC  GET /1.5/144105202/info/configuration HTTP/1.1                            200     1143.0          165.0
2020-04-28 23:11:27.100393 UTC  GET /1.5/144105202/storage/meta/global HTTP/1.1                           200     1144.0          577.0
2020-04-28 23:11:27.315571 UTC  GET /1.5/144105202/storage/meta/global HTTP/1.1                           200     1144.0          577.0
2020-04-28 23:11:27.526238 UTC  GET /1.5/144105202/storage/crypto/keys HTTP/1.1                           200     1143.0          402.0
2020-04-28 23:11:27.742528 UTC  GET /1.5/144105202/info/collections HTTP/1.1                              200     1140.0          186.0
2020-04-28 23:11:27.943604 UTC  GET /1.5/144105202/info/configuration HTTP/1.1                            200     1142.0          165.0
2020-04-28 23:11:28.159722 UTC  GET /1.5/144105202/storage/clients?full=1&limit=1000 HTTP/1.1             200     1158.0          540.0
2020-04-28 23:11:28.461233 UTC  POST /1.5/144105202/storage/clients?batch=true&commit=true HTTP/1.1       200     1791.0          65.0
2020-04-28 23:11:28.802279 UTC  POST /1.5/144105202/storage/tabs?batch=true&commit=true HTTP/1.1          200     1906.0          65.0
2020-04-28 23:12:25.167678 UTC  GET /1.5/144105202/info/collections HTTP/1.1                              200     1141.0          207.0
2020-04-28 23:12:25.395196 UTC  GET /1.5/144105202/info/collection_counts HTTP/1.1                        200     1146.0          101.0
2020-04-28 23:12:25.655342 UTC  GET /1.5/144105202/storage/bookmarks?full=1&limit=1000 HTTP/1.1           200     1160.0          4064.0
2020-04-28 23:12:25.657521 UTC  GET /1.5/144105202/storage/clients?full=1&limit=1000 HTTP/1.1             200     1158.0          540.0
2020-04-28 23:12:25.658093 UTC  GET /1.5/144105202/storage/forms?full=1&limit=1000 HTTP/1.1               200     1156.0          308.0
2020-04-28 23:12:25.658534 UTC  GET /1.5/144105202/storage/crypto?full=1&limit=1000 HTTP/1.1              200     1156.0          404.0
2020-04-28 23:12:25.658955 UTC  GET /1.5/144105202/storage/addons?full=1&limit=1000 HTTP/1.1              200     1157.0          478.0
2020-04-28 23:12:25.666260 UTC  GET /1.5/144105202/storage/meta?full=1&limit=1000 HTTP/1.1                200     1155.0          579.0
2020-04-28 23:12:25.672740 UTC  GET /1.5/144105202/storage/tabs?full=1&limit=1000 HTTP/1.1                200     1155.0          668.0
2020-04-28 23:12:25.674269 UTC  GET /1.5/144105202/storage/prefs?full=1&limit=1000 HTTP/1.1               200     1156.0          9763.0
2020-04-28 23:12:25.675369 UTC  GET /1.5/144105202/storage/history?full=1&sort=index&limit=1000 HTTP/1.1  200     1169.0          5592.0

By the number of gets, I suspect you were using about-sync add-on?

By the number of gets, I suspect you were using about-sync add-on?

Indeed I am.

Thanks for you help! Closing this bug now.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.