Closed
Bug 627552
Opened 10 years ago
Closed 10 years ago
Python server at stage-auth doesn't actually sync
Categories
(Cloud Services :: Server: Sync, defect)
Cloud Services
Server: Sync
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: tracy, Assigned: petef)
References
Details
(Keywords: regression)
Attachments
(2 files)
Seen testing beta9 with Mac and Win 7 1) Created new account and added second client via PAKE. 2) added some history, tabs, bookmarks, form entry and password to client A, Sync. Allow it to complete. 3) Sync Client B Tested results: logs look ok, but the data added on Client A doesn't appear on client B Expected Results: Data is synced across clients
Reporter | ||
Comment 1•10 years ago
|
||
Reporter | ||
Updated•10 years ago
|
Assignee: nobody → tarek
Reporter | ||
Comment 2•10 years ago
|
||
This is against stage-auth. I'm blocked from further testing 'til this is sorted out.
Comment 3•10 years ago
|
||
Relevant line from the logs: 2011-01-20 16:28:04 Service.Main INFO Testing info/collections: {"tabs":1295562039.733859,"clients":null,"crypto":null,"bookmarks":null,"prefs":null,"history":null} This indicates that the client is not getting any timestamps for collections stored in the DB. After investigation, turns out that part of the DB was corrupted. Richard fixed the DB and the timestamps are now back. One question remains though: should the client stop with an error in case some timestamps are null during sync ? While this error is due to a corrupted DB, sync could be stopped in that case. Or maybe the server can return a 503 if tabs has a timestamps but other collection have none (which is a impossible state afaik)
Comment 4•10 years ago
|
||
Tracy: I am going to sync several clients this morning and check that everything works fine now
Tracy: Repaired all InnoDB and MyISAM tables on weave-stage-db01 in all databases. Smoke test should work again. Could you confirm during your next smoke test pass whenever?
Reporter | ||
Comment 6•10 years ago
|
||
Sync is working on stage-auth. I'll resolve this bug as fixed. Tarek, can you file a bug against how to handle the timestamp issue?
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Comment 7•10 years ago
|
||
A bug was filed already for this: bug 627671
Comment 8•10 years ago
|
||
Sorry, the last comment was an unrelated bug. I am reopening the bug because Tracy had the issue again.
Comment 9•10 years ago
|
||
Sorry, the last comment was an unrelated bug. I am reopening the bug because Tracy had the issue again.
Reporter | ||
Comment 10•10 years ago
|
||
https://stage-node02.services.mozilla.com/1.0/7ohn6wlh6dz4otolncc4dkykm6zpzlwu is where I am currently seeing this.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 11•10 years ago
|
||
Richard: it seems that another DB is corrupted. How could we prevent or detect DB corruptions ?
Comment 12•10 years ago
|
||
Can we detect if the mysql servers are getting restarted often in the logs ? ("restarted mysql" in the error log). This seems to be one possible reason of getting corrupted DBs. If this not the case, can we activate if it's not, the general logger ? and try to find when the DB gets corrupted to build a test case to reproduce the issue.
![]() |
||
Comment 13•10 years ago
|
||
Staging database is not being restarted. What you're seeing in the logs is evidence of mysql worker threads crashing. A clearer description of what you're seeing in the error log would be "disconnected from mysql server, reconnecting". The general logger is active, but truncating after 6 hours (bug open to extend this time). Based on recent experience there are helpful stack traces any time the python code loses its connection to mysql, so we may be able to tell what the specific query affected is.
Comment 14•10 years ago
|
||
Stage still seems out of order I get a 500 here: https://stage-auth.services.mozilla.com/user/1.0/k7ndfwezspuuuiwlvqzyrwuz6yzwwq6i/node/weave What's the current status of stage ? do we use PHP for reg/sreg or not ? Do we have enough nodes provisioned in the available_nodes table ?
Reporter | ||
Comment 15•10 years ago
|
||
I'm waiting on this to make another complete testing pass against the python server.
Assignee | ||
Comment 16•10 years ago
|
||
Found two problems: * available_nodes was 0; upped to 100 for each active node (and adding monitoring for this) * gunicorn-syncstorage was not running I can sign up a new user in stage & sync successfully.
Assignee: tarek → petef
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Resolution: --- → FIXED
![]() |
||
Comment 17•10 years ago
|
||
As a result of this work, is Python reg/sreg up and running in staging, or is it PHP?
Comment 18•10 years ago
|
||
We need PHP in stage, so we have a similar environment that what is going to be launched at first. reg/sreg in Python will be introduced in production in a second phase, to minimize the global risks
Reporter | ||
Comment 19•10 years ago
|
||
Pete, I just set up a new account against http://stage-auth.services.mozilla.com. It still won't sync. Seeing the same in logs as initially reported. 2011-02-23 09:24:38 Net.Resource DEBUG GET fail 500 https://stage-auth.services.mozilla.com/user/1.0/p4stuhhw7gvybs3b37fsr5nh2ck7z5fy/node/weave 2011-02-23 09:24:38 Service.Main DEBUG Exception: Unexpected response code: 500 No traceback available
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 20•10 years ago
|
||
What's installed in stage ? I am under the impression that nothing has changed yet (see 632816) -- if so, doing QA on it for now is a waste of time. We need: - Python sync server - PHP reg / sreg server
Assignee | ||
Comment 21•10 years ago
|
||
(In reply to comment #19) > It still won't sync. Seeing the same in logs as initially reported. available_nodes dropped back to 0 again on adm1. I'll re-bump to a couple thousand I guess. Tarek, I'm working on the other bug and getting php reg/sreg deployed in stage.
Comment 22•10 years ago
|
||
(In reply to comment #21) > (In reply to comment #19) > > It still won't sync. Seeing the same in logs as initially reported. > > available_nodes dropped back to 0 again on adm1. I'll re-bump to a couple > thousand I guess. Does the admin scripts run in stage ? I am thinking about the one that cleans that table to increment the available nodes w/ the daily deleted account numbers. I am saying this because Hudson generates several hundreds users per day (and deletes them) so you will hit the problem again. > > Tarek, I'm working on the other bug and getting php reg/sreg deployed in stage. Cool thanks !
Assignee | ||
Comment 23•10 years ago
|
||
(In reply to comment #22) > Does the admin scripts run in stage ? I am thinking about the one that cleans > that table to increment the available nodes w/ the daily deleted account > numbers. > > I am saying this because Hudson generates several hundreds users per day (and > deletes them) so you will hit the problem again. AFAICT, no. I'll talk to atoll about that today. php reg/sreg deployed, and tarek ran a functional test through Hudson and everything passed (we had to temporarily disable captcha to get it to pass, which is expected). Also deployed latest syncstorage from the other bug.
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•