Closed Bug 823054 Opened 10 years ago Closed 8 years ago

DatabaseError: (1205, 'Lock wait timeout exceeded; try restarting transaction')

Categories

(addons.mozilla.org Graveyard :: Code Quality, defect, P3)

x86
macOS
defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: tarek, Unassigned)

References

Details

(Keywords: perf)

Seen on high load (> 500 RPS) on stage. 


DatabaseError: (1205, 'Lock wait timeout exceeded; try restarting transaction')

Stacktrace (most recent call last):

  File "django/core/handlers/base.py", line 111, in get_response
    response = callback(request, *callback_args, **callback_kwargs)
  File "amo/decorators.py", line 160, in wrapper
    return f(*args, **kw)
  File "amo/decorators.py", line 152, in wrapper
    return f(*args, **kw)
  File "amo/decorators.py", line 33, in wrapper
    return func(request, *args, **kw)
  File "mkt/submit/decorators.py", line 37, in wrapper
    return f(request, *args, **kw)
  File "mkt/submit/decorators.py", line 21, in wrapper
    return f(request, *args, **kw)
  File "django/db/transaction.py", line 209, in inner
    return func(*args, **kwargs)
  File "mkt/submit/views.py", line 92, in manifest
    is_packaged=form.is_packaged())
  File "addons/models.py", line 430, in from_upload
    addon = Addon(**dict((k, v) for k, v in data.items() if k in fields))
  File "addons/models.py", line 323, in __init__
    super(Addon, self).__init__(*args, **kw)
  File "amo/models.py", line 226, in __init__
    super(OnChangeMixin, self).__init__(*args, **kw)
  File "django/db/models/base.py", line 355, in __init__
    setattr(self, field.name, rel_obj)
  File "translations/fields.py", line 125, in __set__
    value = self.translation_from_dict(instance, lang, value)
  File "translations/fields.py", line 175, in translation_from_dict
    trans = self.translation_from_string(instance, locale, string)
  File "translations/fields.py", line 158, in translation_from_string
    translation = self.model.new(string, lang)
  File "translations/models.py", line 84, in new
    SET id=LAST_INSERT_ID(id + @@global.auto_increment_increment)""")
  File "django/db/backends/mysql/base.py", line 114, in execute
    return self.cursor.execute(query, args)
  File "MySQLdb/cursors.py", line 173, in execute
    self.errorhandler(self, exc, value)
  File "MySQLdb/connections.py", line 36, in defaulterrorhandler
    raise errorclass, errorvalue
What is the value for "innodb_lock_wait_timeout" ?

* it looks like raising it a bit could fix the issue.
* we also need to look at the longest requests on high load to see what request takes so long
Severity: normal → critical
Priority: -- → P1
Related: Bug 756242
Ok it seems that this 'magic' autoinc does not scale well after 500 RPS. 

Sheeri, Kumar, 

Can I get more background on this feature ?
(In reply to Tarek Ziadé (:tarek) from comment #1)
> What is the value for "innodb_lock_wait_timeout" ?
> 
> * it looks like raising it a bit could fix the issue.
> * we also need to look at the longest requests on high load to see what
> request takes so long

Just to clarify this is happening on our stage cluster during load testing.

addons1.stage.db.phx1.mozilla.com

mysql> show variables like '%innodb_lock_wait%';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| innodb_lock_wait_timeout | 50    |
+--------------------------+-------+
1 row in set (0.00 sec)

50 seconds seems pretty high already. We can try raising it to see if it helps under high load but I will have defer to @sheeri.
Well, that's what load testing things on stage is for, right?

Let's see if we increase the timeout, if that's the right variable to tweak, and we could also see what's taking so long by looking at the slow query log....grab me on IRC and we can do this.
Looking at the slow query logs, we keep today and yesterday, so starting  121219  4:45:08 here's what we have:
[root@addons2.stage.db.phx1 mysql]# grep Query.time mysql-slow.log.* | cut -f2 -d: | cut -f1 -d\. | sort -n | tail
 7
 9
 9
 10
 11
 12
 34
 35
 61
 62

So there are some 60 second queries...this is on addons2.stage, one of the slaves...Note that the thing they have most in common is they are examining LOTS of rows. (see Rows_examined) As opposed to, say, Lock_time.....the queries themselves are taking a while.

here's the 4 queries that are over 20 seconds:
# User@Host: addons_dev[addons_dev] @  [10.8.70.202]
# Query_time: 35.398121  Lock_time: 0.000036 Rows_sent: 1  Rows_examined: 13729345
SET timestamp=1355985339;
SELECT SUM(`download_counts`.`count`) AS `sum` FROM `download_counts` WHERE `download_counts`.`date` < '2012-12-20';

# Time: 121219 21:35:39
# User@Host: addons_dev[addons_dev] @  [10.8.70.202]
# Query_time: 34.801077  Lock_time: 0.000054 Rows_sent: 24338  Rows_examined: 13738531
use marketplace_altdev_allizom_org;
SET timestamp=1355981739;
SELECT
               addon_id, AVG(count), SUM(count)
           FROM download_counts
           USE KEY (`addon_and_count`)
           GROUP BY addon_id
           ORDER BY addon_id;
# Time: 121219 23:46:06
# User@Host: addons_stage[addons_stage] @  [10.8.70.202]
# Query_time: 61.873735  Lock_time: 0.000048 Rows_sent: 921027  Rows_examined: 7666361
use addons_allizom_org;
SET timestamp=1355989566;
SELECT `stats_addons_collections_counts`.`addon_id`, `stats_addons_collections_counts`.`collection_id`, SUM(`stats_addons_collections_counts`.`count`) AS `sum` FROM `stats_addons_collections_counts` GROUP BY `stats_addons_collections_counts`.`addon_id`, `stats_addons_collections_counts`.`collection_id` ORDER BY NULL;



# User@Host: addons_dev[addons_dev] @  [10.8.70.202]
# Query_time: 62.015101  Lock_time: 0.000047 Rows_sent: 919481  Rows_examined: 7644040
use addons_dev_allizom_org_new;
SET timestamp=1355989566;
SELECT `stats_addons_collections_counts`.`addon_id`, `stats_addons_collections_counts`.`collection_id`, SUM(`stats_addons_collections_counts`.`count`) AS `sum` FROM `stats_addons_collections_counts` GROUP BY `stats_addons_collections_counts`.`addon_id`, `stats_addons_collections_counts`.`collection_id` ORDER BY NULL;
All stats queries ugh. I'm trying to kill these at least by moving stats off of zamboni and into another project.
Severity: critical → normal
Priority: P1 → --
@wil: all our load test work is blocked by this because we can't do even small loads without locking mysql and putting the app in a broken state - how can we express it in term of priority and severity ?
any time we see USE KEY in a query we might as well kill that query cold. There is no way mysql can optimize for that. Chances are the key is wrong.
kumar++ for comment 9. I agree.
Blocks: 825280
bug 825280 is for fixing slow queries which sounds like a great idea
Whiteboard: [waiting on monolith]
Monolith is alive, we can start thinking about this
Keywords: perf
Priority: -- → P3
Whiteboard: [waiting on monolith]
Meaning, thinking about going through the code and rework the queries that use "USE KEY" or any other index hints? Or something else?
(In reply to Sheeri Cabral [:sheeri] from comment #13)
> Meaning, thinking about going through the code and rework the queries that
> use "USE KEY" or any other index hints? Or something else?

Monolith replaces all of those.
It sounds like this is a Monolith thing now, and may be fixed. Is this good to close?
Most of those look like they are AMO and we haven't done any work getting those stats out of AMO and into Monolith. It's in the plans for the future but the priority was to get Marketplace on Monolith first.

I don't think we should close this until AMO uses Monolith.
No longer blocks: tarako-marketplace
Blocks: 992365
No longer blocks: 992365
Component: General → Code Quality
Product: Marketplace → addons.mozilla.org
Version: 1.0 → unspecified
Thanks for filing this.  Due to resource constraints we are closing bugs which we won't realistically be able to fix.  If you have a patch that applies to this bug please reopen.

For more info see http://micropipes.com/blog/2014/09/24/the-great-add-on-bug-triage/
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Product: addons.mozilla.org → addons.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.