Sentry: ConcurrentModificationException
Categories
(Data Platform and Tools :: Glean: SDK, defect, P1)
Tracking
(Not tracked)
People
(Reporter: mdroettboom, Assigned: mdroettboom)
Details
Attachments
(3 files)
java.util.ConcurrentModificationException: null
at java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:757)
at java.util.LinkedHashMap$LinkedKeyIterator.next(LinkedHashMap.java:780)
at mozilla.telemetry.glean.GleanInternalAPI$initialize$1.invokeSuspend(Glean.kt:16)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:2)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:19)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:457)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:301)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1162)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:636)
at java.lang.Thread.run(Thread.java:764)
https://sentry.prod.mozaws.net/operations/fenix/issues/7519801/
Assignee | ||
Comment 1•4 years ago
|
||
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Comment 2•4 years ago
|
||
Assignee | ||
Updated•4 years ago
|
Comment 3•4 years ago
|
||
Hi Michael, I was going through the crashes on Nightly, and I'm seeing this signature continue to show up, even in buildids from 5/13.
https://crash-stats.mozilla.org/report/index/1aae6d08-110e-4ad1-b0e4-2c86c0200515
(See "More Reports" and sort by build id for other reports.)
For our 5.0 release, this was a top 3 crasher from Google Play Store (although that was before your fix landed). I don't have as good visibility into Nightly, so I just wanted to raise it in case those reports are helpful.
Feel free to close again if there's other reasons this could be showing up! We'll also know more when we release 5.1 next week.
Assignee | ||
Comment 4•4 years ago
|
||
Thanks @liuche. This has been tricky, because I still haven't been able to reproduce locally -- there's some combination of things folks are hitting the wild here. I will have a renewed look -- I agree that this is high priority.
Assignee | ||
Updated•4 years ago
|
Updated•4 years ago
|
Assignee | ||
Comment 5•4 years ago
|
||
This Sentry link for Nightly shows the problem persisting with Glean 29.1.0 (which contains the first "fix"): https://sentry.prod.mozaws.net/operations/fenix-nightly/issues/7960114
Assignee | ||
Comment 6•4 years ago
|
||
Update: I was able to write a unit test that registers pings while Glean is being initialized, but this still doesn't lead to a ConcurrentModificationException
:
@Test
fun `Initializing while registering pings isn't a race condition`() {
Glean.testDestroyGleanHandle()
Dispatchers.API.setTaskQueueing(true)
Dispatchers.API.setTestingMode(false)
val stringMetric = StringMetricType(
disabled = false,
category = "telemetry",
lifetime = Lifetime.Application,
name = "string_metric",
sendInPings = listOf("store1")
)
stringMetric.set("foo")
for (i in 1..1000) {
val ping = PingType<NoReasonCodes>(
name = "race-condition-ping${i}",
includeClientId = true,
sendIfEmpty = false,
reasonCodes = listOf()
)
Glean.registerPingType(ping)
}
val config = Configuration()
Glean.initialize(context, true, config)
val ping = PingType<NoReasonCodes>(
name = "race-condition-ping",
includeClientId = true,
sendIfEmpty = false,
reasonCodes = listOf()
)
while (Dispatchers.API.taskQueue.size > 0) {
Glean.registerPingType(ping)
}
}
Assignee | ||
Comment 7•4 years ago
|
||
This unit test (very similar to the one above, but registering pings from another coroutine thread rather than the main thread) reproduces the bug more than 50% of the time. Woohoo. Progress... Now to experiment with various mitigations.
@Test
fun `Initializing while registering pings isn't a race condition`() {
Glean.testDestroyGleanHandle()
Dispatchers.API.setTaskQueueing(true)
Dispatchers.API.setTestingMode(false)
val stringMetric = StringMetricType(
disabled = false,
category = "telemetry",
lifetime = Lifetime.Application,
name = "string_metric",
sendInPings = listOf("store1")
)
stringMetric.set("foo")
for (i in 1..1000) {
val ping = PingType<NoReasonCodes>(
name = "race-condition-ping${i}",
includeClientId = true,
sendIfEmpty = false,
reasonCodes = listOf()
)
Glean.registerPingType(ping)
}
val config = Configuration()
Glean.initialize(context, true, config)
GlobalScope.launch {
val ping = PingType<NoReasonCodes>(
name = "race-condition-ping",
includeClientId = true,
sendIfEmpty = false,
reasonCodes = listOf()
)
while (Dispatchers.API.taskQueue.size > 0) {
Glean.registerPingType(ping)
}
}
}
}
Comment 8•4 years ago
|
||
Comment 9•4 years ago
|
||
Assignee | ||
Comment 10•4 years ago
|
||
Not seeing any reports from Fenix Nightly with Glean 29.1.1 (with the second fix), so closing this. Feel free to reopen if we see them again.
Assignee | ||
Updated•4 years ago
|
Comment hidden (collapsed) |
Description
•