4.86 - 9.53% Heap Unclassified (linux64, osx-10-10, windows10-64, windows10-64-pgo-qr) regression on push 5514aae0e34e81b39a88447094a34e13c0d74aac
Categories
(Core :: Security: PSM, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr60 | --- | unaffected |
firefox66 | --- | unaffected |
firefox67 | --- | unaffected |
firefox68 | --- | fixed |
People
(Reporter: igoldan, Assigned: myk)
References
(Regression)
Details
(Keywords: perf, regression)
Attachments
(1 file)
We have detected an awsy regression from push:
As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
10% Heap Unclassified windows10-64 pgo stylo 44,913,154.82 -> 49,192,859.81
7% Heap Unclassified osx-10-10 opt stylo 74,598,810.00 -> 79,551,722.98
6% Heap Unclassified windows10-64-pgo-qr opt stylo 54,957,674.79 -> 58,183,361.15
5% Heap Unclassified linux64 pgo stylo 79,531,437.71 -> 83,393,662.70
You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=20020
On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the jobs in a pushlog format.
To learn more about the regressing test(s), please see: https://wiki.mozilla.org/AWSY/Tests
Reporter | ||
Updated•6 years ago
|
Reporter | ||
Updated•6 years ago
|
Comment 1•6 years ago
|
||
It seems to me that almost all of the heap usage here will be in rkv. Myk, does anything jump out as being an obvious cause of this?
Comment 2•6 years ago
|
||
massif is telling me that merely instantiating an Rkv allocates about 3MB, which appears to account for the majority of these increases:
99.99% (3,151,258B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->66.54% (2,097,152B) 0x133E75: mdb_env_open (in /home/keeler/src/rust-scratch/target/release/scratch)
| ->66.54% (2,097,152B) 0x10E8F6: lmdb::environment::EnvironmentBuilder::open_with_permissions (in /home/keeler/src/rust-scratch/target/release/scratch)
| ->66.54% (2,097,152B) 0x10E6EE: lmdb::environment::EnvironmentBuilder::open (in /home/keeler/src/rust-scratch/target/release/scratch)
| ->66.54% (2,097,152B) 0x10DC4D: rkv::env::Rkv::from_env (in /home/keeler/src/rust-scratch/target/release/scratch)
| ->66.54% (2,097,152B) 0x10DB5D: rkv::env::Rkv::new (in /home/keeler/src/rust-scratch/target/release/scratch)
| ->66.54% (2,097,152B) 0x10D809: scratch::main (in /home/keeler/src/rust-scratch/target/release/scratch)
| ->66.54% (2,097,152B) 0x10D571: std::rt::lang_start::{{closure}} (in /home/keeler/src/rust-scratch/target/release/scratch)
| ->66.54% (2,097,152B) 0x11E981: std::panicking::try::do_call (rt.rs:59)
| ->66.54% (2,097,152B) 0x12C158: __rust_maybe_catch_panic (lib.rs:102)
| ->66.54% (2,097,152B) 0x11F302: std::rt::lang_start_internal (panicking.rs:289)
| ->66.54% (2,097,152B) 0x10DA10: main (in /home/keeler/src/rust-scratch/target/release/scratch)
|
->33.27% (1,048,584B) 0x112731: mdb_midl_alloc (in /home/keeler/src/rust-scratch/target/release/scratch)
| ->33.27% (1,048,584B) 0x133E53: mdb_env_open (in /home/keeler/src/rust-scratch/target/release/scratch)
| ->33.27% (1,048,584B) 0x10E8F6: lmdb::environment::EnvironmentBuilder::open_with_permissions (in /home/keeler/src/rust-scratch/target/release/scratch)
| ->33.27% (1,048,584B) 0x10E6EE: lmdb::environment::EnvironmentBuilder::open (in /home/keeler/src/rust-scratch/target/release/scratch)
| ->33.27% (1,048,584B) 0x10DC4D: rkv::env::Rkv::from_env (in /home/keeler/src/rust-scratch/target/release/scratch)
| ->33.27% (1,048,584B) 0x10DB5D: rkv::env::Rkv::new (in /home/keeler/src/rust-scratch/target/release/scratch)
| ->33.27% (1,048,584B) 0x10D809: scratch::main (in /home/keeler/src/rust-scratch/target/release/scratch)
| ->33.27% (1,048,584B) 0x10D571: std::rt::lang_start::{{closure}} (in /home/keeler/src/rust-scratch/target/release/scratch)
| ->33.27% (1,048,584B) 0x11E981: std::panicking::try::do_call (rt.rs:59)
| ->33.27% (1,048,584B) 0x12C158: __rust_maybe_catch_panic (lib.rs:102)
| ->33.27% (1,048,584B) 0x11F302: std::rt::lang_start_internal (panicking.rs:289)
| ->33.27% (1,048,584B) 0x10DA10: main (in /home/keeler/src/rust-scratch/target/release/scratch)
|
->00.18% (5,522B) in 1+ places, all below ms_print's threshold (01.00%)
(this is from a little scratch program that basically just calls Rkv::new(Path::new("/tmp/"))
)
Assignee | ||
Comment 3•6 years ago
•
|
||
DMD shows several allocations on a local macOS build:
Unreported {
1 block in heap block record 2 of 6,020
2,097,152 bytes (2,097,152 requested / 0 slop)
1.81% of the heap (7.33% cumulative)
5.51% of unreported (22.27% cumulative)
Allocated at {
#01: replace_calloc(unsigned long, unsigned long) (DMD.cpp:1144, in libmozglue.dylib)
#02: mdb_env_open (in XUL) + 228
#03: lmdb::environment::EnvironmentBuilder::open_with_permissions::h6871600ac7cbdcc3 (environment.rs:277, in XUL)
#04: lmdb::environment::EnvironmentBuilder::open::h281e464919916b16 (environment.rs:249, in XUL)
#05: rkv::env::Rkv::from_env::h0ac73aabfded456d (env.rs:72, in XUL)
#06: rkv::env::Rkv::with_capacity::h319e356b0d46680f (env.rs:89, in XUL)
#07: rkv::env::Rkv::new::h2ea2cc226a47ecc7 (env.rs:62, in XUL)
#08: cert_storage::SecurityState::new::h4c70ddb503b0fed0 (lib.rs:78, in XUL)
}
}
Unreported {
1 block in heap block record 5 of 6,020
1,052,672 bytes (1,048,584 requested / 4,088 slop)
0.91% of the heap (11.48% cumulative)
2.77% of unreported (34.88% cumulative)
Allocated at {
#01: replace_malloc(unsigned long) (DMD.cpp:1128, in libmozglue.dylib)
#02: mdb_midl_alloc (midl.c:107, in XUL)
#03: mdb_env_open (in XUL) + 190
#04: lmdb::environment::EnvironmentBuilder::open_with_permissions::h6871600ac7cbdcc3 (environment.rs:277, in XUL)
#05: lmdb::environment::EnvironmentBuilder::open::h281e464919916b16 (environment.rs:249, in XUL)
#06: rkv::env::Rkv::from_env::h0ac73aabfded456d (env.rs:72, in XUL)
#07: rkv::env::Rkv::with_capacity::h319e356b0d46680f (env.rs:89, in XUL)
#08: rkv::env::Rkv::new::h2ea2cc226a47ecc7 (env.rs:62, in XUL)
}
}
Unreported {
1 block in heap block record 633 of 6,020
4,096 bytes (4,096 requested / 0 slop)
0.00% of the heap (30.44% cumulative)
0.01% of unreported (92.49% cumulative)
Allocated at {
#01: replace_calloc(unsigned long, unsigned long) (DMD.cpp:1144, in libmozglue.dylib)
#02: mdb_env_open (in XUL) + 2165
#03: lmdb::environment::EnvironmentBuilder::open_with_permissions::h6871600ac7cbdcc3 (environment.rs:277, in XUL)
#04: lmdb::environment::EnvironmentBuilder::open::h281e464919916b16 (environment.rs:249, in XUL)
#05: rkv::env::Rkv::from_env::h0ac73aabfded456d (env.rs:72, in XUL)
#06: rkv::env::Rkv::with_capacity::h319e356b0d46680f (env.rs:89, in XUL)
#07: rkv::env::Rkv::new::h2ea2cc226a47ecc7 (env.rs:62, in XUL)
#08: cert_storage::SecurityState::new::h4c70ddb503b0fed0 (lib.rs:78, in XUL)
}
}
These don't account for all of the regression seen in CI, but then I didn't run the tp5 test locally, I only started up the browser; so there may be more to uncover.
I don't know why the mdb_env_open
stack frames don't have line numbers, but I think the first two (i.e. the significant ones) are https://searchfox.org/mozilla-central/rev/4eaf7c637a303e0f7adcf1ae45064b2d4bef9eb0/third_party/rust/lmdb-rkv-sys/lmdb/libraries/liblmdb/mdb.c#4972 and https://searchfox.org/mozilla-central/rev/4eaf7c637a303e0f7adcf1ae45064b2d4bef9eb0/third_party/rust/lmdb-rkv-sys/lmdb/libraries/liblmdb/mdb.c#4971.
Those are intentional, and not leaks, but significant nevertheless. At the very least, they suggest that we should aggressively group stores into a small number of environments as we add rkv consumers (to avoid paying this cost for each new consumer). And perhaps there's more we can do (close the environment when we aren't using it, configure it to be read-only when we only expect to read from it, configure LMDB to reduce these allocations, etc.).
I also don't know how we would go about reporting on heap allocations in Rust crates (or rather, in this case, in C libraries within Rust crates) to move these allocations from unclassified to explicit.
Eric, do you have a suggestion for that?
Assignee | ||
Comment 4•6 years ago
•
|
||
I pushed a prospective LMDB configuration change to tryserver to see how it would impact heap allocations:
Assignee | ||
Comment 5•6 years ago
|
||
Here's a comparison (builds still in progress) for another potential strategy for reducing these allocations:
Assignee | ||
Comment 6•6 years ago
|
||
The new rkv-based cert_storage database caused a Heap Unclassified regression because of memory that LMDB reserves when opening a database in read-write mode. Since cert_storage usage is read-heavy, this change claws back that regression by opening it in read-only mode except when changes are being made.
Assignee | ||
Updated•6 years ago
|
Pushed by myk@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/3561cdc13806 reopen security_state env as read-only when not writing r=keeler
Comment 8•6 years ago
|
||
bugherder |
Reporter | ||
Updated•6 years ago
|
Assignee | ||
Comment 9•6 years ago
|
||
(In reply to Myk Melez [:myk] [@mykmelez] from comment #3)
I also don't know how we would go about reporting on heap allocations in Rust crates (or rather, in this case, in C libraries within Rust crates) to move these allocations from unclassified to explicit.
Eric, do you have a suggestion for that?
I chatted with Eric, who gave me some suggestions, which I've noted in bug 1542928.
Updated•6 years ago
|
Updated•3 years ago
|
Description
•