Bug 1538540 Comment 8 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

I haven't found any smoking gun yet. However, here is a summary of my spelunking so far:

- Something is missing from the TextureCache's `texture_cache_map`
- This means one of two possibilities:
  - The item has not been added yet, and yet we render with a frame that holds the key. This should not be possible, as in order to be added to the frame the item must have been queued to be added to `texture_cache_map` via `TextureCacheAllocationKind::Alloc`
  - The item has been removed, and we render with a frame that has not been updated. This is likely what's going on, and it's what was going on in bug 1538710. However, that was specific to multiple documents, where one document's frame build would trigger a clearing of cached resources, and the other document's frame would not be built, leading to a stale frame when it came time to render
- Progressing from the assumption that we are removing the relevant `texture_cache_map` item, triggering a render, and yet not building a frame, we need to examine how we remove items from `texture_cache_map`.

- The item is actually removed [here](https://searchfox.org/mozilla-central/rev/d33d470140ce3f9426af523eaa8ecfa83476c806/gfx/wr/webrender/src/renderer.rs#3315)

- To hit this, we must go through [`TextureUpdateList::push_free`](https://searchfox.org/mozilla-central/rev/d33d470140ce3f9426af523eaa8ecfa83476c806/gfx/wr/webrender/src/internal_types.rs#249)
  - This is called from [`TextureCache::free` for a standalone entry](ee3905439acbf81e9c829ece0b46d09d2fa26c5c), and from [`TextureCache::clear_shared`](https://searchfox.org/mozilla-central/rev/d33d470140ce3f9426af523eaa8ecfa83476c806/gfx/wr/webrender/src/internal_types.rs#249)

    - `TextureCache::free` is called from three places:
      - `TextureCache::expire_old_entries` - this is ruled out as it is only called *during* a frame build
      - `TextureCache::upsert_entry` - this is ruled out for the same reason
      - **`TextureCache::clear_all`** - this can be called independently outside of a frame build

    - `TextureCache::clear_shared` is called from two places:
      - `TextureCache::maybe_reclaim_shared_memory` - this was the cause of bug 1538710, in which it was moved to `TextureCache::before_frames`, where it forces us to build a frame by setting `require_frame_build` to `true`
      - **`TextureCache::clear_all`** - (as stated above) this can be called independently outside of a frame build

- This leaves us with the conclusion that either we've missed something, or we have to call `TextureCache::clear_all` *without* doing a frame build, but *still* trigger a render.

- So, going down that rabbit hole, there's two places where we can trigger `TextureCache::clear_all`
  - Through [`NotifyWebRenderContextPurge`](https://searchfox.org/mozilla-central/rev/ee3905439acbf81e9c829ece0b46d09d2fa26c5c/gfx/webrender_bindings/RendererOGL.cpp#169), which occurs immediately after a render call, meaning we would have to render _again_ in order to hit it.
  - Through [`NotifyMemoryPressure`](https://searchfox.org/mozilla-central/rev/ee3905439acbf81e9c829ece0b46d09d2fa26c5c/gfx/webrender_bindings/WebRenderAPI.cpp#522), which seems to be triggerable any time(?)

- So, assuming `NotifyMemoryPressure` is the culprit, because it's a possible way of clearing items from `texture_cache_map` which isn't directly tied with a frame build, the question becomes whether we can then trigger a render while continuing to avoid the frame build.

- Before the fixes for bug 1538710, this was relatively easy - if we called `TextureCache::clear_all` from a memory pressure event, we could have the main document with `frame_is_valid` equal to `true`, causing `render_frame` to be true and `build_frame` to be false inside `RenderBackend::update_document`, which would lead us to call `wr_notifier_new_frame_ready`, without having built an actual frame (?). After bug 1538710, it seems trickier, though there is this strange comment referencing what seems to be a similar problem in the handling of `ResultMsg::UpdateResources`, though that code seems to be dead. I think my next step is to do some archaeology on that code and see if anything pops out. However, I still don't have any hunches as to how the document splitting patches would have affected any of this.
I haven't found any smoking gun yet. However, here is a summary of my spelunking so far:

- Something is missing from the TextureCache's `texture_cache_map`
- This means one of two possibilities:
  - The item has not been added yet, and yet we render with a frame that holds the key. This should not be possible, as in order to be added to the frame the item must have been queued to be added to `texture_cache_map` via `TextureCacheAllocationKind::Alloc`
  - The item has been removed, and we render with a frame that has not been updated. This is likely what's going on, and it's what was going on in bug 1538710. However, that was specific to multiple documents, where one document's frame build would trigger a clearing of cached resources, and the other document's frame would not be built, leading to a stale frame when it came time to render
- Progressing from the assumption that we are removing the relevant `texture_cache_map` item, triggering a render, and yet not building a frame, we need to examine how we remove items from `texture_cache_map`.

- The item is actually removed [here](https://searchfox.org/mozilla-central/rev/d33d470140ce3f9426af523eaa8ecfa83476c806/gfx/wr/webrender/src/renderer.rs#3315)

- To hit this, we must go through [`TextureUpdateList::push_free`](https://searchfox.org/mozilla-central/rev/d33d470140ce3f9426af523eaa8ecfa83476c806/gfx/wr/webrender/src/internal_types.rs#249)
  - This is called from [`TextureCache::free` for a standalone entry](ee3905439acbf81e9c829ece0b46d09d2fa26c5c), and from [`TextureCache::clear_shared`](https://searchfox.org/mozilla-central/rev/d33d470140ce3f9426af523eaa8ecfa83476c806/gfx/wr/webrender/src/internal_types.rs#249)

    - `TextureCache::free` is called from three places:
      - `TextureCache::expire_old_entries` - this is ruled out as it is only called *during* a frame build
      - `TextureCache::upsert_entry` - this is ruled out for the same reason
      - **`TextureCache::clear_all`** - this can be called independently outside of a frame build

    - `TextureCache::clear_shared` is called from two places:
      - `TextureCache::maybe_reclaim_shared_memory` - this was the cause of bug 1538710, in which it was moved to `TextureCache::before_frames`, where it forces us to build a frame by setting `require_frame_build` to `true`
      - **`TextureCache::clear_all`** - (as stated above) this can be called independently outside of a frame build

- This leaves us with the conclusion that either we've missed something, or we have to call `TextureCache::clear_all` *without* doing a frame build, but *still* trigger a render.

- So, going down that rabbit hole, there's two places where we can trigger `TextureCache::clear_all`
  - Through [`NotifyWebRenderContextPurge`](https://searchfox.org/mozilla-central/rev/ee3905439acbf81e9c829ece0b46d09d2fa26c5c/gfx/webrender_bindings/RendererOGL.cpp#169), which occurs immediately after a render call, meaning we would have to render _again_ in order to hit it.
  - Through [`NotifyMemoryPressure`](https://searchfox.org/mozilla-central/rev/ee3905439acbf81e9c829ece0b46d09d2fa26c5c/gfx/webrender_bindings/WebRenderAPI.cpp#522), which seems to be triggerable any time(?)

- So, assuming `NotifyMemoryPressure` is the culprit, because it's a possible way of clearing items from `texture_cache_map` which isn't directly tied with a frame build, the question becomes whether we can then trigger a render while continuing to avoid the frame build.

- Before the fixes for bug 1538710, this was relatively easy - if we called `TextureCache::clear_all` from a memory pressure event, we could have the main document with `frame_is_valid` equal to `true`, causing `render_frame` to be true and `build_frame` to be false inside `RenderBackend::update_document`, which would lead us to call `wr_notifier_new_frame_ready`, without having built an actual frame (?). After bug 1538710, it seems trickier, though there is this strange comment referencing what seems to be a similar problem in the handling of `ResultMsg::UpdateResources`, though that code seems to be dead. I think my next step is to do some archaeology on that code and see if anything pops out. _EDIT: ResultMsg::UpdateResources is indeed used, and seems to be used correctly. I just missed it because it only shows up in searchfox for text searches_ However, I still don't have any hunches as to how the document splitting patches would have affected any of this.
I haven't found any smoking gun yet. However, here is a summary of my spelunking so far:

- Something is missing from the TextureCache's `texture_cache_map`
- This means one of two possibilities:
  - The item has not been added yet, and yet we render with a frame that holds the key. This should not be possible, as in order to be added to the frame the item must have been queued to be added to `texture_cache_map` via `TextureCacheAllocationKind::Alloc`
  - The item has been removed, and we render with a frame that has not been updated. This is likely what's going on, and it's what was going on in bug 1538710. However, that was specific to multiple documents, where one document's frame build would trigger a clearing of cached resources, and the other document's frame would not be built, leading to a stale frame when it came time to render
- Progressing from the assumption that we are removing the relevant `texture_cache_map` item, triggering a render, and yet not building a frame, we need to examine how we remove items from `texture_cache_map`.

- The item is actually removed [here](https://searchfox.org/mozilla-central/rev/d33d470140ce3f9426af523eaa8ecfa83476c806/gfx/wr/webrender/src/renderer.rs#3315)

- To hit this, we must go through [`TextureUpdateList::push_free`](https://searchfox.org/mozilla-central/rev/d33d470140ce3f9426af523eaa8ecfa83476c806/gfx/wr/webrender/src/internal_types.rs#249)
  - This is called from [`TextureCache::free` for a standalone entry](ee3905439acbf81e9c829ece0b46d09d2fa26c5c), and from [`TextureCache::clear_shared`](https://searchfox.org/mozilla-central/rev/d33d470140ce3f9426af523eaa8ecfa83476c806/gfx/wr/webrender/src/internal_types.rs#249)

    - `TextureCache::free` is called from three places:
      - `TextureCache::expire_old_entries` - this is ruled out as it is only called *during* a frame build
      - `TextureCache::upsert_entry` - this is ruled out for the same reason
      - **`TextureCache::clear_all`** - this can be called independently outside of a frame build

    - `TextureCache::clear_shared` is called from two places:
      - `TextureCache::maybe_reclaim_shared_memory` - this was the cause of bug 1538710, in which it was moved to `TextureCache::before_frames`, where it forces us to build a frame by setting `require_frame_build` to `true`
      - **`TextureCache::clear_all`** - (as stated above) this can be called independently outside of a frame build

- This leaves us with the conclusion that either we've missed something, or we have to call `TextureCache::clear_all` *without* doing a frame build, but *still* trigger a render.

- So, going down that rabbit hole, there's two places where we can trigger `TextureCache::clear_all`
  - Through [`NotifyWebRenderContextPurge`](https://searchfox.org/mozilla-central/rev/ee3905439acbf81e9c829ece0b46d09d2fa26c5c/gfx/webrender_bindings/RendererOGL.cpp#169), which occurs immediately after a render call, meaning we would have to render _again_ in order to hit it.
  - Through [`NotifyMemoryPressure`](https://searchfox.org/mozilla-central/rev/ee3905439acbf81e9c829ece0b46d09d2fa26c5c/gfx/webrender_bindings/WebRenderAPI.cpp#522), which seems to be triggerable any time(?)

- So, assuming `NotifyMemoryPressure` is the culprit, because it's a possible way of clearing items from `texture_cache_map` which isn't directly tied with a frame build, the question becomes whether we can then trigger a render while continuing to avoid the frame build.

- Before the fixes for bug 1538710, this was relatively easy - if we called `TextureCache::clear_all` from a memory pressure event, we could have the main document with `frame_is_valid` equal to `true`, causing `render_frame` to be true and `build_frame` to be false inside `RenderBackend::update_document`, which would lead us to call `wr_notifier_new_frame_ready`, without having built an actual frame (?). After bug 1538710, it seems trickier, though there is this strange comment referencing what seems to be a similar problem in the handling of `ResultMsg::UpdateResources`, though that code seems to be dead. _EDIT: ResultMsg::UpdateResources is indeed used, and seems to be used correctly. I just missed it because it only shows up in searchfox for text searches_. I think my next step is to do some archaeology on that code and see if anything pops out. However, I still don't have any hunches as to how the document splitting patches would have affected any of this.

Back to Bug 1538540 Comment 8