Closed Bug 1072501 Opened 10 years ago Closed 9 years ago

We probably don't use DIBTextureClient with e10s

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla46

Tracking Flags:

Tracking

Status

e10s

m8+

---

firefox45

---

wontfix

firefox46

---

fixed

People

(Reporter: jrmuizel, Assigned: bas.schouten)

References

Details

Attachments

(3 files, 1 obsolete file)

Crash stack for the "click on a crash in about:crashes" workflow 9 years ago Milan Sreckovic [:milan] (needinfo for best results) 10.97 KB, text/plain		Details
Manage the mapped data 9 years ago Milan Sreckovic [:milan] (needinfo for best results) 6.76 KB, patch	bas.schouten : feedback-	Details \| Diff \| Splinter Review
Unmap file mapping on source surface destruction. 9 years ago Bas Schouten (:bas.schouten) 3.24 KB, patch	milan : review+	Details \| Diff \| Splinter Review
Unmap file mapping on source surface destruction and make UpdateFromSurface work for ShmemDIBTextureData. 9 years ago Bas Schouten (:bas.schouten) 9.27 KB, patch	jrmuizel : review+	Details \| Diff \| Splinter Review

Jeff Muizelaar [:jrmuizel]

Reporter

Description

•

10 years ago

I expect this will break subpixel AA

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 1

•

10 years ago

subpixel when not using direct2d

tracking-e10s: --- → ?

Jeff Muizelaar [:jrmuizel]

Reporter

Updated

•

10 years ago

OS: Mac OS X → Windows XP

Brad Lassey [:blassey] (use needinfo?)

Updated

•

10 years ago

tracking-e10s: ? → +

Jim Mathies [:jimm]

Comment 2

•

10 years ago

Jeff, is this a serious e10s related bug? Nobody is sure what this is about.

Flags: needinfo?(jmuizelaar)

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 3

•

10 years ago

We create a DIBTexture client here only when in the same process: http://dxr.mozilla.org/mozilla-central/source/gfx/layers/client/TextureClient.cpp#296 This is needed for proper subpixel-aa text and maybe performance. This can probably be fixed by creating a DIBSection around shared memory like we do for plugins.

Flags: needinfo?(jmuizelaar)

Milan Sreckovic [:milan] (needinfo for best results)

Comment 4

•

10 years ago

It would help us if E10S team can figure out which of your milestones needs this so that we can schedule it correctly.

Jim Mathies [:jimm]

Comment 5

•

10 years ago

(In reply to Milan Sreckovic [:milan] from comment #4) > It would help us if E10S team can figure out which of your milestones needs > this so that we can schedule it correctly. It's up to the graphics team, can we live without whatever this is? In triage yesterday we just gave it a plus, implying this is considered post-enable on nightly cleanup fodder.

Milan Sreckovic [:milan] (needinfo for best results)

Comment 6

•

10 years ago

Thanks, that helps, wasn't sure what that plus meant.

cmtalbert

Comment 7

•

10 years ago

Hey Jeff, Can you help me understand if I'm reading this bit of code correctly? It looks like we might have E10S issues on Windows XP sytems where we are using either total software rendering with no direct x support at all, and rendering something with BGR color formats, which is the line of code you reference, and that we might have issues with D3D9 on XP on E10S (the previous if statement right above the one you reference). I see other places where aAllocator->IsSameProcess is used in this method, but the usage in the XP ifdef concerns me most because trying to get real-world preliminary testing in that area is going to be tough. Also, what is a real world case where we would use BGR color space? I can't exactly wrap my head around when we tend to encounter surfaces of that type.

Flags: needinfo?(jmuizelaar)

Jeff Muizelaar [:jrmuizel]

Reporter

Comment 8

•

10 years ago

(In reply to Clint Talbert ( :ctalbert ) from comment #7) > Hey Jeff, > > Can you help me understand if I'm reading this bit of code correctly? It > looks like we might have E10S issues on Windows XP sytems where we are using > either total software rendering with no direct x support at all, and > rendering something with BGR color formats, which is the line of code you > reference, and that we might have issues with D3D9 on XP on E10S (the > previous if statement right above the one you reference). I see other > places where aAllocator->IsSameProcess is used in this method, but the usage > in the XP ifdef concerns me most because trying to get real-world > preliminary testing in that area is going to be tough. > > Also, what is a real world case where we would use BGR color space? I can't > exactly wrap my head around when we tend to encounter surfaces of that type. This is unrelated to Windows XP. The problems should show up anytime we're not using Direct2D. I'm not sure the other cases matter.

Flags: needinfo?(jmuizelaar)

Milan Sreckovic [:milan] (needinfo for best results)

Comment 9

•

9 years ago

Has anything changed on this front, as in what we think we need to do and what we have done?

Flags: needinfo?(jmuizelaar)

Flags: needinfo?(bas)

Bas Schouten (:bas.schouten)

Assignee

Comment 10

•

9 years ago

(In reply to Milan Sreckovic [:milan] from comment #9) > Has anything changed on this front, as in what we think we need to do and > what we have done? No, the basic fix for this is 'in the tree', but when we enable it we see mysterious crashes occur in the wild inside cairo. I have not heard of any way to reproduce them, we haven't attempted to re-enable it and figure out more data from crash reports - we could.

Flags: needinfo?(bas)

Milan Sreckovic [:milan] (needinfo for best results)

Comment 11

•

9 years ago

Is it expected that end up in the "UNSUPPORTED" section of cairo_win32_surface_composite?

Milan Sreckovic [:milan] (needinfo for best results)

Comment 12

•

9 years ago

(In reply to Milan Sreckovic [:milan] from comment #11) > Is it expected that end up in the "UNSUPPORTED" section of > cairo_win32_surface_composite? Hmm, probably is, although I still don't like the "unsupported" part.

Milan Sreckovic [:milan] (needinfo for best results)

Comment 13

•

9 years ago

Read access violation on the surface in the pattern argument to DrawTargetCairo::DrawPattern. Not null, but "bad". Don't know if it's a UAF or some uninitialized value.

Milan Sreckovic [:milan] (needinfo for best results)

Comment 14

•

9 years ago

Attached file Crash stack for the "click on a crash in about:crashes" workflow — Details

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

9 years ago

Flags: needinfo?(jmuizelaar) → needinfo?(bas)

Milan Sreckovic [:milan] (needinfo for best results)

Comment 15

•

9 years ago

Attached patch Manage the mapped data — Details — Splinter Review

(In reply to Milan Sreckovic [:milan] from comment #13) > Read access violation on the surface in the pattern argument to > DrawTargetCairo::DrawPattern. Not null, but "bad". Don't know if it's a > UAF or some uninitialized value. Neither - we use the memory after we've unmapped it; it'd be surprising if that worked :) TextureHostFileMapping::UpdatedInternal unmaps it, after giving it to texture source to hold, and eventually passing it outside through BindTextureSource. I don't know what the design had in mind here - for us to keep the data mapped inside of TextureHostFileMapping or to make a copy in the bind method, or something else? Anyway, something like this works, but I don't know if the data that gets outside of this class lives after the class goes away.

Attachment #8691064 - Flags: feedback?(bas)

Bas Schouten (:bas.schouten)

Assignee

Comment 16

•

9 years ago

Comment on attachment 8691064 [details] [diff] [review] Manage the mapped data Review of attachment 8691064 [details] [diff] [review]: ----------------------------------------------------------------- Hrm, so this is interesting. I wrote this code, the intention was to minimize address space usage by unmapping the file when we no longer need it (i.e. when we're done with the surface), if the texture source apparently can grab a reference internally, I believe we should store userdata on the surface and have that unmap the file when the surface is destroyed. This would maintain the minimizing address space usage in the parent process but at the same time keep the memory around for as long as the DataSourceSurface is. This patch would obviously do the trick, but f- since I'd like to know if that approach gets us the best of both worlds. Nice catch.

Attachment #8691064 - Flags: feedback?(bas) → feedback-

Bas Schouten (:bas.schouten)

Assignee

Updated

•

9 years ago

Flags: needinfo?(bas)

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

9 years ago

Assignee: nobody → bas

Jim Mathies [:jimm]

Updated

•

9 years ago

tracking-e10s: + → m8+

Bas Schouten (:bas.schouten)

Assignee

Comment 18

•

9 years ago

Attached patch Unmap file mapping on source surface destruction. (obsolete) — Details — Splinter Review

Attachment #8691533 - Flags: review?(milan)

Milan Sreckovic [:milan] (needinfo for best results)

Comment 19

•

9 years ago

Comment on attachment 8691533 [details] [diff] [review] Unmap file mapping on source surface destruction. Review of attachment 8691533 [details] [diff] [review]: ----------------------------------------------------------------- This fixes the crash I was seeing (with suitable rebasing to central.) Offline conversation established that we won't have the same map given to multiple surfaces. I am assuming that the TextureHostFileMapping will not outlive the surface that uses the mapped data, including during the shutdown? ::: gfx/layers/TextureDIB.cpp @@ +236,2 @@ > > ::UnmapViewOfFile(fileMapping); I know this isn't part of the change, but shouldn't this be ::UnmapViewOfFile(data) instead?

Attachment #8691533 - Flags: review?(milan) → review+

Bas Schouten (:bas.schouten)

Assignee

Comment 20

•

9 years ago

(In reply to Milan Sreckovic [:milan] from comment #19) > Comment on attachment 8691533 [details] [diff] [review] > Unmap file mapping on source surface destruction. > > Review of attachment 8691533 [details] [diff] [review]: > ----------------------------------------------------------------- > > This fixes the crash I was seeing (with suitable rebasing to central.) > Offline conversation established that we won't have the same map given to > multiple surfaces. > > I am assuming that the TextureHostFileMapping will not outlive the surface > that uses the mapped data, including during the shutdown? If it would we'd see issues in other texture clients I believe. I'm also not sure if you close the handle but not unmap the memory actually goes away but it shouldn't be relevant. > ::: gfx/layers/TextureDIB.cpp > @@ +236,2 @@ > > > > ::UnmapViewOfFile(fileMapping); > > I know this isn't part of the change, but shouldn't this be > ::UnmapViewOfFile(data) instead? Very good catch. Absolutely true. Fixed locally, I will land this patch after Nical lands his queue, I don't want to bitrot him.

Pulsebot

Comment 21

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/b336ed16226d

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 22

•

9 years ago

I had to back this out for various t-e10s failures on Windows: https://hg.mozilla.org/integration/mozilla-inbound/rev/8c813f2aeb7a https://treeherder.mozilla.org/logviewer.html#?job_id=18093143&repo=mozilla-inbound

Flags: needinfo?(bas)

Joel Maher ( :jmaher ) (UTC -8)

Comment 23

•

9 years ago

as a note, this caused a 10% win for tps, the backout saw a regression- looking forward to this landing again!

Bas Schouten (:bas.schouten)

Assignee

Comment 24

•

9 years ago

(In reply to Joel Maher (:jmaher) from comment #23) > as a note, this caused a 10% win for tps, the backout saw a regression- > looking forward to this landing again! I honestly don't see any way whatsoever that this test failure would be related to my patch so it might be a little tricky :).

Flags: needinfo?(bas)

Bas Schouten (:bas.schouten)

Assignee

Comment 25

•

9 years ago

(In reply to Bas Schouten (:bas.schouten) from comment #24) > (In reply to Joel Maher (:jmaher) from comment #23) > > as a note, this caused a 10% win for tps, the backout saw a regression- > > looking forward to this landing again! > > I honestly don't see any way whatsoever that this test failure would be > related to my patch so it might be a little tricky :). I was reading the log poorly, I think I understand the problem better now.

Pulsebot

Comment 26

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/a52fb149b9ee

Phil Ringnalda (:philor)

Comment 27

•

9 years ago

Backed out in https://treeherder.mozilla.org/logviewer.html#?job_id=18156523&repo=mozilla-inbound - it was a little hard to spot since you landed on top of other gfx talos e10s bustage, but https://treeherder.mozilla.org/logviewer.html#?job_id=18160878&repo=mozilla-inbound#L6491 is a pretty good example.

Pulsebot

Comment 28

•

9 years ago

Backout: https://hg.mozilla.org/integration/mozilla-inbound/rev/d49a71d3b20a

Phil Ringnalda (:philor)

Comment 29

•

9 years ago

Yeah, that was what I meant to say.

Bas Schouten (:bas.schouten)

Assignee

Comment 30

•

9 years ago

Attached patch Unmap file mapping on source surface destruction and make UpdateFromSurface work for ShmemDIBTextureData. — Details — Splinter Review

This fixes the issue that was causing the reftest failure.

Attachment #8691533 - Attachment is obsolete: true

Attachment #8694992 - Flags: review?(milan)

Bas Schouten (:bas.schouten)

Assignee

Updated

•

9 years ago

Attachment #8694992 - Flags: review?(milan) → review?(jmuizelaar)

Jeff Muizelaar [:jrmuizel]

Reporter

Updated

•

9 years ago

Attachment #8694992 - Flags: review?(jmuizelaar) → review+

Pulsebot

Comment 31

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/975af6bc574f

Phil Ringnalda (:philor)

Comment 32

•

9 years ago

That's slightly better, in that it fooled me into thinking it was just normal talos-e10s bustage for a little while, but it's still busted. Tree's closed.

Phil Ringnalda (:philor)

Comment 33

•

9 years ago

Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/d11cceb2f732 - Win32-only, t-e10s-only, g1 and s and tp, perhaps https://treeherder.mozilla.org/logviewer.html#?job_id=18418348&repo=mozilla-inbound#L2445 is a clue as to what's failing before the unhelpful "TalosError: timeout".

Phil Ringnalda (:philor)

Comment 34

•

9 years ago

Also busted in Win32 Marionette, with one of two crashes, https://treeherder.mozilla.org/logviewer.html#?job_id=18429747&repo=mozilla-inbound or https://treeherder.mozilla.org/logviewer.html#?job_id=18429748&repo=mozilla-inbound

Milan Sreckovic [:milan] (needinfo for best results)

Comment 35

•

9 years ago

Bas, next time we land this, what do you think about having a preference that lets us "turn it off" - we'd still have it be on by default, but I'd like the ability to turn it off in case we need to investigate possible fallback. (gfx.layers.dib.disable or some such, false by default.)

Bas Schouten (:bas.schouten)

Assignee

Comment 36

•

9 years ago

(In reply to Milan Sreckovic [:milan] from comment #35) > Bas, next time we land this, what do you think about having a preference > that lets us "turn it off" - we'd still have it be on by default, but I'd > like the ability to turn it off in case we need to investigate possible > fallback. (gfx.layers.dib.disable or some such, false by default.) Yeah, once I get it to pass the tests I'll add that.

Pulsebot

Comment 37

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/ccc320a19060

Carsten Book [:Tomcat]

Comment 38

•

9 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/ccc320a19060

Status: NEW → RESOLVED

Closed: 9 years ago

status-firefox46: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla46

Joel Maher ( :jmaher ) (UTC -8)

Updated

•

9 years ago

Depends on: 1234926

Brad Lassey [:blassey] (use needinfo?)

Comment 39

•

9 years ago

Milan, should we uplift this to 45 for our e10s beta experiment?

Flags: needinfo?(milan)

Brad Lassey [:blassey] (use needinfo?)

Updated

•

9 years ago

Whiteboard: [e10s-45-uplift]

Milan Sreckovic [:milan] (needinfo for best results)

Comment 40

•

9 years ago

I would prefer to let it ride the trains; I'm not convinced all that we need is contained in this bug, and this is a minor visual glitch that I can't see really screwing up the e10s beta experiment.

Flags: needinfo?(milan)

Brad Lassey [:blassey] (use needinfo?)

Updated

•

9 years ago

Whiteboard: [e10s-45-uplift]

Ryan VanderMeulen [:RyanVM]

Updated

•

9 years ago

status-firefox45: --- → wontfix

You need to log in before you can comment on or make changes to this bug.