Closed Bug 1123598 Opened 5 years ago Closed 5 years ago
.0][Messages][MMS]The MMS content display obscure about 2 seconds when view a newspaper MMS .
212.40 KB, image/png
2.40 KB, application/x-java-archive
1.56 MB, application/zip
7.64 KB, text/plain
2.28 KB, patch
|Details | Diff | Splinter Review|
Created an attachment (id=1077419) PIC DEFECT DESCRIPTION: ->The MMS content display obscure about 2 seconds when view a newspaper MMS. REPRODUCING PROCEDURES: ->MS received a newspaper MMS. ->Enter"Messages"to view the MMS ->The part of content display obscure about 2 seconds。(ko) note：Fire E1.3 is ok。 EXPECTED BEHAVIOUR: ->Should display clear. ASSOCIATE SPECIFICATION: TEST PLAN REFERENCE: TOOLS AND PLATFORMS USED: USER IMPACT: REPRODUCING RATE:100% For FT PR, Please list reference mobile's behavior:
Maybe caused by APZ. When close APZ, it will not blur when open a very long text mms.
This is caused by the "low precision painting"; basically we use this mode to make it display faster when we have less CPU available. I don't know why it's happening here but it's not unexpected. Hey kats, maybe you can give some more information than I can ?
It should only be happening while scrolling so if it's happening just loading an MMS I would say that it's unexpected. What version of B2G and what device is this happening on? Is it possible to get an MMS database dump that will allow me to reproduce the issue? There's not much to go on in the bug description here.
Note that we use scrollIntoView when we load MMS. Could this trigger this? NI tianm to get a database dump.
Use this data, you can reproduce this bug.
Mozilla Build ID: 20141019000201 Reproduce it on FireE2.0.
Close the "Async Pan/Zoom" and restart the phone, this phenomenon will disappear. The UI will not blur when you open a mms.
Hey Oleg, can you please look whether you reproduce on Flame ? Take care that this is on the v2.0 version.
(In reply to Julien Wajsberg [:julienw] from comment #9) > Hey Oleg, can you please look whether you reproduce on Flame ? > Take care that this is on the v2.0 version. Yes, I see this problem on flame & pvt 2.0 build as well (4 MMS with one image and lots of text), sometimes this artifact doesn't even go away until I do anything that causes repaint. At the first look it seems that it happens when we call scrollIntoView to show the latest message and the same append MMS with parsed image so that scroll automatically tries to go up. And yes, I doesn't see this issue if APZC is disabled. Hey :kats, looks like you can help with this? Thanks!
Flags: needinfo?(azasypkin) → needinfo?(bugmail.mozilla)
Can you tell me how to properly inject the provided database into the messages app? I tried both pushing it to /data/local/storage/persistent/1038+f+app+++sms.gaiamobile.org and overwriting the contents of /data/local/storage/persistent/1010+f+app+++sms.gaiamobile.org (which got created when I started the messages app). I also tried throwing the sqlite file into /data/local/storage/persistent/chrome/idb/. None of these approaches resulted in the message appearing when I started the messages app.
Flags: needinfo?(bugmail.mozilla) → needinfo?(azasypkin)
(In reply to Kartikaya Gupta (email:firstname.lastname@example.org) from comment #11) > Can you tell me how to properly inject the provided database into the > messages app? I tried both pushing it to > /data/local/storage/persistent/1038+f+app+++sms.gaiamobile.org and > overwriting the contents of > /data/local/storage/persistent/1010+f+app+++sms.gaiamobile.org (which got > created when I started the messages app). I also tried throwing the sqlite > file into /data/local/storage/persistent/chrome/idb/. None of these > approaches resulted in the message appearing when I started the messages app. I haven't tried it, I've just created MMS from scratch. But you can use my profile (you'll have to unzip it) and profile restore script from b2g-flash-tool folder, eg. './backup_restore_profile.sh -r -p ./mozilla-profile-test/'. Just backed up and restored on another device - works fine.
Thanks, with that I can reproduce on the latest 2.0 and 2.1 pvtbuilds. Doesn't happen on 2.2 or 3.0, though, so it's likely been fixed already. I'm not sure offhand when this was fixed, so requesting a fix window on 2.2.
[Blocking Requested - why for this release]:
blocking-b2g: --- → 2.1?
(I don't know if we can still have 2.0 blockers so I'm asking 2.1 only)
b2g-inbound fixed window: Last Broken Environmental Variables: Device: Flame BuildID: 20140914230807 Gaia: 2f5a1aff6b39f6e8e03e3a69c05d57d5b0987409 Gecko: 470ad786cb72 Version: 35.0a1 (2.2 Master) Firmware: V18D-1 User Agent: Mozilla/5.0 (Mobile; rv:35.0) Gecko/35.0 Firefox/35.0 First Working Environmental Variables: Device: Flame BuildID: 20140914234307 Gaia: 855be6ade407c26e0596e7306a44deebc3f60933 Gecko: f27ff178807d Version: 35.0a1 (2.2 Master) Firmware: V18D-1 User Agent: Mozilla/5.0 (Mobile; rv:35.0) Gecko/35.0 Firefox/35.0 First Working Gaia & Last Broken Gecko - issue does NOT repro Gaia: 855be6ade407c26e0596e7306a44deebc3f60933 Gecko: 470ad786cb72 First Working Gecko & Last Broken Gaia - issue DOES repro Gaia: 2f5a1aff6b39f6e8e03e3a69c05d57d5b0987409 Gecko: f27ff178807d Gaia pushlog: https://github.com/mozilla-b2g/gaia/compare/2f5a1aff6b39f6e8e03e3a69c05d57d5b0987409...855be6ade407c26e0596e7306a44deebc3f60933 Fixed in central 2.2 by patches to Bug 983172.
QA Whiteboard: [QAnalyst-Triage?]
Steve, can you take a look at this please? Looks like the work done on Bug 983172 may be the culprit here.
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker) → needinfo?(schung)
Please disregard Comment 17. Steve, looks like this was fixed in central 2.2. Can you look into uplifting that fix for bug 983172 to 2.1?
(In reply to KTucker [:KTucker] from comment #18) > Please disregard Comment 17. Steve, looks like this was fixed in central > 2.2. Can you look into uplifting that fix for bug 983172 to 2.1? I'll request 2.1 approval in bug 983172. It did help to reduce CPU and memory consumption while opening/rendering images. So this patch might be able to avoid low precision painting issue. Just wondering if it's still possible to reproduce when there're lots of images in the thread.
Hey Kats, looks like some changes we made in the SMS application in bug 983172 made the problem disappear. Still this means the underlying Gecko issue is likely not fixed. And the patch in bug 983172 is quite scary and had several regressions, so if we can find a simpler Gecko patch I'd be more at ease, especially that the partner will likely want to uplift to v2.0. In the patch in bug 983172 we made the following changes: * we're using blob url instead of data url * we're using mozsamplesize to display these blob-based images. Thank you for your help!
triage: we think it's a very bad user experience, and would like to block. there was a fix, but per comment 20 there's risk needs to be considered. As both 2.1/2.0 are CC, we're forwarding the final call to RM.
(In reply to Julien Wajsberg [:julienw] from comment #20) > Hey Kats, > > looks like some changes we made in the SMS application in bug 983172 made > the problem disappear. Still this means the underlying Gecko issue is likely > not fixed. And the patch in bug 983172 is quite scary and had several > regressions, so if we can find a simpler Gecko patch I'd be more at ease, > especially that the partner will likely want to uplift to v2.0. Yeah that's fair. If this is taken as blocking for 2.0 or 2.1 then I can try to debug it directly on those branches. I can't guarantee finding a simple enough fix, but I can look into it. I'll wait for :bajaj to decide if it's worth doing this or not.
(In reply to Kartikaya Gupta (email:email@example.com) from comment #22) > (In reply to Julien Wajsberg [:julienw] from comment #20) > > Hey Kats, > > > > looks like some changes we made in the SMS application in bug 983172 made > > the problem disappear. Still this means the underlying Gecko issue is likely > > not fixed. And the patch in bug 983172 is quite scary and had several > > regressions, so if we can find a simpler Gecko patch I'd be more at ease, > > especially that the partner will likely want to uplift to v2.0. > > Yeah that's fair. If this is taken as blocking for 2.0 or 2.1 then I can try > to debug it directly on those branches. I can't guarantee finding a simple > enough fix, but I can look into it. I'll wait for :bajaj to decide if it's > worth doing this or not. I am not sure of uplifting any patch to 2.0 at this point given we are not sure of the risk of uplift of the patch or have a solution here. Also, most of the 2.0 builds are out there. I'll investigate to find if this is worth uplifting to 2.0 once we have a patch. For 2.1, I think this is definitely worth fixing. But before jumping into blocking this, I had QA try to reproduce this issue on by simulating the same testcase with MMS with lots of tests and they are not able to repro it. Although when the profile attached in this bug was used its pretty easy to reproduce and makes you feel like a bug we would not to ship with given the test is blurred for 2 secs. Also note it it happens for 2 seconds everytime you open the message thread. Do we know whats so unique about the profile and if that could be a common usecase?
I don't know about this profile but what I know for sure is that I feel like this happens to me when I simply browse the web. What _could_ happen (Kats tell me if I'm crazy) is that with the code in v2.0 and v2.1 we're doing quite expensive operations to resize the images. We changed this code in v2.2 (that's bug 983172) and we're not resizing the images anymore, or more exactly we don't do this in the Gaia code anymore and let Gecko do this in a way more efficient way. So because we're doing expensive operations, maybe we are really running out of available CPU resources, and as a result we trigger the low resolution painting. If this is what's happening, then I don't see a lot of solutions: * actually uplift bug 983172 * try to adjust some thread priorities * do a specific patch that would delay resizing images for hidden messages If this is what's happening, then I'd uplift bug 983172 (+ its regression fix), because this is the safest.
(In reply to Julien Wajsberg [:julienw] from comment #24) > So because we're doing expensive operations, maybe we are really running out > of available CPU resources, and as a result we trigger the low resolution > painting. I don't think this is likely. The low-res painting doesn't use available CPU resources as a condition on whether or not to trigger. I think more likely is that it's a race condition or similar in the low-precision painting code, and it happens to be reliably reproducible using this profile and the code as it was in 2.1. Changing the code worked around the race in 2.2. I'll do a 2.1 build and see if I can track down the problem here. If I'm still able to repro the problem on a local build it shouldn't be hard to figure out what's going on. Fixing it might fix other cases where this randomly appears as well.
Assignee: nobody → bugmail.mozilla
Per offline discussion, I'm here to post my findings on how often we can encounter this in real life usage. I was unable to manually create SMS's that could reproduce the issue as the attached profile on comment 12. I mimicked its message style and created long messages with text and images and was unable to reproduce the issue. There seems to be something unique about the provided profile's MMS composition where it would have enough text & image and would not have too much contents so it would be able to scroll to the end of the message (if the thread is too long then it won't scroll to the end upon opening) and would display the blurriness. I then tried to find newspaper websites that would re-create the article in a message and was unsuccessful as well. Most of the newspaper websites only provide a link to the article, and not the whole content of the article. And most of those websites provide sharing via very specific social media websites such as facebook or twitter. I did find one that allows me to send an email with the whole article but the Flame device doesn't know how to open the message. Finally I tried to create a long email with an image and sent the email as an MMS to the Flame device. In this way I was able to reproduce the issue, but the issue seems less severe than the provided profile at comment 12. I experienced blurriness of messages thread for about a second, as opposed to using the profile it creates a permanent blurriness until there's movement on the page. I did encounter this short blurriness of 1 to 2 seconds when finding the window and I regarded it as reproducing the issue, since that's what the original bug is about. My conclusion is this is not a very common use scenario in the US (sending news articles via MMS), although it seems to be common in other areas. And when manually reproducing the issue it seems less severe because it recovers pretty fast. Unless the user is particularly looking for it, they won't see it as a big issue. It is worth fixing but not worth risking breaking other stuff in fixing it.
triage: non-blocking. please ask for approval and let the sherif deicde based on risk
blocking-b2g: 2.1? → backlog
I was able to reproduce this on 2.1; tiling log attached. The problematic bit is where it aborts the high-res paint because it thinks we're about to checkerboard. The painted rect is (x=0.000000, y=1535.983276, w=320.000000, h=462.016724) and the showing rect is (x=0.000000, y=1483.000000, w=320.000000, h=515.000000), and yet painted.Contains(showing) returns false at . If you do the check manually you'll see that y+h for the painted rect is exactly equal to the y+h for the showing rect, but presumably the log is printing truncated values and the the floating-point values actually stored in memory don't add up quite so neatly.  http://mxr.mozilla.org/mozilla-b2g34_v2_1/source/gfx/layers/client/TiledContentClient.cpp?rev=9148cd599e9f#274
Oh wait I'm wrong. The top of the painted rect is lower than the top of the showing rect, so that's why the check is false. There might /also/ be a rounding error but there might not be. I'll keep digging.
Ok, so the problem seems to happen because the critical displayport is not updated after the content scrollTo's to the bottom of the messages screen. The scroll offset is initially (0,0) so the displayport margin is all on the "bottom" side, and then we jump to the bottom of the content, scroll offset (0, 1549.67), and the margins never get updated. The top end of the margin gets rounded up to the nearest tile which is 13.67 pixels away but that's not enough to cover the "danger zone" area. This results in AboutToCheckerboard to detect a checkerboarding condition. While reproducing this on 2.1, I noticed it wouldn't happen the first few times I opened the messages app. The reason it didn't happen on those instances is because the scroll offset would first go from (0,0) to (0, 337.667) and then from there it would go to (0, 1549.67). The intermediate jump allows the margin to shift to the "top" side enough that it covers the danger zone and there's no problem. Basically if I'm understanding this correctly the fix is to make sure that we recompute margins in the APZ and push them to layout any time there's a scroll offset change that we get from layout. There's a comment in the code at  which implies this always happens, but if you look inside the CancelAnimation code it only happens if we are in an overscrolled state.  http://mxr.mozilla.org/mozilla-central/source/gfx/layers/apz/src/AsyncPanZoomController.cpp?rev=992050dc1e5f#2803
I think the root cause of this bug (i.e. APZC doesn't recompute margins if web content does a scrollTo) still exists in master, although I wasn't able to construct a test case that reproduces this problem. There's enough state in the high/low-precision drawing code that it manages to take a different good path out of there with everything I attempted. Nonetheless I can see that the margins don't get recomputed and that's obviously bad because then we'll checkerboard sooner than we should, even if we don't get stuck in a perma-low-res state. I'll put up a patch to fix this.
Comment on attachment 8557112 [details] [diff] [review] Ensure we do a repaint after scroll position changes Review of attachment 8557112 [details] [diff] [review]: ----------------------------------------------------------------- Is it necessary to repaint in this case, or would it be sufficient to just compute new displayport margins and apply them to the next repaint (triggered by something else)? If so, would it be useful to have a mechanism to separate the two (computing displayport and repainting)?
Attachment #8557112 - Flags: review?(botond) → review+
For the scenario this bug was filed for it is necessary to send those margins to layout right away, otherwise we get stuck with low-precision rendered content visible to the user.
Component: Gaia::SMS → Panning and Zooming
Product: Firefox OS → Core
(In reply to Kartikaya Gupta (email:firstname.lastname@example.org) from comment #34) > For the scenario this bug was filed for it is necessary to send those > margins to layout right away, otherwise we get stuck with low-precision > rendered content visible to the user. Does that mean that there was already a repaint (namely, the one for which we are processing the layers transaction) with a suboptimal displayport? If so, that makes me wonder whether we'd be better off doing the redistribute-displayport-to-the-other-direction on the layout side (in GetDisplayPortFromMarginsData) instead, so we paint the displayport we want to begin with.
(In reply to Botond Ballo [:botond] from comment #36) > Does that mean that there was already a repaint (namely, the one for which > we are processing the layers transaction) with a suboptimal displayport? Yes > If > so, that makes me wonder whether we'd be better off doing the > redistribute-displayport-to-the-other-direction on the layout side (in > GetDisplayPortFromMarginsData) instead, so we paint the displayport we want > to begin with. Yeah, this might make sense. I used to think that computing the displayport was really the APZ's job but I think that's a holdover from my Fennec-inspired worldview where we had complicated heuristics to position the displayport based on velocity and so on. Given bugs like this one and the stuff dvander's been running into it might make more sense to move more or all of the displayport computation logic into the main-thread side of things.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla38
Hey kats, do you think it makes sense to try uplifting this to v2.1 and v2.2 ?
Comment on attachment 8557112 [details] [diff] [review] Ensure we do a repaint after scroll position changes NOTE: Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings. [Approval Request Comment] Bug caused by (feature/regressing bug #): APZ User impact if declined: in some rare cases we end up in a state with blurry content visible. This may happen for example after a scrollTo by web content (in some cases only) Testing completed: locally on 2.1 and master builds Risk to taking this patch (and alternatives if risky): low risk, patch is small String or UUID changes made by this patch: none
Attachment #8557112 - Flags: approval-mozilla-b2g37?
Attachment #8557112 - Flags: approval-mozilla-b2g37+
Attachment #8557112 - Flags: approval-mozilla-b2g34?
Attachment #8557112 - Flags: approval-mozilla-b2g34+
Attachment #8557112 - Flags: approval-mozilla-b2g32?
Attachment #8557112 - Flags: approval-mozilla-b2g32-
You need to log in before you can comment on or make changes to this bug.