Closed Bug 796673 Opened 12 years ago Closed 12 years ago

[email/activesync] Download HTML bodies when available

Categories

(Firefox OS Graveyard :: Gaia::E-Mail, defect, P1)

defect

Tracking

(blocking-basecamp:+)

VERIFIED FIXED
B2G C1 (to 19nov)
blocking-basecamp +

People

(Reporter: ghtobz, Assigned: squib)

References

Details

(Keywords: feature, Whiteboard: [label:email][label:feature])

[GitHub issue by mozsquib on 2012-09-18T16:54:18Z, https://github.com/mozilla-b2g/gaia/issues/4868]
ActiveSync only supports getting plaintext messages now. We should fetch the HTML parts when they're available. This is, unfortunately, a bit more complicated than it should be. Some discussion has already happened over at mozilla-b2g/gaia-email-libs-and-more#18
[GitHub comment by mozsquib on 2012-09-20T23:43:21Z]
One complication is that embedded images will work differently from IMAP. It appears that embedded attachments are flagged with `AirSyncBase:IsInline`. Here's an example in full:

```
<Attachment>
  <DisplayName>galaxy.jpg</DisplayName>
  <FileReference>fee80a43-0360-11e2-9388-00237de417b2:1</FileReference>
  <Method>1</Method>
  <EstimatedDataSize>322905</EstimatedDataSize>
  <ContentId>part1.01050208.00060103@gmail.com</ContentId>
  <IsInline>1</IsInline>
</Attachment>
```

@asutherland: it seems that when we have a content ID for IMAP, we store that in the name field. Should we be doing that here, too? Or perhaps we really want a separate `contentId` attribute for our attachments.
[GitHub comment by asutherland on 2012-09-21T00:04:45Z]
@mozsquib Yes, let's add a contentId and change the existing usage.  We do have separate lists for attachments and related parts, so it's okay to have different semantics for name, but since the UX wireframes call for being able to save those inline parts to disk, I do think we should indeed try to preserve the filename when it's available so that we can try and save the file with the name it came with.  I was trying to do the shape normalization thing, so maybe let's just put null for actual attachments, even though it's redundant?  (Or it could just be the last field so although the shape will be different, it won't have an entirely different series of shape transitions.)
Component: Gaia → Gaia::E-Mail
asuth, my plan here is to download the headers first and figure out the preferred type of the message (plain text or HTML). Then I'll download the appropriate body part. Is there a way in the database to store the headers when we don't have the body yet?
Yes, although I would lean towards just not adding either of them until you have both for simplicity, and we could do the split thing in a follow-up.  Details:

The header and body are added via separate calls.  You could definitely do:

- Add header to DB (without snippet? or preliminary snippet)
- Fetch body
- Update header with body (probably required update to UI code since we assume the snippet exists from the word "go")
- Add body to DB

The main background enhancement that you would need is to deal with the possibility that the user manages to click on the header and there is no body yet.  Probably the right API for now would be for getBody to just add the desire to get notified when the body is added to a map, and have the body addition trigger the callback.  In the future, if we decoupled body retrieval, getBody could actually trigger the retrieval.  But that bit is definitely way too much work for now.
Assignee: nobody → squibblyflabbetydoo
More adventures in Gmail compatibility:

1) While it's difficult to figure out if a message is plain text or HTML in Hotmail, it appears to be outright impossible for Gmail.

2) Gmail likes to munge the HTML it gives you so that it matches the HTML that goes into the Gmail web interface. Most notably, this means that image URLs are relative to the Gmail website's URL (https://mail.google.com/mail/u/0/).

I'm not sure what the right way forward is. Maybe we should just give up and always fetch HTML mail...
One more thing: ActiveSync technically *requires* that it give me the original format[1]. Do we think it's at all worthwhile to complain at Microsoft and Google to make Hotmail/Gmail obey the spec? That would make my life a whole lot easier.

[1] http://msdn.microsoft.com/en-us/library/ee218276%28v=exchg.80%29.aspx
(In reply to Jim Porter (:squib) from comment #6)
> 1) While it's difficult to figure out if a message is plain text or HTML in
> Hotmail, it appears to be outright impossible for Gmail.

Can you elaborate on this?  What does hotmail do that we can use that gmail does not do?
 
> 2) Gmail likes to munge the HTML it gives you so that it matches the HTML
> that goes into the Gmail web interface. Most notably, this means that image
> URLs are relative to the Gmail website's URL
> (https://mail.google.com/mail/u/0/).

Freaky, but I'm confused.  Are you saying:

A) It re-writes embedded images so that instead of using cid's with attachments they become external images hosted by google?

B) It re-writes external images to go through google somehow?

C) It pre-sanitizes the messages so that javascript is stripped or nulled out, and all of those many many many div's that gmail likes to wrap HTML e-mail are pre-created?
 
> I'm not sure what the right way forward is. Maybe we should just give up and
> always fetch HTML mail...

This doesn't sound like the end of the world.
(In reply to Andrew Sutherland (:asuth) from comment #8)
> (In reply to Jim Porter (:squib) from comment #6)
> > 1) While it's difficult to figure out if a message is plain text or HTML in
> > Hotmail, it appears to be outright impossible for Gmail.
> 
> Can you elaborate on this?  What does hotmail do that we can use that gmail
> does not do?

Hotmail will return the native type of the message (plain text or HTML) if I ask real nice, but there appears to be no way to do this for Gmail. See https://github.com/mozilla-b2g/gaia-email-libs-and-more/issues/18 for some info on how Hotmail works.

It's extra-frustrating because both Hotmail and Gmail are in violation of the ActiveSync spec, as I mentioned in comment 7 (it's possible they're just conforming to an old version, but of course, Microsoft doesn't give me version history of their specs...)

> > 2) Gmail likes to munge the HTML it gives you so that it matches the HTML
> > that goes into the Gmail web interface. Most notably, this means that image
> > URLs are relative to the Gmail website's URL
> > (https://mail.google.com/mail/u/0/).
> 
> Freaky, but I'm confused.  Are you saying:
> 
> A) It re-writes embedded images so that instead of using cid's with
> attachments they become external images hosted by google?

Yes.

> B) It re-writes external images to go through google somehow?

Maybe, but I don't think so. I haven't tried this yet.

> C) It pre-sanitizes the messages so that javascript is stripped or nulled
> out, and all of those many many many div's that gmail likes to wrap HTML
> e-mail are pre-created?

Yes (though I'm not sure how much gets sanitized).
Right, I remembered the issue 18 thing; it sounds like you're saying gmail does not provide the disambiguating element and if we ask for the body part as one or the other, it will just coerce it for us.

Embedded images becoming external images isn't the end of the world as long as it's a bearer URL and we don't need to authenticate for that HTTP request.  I think it'll be pretty obvious if that's a problem in the app, so no need to research that much.

Let's just ask for HTML always for gmail for now.  It looks like all of the providers on my top-brazil domains list that use google apps for domains also use the recommended gmail DNS MX entries, so I think just special-casing on the google domains may be sufficient.  Other heuristics based on what you said would seem to be check if the disambiguating entry is missing and then just ask for HTML, or use protocol version 12.x as a proxy for using HTML.
Priority: -- → P2
Priority: P2 → --
Status: NEW → ASSIGNED
Priority: -- → P1
Just a quick update for people following along: I have this working perfectly for Hotmail, but Gmail does some fairly rude things to cid: URLs, which makes it fairly difficult to get images to show up in Gmail.

I think I have a workaround that should fix this, but I'm also going to send a message to the Gmail team to see if I can convince them to stop munging the HTML.
Keywords: feature
We're marking this bug with the C1 milestone since it follows the criteria of "unfinished feature work" (see https://etherpad.mozilla.org/b2g-convergence-schedule).

If this work is not finished by Nov19, this bug will need an exception and will be called out at the upcoming Exec Review.
Target Milestone: --- → B2G C1 (to 19nov)
This is fixed in:
  https://github.com/mozilla-b2g/gaia/pull/6222
  https://github.com/mozilla-b2g/gaia-email-libs-and-more/pull/67
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Verified fixed on 11/27 daily build.  html emails load nicely.
Status: RESOLVED → VERIFIED
Unagi Build ID:20130103070201 - fix verified
in-moztrap+ - this is covered by moztrap test case https://moztrap.mozilla.org/manage/cases/?filter-id=2915#caseversion-id-38552
Flags: in-moztrap+
You need to log in before you can comment on or make changes to this bug.