Closed Bug 851963 Opened 11 years ago Closed 10 years ago

[email] iOS mail client attached images may be treated as embedded images that cannot be viewed because there is no referencing text/html part

Categories

(Firefox OS Graveyard :: Gaia::E-Mail, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(tracking-b2g:backlog, b2g-v2.0 affected, b2g-v2.1 fixed)

RESOLVED FIXED
2.0 S4 (20june)
tracking-b2g backlog
Tracking Status
b2g-v2.0 --- affected
b2g-v2.1 --- fixed

People

(Reporter: asuth, Assigned: awissmann)

References

Details

(Whiteboard: [priority])

Attachments

(1 file)

While discussing image attachments with :lightsofapollo, the issue of iOS attached images was broached.  I checked an e-mail I had received from an iPad that had no real body text (the image was described in the subject) other than the iPad signature line.

The message was multipart/mixed with parts like so:
- text/plain
- image/jpeg:
Content-Type: image/jpeg;
	name=photo.JPG
Content-Disposition: inline;
	filename=photo.JPG
Content-Transfer-Encoding: base64
- text/plain (the signature)

This is a (historically speaking) unusual message structure, although arguably far preferable than including a text/html part just to make the image visible.

Our current heuristics in imapchew.js will see the explicit content-disposition: inline and decide that it is a related part.

The simple/obvious fix for this problem is to notice that we have no content-id for the message and clobber the image back into being an attachment.
Depends on: 858663
No longer depends on: 858663
> Anthony Hughes, Mozilla QA (:ashughes) 2013-04-05 10:52:02 PDT
> Depends on: 858663
> Anthony Hughes, Mozilla QA (:ashughes) 2013-04-05 10:53:08 PDT
> No longer depends on: 858663

Sorry, wrong bug number.
blocking-b2g: --- → backlog
Whiteboard: [priority]
Attachment #8434325 - Flags: review?(bugmail)
Comment on attachment 8434325 [details] [review]
My proposed fix, along with a test case

Apologies about the delay on the review; the bootcamp week has been busier than I was expecting and I didn't want to rush this one since there can be subtleties with this.  (Also, kudos on digging in so quickly!)

Since it's not 100% clear to me what the right couI'm gonna do this as a sorta chart thing:rse of action is in all of these cases, I'm going to do a chart thing to help explore the space:

disposition | id? | filename? | mimetype => conditional that triggers (!! if changed) | our resulting disposition and any fixup (!! if changed) | is this even sane?

== Inline Disposition

- inline | yes id | no filename | text/* => 1 | inline | sane-weird
- inline | yes id | no filename | image/* => 1 | inline | sane
- inline | yes id | no filename | non-text, non-image => 1 | inline, fixup to attachment below | very weird but happens

(same results with a filename, sanity varies)
- inline | yes id | yes filename | text/* => 1 | inline | sane-ish
- inline | yes id | yes filename | image/* => 1 | inline | sane
- inline | yes id | yes filename | non-text/* => 1 | inline, fixup to attachment below | not quite as weird but happens

- inline | no id | no filename | text/* => 4 !! was 1 | inline | pretty weird, should just be a body part
- inline | no id | no filename | image/* => 3!! was 1 | attachment because it's non-text !! was inline before but not usefully | arguably super-weird variant on iOS case, could believe Exchange has done it but haven't seen it
- inline | no id | no filename | non-text, non-image => 3!! was 1 | attachment, same as before except previously it was due to the fix-up clause below | insane

- inline | no id | yes filename | text/* => 3!! was 1 | attachment !! was inline | weird but could happen for automated logs; this is arguably an improvement
- inline | no id | yes filename | image/* => 3!! was 1 | attachment !! was inline | iOS/this bug, super happy to have fixed.
- inline | no id | yes filename | non-text, non-image => 3!! was 1 | attachment !! was inline | less weird than the no filename cases, plausibly Exchange might do


== Attachment Disposition

This needs less breakout, especially since there's less debate about intent and our fix-ups just migrates things to be an attachment from non-attachment.

- attachment | no id | no filename | text => 4 !! was 1 | inline !! was attachment | it happens, same situation as next
- attachment | no id | no filename | non-text => 3 !! was 1 | attachment | it happens, same situation as above
- attachment | no id | yes filename | * => 3 !! was 1 | attachment | very usual

- attachment | yes id | no filename | text => 2 !! was 1 | inline !! was attachment | insane, plausible from exchange/outlook
- attachment | yes id | no filename | non-text => 2 !! was 1 | inline !! was attachment | insane, plausible from exchange/outlook
- attachment | yes id | yes filename => 2 !! was 1 | inline !! was attachment | sane-weird, plausible from exchange/outlook

== No disposition given

There is no change to this logic.


==== Conclusion

So, that chart took a lot more work and is a lot more confusing than I expected.  I think my take-away is that we want our conditions to be more like this (noting that the existing code's oddness is my fault):
1) If there's an explicit disposition (case-insensitive)...
1a) of inline:
1a1) If there's an id: inline.  Our fix-up for types we can't handle below this saves us from weirdness.
1a2) No id: attachment
1 [review]b) of attachment: attachment
2 [review]) (no specified disposition, AKA usually a body part), basically the same as clauses 3/4 still
2a) has a filename or is non-text: attachment
2 [review]b) else: inline

And we'd still have our fix-up.

My brain finds this easier to digest since it maps to a relatively straightforward flow chart type of thing.

Assuming this makes sense to you too, please do another rev of the patch, and please add {} braces to the new/revised code since we're trying to clean up the code base as we go.  Feel free to add any additional MIME structures that jump out at you as needing coverage.
Attachment #8434325 - Flags: review?(bugmail)
> So, that chart took a lot more work and is a lot more confusing than I
> expected.  I think my take-away is that we want our conditions to be more
> like this (noting that the existing code's oddness is my fault):
> 1) If there's an explicit disposition (case-insensitive)...
> 1a) of inline:
> 1a1) If there's an id: inline.  Our fix-up for types we can't handle below
> this saves us from weirdness.
> 1a2) No id: attachment
> 1 [review]b) of attachment: attachment
> 2 [review]) (no specified disposition, AKA usually a body part), basically
> the same as clauses 3/4 still
> 2a) has a filename or is non-text: attachment
> 2 [review]b) else: inline
> 
> And we'd still have our fix-up.
> 
> My brain finds this easier to digest since it maps to a relatively
> straightforward flow chart type of thing.
> 
> Assuming this makes sense to you too, please do another rev of the patch,
> and please add {} braces to the new/revised code since we're trying to clean
> up the code base as we go.  Feel free to add any additional MIME structures
> that jump out at you as needing coverage.

I've modified the commit in the original pull request to reflect these comments
Attachment #8434325 - Flags: review?(bugmail)
See Also: → 1024685
Doh. So with the change, our test_mail_html "embedded and remote images" test fails because although we provide a content id we do not provide a content disposition in the test.  This ends up triggering the:
  } else if (filename || partInfo.type !== 'text') {
    disposition = 'attachment';
  }
clause since the type is not text.

Looking at http://tools.ietf.org/html/rfc2387 I then see that section 4 says that:

   With Multipart/Related however, the application processing the
   compound object determines the presentation style for all the
   contained parts.  In that context the Content-Disposition header
   information is redundant or even misleading.  Hence, User Agents that
   understand Multipart/Related shall ignore the disposition type within
   a Multipart/Related body part.

I also see that a multipart/related HTML email sent using Thunderbird 14 that was clearly copied and pasted from an MS Word (v14?) document generated a MIME structure where the images had mime headers on the images like so:

  Content-Type: image/jpeg
  Content-Transfer-Encoding: base64
  Content-ID: <part1.02000808.08090909@mozilla.com>

In contrast, Thunderbird-produced HTML prefers to make things with an explicit disposition and filename that look like:

  Content-Type: image/jpeg;
   name="mozillians.jpg"
  Content-Transfer-Encoding: base64
  Content-ID: <part1.08000001.08070403@mozilla.com>
  Content-Disposition: inline;
   filename="mozillians.jpg"

As discussed at other times, the most optimal solution for the final determination is to check whether the HTML body part includes references to the given content-id or not.  However, we don't have the HTML body at the time we're processing this, so that sanitizer would have to perform a secondary fix-up pass if these heuristics get it wrong.  I've filed bug 1024685 as a viable long-term solution to this.

So we need a change to cover the 1st case (and unit test) case.  I think I'm also realizing it's wrong of us to not be aware of what our parent type is in chewLeaf.  Arguably there should be a second arg to it like parentMultipartType that can be null if there is no enclosing container and would be 'related'/'alternative'/'mixed' otherwise.

So, I now propose that we revise the logic for the non-disposition case to look like this instead:

    // Inline image attachments that belong to a multipart/related may lack a
    // disposition but have a content-id.
    // XXX Ensure 100% correctness in the future by fixing up mis-guesses
    // during sanitization as part of https://bugzil.la/1024685
    } else if (parentMultipartType === 'related' && partInfo.id &&
               partInfo.type === 'image') {
      disposition = 'inline';
    // If there is no explicit disposition, all non-text parts or ones with a
    // filename are made attachments.
    } else if (filename || partInfo.type !== 'text') {
      disposition = 'attachment';
    } else {
      disposition = 'inline';
    }

I would also propose that we also modify the newly failing test case slightly to make relImage_2 have an explicit inline disposition just to get slightly more branch coverage out of that case.

My apologies about not realizing this glitch before.  This is the really ugly part of email.

Also NI'ing :jcranmer since he is arguably one of the most experienced people when it comes to crazy MIME document structures/etc. and may be able to use his extensive MIME structure corpus to better answer how good my proposal is.

It's okay to wait to update your patch for feedback from :jcranmer to reduce turnover.
Flags: needinfo?(Pidgeot18)
Comment on attachment 8434325 [details] [review]
My proposed fix, along with a test case

Thanks for addressing the lint nits already.  Please re-request review once we've heard from :jcranmer and you've fixed up the patch and you get a green test run locally or the travis failures aren't in test_mail_html/MIME related things.
Attachment #8434325 - Flags: review?(bugmail)
Attachment #8434325 - Flags: review?(bugmail)
Attachment #8434325 - Flags: review?(Pidgeot18)
Comment on attachment 8434325 [details] [review]
My proposed fix, along with a test case

Great work on this and sorry about all the iterations.  I see from :jcranmer's IRC nick that he's likely to be in transit.  And his answers may result in a more complex follow-up with more test coverage and/or involve importing some code from jsmime, so let's land this as-is.

I'll do the merging and landing magic for this after dinner.
Attachment #8434325 - Flags: review?(bugmail)
Attachment #8434325 - Flags: review?(Pidgeot18)
Attachment #8434325 - Flags: review+
> I'll do the merging and landing magic for this after dinner.

Awesome, thanks!
(In reply to Andrew Sutherland (:asuth) from comment #7)
> Also NI'ing :jcranmer since he is arguably one of the most experienced
> people when it comes to crazy MIME document structures/etc. and may be able
> to use his extensive MIME structure corpus to better answer how good my
> proposal is.

So I don't actually have a MIME structure corpus. The "corpus" you're probably referring to is my NNTP archive, which is insufficient to ask any questions about text/html or multipart/related. Most of my other knowledge comes from looking at bug reports and other programs to see what interoperability exists.

> As discussed at other times, the most optimal solution for the final
> determination is to check whether the HTML body part includes references to
> the given content-id or not.  However, we don't have the HTML body at the
> time we're processing this, so that sanitizer would have to perform a
> secondary fix-up pass if these heuristics get it wrong.  I've filed bug
> 1024685 as a viable long-term solution to this.

"References" is a bit of a vague term. There's apparently a phishing scam that sends a phishing page attached to an email via an <a href="cid:...."> link, judging from a recent bug.

>     // Inline image attachments that belong to a multipart/related may lack a
>     // disposition but have a content-id.
>     // XXX Ensure 100% correctness in the future by fixing up mis-guesses
>     // during sanitization as part of https://bugzil.la/1024685
>     } else if (parentMultipartType === 'related' && partInfo.id &&
>                partInfo.type === 'image') {

video/* and audio/* are theoretically interesting here, although I don't know how many people are sending <audio> or <video> links in email or even how many clients beyond Thunderbird support those in HTML email. /me glances askew at htemail working group. application/octet-stream throws a wrench in here as well, of course, and there may be a few other interesting MIME types that get shoved into an application/.

My gut reaction is that all direct children of multipart/related ought to default to inline-used files, but, honestly, that's largely based on an assumption that email clients are sane in how they emit messages, which experience has tended to indicate is a fallacious assumption.
Flags: needinfo?(Pidgeot18)
Blocks: 1038836
blocking-b2g: backlog → ---
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: