Closed Bug 674619 Opened 14 years ago Closed 14 years ago

cmyk image with adobe marker not converted properly

Categories

(Core :: Graphics, defect)

defect
Not set
normal

Tracking

()

RESOLVED INCOMPLETE
mozilla10

People

(Reporter: sbarman, Assigned: yury)

Details

(Whiteboard: [pdfjs])

Attachments

(8 files, 9 obsolete files)

26.46 KB, image/jpeg
Details
271.00 KB, image/png
Details
26.45 KB, image/jpeg
Details
138.25 KB, image/png
Details
5.57 KB, text/x-c++src
Details
26.47 KB, image/jpeg
Details
19.32 KB, patch
joe
: checkin-
Details | Diff | Splinter Review
11.13 KB, patch
joe
: checkin+
Details | Diff | Splinter Review
I am extracting an image out of a pdf and using the following code: var img = new Image(); img.src = 'data:image/jpeg;base64,' + window.btoa(bytesToString(bytesArr)); to display it onto the screen (using a Canvas element), although the same problem happens when displaying the extracted image just using firefox. img_original.jpg contains the extracted image data (directly from the pdf). img_noadobemarker.jpg contains is the same as img_original except for the extra adobe marker removed. The adobe marker signifies that the decoder should perform a YCCK-to-CMYK image conversion. img_correct.png is a properly decoded version of the jpeg (as done by poppler). I have tried decoding img_original.jpg with firefox on ubuntu, windows and mac and all display a wrong image. I have also tried decoding it with chrome on ubuntu and get the same results. When I display img_noadobemarker.jpg in firefox, I get an image with a green tint. But when I display it in eog, I get the picture in img_noadobemarker.png. I believe this is a colorspace issue, with one of more of the color transforms not being applied. Attached is the source code which poppler uses to decode the jpeg. They use libjpeg to do most of the hard work.
Attached image Correctly decoded image
Try with the following build: https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/b56girard@gmail.com-661b4dd715a4/ - Recommended: Create a new profile - Set the property: |gfx.color_management.force_v4| - Restart firefox
Actually this image does not have a color profile, so the above step wont help. Moving to Graphics.
Component: GFX: Color Management → Graphics
QA Contact: color-management → thebes
I found the root of the bug in the following file. http://mxr.mozilla.org/mozilla-central/source/modules/libpr0n/decoders/nsJPEGDecoder.cpp#1172 libjpegturbo is used to grunt of the decoding. But nsJPEGDecoder assumes that the output of libjpegturbo is in inverted CMYK and does a conversion from inverted CMYK -> RGB. I edited the build so that the conversion was from CMYK -> RGB, instead, and the image displayed properly. So I tried the modified build and it seems for the CMYK JPEGs I found online, they do decode to an inverted CMYK form. But it seems like JPEGs with the proprietary Adobe marker (which is implemented by libjpeg) decode to CMYK and do not need the inversion.
(In reply to comment #7) > I found the root of the bug in the following file. > > http://mxr.mozilla.org/mozilla-central/source/modules/libpr0n/decoders/ > nsJPEGDecoder.cpp#1172 > > libjpegturbo is used to grunt of the decoding. But nsJPEGDecoder assumes > that the output of libjpegturbo is in inverted CMYK and does a conversion > from inverted CMYK -> RGB. I edited the build so that the conversion was > from CMYK -> RGB, instead, and the image displayed properly. > > So I tried the modified build and it seems for the CMYK JPEGs I found > online, they do decode to an inverted CMYK form. But it seems like JPEGs > with the proprietary Adobe marker (which is implemented by libjpeg) decode > to CMYK and do not need the inversion. Are you saying that the behaviour of libjpegturbo is different from libjpeg?
No. I should have said that both libjpeg and libjpegturbo implement the proprietary Adobe marker. The problem is that nsJPEGDecoder assumes that if "mInfo.out_color_space == JCS_CMYK" then the output format is really inverted CMYK.
Strange thing, the images: ./modules/libpr0n/test/reftest/jpeg/jpg-cmyk-1.jpg ./modules/libpr0n/test/reftest/jpeg/jpg-cmyk-2.jpg when browsed via firefox look like: ./modules/libpr0n/test/reftest/jpeg/jpg-cmyk-1.png ./modules/libpr0n/test/reftest/jpeg/jpg-cmyk-2.png, and when browsed via OS they look different. Both of those jpegs have adobe's markers. With attachment 548849 [details] it's vice versa.
That's consistent with the https://bugzilla.mozilla.org/show_bug.cgi?id=44781#c70 statements. Difference with the attachment 548849 [details] and (e.g.) http://mxr.mozilla.org/mozilla-central/source/modules/libpr0n/test/reftest/jpeg/jpg-cmyk-1.jpg is the presence of the JFIF header. The JFIF header is present for jpg-cmyk-1.jpg; and version for this header is 0x0101 (current JFIF spec is 0x0102). Adobe markers are present in both files.
Found the following rules for the Adobe markers (from http://download.oracle.com/javase/6/docs/api/javax/imageio/metadata/doc-files/jpeg_metadata.html): If an Adobe APP14 marker segment is present, the colorspace is determined by consulting the transform flag. The transform flag takes one of three values: 2 - The image is encoded as YCCK (implicitly converted from CMYK on encoding). 1 - The image is encoded as YCbCr (implicitly converted from RGB on encoding). 0 - Unknown. 3-channel images are assumed to be RGB, 4-channel images are assumed to be CMYK. Looks like only transform=0 is implemented. transform=1 is default JPEG/JFIF mode. The transform=2 is missing.
Assignee: nobody → async.processingjs
OS: Linux → All
Hardware: x86 → All
Version: unspecified → Trunk
Whiteboard: [pdfjs]
The libjpeg-turbo outputs the CMYK as inverse values from 255 to 0. The ycck_cmyk_convert function outputs the CMYK values in normal range from 0..255. The patch forces ycck_cmyk_convert and cmyk_ycck_convert operate in the inverse values range. Not sure how to proceed here since it looks like upstream bug. Sent the same patch to https://sourceforge.net/tracker/?func=detail&aid=3403800&group_id=303195&atid=1278160 .
I really don't think that the problem is with libjpeg-turbo. The problem lies in libpr0n, which inverts whatever libjpeg-turbo returns if the colorspace is YCCK or CMYK. In particular, look function static void cmyk_convert_rgb(JSAMPROW row, JDIMENSION width) in ./modules/libpr0n/decoders. This function explicitly inverts the image, and if the inversion is commented out then the test image is displayed correctly (although other images on the web do not). As far as I can tell, libjpeg-turbo functions similar to libjpeg which is what poppler uses to render jpegs. I believe libpr0n is the real offender and what the patch does is invert the image twice.
I thought that too. The libpr0n relies on the libjpeg to output the CMYK color space , which is inverted for Adobe images transform, no other programs outputs CMYK. Look at http://mxr.mozilla.org/mozilla-central/source/jpeg/jdcolor.c#390. If libjpeg-turbo team will disagree, we can extend the linpr0n to request and process YCCK color space and perform YCCK->CMYK conversion there.
Comment on attachment 558091 [details] [diff] [review] Fixes the libjpeg-turbo YCCK->CMYK logic Can you fix the tabs/formatting? I think we just ask a module owner for review and land this while the upstream fix is being applied.
Attachment #558091 - Flags: review?(joe)
Attachment #558091 - Attachment is obsolete: true
Attachment #558091 - Flags: review?(joe)
Attachment #558096 - Flags: review?(joe)
I still think you are inverting the components at the wrong place. It seems like you are inverting when YCCK -> CMYK (by saying that R=C, G=M, and B=Y, which doesn't make any sense to me). Anyways, my feeling is that this will break if the pdf has a CMYK image which isn't encoded using YCCK (which I don't think we currently have), since this code would not get called and the image would then be inverted by libpr0n.
Comment on attachment 558096 [details] [diff] [review] Fixes the libjpeg-turbo YCCK->CMYK logic (v2) I am going to admit I have no clue what you guys are talking about, but we should achieve consensus and have the two of you agree before we push this to joe for review. Deal? :)
Attachment #558096 - Flags: review?(joe)
Yeah, thats fine. So I think we both agree that the image is getting inverted somewhere it shouldnt be inverted. If I understand correctly, Yury thinks its in libjpeg-turbo while I think its in libpr0n. Now my reasons for believing its in libpr0n is that: -the inversion code in there is only applied to cmyk images (commenting the code out makes the image appear correctly) -poppler uses libjpeg without doing any transforms to the output. The way poppler works is that it reads the adobe marker if presents and sets the colorspace on that. There is also an option to override the colorspace from the call code. -I assume libjpeg-turbo implements the conversions as libjpeg (in that they are interchangeable) From what I can tell of the patch, it wants to correct the inversion when the YCCK values are transformed to CMYK values inside of libjpeg-turbo. So I think this approach won't work because: -This won't apply to CMYK images which are not encoded. -If you change libjpeg-turbo, presumable libjpeg should change which would affect poppler and other applications. I'd be convinced if it could be shown that the patch fixes CMYK images in pdfs (for both jpegs using the YCCK transform and not using it) and leaves intact those CMYK jpegs that are online.
(In reply to Shaon Barman from comment #20) > I'd be convinced if it could be shown that the patch fixes CMYK images in > pdfs (for both jpegs using the YCCK transform and not using it) and leaves > intact those CMYK jpegs that are online. After applying the patch, the http://mxr.mozilla.org/mozilla-central/source/modules/libpr0n/test/reftest/jpeg/jpg-cmyk-1.jpg image is not affected (this image has no YCCK transform), however it's CMYK image. The all CMYK jpegs belong to the Adobe products AFAIK. In our case, libjpeg-turbo reads Adobe markers and makes right call about transform (YCCK or CMYK). libpr0n requests image in CMYK color space, for YCCK the libjpeg-turbo returns normal CMYK, but for CMYK libjpeg-turbo returns inverted CMYK. Which makes it weird to users of the library to check Adobe marker again to invert the values. Why inverted CMYK? I'm taking in account that Adobe intended to return inverted CMYK when no transform applied (transform=0), CMYK is subtractive color model and it's way easier to do |iC * iK| instead of |1 - (C * (1 - K) + K)|.
Response from upstream: "This code is not unique to libjpeg-turbo. libjpeg does the same thing. Thus, I cannot apply this patch without breaking compatibility with libjpeg. Whether or not it's correct for libjpeg to output inverted CMYK, that is what it has been doing since the 1990's, so libjpeg-turbo cannot diverge from that. I have read the comments on Mozilla.org as well. What I will say is: if this is a unique problem with libjpeg-turbo that does not occur with the unaccelerated libjpeg, then it's our problem. Otherwise, it's Mozilla's problem." Looking for workaround.
That one affects only libpr0n. This is a workaround that is based on the 11-year old libjpeg behavior: (In reply to Yury (:yury) from comment #21) > In our case, libjpeg-turbo reads Adobe markers and makes right call about > transform (YCCK or CMYK). libpr0n requests image in CMYK color space, for > YCCK the libjpeg-turbo returns normal CMYK, but for CMYK libjpeg-turbo > returns inverted CMYK. The patch shall not effect existing images in the reftests.
Attachment #558096 - Attachment is obsolete: true
Attachment #558113 - Flags: feedback?(sbarman)
From http://www.jpegcameras.com/libjpeg/libjpeg-3.html: "... it appears that Adobe Photoshop writes inverted data in CMYK JPEG files: 0 represents 100% ink coverage, rather than 0% ink as you'd expect. This is arguably a bug in Photoshop, but if you need to work with Photoshop CMYK files, you will have to deal with it in your application. We cannot "fix" this in the library by inverting the data during the CMYK<=>YCCK transform, because that would break other applications, notably Ghostscript. Photoshop versions prior to 3.0 write EPS files containing JPEG-encoded CMYK data in the same inverted-YCCK representation used in bare JPEG files, but the surrounding PostScript code performs an inversion using the PS image operator. I am told that Photoshop 3.0 will write uninverted YCCK in EPS/JPEG files, and will omit the PS-level inversion. (But the data polarity used in bare JPEG files will not change in 3.0.) In either case, the JPEG library must not invert the data itself, or else Ghostscript would read these EPS files incorrectly."
A couple of quick questions (although I admit I am a bit confused at this point, so correct if I am wrong). Also, I am assuming this whole inversion is because adobe photoshop writes cmyk images with their components inverted. 1)Do the inverted photoshop images contain the Adobe APP14 marker segment? 2)Images embedded in pdfs are not inverted. This patch seems to only not invert images encoded in YCCK colorspace (which is not the same behaviour at the poppler code). Won't embedded jpeg images encoded in CMYK be inverted still?
So far I found out that: 1. Almost every article refers CMYK JPEG images as Adobe JPEGs (never saw non-Adobe CMYK images); 2. Those images where saved by older Photoshop's and those images have the marker, and that marker has color transform code (0 or 2); 3. The CMYK images may have additional transform to YCCK, which indicated by the color transform code = 2, otherwise it's CMYK; 4. for color transform = 0, libjpeg outputs decoded CMYK data as is without changes, but Adobe intended to use that data as inverted CMYK; 5. for color transform = 2, libjpeg outputs the CMYK data calculated based on YCCK values (see Adobe's tech. note #5116), which produces normal CMYK; The image you attached has color transform code = 2, the existing images in the Mozilla's reftest color transform code = 0. (In reply to Shaon Barman from comment #25) > 1)Do the inverted photoshop images contain the Adobe APP14 marker segment? Yes. > 2)Images embedded in pdfs are not inverted. This patch seems to only not > invert images encoded in YCCK colorspace (which is not the same behaviour at > the poppler code). Won't embedded jpeg images encoded in CMYK be inverted > still? I only have the PDFs that have YCCK encoded (with non-inverted CMYK). I don't know if poppler handles CMYK images w/o YCCK transform properly.
Just got a hold of the Photoshop for a second: the image in the attachment was opened as inverted one (as in current Firefox). That means that the attachment 548849 [details] is invalid Adobe JPEG file or test. (In reply to Yury (:yury) from comment #24) > Photoshop versions prior to 3.0 write EPS files containing JPEG-encoded CMYK > data in the same inverted-YCCK representation used in bare JPEG files, but > the surrounding PostScript code performs an inversion using the PS image > operator. I am told that Photoshop 3.0 will write uninverted YCCK in > EPS/JPEG files, and will omit the PS-level inversion. (But the data polarity > used in bare JPEG files will not change in 3.0.) The file above was extracted from the PDF, so it's not a "bare" JPEG. From a user point of view, the standalone image must be displayed as it displayed by the application/editor that created it. I recommend to WONTFIX this ticket, unless we want to benefit pdf.js by assuming nobody will ever use JPEG YCCK encoded image.
This patch allows FF to use normal CMYK data when 'EMBED' marker is found. This marker is not yet introduced application data segment that will have the following data: APP12 'EMBED\x00' (in hex: FF EC 00 08 45 4D 42 45 44 00) The pdf.js (or similar projects) can insert this information to indicate that JPEG resource originated from the embedded data.
Attachment #559697 - Flags: feedback?(sbarman)
(In reply to Yury (:yury) from comment #11) > That's consistent with the > https://bugzilla.mozilla.org/show_bug.cgi?id=44781#c70 statements. Forward-looking comments ftw! :) Anyway, I do hope the final patch for this bug includes new reftests for the kinds of images we're not currently handling properly.
Embedded CMYK and YCCK images are encoded using normal CMYK values (vs inverted in regular files).
Attachment #559697 - Attachment is obsolete: true
Attachment #559697 - Flags: feedback?(sbarman)
Attachment #563624 - Flags: feedback?(sbarman)
The reporter is unavailable for feedback or slow to respond. Pushing for review.
Attachment #558113 - Attachment is obsolete: true
Attachment #559947 - Attachment is obsolete: true
Attachment #563624 - Attachment is obsolete: true
Attachment #558113 - Flags: feedback?(sbarman)
Attachment #563624 - Flags: feedback?(sbarman)
Attachment #567361 - Flags: review?(joe)
(In reply to Yury (:yury) from comment #33) > Created attachment 567361 [details] [diff] [review] [diff] [details] [review] > Changing the libpr0n for normal CMYK colors when EMBED marker found > > The reporter is unavailable for feedback or slow to respond. Pushing for > review. The changes look good. I think added the extra marker should fix this issue for firefox, but whats the story with compatibility with other browsers? Alos, will there be any other projects besides pdf.js which uses this tag?
We can file a webkit bug and see what they think. I definitely think we should take this patch. Joe is probably good for reviewing this.
Comment on attachment 567361 [details] [diff] [review] Changing the libpr0n for normal CMYK colors when EMBED marker found Review of attachment 567361 [details] [diff] [review]: ----------------------------------------------------------------- r=me with nits fixed ::: modules/libpr0n/decoders/nsJPEGDecoder.cpp @@ +100,5 @@ > > return profile; > } > > +#define JPEG_EMBED_MARKER (JPEG_APP0 + 12) Is this also defined in tech note 5116? (I can't find that tech note on the internet.) @@ +107,5 @@ > + * Introducing the EMBED marker for the Adobe's CMYK images. There is > + * an issue when the image with YCCK components is saved as a standalone one > + * the CMYK value are inverted, however when the image in saved within > + * the file (e.g. PDF or EPS) it contains normal values (as described in > + * the vendor's technote #5116). A tweak of the text here: Check for the EMBED marker for Adobe's CMYK images. When Adobe products save an image with YCCK components standalone, the CMYK values are inverted; however, when an image is embedded within a PDF or EPS, it contains normal values. This is described in Adobe tech note #5116. @@ +109,5 @@ > + * the CMYK value are inverted, however when the image in saved within > + * the file (e.g. PDF or EPS) it contains normal values (as described in > + * the vendor's technote #5116). > + */ > +static bool IsEmbedMarkerPresent(struct jpeg_decompress_struct *info) The indentation in this function looks sort of weird. In this file, it should be two spaces. @@ +112,5 @@ > + */ > +static bool IsEmbedMarkerPresent(struct jpeg_decompress_struct *info) > +{ > + jpeg_saved_marker_ptr marker; > + // Looking for the APP12 marker that has only 'EMBED\x00' in its data. Is using \x00 more canonical in JPEG contexts? If not, just use \0. @@ +656,5 @@ > if (mInfo.out_color_space == JCS_CMYK) { > /* Convert from CMYK to RGB */ > /* We cannot convert directly to Cairo, as the CMSRGBTransform may wants to do a RGB transform... */ > /* Would be better to have platform CMSenabled transformation from CMYK to (A)RGB... */ > + /* When EMDEB marker is present, the conversion produces the normal (not inverted) CMYK values. */ Typo: EMDEB instead of EMBED. @@ +1208,2 @@ > /* > + * Input is CMYK stored as 4 bytes per pixel. Leave this as saying (Inverted) CMYK, and add a similar comment above cmyk_convert_rgb.
Attachment #567361 - Flags: review?(joe) → review+
(In reply to Joe Drew (:JOEDREW!) from comment #36) > > +#define JPEG_EMBED_MARKER (JPEG_APP0 + 12) > > Is this also defined in tech note 5116? (I can't find that tech note on the > internet.) The EMBED marker or its ID is not defined in the tech note. Based on comment 24 quote, the Photoshop does not save JPEG data according tech note #5116, but Acrobat does. Here, in this bug, we are introducing 'EMBED' marker that helps to differentiate those two formats. Web applications or page scripts may easily add that marker to the data to take advantage of the "right" format. We probably need to add documentation or wiki about that. Does this change things?
The decoder was moved from the /modules/libpr0n to /images
Attachment #567361 - Attachment is obsolete: true
Attachment #567643 - Attachment is obsolete: true
Please could you tweak your hgrc to automatically add author info (guide here: http://blog.bonardo.net/2010/06/22/so-youre-about-to-use-checkin-needed) + add a commit message when attaching patches, since it makes pushing half a dozen checkin-neededs a lot easier. Thanks :-)
Status: NEW → ASSIGNED
Flags: in-testsuite+
Keywords: checkin-needed
Target Milestone: --- → mozilla10
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Crap. I wasn't around to see comment 37. I think, rather than creating an EMBED tag, we should create a tag that says the CMYK values are inverted from the norm. I'm going to back this out from mozilla-central (and aurora) and have us come up with a better solution.
Attached patch backoutSplinter Review
This patch will back out something that isn't ready for Aurora yet. We'll re-land on m-c once it's ready.
Attachment #572875 - Flags: approval-mozilla-aurora?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Can we get a concrete actionable suggestion what the tag should be called?
I'd be interested in what DRC says. But something like NormalCMYK would be a good start.
Comment on attachment 572875 [details] [diff] [review] backout Approved but it seems like this was already backed out (?).
Attachment #572875 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Hasn't been backed out of mozilla-aurora yet, but has been backed out of mozilla-central. This is just waiting on the mozilla-aurora migration to be completed so it can be pushed.
Attachment #572201 - Flags: checkin-
Attachment #572875 - Flags: checkin+
CCing DRC per comment 48
Please reopen this bug once we've figured out what we want to do here.
Status: REOPENED → RESOLVED
Closed: 14 years ago14 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: