Closed Bug 486918 Opened 16 years ago Closed 12 years ago

Add an option to use high-quality downscaling for images

Categories

(Core :: Graphics: ImageLib, defect)

defect
Not set
major

Tracking

()

VERIFIED FIXED
mozilla18
Tracking Status
firefox18 --- disabled

People

(Reporter: Virtual, Assigned: joe)

References

(Depends on 5 open bugs, Blocks 1 open bug)

Details

(Keywords: feature, nightly-community, Whiteboard: [parity-WebKit][parity-Chrome][parity-Safari])

Attachments

(13 files, 17 obsolete files)

1.02 KB, text/html
Details
1.77 MB, image/png
Details
106.76 KB, image/png
Details
1.42 MB, image/png
Details
272 bytes, text/html
Details
22.48 KB, patch
justin.lebar+bug
: review+
Details | Diff | Splinter Review
26.07 KB, patch
justin.lebar+bug
: review+
Details | Diff | Splinter Review
772 bytes, text/html
Details
464.79 KB, image/jpeg
Details
6.00 KB, patch
justin.lebar+bug
: review+
Details | Diff | Splinter Review
75.52 KB, patch
jrmuizel
: review+
Details | Diff | Splinter Review
22.50 KB, patch
joe
: review+
Details | Diff | Splinter Review
28.56 KB, patch
jrmuizel
: review+
Details | Diff | Splinter Review
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-PL; rv:1.9.2a1pre) Gecko/20090404 Minefield/3.6a1pre
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-PL; rv:1.9.2a1pre) Gecko/20090404 Minefield/3.6a1pre

Awful image quality in downscaling...

Modern browsers should use only Lanczos or at least Bicubic resizing method for scaling images....

Not Nearest Neighbour or Bilinear, which provide bad quality...

Reproducible: Always

Steps to Reproduce:
1. Go to size http://bj0rn.animeblogger.net/
2. And see some images on the main site
3. Scaling looks terrible
Actual Results:  
I'm seeing awful resizing image quality in scaling...

Expected Results:  
I want to see awesome quality in downscaling images like in Safari 4 Public Beta 528.16, Internet Explorer 8 or Google Chrome 2.0.171.0 (but Chrome processing images only when they are fully downloaded, which looks bad on slow connections...)

Bug looks like in Mozilla Firefox 3.6a1re (even 3.0.8) like this
http://img6.imageshack.us/img6/1299/fx36a1pre.png

Even Opera 10.00.1355 Alpha have this bug, but looks better, some lines are antialiased, but no too good to not see them...
http://img17.imageshack.us/img17/9898/opera10.png

And on Safari 4 Public Beta 528.16, Internet Explorer 8 or Google Chrome 2.0.171.0 site looks like should look like, are line is antialiased...
http://img510.imageshack.us/img510/702/safari.png
http://img12.imageshack.us/img12/6109/ie8d.png
http://img27.imageshack.us/img27/7941/chromeq.png
Component: General → GFX: Thebes
Product: Firefox → Core
QA Contact: general → thebes
Just a warning for other people: These images might not be safe for work
bug 372462, although not a lot movement since Firefox 3.0 went to bilinear for downscaling.
The good news for you is that this is on the post-Firefox 3.5 to-do list https://wiki.mozilla.org/Platform/2009-Q2-Goals#GFX
@ Jo Hermans - I dont think so...
It looks like Nearest Neighbour scaling or without any scaling method...

@ Robert Longson - good to hear that...
but nice will be when Fx team add Lanczos or at last bicubic method for sampling images...
I believe Jeff has an in-progress patch or two implementing a box filter in pixman/cairo.
Status: UNCONFIRMED → NEW
Ever confirmed: true
@ Joe Drew - But this didn't only concern to box filter... Normal zoom out in images are affected too...

IMO we should implementing Lanczos or at last BiCubic method for scaling images...
And Zooming Out and In will be perfect for ages... ;)
(In reply to comment #6)
> IMO we should implementing Lanczos or at last BiCubic method for scaling
> images...

Patches accepted :)

The relevant file is in pixman, in pixman-transformed.c. If you need help, going on irc.freenode.net channel #cairo would be a good first step.
tracking-fennec: --- → ?
Flags: wanted1.9.2?
Flags: wanted1.9.1?
Flags: wanted1.8.1.x?
Flags: wanted-fennec1.0?
Flags: blocking1.9.2?
Flags: blocking1.9.1?
Flags: blocking1.9.0.9?
Flags: blocking1.9.0.10?
Flags: blocking1.8.1.next?
Please don't play with flags.
Flags: wanted1.9.2?
Flags: wanted1.9.1?
Flags: wanted1.8.1.x?
Flags: wanted-fennec1.0?
Flags: blocking1.9.2?
Flags: blocking1.9.1?
Flags: blocking1.9.0.9?
Flags: blocking1.9.0.10?
Flags: blocking1.8.1.next?
tracking-fennec: ? → ---
Blocks: 372462
IE 8 truly has great image downscaling quality compared to Firefox. I hope the quality of Firefox's downscaling algorithm improves significantly in subsequent versions. I was truly surprised to see how bad it is currently.
yep...

on my eye the best image quality in downscaling have Chromium, next IE8 and Safari 4 with very minor changes compared to IE8... in this team you probably can see difference only in zoom in e.g. XnView or IrfanView without any scaling method...
the second team is Opera alone, and the last is unfortunately Mozilla Firefox...

IMO we should implant Lanczos sampling for image scaling and we will own all browsers :D
Jeff is working on great downscaling, including a Lanczos filter, but it will not land in 1.9.2.
Flags: wanted1.9.2?
Attached file testcase β€”
(In reply to comment #11)
> Jeff is working on great downscaling, including a Lanczos filter, but it will
> not land in 1.9.2.

Than any chances to get this landed in 1.9.3 maybe ? :)
And how works going ?
blocking2.0: --- → ?
Other things have come up; this probably will not make it for 1.9.3 either.
blocking2.0: ? → -
(In reply to comment #15)
> Other things have come up; this probably will not make it for 1.9.3 either.

it's very,very unfortunate. The image quality under Firefox is very poor.
See for example this comparison between FF3.6.3 and Chrome5 under Linux (I add the attachment).
Have you noticed for Firefox the horrible letters in "The annotated Alice" cover?
For Chrome it's much better!
(In reply to comment #15)
> Other things have come up; this probably will not make it for 1.9.3 either.

implanting Lanczos scaling IMO would take about 1-3 days of hard work (at least in other programs like FastStone Image Viewer)
and its not that time consuming but final result will be awesome and the best compared to other browser

so can we reconsider this as wanted to Mozilla Firefox 4.0 ?
blocking2.0: - → ?
(In reply to comment #18)
> (In reply to comment #15)
> > Other things have come up; this probably will not make it for 1.9.3 either.
> 
> implanting Lanczos scaling IMO would take about 1-3 days of hard work (at least
> in other programs like FastStone Image Viewer)
> and its not that time consuming but final result will be awesome and the best
> compared to other browser

Implementing the scaling isn't the hard part since I've already written a Lanczos scaler. The hard part is deciding how/if the scaled images should be cached, if the scaler should be part of cairo, and what to do about the negative performance impact especially compared to doing scaling on the GPU with Direct2D.
I'm trying to be quite strict with what we block on. Blocking means 'we will not  release Firefox (whatever-version) without this fix,' and while I'd love to see this fixed, I can't in good conscience block the release on it.
blocking2.0: ? → -
(may I suggest a "wanted" for status1.9.3?)
(In reply to comment #20)
yep, but I stated that I wanted it only as WANTED (but I cant set it like this) :P

(In reply to comment #19)
as far I remember FastPictureViewer support GPU boost for displaying images ;p
but how much time it will take to fully do it ? 2 weeks month or more ?
Summary: Awful image quality in downscaling → Awful image quality in image scaling
Ubuntu Linux Firefox image scaling is terrible on my laptop. It's fine with Chromium. I'm using 2 browsers because of this bug. Here is a good bug report from Ubuntu launchpad;

"FFe: Pixellated Images in Firefox/Opera due to incorrect EXTEND_PAD implementation in several video drivers."


[Problem]
Upscaled images in Firefox (and Opera) look pixelated when zoomed, edges appear jagged.

[Discussion]
This is because firefox is using nearest-neighbor interpolation for upscaling. It would look better if bilinear filtering were used by Cairo, which requires EXTEND_PAD. However, EXTEND_PAD is not implemented very well in several video drivers, and Cairo is unable to distinguish drivers that have good implementations from ones with bad ones, so it is currently using a client-side fall back which is deemed too slow by the firefox developers.

Solving this requires updating each video driver to either implement EXTEND_PAD correctly or at least stop advertising it can do it when it really can't. Once this is done, cairo's client-side workaround can be removed and firefox can be updated to use EXTEND_PAD.

The proposed fixes are available (for jaunty and karmic) in the firefox-smooth-scaling PPA:
https://launchpad.net/~firefox-smooth-scaling/+archive/ppa
this is exasperating. DO IT. JUST DO IT. do you need numerical recipes or what? And DO IT NOW, today. Instead of all this bla bla bla. Are you professionals or what?
Please read
 
  https://bugzilla.mozilla.org/page.cgi?id=etiquette.html

especially point 1.1, 1.2, 1.3
You would appreciate that Firefox will only survive if it continues to deliver excelence. 1 year and 7 months NOT FIXING a problem for which I could write an implementation in 1 day or less. I simply do not get it. Therefore the tone. Seems that at times this is the only way to make people reflect. You decide if Firefox should survive or not. Β‘AllΓ‘ Ustedes!
(In reply to comment #28)
>I could write an implementation in 1 day or less.

please do...
(In reply to comment #25)
> [Problem]
> Upscaled images in Firefox (and Opera) look pixelated when zoomed, edges appear
> jagged.

That has already been fixed in bug 422179 This is a separate but related bug about downscaling, not upscaling.
I have disabled the Bugzilla account of suharizen@gmail.com, pending a change in attitude.

Gerv
Any plans for implementation this in future releases next to Firefox4 ?
Because downscaling quality in the worst compared to all other browsers...
It appears that FF4b8 got some improvements, CTRL-+ enlarged images don't look completely ugly any more.
Yep, but still lacks quality in downscalling like you can see on testcase.
Also zoom quality can be improve too by using Lanczos sampling or Bicubic (like Chromium)
Whiteboard: [wanted2.1?] → [wanted2.1?][parity-Chrome][parity-Safari]
Also [parity-IE], for what it's worth.
Partially, because IE not support background-size, based on info from testcase. But still have superior quality compared to Firefox
Whiteboard: [wanted2.1?][parity-Chrome][parity-Safari] → [wanted2.1?][parity-WebKit][parity-Chrome][parity-Safari]
sorry for duplicate, I also hope Mozilla fix this "bug", 3 years is too much time for a browser like Firefox.
Sorry again.
Too bad this has still not been fixed in Firefox 4b12 :(
Check here in the result list, the video previews are downsized and look horrible in FF4b12 : http://online.nolife-tv.com/index.php

Actually I noticed that because I came back from Chrome to test FF4b11. Chrome doesn't have this issue.
>Too bad this has still not been fixed in Firefox 4b12 :(
of course not, the bug status would be "fixed" instead of "new" if this bug would be fixed
Whiteboard: [wanted2.1?][parity-WebKit][parity-Chrome][parity-Safari] → [parity-WebKit][parity-Chrome][parity-Safari]
@ Jeff - what the progress in your Lanczos scaler ?

Now we have HW Acc, so we can use it to boost decoding speed like for e.g. Shaders in MPC-HC with scaler script
Any idea how to solve this bug? I downscaled images to my shop and the images in Firefox are awful, they don't have enough quality.

If someone have a clue how to solve this, please, let me know.
(In reply to comment #41)
> @ Jeff - what the progress in your Lanczos scaler ?

The scaler has been done for a long time. It's the integration work that is hard and needs doing.
Thanks for info! I appropriate it, because with this patch we will have awesome quality.
And you have some plans when this can be finalized e.g. Firefox 8/9, so in early 2012 ? Or you guys got other work to do for now.

@ hugo.quinta - use rescaled thumbnails
(In reply to comment #44)
> Thanks for info! I appropriate it, because with this patch we will have
> awesome quality.
> And you have some plans when this can be finalized e.g. Firefox 8/9, so in
> early 2012 ? Or you guys got other work to do for now.
> 
> @ hugo.quinta - use rescaled thumbnails

There's currently nothing planed for this as we have other higher priority things to work on.
I would really love to see this issue resolved.

If you want super fast hardware accelerated down scaling with an arbitrary kernel size you could try summed area tables. DirectX 10+ hardware can calculate the SATs very efficiently with the compute shader, even without the GPU it would be feasible on a separate thread. Once you have the SATs it is *constant time* to filter, so zooming in and out and resizing the image dynamically would impose only a constant time re-filtering cost. Arbitrary kernel size means you can adapt the kernel depending on the source / target image sizes. 

But, I'm a video game developer so I'm not sure where the speed vs. quality tradeoff lies with web browsers. Anyway, just an idea :)
I've noticed the image quality as of about 7a1 nightly is *drastically* improved, going from completely illegible text to on-par with the default Android browser. What image filter is being used now?
On Android, we now use bilinear scaling on devices that support NEON.
@CoJaBo check out my examples in bug 667850 - Nightly 7a1 is still nowhere near acceptable.
@Roman My comment was regarding the Mobile version of Firefox 7a1, running on the Motorola Droid X, on which the image matches your reference bilinear scaled image.
The 7a1 Desktop version (running on Linux) shows a modest improvement of zooming in, but zooming out is still identical to 3.6 (i.e., abysmal).
(In reply to comment #49)
> @CoJaBo check out my examples in bug 667850 - Nightly 7a1 is still nowhere
> near acceptable.

You are complaining about downscaling quality with the scaling factor 2x or more, where bilinear filter alone is not good enough without having some additional tricks. This problem can be actually solved in pixman:
    http://comments.gmane.org/gmane.comp.lib.cairo/20111
    http://comments.gmane.org/gmane.comp.graphics.pixman/326

And if a special care is taken, I think it should be even fast enough for scaling images in realtime.
(In reply to comment #11)
> Jeff is working on great downscaling, including a Lanczos filter, but it
> will not land in 1.9.2.

Is there any update on this?
BUMP! Want to know the answer on this question too.
don't do that
https://bugzilla.mozilla.org/page.cgi?id=etiquette.html
I've been a big fan of Firefox over the years, since version 1.0. I have seen all around me switch from Firefox for Chrome. But bugs are like this, which is now three years and is not really resolved that leave me with an anger, a very great hatred. Solve this !!!!!!!
(In reply to a7711486 from comment #57)
> I've been a big fan of Firefox over the years, since version 1.0. I have
> seen all around me switch from Firefox for Chrome. But bugs are like this,
> which is now three years and is not really resolved that leave me with an
> anger, a very great hatred. Solve this !!!!!!!

This is neither a discussion forum nor a newsgroup.
Please avoid this kind of comment and refer to https://bugzilla.mozilla.org/page.cgi?id=etiquette.html
(In reply to RNicoletto from comment #58)
> (In reply to a7711486 from comment #57)
> > I've been a big fan of Firefox over the years, since version 1.0. I have
> > seen all around me switch from Firefox for Chrome. But bugs are like this,
> > which is now three years and is not really resolved that leave me with an
> > anger, a very great hatred. Solve this !!!!!!!
> 
> This is neither a discussion forum nor a newsgroup.
> Please avoid this kind of comment and refer to
> https://bugzilla.mozilla.org/page.cgi?id=etiquette.html

In not give a damn. I've passed the message.
I'm not sure if I should report a new bug, however, this problem also applies when upscaling an image. It looks terrible in Firefox (massive blurring), but fantastic in Chrome. The resizing algorithm is obviously bad.

It has been 3 years. Any news?
Hi

As a web designer it's a little hard to sell an adaptive/scaling solutions when image downscaling produces blurry images in FF. 

Now there is a definite movement with many popular HTML/CSS frameworks built to downscale images when the users viewport is smaller than the maximum resolution, does this make it a more pressing issue and higher on the bug fix list?

Cheers, 
Shaun
Any updates on this?

Resampling looks fine on OSX.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #19)
> Implementing the scaling isn't the hard part since I've already written 
> a Lanczos scaler. The hard part is deciding how/if the scaled images should be
> cached, if the scaler should be part of cairo
Will mark those scaled images a shorter cache expire time than regular help?
(In reply to Jeff Muizelaar [:jrmuizel] from comment #19)
> cached, if the scaler should be part of cairo, and what to do about the
> negative performance impact especially compared to doing scaling on the GPU
> with Direct2D.

A bit late to the party, but just a reminder Bicubic and Lanczos can be easily done on the GPU, in fact several video players do.

I don't know if Direct2D would make that easy, but it's a great opportunity for Azure.
Yesterday I was testing the image downscaling quality for different browsers and ended up wondering why Firefox 8 was achieved such bad results. At the end I found this bug report. Today I have updated my Firefox (Ubuntu 11.10) to 9.0.1, taken a look at my image scaling testcase and found out that the new Firefox version is doing a better job now. It doesn't achieve the quality of Chrome or IE 9, but I would no longer qualify it as 'awful'. Since the quality of downscaled images on Firefox 9.0.1 on a OS X machine were achieving good results (as Ricardo Tomasi in comment 62 noticed), the quality improvement on the Ubuntu version was not unexpected.

What somehow confuses me are the facts that I this bug report is still marked as 'new' and that I cannot find a related issue in the list of fixed bugs of the release notes for Firefox 9.0 or Firefox 9.0.1 (see e.g. http://www.mozilla.org/en-US/firefox/9.0.1/releasenotes/buglist.html).

Can anyone confirm that Firefox (Linux, Windows) has, at least partially, made some improvements on image scaling? Can anyone explain how this happened?
@Ivan Ruiz Gallego

Nope. I cannot confirm this improvement. I compared (one more time) the image quality between Firefox 9.0.1 and Chrome. Unfortunately the Firefox result is still awful.
It's really a shame...
QImage->transformed(scaleMatrix, Qt::SmoothTransformation)) - does the right thing.
Cairo any filter does not scale with required quality.

Chromium does fast (bad) scaling on first iteration, and good quality when zoom finished. can we do the same even with hacky implementation?
probably quick fix could be done using next path:
1) while zoom (ctl+wheel) we scale with cairo
2) when zooming finished, we go through downscaled images (layers) and pre-downscale layer content using extra function such way that cairo would not need to scale anything after that.
I don't have a big issue with doing two separate types of scaling, especially if we can make it the same cross-platform. For example, Chrome apparently does bilinear first, and then Lanczos later.
There is a guy want to pick up bug 517294 comment15, he want some start-up help.
Reddit question on the topic:

http://www.reddit.com/r/firefox/comments/p6kda/is_there_an_extension_or_workaround_for_firefoxs/
Fix this Mozilla.
Rather than commenting with numerous "me too"s, consider using the "vote for this bug" feature, thats what its there for; or better yet, find someone with the time/motivation to work on implementing a patch.
(IMVHO, this ought to be made a blocker- Firefox is almost IE6-esque in how far it has fallen behind in this area)
(In reply to CoJaBo from comment #73)
> Rather than commenting with numerous "me too"s, consider using the "vote for
> this bug" feature, thats what its there for; or better yet, find someone
> with the time/motivation to work on implementing a patch.
> (IMVHO, this ought to be made a blocker- Firefox is almost IE6-esque in how
> far it has fallen behind in this area)

I did vote on it.
Then a comment is unnecessary- it just adds clutter to the discussion without furthering progress on the issue. This is not going to be fixed until someone actually steps up to take it. I would take a crack at it myself, but I do not have much experience working on the source, and do not currently have the free time to get up to speed with it.
Perhaps check out the forums or IRC rooms (#firefox on mozilla.org) if you want to discuss the necessity of fixing it, and perhaps find someone willing to commit to working on it.
I'd just like to comment that this also affects the quality of thumbnails in the new tab page, and it's very noticeable and jarring. 


I don't know the details, but reading about this bug around the net, people are saying even the bilinear Mozilla itself quotes it has is broken and it looks more like nearest neighbor.

https://developer.mozilla.org/En/CSS/Image-rendering

I'm no expert at this, but this effect should be much more amplified on Fennec, where the screen is already very small compared to desktops, and it's not unusual to downscale a image several times larger than the screen, which would make it a bigger mess than on regular monitors. 

From my limited testing, Firefox seems to upscale images much better than Chrome, so maybe you can take it from there?
Attached image Upscaling comparison β€”
> I don't know the details, but reading about this bug around the net, people
> are saying even the bilinear Mozilla itself quotes it has is broken and it
> looks more like nearest neighbor.

For magnification factors under 0.5, the 2x2 bilinear filter kernel will fail to sample all the source picture's pixels, producing very jarring aliasing.

Chrome uses a 4x4 bicubic filter IIRC, and on top of that does a pre-minification step where the picture is downsized to half its size as many times as necessary to keep it within the good working range of the resizer. This improves both performance and quality.

> From my limited testing, Firefox seems to upscale images much better than
> Chrome

Not really. It's blurrier and therefore it might hide artifacts in low-quality pictures better, but bicubic is certainly better for high quality ones. See attachment; Chrome is on the right.

Also it's worth noting Chrome switches to bicubic only when an image is quiescent for more than 0.5 seconds or so. During animations, progressive loading, and the like, it uses bilinear to keep performance high, getting the best of both worlds. (Before they implemented this, their performance in the Flying Images test drive benchmark used to be terrible.)
(In reply to A, T from comment #77)
> I'd just like to comment that this also affects the quality of thumbnails in
> the new tab page, and it's very noticeable and jarring. 

bug 723050
(In reply to Oleg Romashin (:romaxa) from comment #68)
> probably quick fix could be done using next path:
> 1) while zoom (ctl+wheel) we scale with cairo
> 2) when zooming finished, we go through downscaled images (layers) and
> pre-downscale layer content using extra function such way that cairo would
> not need to scale anything after that.

ok, I have an initial patch for this idea. So far it's qt-only and doesn't work for animation. Please check this out. How does it match your expectations?
Attachment #594858 - Flags: feedback?(romaxa)
Comment on attachment 594858 [details] [diff] [review]
patch WIP1: pre-downscale with Qt::SmoothTransformation

 			imgFrame.cpp \
+                        SmoothScaling.cpp \
 			imgLoader.cpp    \

would be nice to add SmoothScaling.cpp file in patch 

0.5, should be defined somewhere same as 500ms
Attachment #595091 - Flags: feedback?(romaxa)
Attachment #594858 - Attachment is obsolete: true
Attachment #594858 - Flags: feedback?(romaxa)
Comment on attachment 595091 [details] [diff] [review]
patch WIP2: pre-downscale with Qt::SmoothTransformation

I'm ok with this solution because it is implementing exactly what I was proposing.
I think joe should give a feedback on it.
Attachment #595091 - Flags: feedback?(romaxa) → feedback?(joe)
Comment on attachment 595091 [details] [diff] [review]
patch WIP2: pre-downscale with Qt::SmoothTransformation

This looks reasonable, but I would prefer we have a single worker (like we do for discards, and like we will for decodes once bug 715308 lands) that does some small amount of work, say 5 ms, and then reposts itself to the event queue.

It's not pre-downscaling, though, right? It's post-downscaling?

Since you aren't doing this on animated images, there's no reason to keep track of frame numbers; you can just use frame 0.

You could use nsAutoPtr in a lot of places, rather than raw pointers, to avoid manually deleting.
Attachment #595091 - Flags: feedback?(joe) → feedback+
ok, here is platform/toolkit independent version
Attachment #597224 - Attachment description: patch WIP3: pre-downscale with Qt::SmoothTransformation → patch WIP3: pre-downscale with SmoothTransformation
That might sound silly, but I am not able to find documentation about how to apply a patch in hg. hg import only fails.
Could someone give me a hint?
Your best bet is to use patch -p1 < /path/to/patchfile. You could also look up mercurial queues.
Thank you so much.
I download the try build, have to say it is a lot better, but here are some question:

I has a userscript, it can pop-up the image and resize it, it's been downscale at the page, the resampler do a good job, but after I pop it, then resize it, it won't anti-alias again, which Chrome does.
Now I understand why, if there are Image instance with same src, but different size, the anti-alias will stop function.
here is another catch:
1) open a page with scaled images, then swith to other tab, 
2) wait until the image was discarded, 
3) switch back to the tab, the image will show in its natural size
Forget to mention, all observation I post above is on Win7 64bit with 32bit trybuild.
With a careful naked eyes comparison, Chrome still do better, but Firefox being really close now.

This is awesome, man. Hope seeing this in nightly soon.
After testing the try-builds, I can say that this is a splendid job, the image quality is greatly improved. 

There's a few issues I spotted, that #92 could have missed:
When resizing the browser window, each time it was resized either up or down, the smoothscaling filter would need to be re-applied, and several times the image brightness would go up for whatever reason, the brightness would go down to normal after resizing the window again. 

To add, the browser crashed 2 times out of roughly 16 times resizing the window. 

Trunk 20120214, a completely fresh profile.
I am also having an issue with the image increasing its brightness while resizing in the try-build. It doesn't seem to happen to all images or at least not a noticeable increase in brightness on all images, mainly ones that were downscaled by a fair amount. 

Also when I loaded a large image and downscaled it, if I went to another tab and then came back to the image, it would be fullsize but only a chunk of the image would be there, resizing it seemed to fixed that issue as well like it fixed the brightness bug.
Is the current patch working for upscaling? I've just tried an image upscaled using an userscript, the upscaled images are exactly the same as Firefox 10.0.1 - still awful. Using ctrl-mouse scroll results in the same thing. I used the try-win32 builds btw.
Attached image Triggering the brightness bug (obsolete) β€”
Assignee: nobody → tanya.meshkova
(In reply to Joe Drew (:JOEDREW!) from comment #84)

> This looks reasonable, but I would prefer we have a single worker (like we
> do for discards, and like we will for decodes once bug 715308 lands) that
> does some small amount of work, say 5 ms, and then reposts itself to the
> event queue.

hey Joe, is it so that it makes sense to use worker instead of timers only if we can scale portions of image data?
Regarding the method itself, it looks like SmoothScaling is not good enough. I'm going to check the Chromium one.
(In reply to Tatiana Meshkova (:tatiana) [at ELC 15.02-17.02.2012] from comment #98)
> (In reply to Joe Drew (:JOEDREW!) from comment #84)
> 
> > This looks reasonable, but I would prefer we have a single worker (like we
> > do for discards, and like we will for decodes once bug 715308 lands) that
> > does some small amount of work, say 5 ms, and then reposts itself to the
> > event queue.
> 
> hey Joe, is it so that it makes sense to use worker instead of timers only
> if we can scale portions of image data?

I'm really looking to avoid event storms. It's OK (at least initially) if we can only scale full images at a time; we can still measure how long that takes and return to the event loop after we've done N ms of work.
IIUC single image scale operation may take much more than 5ms, (20/40/50) depends on the image, with attached testcase it is about 50ms.
In that case, this high-quality scaling should be run on its own worker thread, never blocking the main thread. That's a lot more work, but it's probably necessary to land this.
Here is the first version with worker running in it's own thread.
Attachment #595091 - Attachment is obsolete: true
Attachment #597224 - Attachment is obsolete: true
Exciting! I'll take a look in the morning & give you feedback.
After some testing with the new builds here(specifically the Windows32/Linux32 one): 

https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/tanya.meshkova@gmail.com-dcd46e0d7e2a/

As #94,#95 and #97 said, the brightness bug still persists, though only when hardware acceleration is enabled on Windows, since I was testing the linux build through a VM the hardware acceleration was disabled and didn't encounter the brightness bug.
Turning on HWA and resizing the window horizontally where the image is pops up the bug really fast, for some reason much faster than resizing the window vertically. 

No restart for HWA was required.
Comment on attachment 601481 [details] [diff] [review]
patch WIP4: Pre-downscale with Imlib smooth scale. ScaleWorker in it's own thread.

Review of attachment 601481 [details] [diff] [review]:
-----------------------------------------------------------------

This looks awesome, and I'm very excited!

Be sure to add comments above non-obvious things, like locking and unlocking the imgFrame, as to why you're doing it where you're doing it etc.

Why the ImageLogging.h include location changes?

::: image/src/RasterImage.cpp
@@ +182,5 @@
>  namespace image {
>  
>  /* static */ nsRefPtr<RasterImage::DecodeWorker> RasterImage::DecodeWorker::sSingleton;
> +/* static */ nsRefPtr<RasterImage::ScaleWorker> RasterImage::ScaleWorker::sSingleton;
> +static nsIThread* sScaleWorkerThread = nsnull;

This should be a nsCOMPtr, so ClearOnShutdown assigning null to it frees it. (This will create fake leak reports in our debug tests.)

@@ +2601,5 @@
> +    }
> +
> +    mRequestsMutex.Unlock();
> +    if (request->done) {
> +      NS_DispatchToMainThread(this, NS_DISPATCH_SYNC);

It is sort of ugly that this function does two different things depending on what thread it's called on. I'd prefer the refresh be a separate event with a separate function.

Also, why do we do this synchronously? We could just fire off "refresh" events asynchronously and let Gecko get to them when it gets to them. That'd let us continue scaling on this thread if necessary.

@@ +2607,5 @@
> +
> +    return NS_OK;
> +  }
> +  else {
> +    ScaleRequest* request = mDrawRequests.popFirst();

This will need to be locked if we don't call the refresh event synchronously.

@@ +2723,5 @@
> +          || scale.height < PRE_DOWNSCALE_MIN_FACTOR))) {
> +    ScaleWorker::Singleton()->mRequestsMutex.Lock();
> +  }
> +  if (mScaleRequest.srcDataLocked && mScaleRequest.srcFrame) {
> +    mScaleRequest.srcFrame->UnlockImageData();

I understand why this unlock and re-lock needs to be done from the main thread, but I'm not enthusiastic about it being inside the Draw() method. I'd like it much more if

1) The "queue a scale event for this frame" code was in a separate function, and
2) The "choose which frame to draw" code was in yet another function.
patch WIP4 implementation has performance issues. I.e. it makes a lot of useless work.
E.g. with Flying Images test more than 80% of pre-scaled imgFrames are useless. FPS drops down to 15.
I guess we have to reanimate the original timer for scale requesting.
Attachment #601481 - Attachment is obsolete: true
Attachment #603064 - Flags: feedback?(joe)
(In reply to Joe Drew (:JOEDREW!) from comment #84)

> It's not pre-downscaling, though, right? It's post-downscaling?
Do you mean from user point of view? Well, from cairo point of view it's PRE-downsclaling, right?
(In reply to Joe Drew (:JOEDREW!) from comment #108)
 
> Why the ImageLogging.h include location changes?
I've included Mutex.h and now there is a problem with includes order:
-------8<-------------
ImageLogging.h:46:2: error: #error "Must #include ImageLogging.h before before any IPDL-generated files or other files that #include prlog.h."
-------8<-------------

I believe WIP5 patch fixes the rest.
btw, where can I find Jeff's Lanczos scaler code?
http://lists.cairographics.org/archives/cairo/2009-June/017495.html seems to be where he submitted it as a RFC to the Cairo mailing list.  Not sure about any later revisions.
I can try to get my code up someplace this week.
Just remind... if there are two image in same SRC with different scale, this patch won't work.
We shouldn't optimize for that.
(In reply to dindog from comment #116)
> Created attachment 603167 [details]
> same src with two different scale
> 
> Just remind... if there are two image in same SRC with different scale, this
> patch won't work.

yes, currently the behavior is the same as in case if user would repeatedly Ctrl+/- a single <img>. We redraw infinitely. The solution is to cache more than one scaled image frame for each RasterImage.
So, this same source case is one more reason to implement some cache/discard mechanism.
Suggestions? Objections?
Let's keep it simple initially; we can enhance later.
> Let's keep it simple initially; we can enhance later.

Yep, but we still need to solve problem with recursive/infinite repainting for first version.
because with current version we have:
1) Frame A Draw with Scale1 -> Create pre-scaled image1 and send invalidate
2) Frame A Draw with Scale2 -> Create pre-scaled image2 discard image 1 and send invalidate
3) Invalidate from 2) cause invalidate for Frame A with scale1-> Draw -> goto 1)
Here is my latest try:
https://tbpl.mozilla.org/?tree=Try&rev=12b5d934c9b3
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/tanya.meshkova@gmail.com-12b5d934c9b3

I've imported Lanczos scaler from http://lists.cairographics.org/archives/cairo/2009-June/017495.html for testing purposes.

Any idea why build fails on windows?
Jeff, do you have any newer version?
Attachment #603064 - Attachment is obsolete: true
Attachment #607313 - Flags: feedback?(jmuizelaar)
Attachment #603064 - Flags: feedback?(joe)
I believe this part is ready for review.
Attachment #607317 - Flags: review?(joe)
(In reply to Tatiana Meshkova (:tatiana) from comment #122)
> Here is my latest try:
> https://tbpl.mozilla.org/?tree=Try&rev=12b5d934c9b3
> http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/tanya.
> meshkova@gmail.com-12b5d934c9b3
> 
> I've imported Lanczos scaler from
> http://lists.cairographics.org/archives/cairo/2009-June/017495.html for
> testing purposes.
> 
> Any idea why build fails on windows?
> Jeff, do you have any newer version?

Looks like a problem with assert() on windows by the look of it.
Comment on attachment 607317 [details] [diff] [review]
Part 2: Pre-downscale with downscale_lanczos_filter. ScaleWorker in it's own thread. RequestScale on timer shot.

Review of attachment 607317 [details] [diff] [review]:
-----------------------------------------------------------------

I think you forgot to hg add the lanczos code
> 
> I think you forgot to hg add the lanczos code
here is two parts, lanczos attached to jeff for review ;)
whoops!
Comment on attachment 607317 [details] [diff] [review]
Part 2: Pre-downscale with downscale_lanczos_filter. ScaleWorker in it's own thread. RequestScale on timer shot.

Review of attachment 607317 [details] [diff] [review]:
-----------------------------------------------------------------

Looking good! Some minor (and some major-ish) changes needed, but we'll be able to get this in, I think.

Good work!

::: image/src/RasterImage.cpp
@@ +85,5 @@
> +  if (aScaleFactors.width <= 0 || aScaleFactors.height <= 0)
> +    return false;
> +
> +  if (aSrcFrame->GetFormat() != gfxASurface::ImageFormatARGB32
> +      && aSrcFrame->GetFormat() != gfxASurface::ImageFormatRGB24)

&& on previous line, please

@@ +90,5 @@
> +    return false;
> +
> +  nsIntRect srcRect = aSrcFrame->GetRect();
> +  PRUint32 dstWidth = int(srcRect.width * aScaleFactors.width + 0.9999);
> +  PRUint32 dstHeight = int(srcRect.height * aScaleFactors.height + 0.9999);

I think you're trying to round up hereβ€”can you use NSToIntRoundUp (or whatever you're trying to do - alternatives are located in nsCoord.h) instead?

@@ +97,5 @@
> +                                gfxASurface::ImageFormatARGB32);
> +  if (!NS_FAILED(rv)) {
> +    PRUint8* srcData;
> +    PRUint32 srcDataLength;
> +    aSrcFrame->GetImageData(&srcData, &srcDataLength);

Can you add a comment here (and an assertion that srcData is non-null) that says aSrcFrame is owned by, and locked/unlocked, on the main thread?

@@ +109,5 @@
> +      (aSrcFrame->GetFormat() == gfxASurface::ImageFormatARGB32)
> +      ? PIXMAN_a8r8g8b8 : PIXMAN_x8r8g8b8;
> +    pixman_image_t* tmp = pixman_image_create_bits(format,
> +                            srcRect.width, srcRect.height,
> +                            (uint32_t*)srcData, aSrcFrame->GetImageBytesPerRow());

Can you line the arguments up vertically, like

foo(asdf,
    asdf,
    asdf)

please

@@ +111,5 @@
> +    pixman_image_t* tmp = pixman_image_create_bits(format,
> +                            srcRect.width, srcRect.height,
> +                            (uint32_t*)srcData, aSrcFrame->GetImageBytesPerRow());
> +    downscale_lanczos_image(tmp, dstWidth, dstHeight, 0, 0, dstWidth, dstHeight,
> +                    (uint32_t*)dstData, aDstFrame->GetImageBytesPerRow());

also here.

@@ +2629,5 @@
> +
> +nsresult
> +RasterImage::ScaleWorker::Run()
> +{
> +  ScaleWorker::Singleton()->mRequestsMutex.Lock();

Can you use mozilla::BaseAutoLock instead of bare calls to Lock()? Please do this throughout.

@@ +2651,5 @@
> +  ScaleRequest* request = &aImg->mScaleRequest;
> +  if (request->isInList())
> +    return;
> +
> +  mScaleRequests.insertBack(request);

Does this not need to be locked?

@@ +2678,5 @@
> +{
> +    ScaleWorker::Singleton()->mRequestsMutex.Lock();
> +    ScaleRequest* request = mDrawRequests.popFirst();
> +    if (request) {
> +      // The request is finished by ScaleWorker. No need to keep data locked.

"ScaleWorker is finished with this request, so we can unlock the data now."

@@ +2688,5 @@
> +        nsIntRect frameRect = request->srcFrame->GetRect();
> +        observer->FrameChanged(nsnull, request->image, &frameRect);
> +      }
> +    }
> +    ScaleWorker::Singleton()->mRequestsMutex.Unlock();

This Unlock could/should probably be above the if block, right?

@@ +2717,5 @@
> +    request->remove();
> +    request->UnlockSourceData();
> +  }
> +  // We have to check if request is finished before dropping the destination
> +  // frame. Otherwise we may be writing to the dest while performing scaling.

What happens if we are performing scaling when Stop is requested? Will dstFrame ever be nulled out?

@@ +2720,5 @@
> +  // We have to check if request is finished before dropping the destination
> +  // frame. Otherwise we may be writing to the dest while performing scaling.
> +  if (request->done) {
> +    request->done = false;
> +    request->dstFrame = 0;

nsnull, please

@@ +2751,5 @@
> +          //
> +          && mLockCount == 1
> +          && !mAnim && mDecoded
> +          && (aScale.width < PRE_DOWNSCALE_MIN_FACTOR
> +              || aScale.height < PRE_DOWNSCALE_MIN_FACTOR);

|| on previous line please

@@ +2760,5 @@
> +                             gfxContext *aContext,
> +                             gfxPattern::GraphicsFilter aFilter,
> +                             const gfxMatrix &aUserSpaceToImageSpace,
> +                             const gfxRect &aFill,
> +                             const nsIntRect &aSubimage)

vertically align parameters here please

::: image/src/RasterImage.h
@@ +499,5 @@
> +  {
> +    ScaleRequest(RasterImage* aImage)
> +      : image(aImage)
> +      , srcFrame(0)
> +      , dstFrame(0)

nsnull for both of these please
Attachment #607317 - Flags: review?(joe) → review-
Attachment #607313 - Attachment is obsolete: true
Attachment #609636 - Flags: review?(joe)
Attachment #607313 - Flags: feedback?(jmuizelaar)
Attachment #609636 - Attachment description: Bug 486918. Part 1: Latest filtering code. Source https://github.com/jrmuizel/pixman-scaler → Part 1: Latest filtering code. Source https://github.com/jrmuizel/pixman-scaler
Attachment #607317 - Attachment is obsolete: true
Attachment #609638 - Flags: review?(joe)
windows build is fine now: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/tanya.meshkova@gmail.com-6410126ed1f5/
Comment on attachment 609636 [details] [diff] [review]
Part 1: Latest filtering code. Source https://github.com/jrmuizel/pixman-scaler

Review of attachment 609636 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not qualified to review this, unfortunately. I'd say Jeff should review it, except that he wrote it!

Maybe Siarhei will be able? Or Tim?
Attachment #609636 - Flags: review?(tterribe)
Attachment #609636 - Flags: review?(siarhei.siamashka)
Attachment #609636 - Flags: review?(joe)
@@ +2717,5 @@
> +    request->remove();
> +    request->UnlockSourceData();
> +  }
> +  // We have to check if request is finished before dropping the destination
> +  // frame. Otherwise we may be writing to the dest while performing scaling.

What happens if we are performing scaling when Stop is requested? Will dstFrame ever be nulled out?


This still needs to be answered.
Comment on attachment 609638 [details] [diff] [review]
Part 2: Pre-downscale with downscale_lanczos_filter. ScaleWorker in it's own thread. RequestScale on timer shot. (v2)

Review of attachment 609638 [details] [diff] [review]:
-----------------------------------------------------------------

Excellent!

::: image/src/RasterImage.cpp
@@ +2744,5 @@
> +                      gfxSize aScale)
> +{
> +  return (aFilter == gfxPattern::FILTER_GOOD)
> +          && !mAnim && mDecoded
> +          && (aScale.width < PRE_DOWNSCALE_MIN_FACTOR ||

&& on previous line, please.
Attachment #609638 - Flags: review?(joe) → review+
(In reply to Joe Drew (:JOEDREW!) from comment #134)
> @@ +2717,5 @@
> > +    request->remove();
> > +    request->UnlockSourceData();
> > +  }
> > +  // We have to check if request is finished before dropping the destination
> > +  // frame. Otherwise we may be writing to the dest while performing scaling.
> 
> What happens if we are performing scaling when Stop is requested? Will
> dstFrame ever be nulled out?
> 
> 
> This still needs to be answered.

I've fixed this in v2 by calling Stop again once scaling is finished:

+nsresult
+RasterImage::DrawWorker::Run()
+{
+  ScaleRequest* request;
+  {
+    MutexAutoLock lock(ScaleWorker::Singleton()->mRequestsMutex);
+    request = mDrawRequests.popFirst();
+  }
+  if (request) {
+    // ScaleWorker is finished with this request, so we can unlock the data now.
+    request->UnlockSourceData();
+    // We have to reset dstFrame if request was stopped while ScaleWorker was scaling.
+    if (request->stopped) {
+      ScaleRequest::Stop(request->image);
+    }

request is done here, so Stop will null the dstFrame.
nits fixed. Moving r+ from previous patch.
Attachment #609638 - Attachment is obsolete: true
Attachment #609868 - Flags: review+
Switch between tabs starts the high-quality scaling again, is that a planned behavior?
(In reply to SpeciesX from comment #138)
> Switch between tabs starts the high-quality scaling again, is that a planned
> behavior?

yes, at least for now. Scaled frame is dropped when image lock count becomes 0.
Comment on attachment 609636 [details] [diff] [review]
Part 1: Latest filtering code. Source https://github.com/jrmuizel/pixman-scaler

Review of attachment 609636 [details] [diff] [review]:
-----------------------------------------------------------------

r- in its current form. There are a bunch of issues, some major, though most relatively minor.

::: gfx/cairo/libpixman/src/pixman-rescale-box.c
@@ +39,5 @@
> +
> +/* we work in fixed point where 1. == 1 << 24 */
> +#define FIXED_SHIFT 24
> +
> +#if 0

If we're not using this block of code, it should be removed.

@@ +154,5 @@
> +               contribution_b = (1 - contribution_a)
> +                              = (1 - contribution_a_next)
> +    */
> +
> +    /* box size = ceil(src_width/dest_width) */

What is this? The actual box size you use is fractional.

@@ +282,5 @@
> +
> +       pixel * 1/num == pixel * dest_length / src_length
> +    */
> +    /* the average contribution of each source pixel */
> +    int ratio = ((1 << 24)*(long long int)dest_length)/src_length;

"long long", not "long long int" for consistency with other code in this module.

What guarantees that this doesn't underflow? That will cause infinite loops in downscale_box_filter() (eventually leading to a segfault).

Also, these (1 << 24)'s should all be (1 << FIXED_SHIFT).

@@ +283,5 @@
> +       pixel * 1/num == pixel * dest_length / src_length
> +    */
> +    /* the average contribution of each source pixel */
> +    int ratio = ((1 << 24)*(long long int)dest_length)/src_length;
> +    /* because ((1 << 24)*(long long int)dest_length) won't always be divisible by src_length

And here.

@@ +286,5 @@
> +    int ratio = ((1 << 24)*(long long int)dest_length)/src_length;
> +    /* because ((1 << 24)*(long long int)dest_length) won't always be divisible by src_length
> +     * we'll need someplace to put the other bits.
> +     *
> +     * We want to ensure a + n*ratio < 1<<24

And here.

@@ +288,5 @@
> +     * we'll need someplace to put the other bits.
> +     *
> +     * We want to ensure a + n*ratio < 1<<24
> +     *
> +     * 1<<24

This part of the comment should just be dropped entirely. I don't see what it's saying.

@@ +297,5 @@
> +    /* for each destination pixel compute the coverage of the left most pixel included in the box */
> +    /* I have a proof of this, which this margin is too narrow to contain */
> +    for (i=0; i<dest_length; i++)
> +    {
> +        float left_side = i*scale;

scale and the input and output of floor() and ceil() are all doubles. I don't think there's any reason to truncate to float here. It just makes more work for the CPU.

@@ +303,5 @@
> +        float right_fract = right_side - floor (right_side);
> +        float left_fract = ceil (left_side) - left_side;
> +        int overage;
> +        /* find out how many source pixels will be used to fill the box */
> +        int count = floor (right_side) - ceil (left_side);

You should have an explicit cast to int for clarity.

@@ +324,5 @@
> +        if (left_fract == 0.)
> +            count--;
> +
> +        /* compute how much the right-most pixel contributes */
> +        overage = ratio*(right_fract);

Remove the useless parentheses.

You should also have an explicit cast to int for clarity.

@@ +377,5 @@
> +    pixel_coverage_y = compute_coverage (y_coverage, orig_height, scaled_height);
> +
> +    assert (width + start_column <= scaled_width);
> +
> +    /* skip the rows at the beginning */

This is exactly the same code as at line 164, just with different variable names. Make a helper function.

::: gfx/cairo/libpixman/src/pixman-rescale-integer-box.c
@@ +71,5 @@
> +
> +/* downsample_row_box_filter is a Digital Differential Analyzer (like Bresenham's line algorithm)
> + * 'e' is the current fraction or error
> + */
> +void downsample_row_box_filter(

This should be static, just like the version in pixman-rescale-box.c.

@@ +115,5 @@
> +
> +#ifdef AVOID_DIVIDE
> +        div = div_inv[count & 1];
> +
> +        a = (a * div + 0x10000) >> 24;

What is the 0x10000 for? You're shifting by 24, not 17, so this clearly isn't rounding properly (though div is always under-estimated, so it's not clear that doing naive rounding is the best idea anyway). It wouldn't be too hard to just make this division-by-multiplication stuff exact, instead of just an approximation.

@@ +133,5 @@
> +        e += src_dx;
> +    }
> +}
> +
> +void downsample_columns_box_filter(

This should be static, just like the version in pixman-rescale-box.c.

@@ +161,5 @@
> +            column_src += stride;
> +        }
> +
> +#ifdef AVOID_DIVIDE
> +        a = (a * div + 0x10000) >> 24;

See corresponding notes in downsample_row_box_filter().

@@ +177,5 @@
> +        src++;
> +    }
> +}
> +
> +#define ROUND

This #define doesn't appear to be used for anything. Remove it.

@@ +205,5 @@
> +    dest_dx = scaled_width;
> +    src_dx = orig_width;
> +    ex = (src_dx*2) - dest_dx;
> +    dx2x = dest_dx*2;
> +    src_dx *= 2;

This setup code and variable names choices are all pretty confusing. You've got things like dx2, which is really an increment of a vertical coordinate (recommend dest_dy2), ex vs. e (recommend ey instead of e), dx2 vs. dx2x (recommend dest_dx2 instead of dx2x), dx2 vs. src_dy (both are doubled, but only dx2 has the "2" suffix: recommend src_dy2 instead of src_dy), dx2x vs. src_dx (again, recommend src_dx2 instead of src_dx). Also, the code for x and y is very different, though it appears to do the exact same things. The addition of the variable dest_dx, which is set to exactly the same value as scaled_width and then never used outside of this initialization (and for which there is no corresponding dest_dy), is particularly great.

@@ +211,5 @@
> +    /* Compute the src_offset and associated error value:
> +     * There are two ways to do this: */
> +    /* 1. Directly
> +    int nsrc_offset = (src_dx*startColumn + dest_dx)/(dest_dx*2);
> +    long nex = (src_dx*startColumn + dest_dx)%(dest_dx*2) - dest_dx*2 + src_dx; */

I realize this code is commented out, but long is almost certainly the wrong type here.

::: gfx/cairo/libpixman/src/pixman-rescale-lanczos.c
@@ +73,5 @@
> +/* contains the filter for each destination pixel */
> +struct filter {
> +    int count; /* filter size */
> +    int16_t *values; /* filter coefficients */
> +    int offset; /* filter start offset */

Re-ordering these last two fields will save 8 bytes per struct entry on a 64-bit system.

@@ +85,5 @@
> +        return 0;
> +    return a;
> +}
> +
> +/* from ffmpeg */

What's the licensing associated with this code? Is it MPL-compatible? (to my knowledge, FFmpeg in general is not)

@@ +94,5 @@
> +    else
> +        return a;
> +}
> +
> +#if 0

If we're not using this block of code, it should be removed.

@@ +253,5 @@
> + *   lanczos(x) = sinc(x) * sinc(x / filter_size);
> + * where
> + *   sinc(x) = sin(pi*x) / (pi*x);
> + */
> +float eval_lanczos(int filter_size, float x)

This function should be static.

@@ +263,5 @@
> +    return 1.0f;  /* Special case the discontinuity at the origin. */
> +  else {
> +      float xpi = x * (M_PI);
> +      return (sin(xpi) / xpi) *  /* sinc(x) */
> +          sin(xpi / filter_size) / (xpi / filter_size);  /* sinc(x/filter_size) */

sin(xpi) * sin(xpi / filter_size) * filter_size / (xpi * xpi) drops you from four divisions to two. Floating-point semantics mean the compiler isn't allowed to do this for you.

@@ +278,5 @@
> +   - chromium truncates the filter
> +   - opera pads
> +*/
> +
> +int floor_int(float a)

This function should be static.

@@ +283,5 @@
> +{
> +    return floor(a);
> +}
> +
> +int ceil_int(float a)

This function should be static.

@@ +288,5 @@
> +{
> +    return ceil(a);
> +}
> +
> +int float_to_fixed(float a)

This function should be static.

@@ +306,5 @@
> +    /* filter width */
> +    int i;
> +    struct filter *filter = malloc (dest_subset_size * sizeof(struct filter));
> +    int max_filter_size = ceil_int (src_support * 2) - floor_int (src_support * -2) + 1;
> +    float *filter_values = malloc (max_filter_size * sizeof(float)); /* this is a good candidate for stack allocation */

AFAIK, malloc() is still fallible. This allocation needs to be checked, as does the filter allocation above.

@@ +316,5 @@
> +     * that the range covered by the filter won't necessarily cover any source
> +     * pixel boundaries. Therefore, we use these clamped values (max of 1) for
> +     * some computations.
> +     */
> +    float clamped_scale = MIN (1., scale);

This code doesn't support magnification, and what this comment describes is the wrong way to do it anyway (the proper way is to interpolate the source image at the locations of the destination samples using the Lanczos filter applied to the source image at its original scale). Remove this variable, the comment, and replace all usages of it with scale.

@@ +351,5 @@
> +             * in source space. */
> +            float src_filter_pos = cur_filter_pixel - src_pixel;
> +
> +            /* Since the filter really exists in dest space, map it there. */
> +            float dest_filter_pos = src_filter_pos * clamped_scale;

This is going to introduce a shift in the resampled output. I think what you want instead is (src_filter_pos + 0.5F)*scale - 0.5F

@@ +368,5 @@
> +
> +        {
> +            int16_t leftovers;
> +            /* XXX: eventually we should avoid doing malloc here and just malloc a big
> +             * chunk of max_filter_size * sizeof(int16_t) */

More accurately, dest_subset_size * max_filter_size * sizeof(int16_t). But please fix this TODO. It's not hard. In fact, because you need to check this allocation, it makes cleanup much easier.

@@ +373,5 @@
> +            int16_t *fixed_filter_values = malloc (filter_size * sizeof(int16_t));
> +
> +            /* the filter must be normalized so that we don't affect the brightness of
> +             * the image. Convert to normalized fixed point */
> +            /* XXX: It might be better if we didn't have to do this in a separate pass */

I doubt it. Intel chips at least favor relatively small loops like this.

@@ +374,5 @@
> +
> +            /* the filter must be normalized so that we don't affect the brightness of
> +             * the image. Convert to normalized fixed point */
> +            /* XXX: It might be better if we didn't have to do this in a separate pass */
> +            int16_t fixed_sum = 0; /* XXX: should we use a regular int here? */

If you were off by more than a factor of two, you'd have more problems than the integer overflow this caused (specifically, leftovers below would be larger than the filter coefficient it was correcting).

@@ +387,5 @@
> +             * we add back in to avoid affecting the brightness of the image. We
> +             * arbitrarily add this to the center of the filter array (this won't always
> +             * be the center of the filter function since it could get clipped on the
> +             * edges, but it doesn't matter enough to worry about that case). */
> +            leftovers = float_to_fixed(1.0f) - fixed_sum;

float_to_fixed() won't get inlined here unless it's made static above.

@@ +419,5 @@
> +    /* XXX: this duplicates code in compute_lanczos_filter */
> +    int dest_support = LANCZOS_LOBES;
> +    float src_support = dest_support / ((float)scaled_height/orig_height);
> +
> +    int max_filter_size = ceil_int (src_support * 2) - floor_int (src_support * -2) + 1;

floor_int(src_support * -2) is exactly equivalent to -ceil_int(src_support * 2), so you can reduce this to 2 * ceil_int(src_support * 2) + 1

@@ +423,5 @@
> +    int max_filter_size = ceil_int (src_support * 2) - floor_int (src_support * -2) + 1;
> +
> +    /* we use a ring buffer to store the downsampled rows because we want to
> +     * reuse the downsampled rows across different destination pixels */
> +    int ring_buf_size = max_filter_size;

If you rounded this up to the next power of two, you'd waste a small amount of memory (much of which jemalloc is likely already wasting for you) but eliminate a bunch of modulus operations. Not sure whether that's worth the increased cache footprint, but I expect in the common case it is (for small downscalings max_filter_size is small and for large ones scaled_width is small).

@@ +435,5 @@
> +
> +    x_filter = compute_lanczos_filter (start_column, width,  orig_width,  (float)scaled_width/orig_width);
> +    y_filter = compute_lanczos_filter (start_row,    height, orig_height, (float)scaled_height/orig_height);
> +
> +    if (!ring_buf || !scanline || !x_filter || !y_filter)

If only some of these allocations failed, you'll leak.

@@ +446,5 @@
> +        int src1_index, src1_size;
> +        uint32_t *src1;
> +        assert(filter_offset >= 0);
> +
> +        /* read and downsample the rows needed to downsample the next column */

Technically inaccurate. Perhaps "read and downsample the rows needed to apply the next column filter".

@@ +447,5 @@
> +        uint32_t *src1;
> +        assert(filter_offset >= 0);
> +
> +        /* read and downsample the rows needed to downsample the next column */
> +        while (src_index < filter_offset + filter_size)

src_index is initialized to 0, but when start_column is non-zero, y_filter[0].offset is likely to be non-zero as well, meaning you'll downsample a bunch of rows you don't have to. src_index should be initialized to y_filter[0].offset.

@@ +450,5 @@
> +        /* read and downsample the rows needed to downsample the next column */
> +        while (src_index < filter_offset + filter_size)
> +        {
> +            fetch_scanline (closure, src_index, &scanline);
> +            /* downsample_row_convolve_sse2 (width, scanline, ring_buf + width * (src_index % ring_buf_size), x_filter, orig_width); */

Remove this commented-out code.

::: gfx/cairo/libpixman/src/pixman-rescale-mult-old.c
@@ +1,1 @@
> +/* -*- Mode: c; c-basic-offset: 4; tab-width: 8; indent-tabs-mode: t; -*- */

This entire file should be removed. You're not using it.

::: gfx/cairo/libpixman/src/pixman-rescale.h
@@ +22,5 @@
> +
> +int downscale_integer_box_filter(
> +        const struct rescale_fetcher *fetcher, void *closure,
> +        unsigned orig_width, unsigned orig_height,
> +		signed scaledWidth, signed scaled_height,

s/scaledWidth/scaled_width/

Also, why "signed" as the type, rather than just "int"? The code certainly does not handle these being negative.

@@ +23,5 @@
> +int downscale_integer_box_filter(
> +        const struct rescale_fetcher *fetcher, void *closure,
> +        unsigned orig_width, unsigned orig_height,
> +		signed scaledWidth, signed scaled_height,
> +		uint16_t start_column, uint16_t start_row,

And why are these uint16_t's, when downscale_image takes ints, and certainly does nothing to protect against the conversion overflowing?

::: gfx/cairo/libpixman/src/pixman.h
@@ +779,5 @@
>  						      const pixman_fixed_t         *filter_params,
>  						      int                           n_filter_params);
>  void		pixman_image_set_source_clipping     (pixman_image_t		   *image,
>  						      pixman_bool_t                 source_clipping);
> +pixman_bool_t	pixman_image_has_source_clipping     (pixman_image_t		   *image);

This function isn't implemented anywhere. Do not add a prototype for it.
Attachment #609636 - Flags: review?(tterribe) → review-
An additional note on downscale_image():

From a theoretical perspective, the "correct" way to downscale is to apply a low-pass filter to the source, and then sample the source using an anti-aliasing filter (like Lanczos) applied to the _source_ image at the offsets of the corresponding pixels in the destination. What this code is doing is integrating over the source image using a Lanczos kernel applied to the destination. It's not immediately obvious to me that these two are equivalent. Though obviously the large filter kernel produced by the latter with large downscaling factors has some low-pass characteristics, the mismatch between the two approaches seems like it would be greatest with small amounts of downscaling.

Another approach would be to downscale with a simple box filter until you get as close as possible to the target size (without going below it), and then apply a Lanczos kernel to the source to get to the final target. The result would retain much more HF energy, though also be fairly aliasy. People tend to prefer "sharp" images (most photo sharing sites, for example, apply a sharpening filter after downscaling), so some aliasing is probably okay. The only risk is the introduction of MoirΓ© patterns.

I talked this over with Monty, who spent a lot more time than me thinking about this when he re-did all of GIMPs scaling filters a few years back, but it's still not obvious which would be better. So proceeding with the approach in the current patches is fine, but it might be worth investigating if someone gets bored.
Probably Jeff should take a look at his patch eventually and fix derf comments
Blocks: 739734
(In reply to Joe Drew (:JOEDREW!) from comment #133)
> Maybe Siarhei will be able? Or Tim?

Timothy B. Terriberry saved the day and I guess no review is needed from me anymore :) Which is a relief, because I was anticipating a tough challenge balancing between complaining about the problems in the patch and remaining diplomatic. This could be easily interpreted as bikeshedding or being biased.
tested this patch on fennec XUL IPC version , and don't see quality scaling applied there
(In reply to Oleg Romashin (:romaxa) from comment #144)
> tested this patch on fennec XUL IPC version , and don't see quality scaling
> applied there

Heh, so that's cause pref("content.image.allow_locking", false);
I use mLockCount in order to count same src images. Any other way to do it?
Here is Lanczos scaler moved out from pixman. Dropped other filters. Fixed minors.

Jeff, could you please-please-please fix/answer the rest, so that we could land this finally.

Here is the rest of derf's comments:

@@ +85,5 @@
> +        return 0;
> +    return a;
> +}
> +
> +/* from ffmpeg */

What's the licensing associated with this code? Is it MPL-compatible? (to my knowledge, FFmpeg in general is not)

- Not sure if there is still something left from ffmpeg.

@@ +368,5 @@
> +
> +        {
> +            int16_t leftovers;
> +            /* XXX: eventually we should avoid doing malloc here and just malloc a big
> +             * chunk of max_filter_size * sizeof(int16_t) */

More accurately, dest_subset_size * max_filter_size * sizeof(int16_t). But please fix this TODO. It's not hard. In fact, because you need to check this allocation, it makes cleanup much easier.

@@ +374,5 @@
> +
> +            /* the filter must be normalized so that we don't affect the brightness of
> +             * the image. Convert to normalized fixed point */
> +            /* XXX: It might be better if we didn't have to do this in a separate pass */
> +            int16_t fixed_sum = 0; /* XXX: should we use a regular int here? */

If you were off by more than a factor of two, you'd have more problems than the integer overflow this caused (specifically, leftovers below would be larger than the filter coefficient it was correcting).

@@ +423,5 @@
> +    int max_filter_size = ceil_int (src_support * 2) - floor_int (src_support * -2) + 1;
> +
> +    /* we use a ring buffer to store the downsampled rows because we want to
> +     * reuse the downsampled rows across different destination pixels */
> +    int ring_buf_size = max_filter_size;

If you rounded this up to the next power of two, you'd waste a small amount of memory (much of which jemalloc is likely already wasting for you) but eliminate a bunch of modulus operations. Not sure whether that's worth the increased cache footprint, but I expect in the common case it is (for small downscalings max_filter_size is small and for large ones scaled_width is small).
Attachment #609636 - Attachment is obsolete: true
Attachment #614472 - Flags: feedback?(jmuizelaar)
Attachment #609636 - Flags: review?(siarhei.siamashka)
Joe, could you look at part2 again please?
Now it handles 16 bpp, that can be seen on Android native builds. Current Lanczos scaler is not compatible with 16 bpp, so I have to convert the buffer first.
Attachment #609868 - Attachment is obsolete: true
Attachment #614487 - Flags: review?(joe)
Comment on attachment 614472 [details] [diff] [review]
Part 1: Latest Lanczos filtering code. WIP

FWIW, I'm not going to be able to get to this for a least a week.
Comment on attachment 614487 [details] [diff] [review]
Part 2: Pre-downscale with downscale_lanczos_filter. ScaleWorker in it's own thread. RequestScale on timer shot. (v3)

Review of attachment 614487 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good! Sorry for the very long delay in review; I have been very busy :(

::: image/src/RasterImage.cpp
@@ +74,5 @@
>  #include "mozilla/ClearOnShutdown.h"
>  
> +extern "C" {
> +#define PIXMAN_DONT_DEFINE_STDINT
> +#include "pixman-rescale.h"

It'd be nice if pixman-rescale defined itself as extern "C" I guess, but if that's not what pixman usually does there's not much we can do.
Attachment #614487 - Flags: review?(joe) → review+
Hi, i asked Nicolas Robidoux <nicolas.robidoux@gmail.com> creator of robidoux filter from imagemagick) for some comments and here they are:

First thought: Is it really necessary to use fixed point arithmetic on
current chips? Because of ARM?

Other thing: Although I have other cats to whip right now, I could
definitely make

+/* Evaluates the Lanczos filter of the given filter size window for the given
+ * position.
+ *
+ * |filter_size| is the width of the filter (the "window"), outside of which
+ * the value of the function is 0. Inside of the window, the value is the
+ * normalized sinc function:
+ *   lanczos(x) = sinc(x) * sinc(x / filter_size);
+ * where
+ *   sinc(x) = sin(pi*x) / (pi*x);
+ */
+static float eval_lanczos(int filter_size, float x)
+{
+  if (x <= -filter_size || x >= filter_size)
+    return 0.0f;  /* Outside of the window. */
+  if (x > -FLT_EPSILON &&
+      x < FLT_EPSILON)
+    return 1.0f;  /* Special case the discontinuity at the origin. */
+  else {
+      float xpi = x * (M_PI);
+      return sin(xpi) * sin(xpi / filter_size) * filter_size / (xpi *
xpi);  /* sinc(x) * sinc(x/filter_size) */
+  }
+}

much faster.

If this is only used with Lanczos 2, for example, there is an
algebraic simplification that gets rid of one trig call. (From memory,
there are also simplifications for other number of lobes, but no
"universal" simplification.)

I also have fast approximations of such filters, a primitive version
of which is described in the Masters thesis of my student Chantal
Racette (on arXiv) and used in ImageMagick, but to set those up, I
need time.

Box filtering to the nearest largest size and then finishing off with
an interpolator is what vipsthumbnail does. It's a reasonable approach
but basically you'll get the lack of detail and the artifacts
associated with box filtering (esp. when you are downsampling by a
factor of 2, in which case all you're doing is box filtering!).
Sharpening at the end overcomes some of the issues.

Off the bat, I would not recommend first filtering and then box
filtering, but it's also a reasonable approach.

Will keep an eye. If and when the code becomes more legible, I may
take the time to figure out exactly what it does, and try to improve
on that.

It looks like it's more or less doing "a" right thing. But I'd need
more patience than I have now to make sure all the t's are crossed.
(In reply to Bruno George Moraes from comment #150)
> Hi, i asked Nicolas Robidoux <nicolas.robidoux@gmail.com> creator of
> robidoux filter from imagemagick) for some comments and here they are:
> 
> First thought: Is it really necessary to use fixed point arithmetic on
> current chips? Because of ARM?
No that's because fixed point is faster, I just leave this two quotes here:
"FP math is ALWAYS slower than INT math, unless your have a CPU with ridiculously big FP registers.
also FP math leads to precision problems over time... unless you use a ridiculously high FP precision...
that said, INT math is a way better solution. "
"Exactly -- that's the real power of integer SIMD -- the fact that you can get more throughput by using smaller data types.

Pixels are only 8-bit, and transform intermediates (as well as DCT coefficients) are only 16-bit."
(In reply to Phoenix from comment #151)
> ... that's because fixed point is faster...
Has this wisdom been checked recently with careful benchmarking that takes into account the added complexity of using fixed point to do things which are already complex in floating point?

I stand to be corrected, but I believe "fixed point is faster" is not as true as it used to be. For example, "LUTs are faster" is past its due date: careful "on the fly" computation can, and does, beat LUTs in some situations. (See the link below.)

Note that I'm not afraid of fixed point. https://github.com/jcupitt/libvips/blob/master/libvips/resample/interpolate.c contains a really fast method of computing bilinear weights using integer arithmetic, carefully benchmarked by John Cupitt against alternatives.

My hunch, which is a hunch, no more or no less, is that Lanczos downsampling is not an application in which fixed point is value added, even in 8-bit. For one thing, because you are downsampling, the slow cast back from float or double to uchar or ushort does not happen so often compared to the weighted averaging of several values using Lanczos weights. The more you downsample, the less this cast matters: there are few output pixels, so few casts.

I may be wrong, and I don't know much about cell phone chips, but this is the "wisdom" I have to offer: If you have trouble getting Lanczos downsampling off the ground, start with a version that goes through float, and only bit twiddle once you've cleaned up and optimized the float version.

(Upsampling is a slightly different story: the float to integer casts matter more in this context because there are more output pixels than input pixels.)

(Apologies for pontificating without actually offering code. If this does not get resolved, I'll try to find a block of time in the next few months to be more actively helpful.)
(In reply to Phoenix from comment #151)
W.r.t. to one of your quotes: Between my own hypothetical "through floating point" Lanczos downsampling code and SIMD fixed point code crafted by Dark Shikari, I'd put my money on Dark Shikari.
There were minor typos in my emails to Bruno George Moraes. Instead of fixing them, let me instead explain how I would go about programming this. Nothing too fancy except perhaps in the computation of the weights.

Note that I'd need to be paired with someone who understands how this interfaces the calling program, and also someone who could let me know what operations (vector unit?) I can take advantage of.

I don't believe that it is necessary to prefilter or postfilter when downsampling, unless you use a really blurry scheme like box filtering as the heavy lifter. With Lanczos downsampling, just do like it's done in ImageMagick: Exactly, or approximately, find the weighted sum, with Lanczos weights that come from "stretching" the Lanczos kernel in each direction by a factor proportional to the ratio of the number of input pixels to the number of output pixels, at every output pixel location.

Use the "align corners" image size convention. (Used by ImageMagick.) It's the best one for downsizing, because pixel locations "creep in". (I suggest that the "align centers" image size convention be used when upsizing, but this is not what this thread is about.)

Because the Lanczos filter is a tensor product, the way to do this is filter each input row in the horizontal direction, and then keep that row around just long enough to compute the horizontal output rows that depend on it. Each of these output rows is computed, from top to bottom, by filtering vertically the horizontally filtered input rows that fall within the height of the "stretched" Lanczos support. Given the height of the input and output images, it is easy to compute an upper bound on the number of filtered rows that need to be available to compute an output row. Careful index tracking allows one to reuse filtered rows which are in the overlap of consecutive output rows without moving them within memory. My understanding is that Jeff Muizelaar did just that (not 100% sure).

When filtering vertically, the averaging weights are the same across filtered columns. Consequently, they can be computed one output row at a time and stored in a short array (the length of which depends on the ratio of the height of the input and output, with an upper bound easy to figure out).

When filtering horizontally, the averaging weights are the same across filtered rows. Although this requires a non-trivial amount of memory, I suggest that you store, for each horizontal output location, the index of the leftmost horizontal input location which falls within the stetched Lanczos window width, the number of pixels within the stretched window width (you could figure out the largest possible number, and actually only store the deviation from this number, which would allow you to use uchars), as well as the corresponding weights, as floats. If you are downsampling by a factor of r using Lanczos n-lobes, for example, this requires you to store about r*(2*n)*(output width) uchars, ushorts and floats. You can halve this requirement by exploiting left-right symmetry. You can reduce this requirement by (output width) by exploiting the fact that the weights sum to 1. Note that since r*(output width) = (input width) approximately, this is essentially storing (n-1)(input width) uchars, ushorts and float, which in my opinion is reasonable. If this is too much storage, optimize the computation of the weights and (re)compute them on the fly. I can definitely help with that.

What abyss policy you use is not that important when downsampling with the "align corners" image geometry convention. I suggest that you do like ImageMagick resize, namely you compute weights for actual pixels and normalize (sometimes this is explained as making the abyss (transparent) black and not including the weights of abyss pixels in the normalizing denominator).

W.r.t. to the horizontal and vertical coefficient arrays, filtering is just multiplying pixel values within the relevant index ranges and adding them up.

What the above overall approach does is compute output rows from top to bottom and left to right, using input rows read from top to bottom and left to right as data.
One small thing to remember: If input_width/output_width is not exactly equal to input_height/output_height, and you want to preserve aspect ratio exactly, you need to adjust both the horizontal and vertical "Lanczos window stretching" (take the largest one, for example), and do the geometry computation (which defines the correspondence between pixel locations within the input and output images) from the dead center of the input and output images. (As a simplification, I assume input image has implicitly been "cropped" to the ROI.)
(In reply to Nicolas Robidoux from comment #155)
> exactly, you need to adjust both the horizontal and vertical "Lanczos window
> stretching" (take the largest one, for example), and do the geometry

I'm confused. Why isn't the horizontal and vertical window completely independent here?
(In reply to Timothy B. Terriberry (:derf) from comment #156)
> I'm confused. Why isn't the horizontal and vertical window completely independent here?
Example:
Input image is 1000x512 and you decide to resize so largest dimension is 300.
1000/300 = 3.33..., so you use this as window rescaling factor of the horizontal filtering.
>>IF<< you want to preserve exactly the aspect ratio, you need to use the exact same window rescaling factor in the vertical direction. Unfortunately, 512/3.33... = 153.6, which is not an integer number of pixels. To minimize the impact of the abyss, the best choise is to round down (you could also round). So, the height of the output image will be 153. Unfortunately, 512/153 is not exactly 3.33...
Moral of the story: If you want to preserve aspect ratio, meaning that a square in the input image will be a square in the output image (not only approximately, but exactly, ignoring the artifacts introduced by resampling, which of course can be considerable is the square is small), then the Lanczos window rescaling must be the same in both directions, and then there will be resize operations for which the window rescaling will not match the ratio of input and output dimensions in at least one direction.
The only way to avoid this is to be satisfied with approximate preservation of the aspect ratio. (Important: I mean the aspect ratio of "objects within the image". This type of aspect ratio is not affected by cropping, for example.)
(Last time I use ">>". Messes up the text alignment.)
(In reply to Nicolas Robidoux from comment #157)
> Moral of the story: If you want to preserve aspect ratio, meaning that a
> square in the input image will be a square in the output image (not only
> approximately, but exactly, ignoring the artifacts introduced by resampling,
> which of course can be considerable is the square is small), then the

Okay, I understand what you're saying now. The current API specifies an exact destination width and height, in units of an integer number of pixels. So I don't think it can tell if you want to preserve the aspect ratio exactly or not. If you think that's truly important, it would probably require significant refactoring to get this kind of information down to this level (from all the way up in the CSS/style rules, which is quite far removed from this code). Let's do the simple thing for now. I think the artifacts from our current resampler are a much larger problem.

(In reply to Nicolas Robidoux from comment #154)
> Note that I'd need to be paired with someone who understands how this
> interfaces the calling program, and also someone who could let me know what
> operations (vector unit?) I can take advantage of.

Whatever you use, you'll need to use run-time CPU detection and have fallbacks. There's detection code in xpcom/glue/SSE.h and xpcom/glue/arm.h. Keep in mind if the plan is to contribute these changes back to pixman (and I don't know that it is, maybe Jeff has an opinion here), pixman currently (as of the version in our tree) doesn't have detection support for anything newer than SSE2, and doesn't have separate detection of ARM EDSP (i.e., v5E) instructions, though it does include detection routines for the ARMv6 media instructions and NEON (either of which will only be present if EDSP instructions are also). In addition, our detection routines are all C++, while pixman is written in pure C.

But let's get somewhere with reasonable C code before we worry too much about SIMD.
Things are easier if all you want is to "align" the corners of the output image with the corners of the input image, even if this means a different shrinking/stretching vertically than horizontally.
As long as the the aspect ratios, as full images (really, the ROIs), of the input and output images are close enough, nobody should notice input image squares not being quite square in the output image.
Not something worth propagating major API changes in a web browser, I would guess.

-----

I put looking carefully at this code on my to do list, and I go back to Masters supervision. ETA? Don't hold your breath.
(In reply to Nicolas Robidoux from comment #152)
> (In reply to Phoenix from comment #151)
> > ... that's because fixed point is faster...
> Has this wisdom been checked recently with careful benchmarking that takes
> into account the added complexity of using fixed point to do things which
> are already complex in floating point?
That statement comes from simple fact, that integer units in desktop cpus are usually faster (if their precision is enough). After quick look I find data for Intel Core 2, things doesn't change much for more modern Intel and AMD chips (quick and dirty translation from russian):
"Latency for most Int math is 1 cycle, and execution rate is 3 operations per clock in units, connected to first three ports. Latency for Float math is 3 cycles for ADD, 4 for SSE MUL and 5 for SSE2 or x87 MUL."
Could someone please remove me from CC list? I signed up via BrowserId and my email address not appearing in CC list and it also not letting me to uncheck "Add me to CC list".

Thanks.
RE:

int max_filter_size = ceil_int (src_support * 2) - floor_int (src_support * -2) + 1;

The maximum number of locations with non-zero weights is

int max_filter_size = (int) ( 2 * src_support ) + 1;

under the assumption that

src_support = LANCZOS_LOBES / scale

and

scale = (float) N / (float) n

with N the number of output pixels, and n the number of input pixels, in the relevant direction.

In other words, it appears to me that the value computed for max_filter_size is too large by about a factor of 2.

-----

The above formula is overly safe. It does not explicitly take into account the fact that the Lanczos kernel vanishes at the extremity of its support. The following does take this fact into account, at the cost of a function call:

int max_filter_size = ceil_int ( 2 * src_support );

using Jeff's ceil_int function.
Correction: You CAN preserve the "intrinsic aspect ratio" without changing the API. All you have to do is use the same window scaling in both directions, cropping the output (usually very slightly: subpixel distance) in one of the two directions when there is a mismatch, and making sure to do geometry relative to the image centers (ROIs, really) to keep things aligned.
floor_int and ceil_int are unneeded: All they do is apply the floating point function floor or ceil to a float input, and cast the result to int. This should be automatic if you put the result in an int.
(I believe that there are slight differences w.r.t. the specifics of how it's done depending on whether you are using C++ or C.)
Correction: In the horizontal direction, you are better off using the maximum number of locations with nonzero weights across the board, which obviates the need to have the uchar array used to store how many such locations there are for each output horizontal sampling position. Pad the coefficient array with zeros for unneeded locations.
In the vertical direction, we compute this information on the fly, so we can be "tight".
(In reply to Timothy B. Terriberry (:derf) from comment #140)...
> > +            /* Since the filter really exists in dest space, map it there. */
> > +            float dest_filter_pos = src_filter_pos * clamped_scale;
> 
> This is going to introduce a shift in the resampled output. I think what you
> want instead is (src_filter_pos + 0.5F)*scale - 0.5F

I agree with Timothy: With the "align corners" image geometry convention, and associating with each pixel position its index, starting at 0 and up to width/height - 1 (C style, as is pretty standard), Timothy's formula is the correct one, and the other formula will shift the image.
As written in comment 27 already: 
Do read https://bugzilla.mozilla.org/page.cgi?id=etiquette.html before adding comments that don't help anybody.
jmuizelaar(In reply to Jeff Muizelaar [:jrmuizel] from comment #148)
> Comment on attachment 614472 [details] [diff] [review]
> Part 1: Latest Lanczos filtering code. WIP
> 
> FWIW, I'm not going to be able to get to this for a least a week.

Gentle ping.
Can we get some feedback Jeff Muizelaar [:jrmuizel] on part 1, please? ;)
(In reply to Virtual_ManPL from comment #172)
> jmuizelaar(In reply to Jeff Muizelaar [:jrmuizel] from comment #148)
> > Comment on attachment 614472 [details] [diff] [review]
> > Part 1: Latest Lanczos filtering code. WIP
> > 
> > FWIW, I'm not going to be able to get to this for a least a week.
> 
> Gentle ping.
> Can we get some feedback Jeff Muizelaar [:jrmuizel] on part 1, please? ;)

I will try to look at this again soon. Please don't hesitate to remind me on IRC
Now we use fluid widths design, and thus fluid images and firefox scales it horribly. Also the offset is silly; if you give a fluid width image a background colour it becomes visible around the edges of the image.
I'm offering $100 to the persons who get a good fix for this deployed to
Firefox stable before the 4-year milestone.  If other users would like to chip in, please let me know privately.

I emailed Jeff.  I'm a coder, maybe I can help fix it and keep (some of) my money.  Now to download the mozilla-central hg bundle.  Ouch, that's 500MB, and how many days to compile?

I also wrote a great long humorous diatribe about this bug, but decided not to post it here.  I'd like to keep my bugzilla account, please.

Well since you're all so curious and I'm posting anyway, maybe I will include a few key words and phrases.

HTML5, SVG, WebGL, 1970s, MS Paint, better, sed -i, PIXMAN_FILTER_BEST, about:config, Donald Knuth, hell-bendingly awesome, off-the-shelf, 3 years,
Go Mozilla!!!

If you're extra-curious and would like to read the full rant, I can send it to you.

"Status: NEW"  bawahahahaha
(In reply to sam from comment #178)
> I emailed Jeff.  I'm a coder, maybe I can help fix it and keep (some of) my
> money.  Now to download the mozilla-central hg bundle.  Ouch, that's 500MB,
> and how many days to compile?

If you want help getting started, don't forget to join irc.mozilla.org, channel #introduction.
Nice to know for the "I-don't-rememeber-how-many" times there is someone would like to pick up this...
Just curious, what happen to Tatiana Meshkova (:tatiana) fixed in comment 147 ? I used the try build, and it seemed fine to me, why you want to throw away the done works and start from scratch again?
>Just curious, what happen to Tatiana Meshkova (:tatiana) fixed in comment 147 ?

The first patch (on which the second one builds) has a "WIP" tag and didn't request a review (only feedback), so apparently she thought it needed further development before it could land.
(In reply to Gian-Carlo Pascutto (:gcp) from comment #182)

> The first patch (on which the second one builds) has a "WIP" tag and didn't
> request a review (only feedback), so apparently she thought it needed
> further development before it could land.

Right. Part 1 needs to be either tuned, see comment 146 or replaced by something better.
May be Siarhei could work on it?
Assignee: tanya.meshkova → siarhei.siamashka
@Siarhei: I remember you had some serious concerns about Part 1 code. Could you elaborate? Thanks.
Isn't there anyone responsible for this embarrassing display of utter incompetence?
This absolutely basic issue/feature remains broken after more than three years now!
It already became an in joke on several technology related forums and puts every single person that is involved in Firefox development in a very, very bad light.

Thank you, but I know this rant from a user perspective doesn't help and is sent to a lot of people.
Then please just stay in all your tech forums and add your comments there. Bugzilla is not a forum and your comments don't help anybody. Read https://bugzilla.mozilla.org/page.cgi?id=etiquette.html before commenting here again. Or try to be constructive by providing patches instead of talking. Thanks.
In order to fix this bug, it needs bug 517294, bug 524468 and bug 580469 fixed. Please be patient. While I acknowledge that a bug of this magnitude should see more urgency, there are limited resources available to Mozilla and occasionally, bugs like thus remained unpatched as a result. If you or anyone you know would like to attempt to fix this bug and/or any of the dependencies. Please feel free to say so within and someone will come along to offer you some assistance. However, evangelism of a bug is unnecessary. Devs continue to watch this bug but are inclined to stop if the bug is spammed.
So in order to speed up the landing of this bug it probably makes sense to remove the dependency on Part 1. Instead, it we should probably just use the code from:

http://src.chromium.org/viewvc/chrome/trunk/src/skia/ext/image_operations.cc?view=markup
and
http://src.chromium.org/viewvc/chrome/trunk/src/skia/ext/convolver.cc?view=markup

This is well tested and shipped code. It also has SSE2 code which my code does not.

It forces a skia dependency but I don't think that will be too bad. We can remove that dependency in a follow up.
Agreed!
If it speed up things, works well, have nice performance and it's based on Lanczos method, why not!