78104 - [RFE]block images by directory or by regexp/pattern

Reporter

Description

•

23 years ago

It would be really cool if, in addition to being able to block all images
orginating from a given server, Mozilla could block all images from URLs
matching a given pattern. On many sites, ad images originate from the same
server as the non-ad images, so when the user tries to block the ad images, he
or she ends up blocking the non-ad images as well. It would be nice if Mozilla,
like iCab, could be set to block images originating from, for example, all URLs
containing the pattern "/Ads/" thereby allowing to only block the ad images on
the image server. This feature could also be useful if the ad banners are using
Akamai to serve their images because although the originating server is an
Akamai server, the URL still often contains a pattern matching the name of the
ad provider. It would be even cooler, though not absolutely necessary, if this
matching supported regular expressions.

Stephen Walker

Comment 1

•

23 years ago

*** Bug 78105 has been marked as a duplicate of this bug. ***

Keyser Sose

Comment 2

•

23 years ago

Marking NEW.

Assignee: mstoltz → pavlov

Status: UNCONFIRMED → NEW

Component: Security: General → ImageLib

Ever confirmed: true

OS: Mac System 9.x → All

QA Contact: ckritzer → tpreston

Hardware: Macintosh → All

Summary: Pattern-matching based image blocking → [RFE] Pattern-matching based image blocking

Mitchell Stoltz (not reading bugmail)

Comment 3

•

23 years ago

I agree that this would be a cool feature and it can probably be worked into
nsIContentPolicy, which we will hopefully be using for configurable content
filtering.

Stuart Parmenter

Comment 4

•

23 years ago

is there a bug for making nsIContentPolicy actually work anywhere?  There is
code in the imageframe that is if 0'd out since I had no way of testing it...
but it should allow imageblockin to work.

Target Milestone: --- → Future

Martyr

Comment 5

•

23 years ago

What about having this read an external file, say like those used by
Junkbusters? Might save on the UI work if it was a parse/load/match issue as
opposed to that plus maintaining the associated files and UI support. An "import
block file" or something, maybe?

Mitchell Stoltz (not reading bugmail)

Comment 6

•

23 years ago

I would love to add that feature. It's on my list. I'll take this bug.

Assignee: pavlov → mstoltz

u12624

Comment 7

•

23 years ago

While it'd be cool if one could block a domain (or a directory of it), certain
machines under it should still be allowed to display images. 

Block "*.akamai.net" (99% ****), but allow "a1964.g.akamai.net" (the 5th wave)
and "a772.g.akamai.net" (apple) to show.

Jeremy M. Dolan

Comment 8

•

23 years ago

This should be implemented with regexp matching, preferably.

BLOCK ^ad\.
BLOCK /ads/
ALLOW foo.com/ads/

Cookie blocking/allowing also needs a regexp hookup. I think there's a bug for
that somewhere. Adding this to junkbuster feature tracker. No chance of this
pre-1.0?

Blocks: 91783

Keywords: helpwanted

Marcus Pallinger

Comment 9

•

23 years ago

Does this approach have to be specific to blocking images, or could it also be
used to block any http request matching a given expression?

Jeremy M. Dolan

Updated

•

23 years ago

Blocks: 91785

Andy Lyttle

Comment 10

•

22 years ago

There are plenty of URLs for JavaScript code that loads ad banners; if you block
the URL of the JavaScript, then you block the ad banners completely (much
cleaner than blocking just the images).  If a list of URLs to block is
maintained, this lets you keep JavaScript on but block (much of) the stuff you
don't want.

Benoit

Comment 11

•

22 years ago

A very nice implementation would be something like AdShield for IE.
See http://www.adshield.org/guide/Guide.htm#suppress for screenshots.

Ari

Reporter

Comment 12

•

22 years ago

Right now, image and cookie permissions are stored in the same file in the
profile directory -- cookperm.txt. According to nsPermissions.cpp, which reads
from and writes to that file the current format of that file is:

host \t permission \t permission ... \n

When an image or cookie is blocked, the host name is extracted from the URL and
entries in the filer are updated. Instead of creating a new file that contains
image blocking patterns, I propose that we modify this format to be:

pattern \t permission \t permission ... \n

where 'pattern' is matched against the entire URL, not just the host name.

Under this new scheme, we may even be able to support entires of the old form by
simply treating the 'host' string as a pattern to be found in the URL. If we did
it this way, we would retain backward comaptibility with everyone's cookperm.txt
and we would be able to put off changing the current cookie/image permissions ui.

This idea seems to require some very doable changes to nsPermissions.cpp. The
questions that I have for knowledgable mozilla people are:

1. What should the format of the patterns be? UNIX-like regexps? Shell-like
completion (e.g. *.foo.com)? This affects whether we can have backward
compatibility with old entries?
2. Can we use pattern matching code from anywhere else in the tree? I wouldn't
want to reinvent the wheel. I know js has regular expression processing.

Mark Hammond [:markh] [:mhammond]

Comment 13

•

22 years ago

How about we use JavaScript regular expressions, and just use the JavaScript
engine directly? [A quick look shows extensions\cookies\makefile.win still lists
the JS engine as a .LIB, but nothing currently uses it in this tree]

One possibility would be to enclose regular expressions in "/" characters.  This
would help denote them as "new style patterns" rather than "old style host
names", and would also help reinforce they are regular expressions to the casual
viewer of the .txt file (as JS source code wraps regexs in "/")

Thus, you may expect to see:

/\S*:\/\/ads\..*/

which would block all protocols from all hosts starting with "ads.".  Of course,
the UI would implement a nicer layer on this that would provide a simple way to
partially specify the host, as per previous comments in this bug.

Boris Zbarsky [:bzbarsky]

Comment 14

•

22 years ago

*** Bug 136575 has been marked as a duplicate of this bug. ***

David Scheiderich

Comment 15

•

22 years ago

Ive done some looking around at the source, and plan to tackle this. Im planning
on using the JavaScript RegExp for handaling urls. Also thinking about allowing
simple UNIX like wildchars (*.foo.com), that would easily be translated into
regexps.

Dave

Comment 16

•

22 years ago

I am glad this is getting addressed.  I think the solution discussed here is a
great idea.  However, I humbly request that you include, in whatever solution
you adopt, a fix for the issue discussed in Bug 136091 (Port number in image
source disables image blocking)?  Hopefully it will be solved naturally by your
enhancement.  This is a problem for those of us who are interested in blocking
images from servers mounted on ports other than 80, or from any server when the
link in the HTML source includes any port number at all.
In fact, I have just discovered that one can circumvent Moz's image blocking
entirely by simply attaching the :80 to the host-name portion of the URL (sort
of expected from what I already knew about incorrect image blocking for URLs
including any port number).
Maybe you could also look into how this solution might (or might not)
incorporate solutions to Bug 64068/Bug 133114 (images load even though all
images are blocked or certain server is blocked).  Also, be careful of Bug
83047/Bug 140172, because USEMAP attributes seem to sometimes break the image
blocking functionality.
I hope I know what I'm talking about.  Cheers.

Alfonso Martinez

Comment 17

•

22 years ago

*** Bug 141562 has been marked as a duplicate of this bug. ***

Alex Koralewski

Comment 18

•

22 years ago

i completely and totally am for this feature.  what are the chnaces this feature
will make it into rc3 or the final 1.0?  also i think not only images should be
blocked.. it seems the advertising media is moving to flash and right clicking
on flash applets brings up a macromedia context menu :-(

Keyser Sose

Comment 19

•

22 years ago

Chance of it being in 1.0 is Zero no one has coded anything and its not a
feature anyone is working on at the moment. Sorry but if you can code C++ you
can help us out.

Sagie Maoz

Comment 20

•

22 years ago

Following #18, an option to determine what objects to block would be nice (by
tabs: <img>, <object>, <script> etc.)

David Scheiderich

Comment 21

•

22 years ago

I Have started working on this, and got it fundamently working with images and 
cookies. Someone pointed out (#16) that port numbers in url defeat the current 
system, so I have to look into that as well. Once I get it working down low, I 
was going to attempt to figure out an improved UI to allow support for regexs, 
as right now all you can do is edit the text file. I should be able to get to 
this now, as im done with classes.

Warner Young

Comment 22

•

22 years ago

Blocking any type of object is bug 94035.

Jamie Katz

Comment 23

•

22 years ago

Attached image icab image blocking pref — Details

iCab (MacOS web browser) has a powerful pref for this...

SineSwiper

Comment 24

•

22 years ago

That screenshot looks really nice!  Might be a good model to follow by.  I think
regular expressions might be better than the simple grep-like model they have,
but it might be a good default mode for those who don't know reg exps.

Jonas Jørgensen

Updated

•

22 years ago

Blocks: useragent

Krishna E. Bera

Comment 25

•

22 years ago

Please do not make this feature dependent on Javascript being on. (Why isnt it
called ECMAscript in the browser UI - isn't Mozilla's implementation compliant
to the standard?)  Doing that would be remedying an annoyance (i.e. ad banners,
slow loading) with an abomination (i.e. stupid browser tricks, security holes
galore).  I am not trying to start a religious war, only to add some qualifiers
to my vote for this bug.

David Scheiderich

Comment 26

•

22 years ago

Attached patch Patch for regexp based blocking of images and cookies. (obsolete) — Details — Splinter Review

This is my inital implementation of regexp based blocking. It is using the
JavaScript engine for regexp support. Currently I have not focused on any speed
or other optimizations (Storing compiled regexps with host, so doesnt need to
be done each time...). I have used it under Linux, I hope it should work on
other platforms. To make acctual use of the regexp the cookperm.txt file must
be edited by hand (located in profile directory).

The standard host match is still there, and is the only avalible option from
within the GUI. A sample may look like this (The spaces should be tabs in the
file!):

x10.com 0F 1F
*.doubleclick.net 0F 1F
/images.*.slashdot.org\/banner/ 1F

The first is normal host blocking. Second one will block on any Host that is
part of the doubleclick.net domain. The last is a regexp to block add banners
served from slashdot's own images' servers.

Most modifications occured in nsPermissions.cpp. In nsImgManager I modified the
code to send the image url, instead of just the host.

Jeremy M. Dolan

Comment 27

•

22 years ago

Comment on attachment 88351 [details] [diff] [review]
Patch for regexp based blocking of images and cookies.

Updating MIME type, as this patch is gziped.

So these are just wildcards, not regexps? Regexps might be preferable... UI can
always let users enter them as wildcards and convert * to .* and ? to .

Jeremy M. Dolan

Comment 28

•

22 years ago

Bah, can't change a patch's MIME type to application/g-zip. You'll have to
download it and manually unzip.

David Scheiderich

Comment 29

•

22 years ago

Sorry about the gzip. It does support regexps, as the thrid line is an example
of it. For the code to recognize it as such, it must have a leading and trailing
'/' ... just like normal javascript or vim. If it does not, then it will be
treated as a normal hostname, or a wildcard if it has '*' in it.

Daniel Wang

Comment 30

•

22 years ago

I believe the feature should be designed (and marketed) toward network
admin, parents, etc. and should not be marketed as an ad-blocking
feature (too easy for advertisers to circumvent this, and we should be
careful about image-making of mozilla.org).

What we could do is separating the feature into backend (read regex
from a .js file) and front-end (accessible from Preference, using the
simplier *? expression) (see bug 94797).

SineSwiper

Comment 31

•

22 years ago

Bah!  Screw the corporates.  They can take away ad-blockers from Tivo, but they
can't touch open-source software.  Put it in the front.  Besides, with enough
people using JunkBuster, Window Washer, and other anti-ad programs, it seems to
me that its what people want anyway.  If it's good for network admins and
parents like it too, but it seems to me that ad-blocking is the primary function
here.

Tim

Comment 32

•

22 years ago

Attached patch Non-gzipped unified diff (obsolete) — Details — Splinter Review

Here's the same patch, but as a unified diff with context, capable of being
applied from the mozilla directory (with -p1) or its parent (-p0).  I've
changed none of the code whatsoever.

Tim

Comment 33

•

22 years ago

Regarding the patch (attachment 88351 [details] [diff] [review], aka attachment 88849 [details] [diff] [review]): It would be
worthwhile to only call into the JS for globs and regexes, assuming it's
expensive to do so. Here's a small patch (applies only after original patch) to
only call the javascript comparison function on special entries (I have over
1000 records in cookperm.txt, all but two being old-style entries; thus,
visiting hosts late in the list results in massive numbers of calls per
image/cookie).
I left out the spacing changes in the else block to make the patch smaller.

--- nsPermissions.cpp~	Sun Jun 23 08:27:36 2002
+++ nsPermissions.cpp	Sun Jun 23 08:31:04 2002
@@ -294,11 +294,25 @@
       /* Try using some JavaScript here now.... */ 
       //fprintf( stderr, "\t%s\n", hostStruct->host );
 
+      /* Use the javascript as a last resort (only if glob or RE) */
+      if( ( *(hostStruct->host) != '/' )
+          && ( PL_strchr( hostStruct->host, '*' ) == NULL ) ) {
+
+        PRInt32 hostlen = PL_strchr(hostStruct->host, '/') - hostStruct->host;
+        if( !PL_strncasecmp( hostname, hostStruct->host, hostlen - 1 ) ) {
+          ret = JS_TRUE;
+          rval = STRING_TO_JSVAL( "true" );
+        } else {
+          continue;
+        }
+      } else {
+
       argv[1] = STRING_TO_JSVAL( 
           JS_NewStringCopyN(jscx,hostStruct->host, PL_strlen(hostStruct->host )
) );
       ret = JS_CallFunction( jscx, glob, jsCompareFunc, 2, argv, &rval );
       /*fprintf ( stderr, "Called compare, ret = %d, rval = %d\n",
           JSVAL_TO_BOOLEAN( ret ), JSVAL_TO_BOOLEAN( rval ) );*/
+      }
 
       if ( ret == JS_TRUE && (JSVAL_TO_BOOLEAN(rval)==JS_TRUE) ) {
         /* search for type in the permission list for this host */

David Scheiderich

Comment 34

•

22 years ago

The one thing I had though about to speed it up was to 'compile' the regexp when
loading the file and then save it within the hoststruct. The main problem I can
see with this is large memory usage... espically if you have 1000 items.
However, when it came to comparing the items, it could easily be done by
checking if the member of the struct was null. If it is, then do basic host
match, other wise call a one-line JS function.

Another possible solution would be to keep a second list of actively used
hoststructs, and then compare from there first, moving onto the main list if it
a match wasnt found. This would be bennifical to browsing a handful of sites
constatnly .... sparratic web-browsing wouldn't be much better off. One could
take  this idea to an extreme and make it act like a multi-level feedback queue...

A third option would be to use the JS_RegExp fucntion calls. I cant find them in
the online docs to SpiderMonkey, but they are defined whithin the jsapi.h (line
1611). Then instead of having a JS function, just call the C API directly. This
could be extended with the first idea, saving the compiled regexp.

I think I will tinker with the thrid option, as it should be the easist to
implement. As for the politics, I think it should be in the frontend. Simple
host-named blocking already is. I think from within the prefernce view the
wildcard and regexp should be modifable. If someone wants to use at as a
censoring tool ... I dont paticularly want to be invovled. I think censoring is
anti-opensource.

SineSwiper

Comment 35

•

22 years ago

Maybe it could borrow figures from the new URL sorting engine for auto-complete.
 I'm pretty sure it has frequency figures for each URL now.  (I used to have a
bug #, but it was complete, so I removed my vote.)  The higher URLs get faster
regexp matching, and the rarer ones get slower ones.

However, will this catch blocked URLs as well?  For something like
".+\/ads\/.+", it's going to be hit quite a few times, and all of them blocked.

Tim

Comment 36

•

22 years ago

I've done some hacking on the patch.  An important detail that needs to be
addressed is that with inexact host representations, the same URL can match
multiple times.  Currently, my thought is to leave it matching the first entry
hit.  To avoid problems, I've come up with the following recommendations (would
go in AddHost in nsPermissions.cpp):
Preferred order would be from least to most general:
  - Explicit hosts first, sorted alphabetically;
  - Globbed hosts next, sorted alphabetically;
  - Regular expressions last, sorted in descending order of length
      (longest first).

The last point needs addressing, because I don't know of an easy (i.e., fast)
way to calculate a generality index for a regex.  This works for me and allows
my /./ rule to come last.

Note that having a "/./ [01]F" rule at the end of the list allows a whitelist
effect (any URL not matched by an earlier rule always matches this rule).

I will try to make a patch with just the sorting change, but it'll take me some
time to clean out the extra cruft I'm messing with.

Tim

Comment 37

•

22 years ago

Attached patch Sort entries properly (includes original patch) (obsolete) — Details — Splinter Review

This patch incorporates the original patch because I'm too lazy to hang on to
an original-patch copy of the affected sources.

The main change it contains is the aformentioned sorting logic.

Attachment #88849 - Attachment is obsolete: true

Kelley Cook

Updated

•

22 years ago

Blocks: 33576

David Scheiderich

Comment 38

•

22 years ago

Attached patch Patch for regexp based blocking of images and cookies — Details — Splinter Review

Ok, re-did it using a c++ function for most of the logic, directly calling JS
RegExp functions. (Not listed in JS API Docs, however they are in jsapi.h) This
also includes the sorting patch from Tim. This needs to be applied to orginal
file (unpatched).

Attachment #88351 - Attachment is obsolete: true

Tim

Comment 39

•

22 years ago

For future reference, (cvs -z3) diff -u is the preferred way for patches to be
submitted.  Regardless, nice work. =)

Tim

Comment 40

•

22 years ago

Attached patch Previous patch as unified diff — Details — Splinter Review

As before, I haven't changed anything in the patch--just the format.

Attachment #88966 - Attachment is obsolete: true

Tim

Comment 41

•

22 years ago

-> Image Blocking

Assignee: mstoltz → morse

Component: ImageLib → Image Blocking

QA Contact: tpreston → tever

Tim

Comment 42

•

22 years ago

After discussion in #mozilla, I'm ripping host-based matching apart from the
other types and into a hashtable-based implementation.  That should speed things
up considerably for lists with a lot of hosts (and probably should land
independently of regular expression blocking; if anyone here wants to spin off a
bug dedicated to that change, please go ahead).  However, I'm currently trying
to do a couple other things as well, so I can't easily produce a patch isolating
that change.

Alfonso Martinez

Comment 43

•

22 years ago

*** Bug 130685 has been marked as a duplicate of this bug. ***

Dimitrios

Comment 44

•

22 years ago

*** Bug 156280 has been marked as a duplicate of this bug. ***

Matthias Versen [:Matti]

Comment 45

•

22 years ago

*** Bug 159648 has been marked as a duplicate of this bug. ***

Daniel Wang

Updated

•

22 years ago

Summary: [RFE] Pattern-matching based image blocking → [RFE] Pattern-matching based (url-based) image blocking

Jo Hermans

Comment 46

•

22 years ago

*** Bug 165805 has been marked as a duplicate of this bug. ***

Olav Vitters

Comment 47

•

22 years ago

*** Bug 166037 has been marked as a duplicate of this bug. ***

Christopher Wanko

Comment 48

•

22 years ago

so... what's the final status?  does it work for anyone?  i tried it and regexp
doesn't work in Mozilla 1.1.  do i need to build it my own self?

Matthias Versen [:Matti]

Comment 49

•

22 years ago

*** Bug 167630 has been marked as a duplicate of this bug. ***

gabriel

Updated

•

22 years ago

Blocks: 52168

gabriel

Comment 50

•

22 years ago

I have added bug 52168 (Provide UI for regexp cookie blocking) as being
dependant on this one.

Alfonso Martinez

Comment 51

•

22 years ago

*** Bug 172403 has been marked as a duplicate of this bug. ***

Boris 'pi' Piwinger

Comment 52

•

22 years ago

*** Bug 172373 has been marked as a duplicate of this bug. ***

timeless

Updated

•

22 years ago

Blocks: 69758

timeless

Updated

•

22 years ago

Summary: [RFE] Pattern-matching based (url-based) image blocking → Pattern-matching based (url-based) image blocking

Hugo Haas

Comment 53

•

22 years ago

*** Bug 175592 has been marked as a duplicate of this bug. ***

Warner Young

Comment 54

•

22 years ago

*** Bug 175572 has been marked as a duplicate of this bug. ***

dunham

Comment 55

•

22 years ago

I don't think that last patch works for the simple host-based matching.  It
looks like it tries to match the entire url against the hostname.  I had to
change:

  } else {
    // Simple host-based matching
    return !PL_strcasecmp( url, hoststruct->host );	  
 }

to:

  } else {
    // Simple host-based matching
    nsCAutoString str(url);
    PRInt32 pos = str.FindChar('/', 0);
    if (pos > 0)
      str.Cut(pos, str.Length());
    
    return !PL_strcasecmp( str.get(), hoststruct->host );	  
  }

Jeremy M. Dolan

Comment 56

•

22 years ago

Is there any perf issue preventing this from being checked in?

Summary: Pattern-matching based (url-based) image blocking → Pattern-matching (regexp) based (url-based) image blocking

Daniel Wang

Updated

•

22 years ago

Blocks: 147866

Jesse Ruderman

Updated

•

22 years ago

Summary: Pattern-matching (regexp) based (url-based) image blocking → block images by directory or by regexp/pattern

Corrado Berti

Updated

•

22 years ago

Blocks: majorbugs

Rob Cline

Comment 57

•

22 years ago

I don't see any comments here regarding the 2nd half of what is shown in the
iCab example someone posted. That is, the ability to filter out by "object link"
or "link target" as it is sometimes called.

So for example, if an image was part of a hyperlink, you could block all images
that link to "*/signup.asp" or "?referrer=", which are typical links for many
ads (I'm sure you can think of others too.)

In the patch mentioned here (which I can't really try right now), is this part
being addressed?

Rob

David Scheiderich

Comment 58

•

22 years ago

At the momment I do not see why this couldnt be checked in. I have added the
modification for literal matching, and I will try to roll another patch (Busy
with finals ...) I have not heard anything about the hash-table impl (see
comment 42.)

I am not sure what the best way to pursue filerting based on links as opposed to
source url is. To pull it off, I think that there would need to be changes in
nsImages, to provide not only the source, but a possible link url. The easy part
is handling it in the checking routines, all that is needed is another perm type.

Recently I have dabled with creating some form of a simple UI for this (menu
option pop-ups a dialog.) I am also wondering if it would be good to create a
pref to enable/disable 'advanced image (ect...) blocking'?

Christopher Wanko

Comment 59

•

22 years ago

what does the owner have to say?

Christian :Biesinger (don't email me, ping me on IRC)

Comment 60

•

22 years ago

well you should ask someone to review, then

that's probably morse because he owns cookies.

Preston Crow

Comment 61

•

22 years ago

This feature has been available since Netscape 2.0 via automatic proxy
configuration.  See http://www.schooner.com/~loverso/no-ads/ for information on
how to configure this.  I believe that this provides a complete solution without
any need for code changes.

Ervin Németh

Comment 62

•

22 years ago

Preston, you are right.  But I have the following comments:

1. With automatic proxy configuration you will have to use a blackhole proxy
server.  You suggest a simple way with inetd and a shellscipt.  This resulted
for me the following:

inetd[10430]: 3421/tcp server failing (looping or being flooded), service
terminated for 10 min

The *CORRECT WAY* would be if Mozilla would not even try to make a connection if
an image is an ad.


2. Ads are on webpages so the author gets money for his work.  IMHO Mozilla
should treat ads in a way so authors don't starve but the ads occupy as few
resources in the browser as necessary.


3. As previously mentioned in this bug there are other attributes on images
which could help determining its state for blocking: the object link.

An other decision rule could be if the image is hosted in an other domain (there
is already an option for this but it is not flexible enogh)

R.K.Aa.

Comment 63

•

22 years ago

*** Bug 187940 has been marked as a duplicate of this bug. ***

Mitchell Stoltz (not reading bugmail)

Comment 64

•

22 years ago

Mass reassigning of Image manager bugs to mstoltz@netscape.com, and futuring.
Most of these bugs are enhancement requests or are otherwise low priority at
this time.

Assignee: morse → mstoltz

Krishna E. Bera

Comment 65

•

22 years ago

is there a way to change the priority on this other than voting for it?
ad image blocking is one of the few reasons i'm using Mozilla and not some
lightweight browser.

Rob

Comment 66

•

22 years ago

You could always submit enough dupes for it to get onto the mostfreq list.

Krishna E. Bera

Comment 67

•

22 years ago

i don't really want to subvert bugzilla for a feature that might be important to
me and not many others.
i think there are four separate user needs that the mozilla community needs to
address.  once policies have been set or a way for users to state their
preferences has been created, features that claim to meet these needs in various
ways will iron themselves out.

1. user control of bandwidth they use.

2. need to minimize page load time, e.g. by making some content optional

3. discouraging providers of unwanted content (advertising, objectionable stuff,
etc), e.g. by not providing hits to their server

4. user control (show/block/alter) of presentation of downloaded content.

Mitchell Stoltz (not reading bugmail)

Comment 68

•

22 years ago

Krishna, I appreciate your concern and I agree that this would be "a good thing
to have." Unfortunately, no one at Netscape has the time to work on this right
now. The best way to get this addressed will probably be for you to find someone
interested in implementing it, maybe by asking around on the newsgroups. If
someone comes up with a patch, I'll make sure it gets reviewed.

Henrik Aasted Sorensen

Comment 69

•

22 years ago

While it's not a patch for the core code of Mozilla, an extension
(http://adblock.mozdev.org) has been made, which contains a lot of the
functionality requested in this bugreport. At the moment it merely hides ads,
but  I expect that when bug #162044 is resolved, it will be possible to perform
true blocking.

Quinn Yost (mythdraug)

Comment 70

•

22 years ago

Mitchell: Is the patch attached here so rotted as to be invalid now?
http://bugzilla.mozilla.org/attachment.cgi?id=89643&action=view

Mitchell Stoltz (not reading bugmail)

Comment 71

•

22 years ago

It can probably be updated with a minimum of work.

Paul Rubin

Comment 72

•

22 years ago

It should be possible to block not just images, but everything that can be
transcluded from the original html (that includes CSS files, javascript brought
in through the src attribute (as in "<script src=whatever.js>"), and so forth. 
You'd just get the HTML file the same way Lynx does.

One motivation for this is looking at a page through its Google cache link when
the page's actual host server is down.  Currently, if you click the cache link,
sometimes the HTML file is retrieved from the Google cache and then the browser
hangs for a long time trying to retrieve a CSS file that the HTML file asks for.
Another reason for using the Google cache might be if you want to look at a page
without leaving an HTTP hit in the actual host's server log.  You can turn off
images, but there's still all these other things the page can try to load.  So
there should be a way to say, "don't load ANYTHING!".

Max Waterman

Comment 73

•

22 years ago

I would especially like to block flash ads, for the same reason I want to block
some images - as mentioned in #18.

Max.

Doug

Comment 74

•

22 years ago

The AdBlock plugin is, indeed, nice.  But AdShield for IE is even nicer -
blocking Flash too!  Here's hoping for something just as good in Mozilla
(actually, Phoenix :-)

Matthias Versen [:Matti]

Comment 75

•

22 years ago

*** Bug 194529 has been marked as a duplicate of this bug. ***

benc

Updated

•

22 years ago

QA Contact: tever → nobody

Sébastien Delahaye

Comment 76

•

21 years ago

*** Bug 203625 has been marked as a duplicate of this bug. ***

Matthias Versen [:Matti]

Comment 77

•

21 years ago

*** Bug 210114 has been marked as a duplicate of this bug. ***

Mitchell Stoltz (not reading bugmail)

Updated

•

21 years ago

Status: NEW → ASSIGNED

Summary: block images by directory or by regexp/pattern → [RFE]block images by directory or by regexp/pattern

Mitchell Stoltz (not reading bugmail)

Comment 78

•

21 years ago

*** Bug 90634 has been marked as a duplicate of this bug. ***

Mitchell Stoltz (not reading bugmail)

Comment 79

•

21 years ago

I recommend this be moved to mozdev - it's a good idea, but it's way more than
most people need. The number of users who could benefit from this does not
justify the added bloat and complexity in the core browser. This would make a
fine add-on. Tim, still interested in working on this?

Olivier Cahagne

Comment 80

•

21 years ago

*** Bug 211250 has been marked as a duplicate of this bug. ***

SineSwiper

Comment 81

•

21 years ago

Could somebody implement this on a base level, like wildcard (* and ?) support
for the servers only?  I can understand if putting this in for the directory
level would be a challenge.

*looks up*  Oh, what's the status of the above patches?

> it's a good idea, but it's way more than most people need. The number of users 
> who could benefit from this does not justify the added bloat and complexity in 
> the core browser.

Votes = 100, Dupes = 19, CCs = 62.  The number of users have spoken...

Patrick Barnes

Comment 82

•

21 years ago

I think that this does belong in the core, not as an addon (at mozdev or
anywhere else).  There are a large number of people who know and like regular
expressions.  They are showing up in many end-user programs.  The success of
programs that use regular expressions should prove that there is a place for them.

Patrick Barnes

Comment 83

•

21 years ago

...but the ability to switch between regular expressions and standard wildcards
could be valuable.

Derek

Comment 84

•

21 years ago

cow

David Scheiderich

Comment 85

•

21 years ago

The status of the above patch is that its not working. The last time I worked on
this (November 2002), there had been changes to some of the nspr or something
which affected this code. It was minor to fix - but I never re-released a patch
as I was playing around with some other enchancements.

The current implementation supports 3 modes. (1) The current strict textual
match. (2) A simple wildcard via * and ?. (3) RegExp via the javascript engine.
Truthfully, (2) is implemented as a regexp under the hood as well.

The discussion was moving towards how to speed up the hole processes in a
situation with say 1000 entries. Hashing was suggested, by Tim, for host base
matching. Ideas for regexp? I dont really know any. My suggestion was to lazy
load entries from the file. And only keep around so many.  (Like paging in a
OS). Also, thought of pre-compiling the regexp, but then they need to be stored
in memory - which could be expensive. 

The UI for this "bug" was a different issue as well - a different bug 52168. I
played with an idea - but never got too far. In doing so, I figured the best bet
was to have a config option to enable or disable regexp blocking. This would
turn off the additional features in the UI for those who didnot care or understand.

Personally I have not followed through with this becase the last six months I
was busy with my senior project and other classes. Now that I am done, and am
waiting to start school in a few months, I have some free time to work on this,
agian.

SineSwiper

Comment 86

•

21 years ago

> The discussion was moving towards how to speed up the hole processes in a
> situation with say 1000 entries.

If this is implemented, you won't need 1000 entries, unless you're trying to 
recreate the Great Firewall of China, and that would demand a seperate daemon 
process, anyway.

kmike

Comment 87

•

21 years ago

> If this is implemented, you won't need 1000 entries, unless you're trying to 
> recreate the Great Firewall of China, and that would demand a seperate daemon 
> process, anyway.

Not so. We're talking of ad blocking, and number of ad servers in Internet is
quickly approaching infinity :)
One of most popular ad server list is more than 1000 entries long:
http://pgl.yoyo.org/adservers/
Another list (in form of proxy autoconfing .pac file) is near 500 entries (hard
to say actually, too many comments inside):
http://www.schooner.com/~loverso/no-ads/
Both use wildcards extensively, second list also uses regexps.

BTW, mozilla perfomance with .pac ad blocking is really horrible. Its locking up
 for several seconds on every url request. As .pac files use js engine, that
might be applicable to (2) and (3) modes proposed by David.

Luis Miguel Lagoa Baptista Ferro

Comment 88

•

21 years ago

Javascript engine running isn't very nice for something that needs to run
"quickly"... Why not make a "byte-code" compilation of sorts to speed what the
browser needs to run to process them, and at the same time, allow the nicer user
land language for making the rules...

Tim

Comment 89

•

21 years ago

I'm still interested in this, mitchell. I'll be helping David out however I can.
My preference would be to have this in mozilla/extensions, since it's definitely
something that embeddors wouldn't want to drag along, but it seems desirable
enough to be appropriate for the default build.

To address some of the comments made, I don't expect this to become an all-out
generic filtering mechanism (like mail/news's filters), so that should be a
separate RFE (and more likely, a mozdev project, IMO). Obviously, it's David's
call. As for people who want block lists of 1000 or more, I don't personally
intend to cater to that crowd first (again, I'm not David, and I don't speak for
him). That is, performance in the insanely-large-list category will not be my
personal main concern (initially, anyway).

Bill Mason

Comment 90

•

21 years ago

*** Bug 214869 has been marked as a duplicate of this bug. ***

Steve

Comment 91

•

21 years ago

I have been getting spam with a gif image like

http://xuphekuwisudohuti.miracleproductline.com/healthnews/

with a serial number at hotmail. So they know when I look at the email. I have
started blocking images from the server, but the guy varies the first part
before the domain. So the blocking doesn't work.  

http://[VARIES].miracleproductline.com

The Manage Image Permissions only allows removal of the whole address, not
editing it to be just the server address.  So I am stopped.  This would be great
miracleproductline.com

Though he seems to have a plethera of domains, it'd be a start.

Sander

Comment 92

•

21 years ago

== OT spam ==
Re: comment 91 - go to http://miracleproductline.com and choose tools -> image
manager -> block images from this site. Image blocking nowadays blocks the
domain _and_ all its subdomains. This was fixed by bug 176950
Manually adding domains is bug 33467, but for now you could also just edit
cookperm.txt directly. Adding the following line
miracleproductline.com 1F
to cookperm.txt would do the same.

Peter Lowe

Comment 93

•

21 years ago

The list at http://pgl.yoyo.org/adservers/ is available in lots of different 
formats, including as a PAC file:

http://pgl.yoyo.org/adservers/serverlist.php?
hostformat=proxyautoconfig&showintro=1

and as a Mozilla cookie permissions file:

http://pgl.yoyo.org/adservers/serverlist.php?hostformat=cookperm&showintro=1

More information is on the main page.

Bill Mason

Comment 94

•

21 years ago

*** Bug 218202 has been marked as a duplicate of this bug. ***

Michiel van Leeuwen (email: mvl+moz@)

Comment 95

•

21 years ago

*** Bug 221535 has been marked as a duplicate of this bug. ***

Jo Hermans

Comment 96

•

21 years ago

*** Bug 226501 has been marked as a duplicate of this bug. ***

Bryan Roseberry

Comment 97

•

21 years ago

*** Bug 228592 has been marked as a duplicate of this bug. ***

Michiel van Leeuwen (email: mvl+moz@)

Comment 98

•

20 years ago

*** Bug 217115 has been marked as a duplicate of this bug. ***

Andrew Hagen

Comment 99

•

20 years ago

I am a little confused.

Could someone please clarify whether this bug includes cookie whitelisting, as
called for in duplicate bug 217115, or does this bug just cover cookie (and
image) blacklisting? 

TIA.

benc

Updated

•

20 years ago

No longer blocks: 52168

Patrick

Comment 100

•

20 years ago

*** Bug 176539 has been marked as a duplicate of this bug. ***

SineSwiper

Comment 101

•

20 years ago

I think for now, it just includes blacklisting.  If whitelisting is easy enough
to include, so be it, but it's not a requirement.  This is mostly for catching
the word "ad" in a directory or server name, such as:

ad(s|server)?\..+
[\w\.\-]+\/ads?\/

A better list of examples could be found by just looking at a Junkbuster regexp
blocker file.

Daniel Veditz [:dveditz]

Updated

•

20 years ago

Blocks: 110363

Stefan Borggraefe

Comment 102

•

20 years ago

*** Bug 264219 has been marked as a duplicate of this bug. ***

Gérard Talbot

Updated

•

20 years ago

Blocks: 273416

Sergey Sokoloff

Comment 103

•

20 years ago

Why should AdBlock be hardcoded into Mozilla core? Keeping them independent
increases AdBlock's versatility (it's updated independently, etc.) and keeps
Mozilla more lightweight.

Jerry Baker

Updated

•

19 years ago

No longer blocks: majorbugs

Jaime Mitchell (use bugmail@jaimem.org.uk for email)

Comment 104

•

19 years ago

*** Bug 126635 has been marked as a duplicate of this bug. ***

:Gavin Sharp [email: gavin@gavinsharp.com]

Updated

•

18 years ago

Status: ASSIGNED → NEW

QA Contact: nobody

(mostly gone) XtC4UaLL [:xtc4uall]

Comment 105

•

18 years ago

https://bugzilla.mozilla.org/show_bug.cgi?id=273416
273416

u235898

Comment 106

•

18 years ago

Please note, that allowing wildcard must include or throttle cookie changes, renew or so on too. All this naggin screens if you're using FF 2.0 with "ask me for cookie"

thx
~Marcel

Daniel Veditz [:dveditz]

Updated

•

18 years ago

Assignee: security-bugs → nobody

Phil Ringnalda (:philor)

Updated

•

15 years ago

QA Contact: image-blocking

Ian Neal

Updated

•

14 years ago

No longer blocks: 273416

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

icab image blocking pref 22 years ago Jamie Katz 12.50 KB, image/png		Details
Patch for regexp based blocking of images and cookies. 22 years ago David Scheiderich 2.51 KB, patch		Details \| Diff \| Splinter Review
Non-gzipped unified diff 22 years ago Tim 9.77 KB, patch		Details \| Diff \| Splinter Review
Sort entries properly (includes original patch) 22 years ago Tim 11.35 KB, patch		Details \| Diff \| Splinter Review
Patch for regexp based blocking of images and cookies 22 years ago David Scheiderich 9.57 KB, patch		Details \| Diff \| Splinter Review
Previous patch as unified diff 22 years ago Tim 13.46 KB, patch		Details \| Diff \| Splinter Review