If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

[Feature Request] HTML Filter with Regular Expressions support




HTML: Parser
17 years ago
14 years ago


(Reporter: Marek Wawoczny, Assigned: harishd)



Firefox Tracking Flags

(Not tracked)




17 years ago

I would like to see HTML Filter (PlugIn) with regular expressions support. With
this I can delete unwanted HTML code before page is displayed.

How it should work:
User URL Request->Browser gets document->HTML Filter->Browser starts to parse
HTML page

For Example:
Some Page got refresh meta tag. I want to get rid off this tag.

Regular Expression to match the metatag:
*http-equiv=("|)refresh*content=("|) [#2:10000](*url=("|)([^"' ]+)\1|)*

if matched then should be replaced by
<center><font size=1><a href=\1 >[Refresh]</a></font></center>

This will show link [Refresh], now I can decide when/if page should refresh by
clicking this link.

This HTML Filter could be used to prevent JavaScript, PopUps, displaying images, 
tracking user (webbugs, refferer link deleting), fixing most popular bugs in
HTML code etc.

Comment 1

17 years ago
How about just the ability to pipe HTML pages through an external program before
rendering.  This should be far easier to code into Mozilla and be far more
flexible than built in Mozilla substitutions.


Comment 2

17 years ago
I think that better make it in Mozilla, coz not everyone want to use external 
program, and not every external program is freeware. Now I'm using such program 
- Proxomitron - (to filter HTML) via local proxy. So if U thinking about 
pipening HTML to external app U can use any browser without modifications, 
almost all HTML filters use browser proxy settings. If it would be built in 
Mozilla U could turn off/on it by only one click - for example on browser 
Navigation bar also configuration would be easy. No more clicking 2 or more 
applications to turn on some feature. Also I'm thinking about easy adding new 
rutines. For example U load some page, and it wouldn't be filtered as U wish, so 
U right click on element which U don't want to see next time and in menu "Add to 
unwanted stuff". Easy. Mayby it would be even easier to non-proffesional/new 
users. Also don't forget about proffesional users - they can setup regular 
expressions patterns to suit their demands. But this is only wish, most 
important thing is to make Mozilla fast and the best browser in World :)))

Comment 3

17 years ago
Assignee: asa → harishd
Component: Browser-General → Parser
QA Contact: doronr → bsharma

Comment 4

17 years ago
you might try protozilla, i really can't imagine someone implementing this as 
part of the trunk.
confirming enhancement request
Ever confirmed: true
OS: Windows 2000 → All
Hardware: PC → All

Comment 6

16 years ago
This bug has been marked "future" because the original netscape engineer working 
on this is over-burdened. If you feel this is an error, that you or another
known resource will be working on this bug,or if it blocks your work in some way 
-- please attach your concern to the bug for reconsideration. 

Target Milestone: --- → Future


16 years ago
QA Contact: bsharma → moied

Comment 7

16 years ago
I think, that a regex support is not needed. Instead it should be easy to maintain
a list of unwanted URLs (just as the "Block Images from Server", but also for
javascript Pop-ups, HTML-pages and Banners).

Comment 8

16 years ago
>I think, that a regex support is not needed. Instead it should be easy to maintain
>a list of unwanted URLs (just as the "Block Images from Server", but also for
>javascript Pop-ups, HTML-pages and Banners).

And what about example that I provided in bug report? Block feature is only partial, imagine what you can do with regexp... You can change any string (for example meta data, tags, fragments of code) with regexp.

Comment 9

16 years ago
Changing the HTML source of a page could be a copyright violation,but "not loading" parts of a Web page (banners, pop-ups, flash, ...)doesn't violate any copyrights. Besides it's much easier to use,for those who don't want to learn a new regex syntax.

Comment 10

16 years ago
Hmmmm... So you are trying to say that all HTML filter applications, proxy servers (some are adding stuff to viewed pages) are copyright violation? Don't think so, you use that for viewing, you are not distributing in any way modified pages etc. Btw. Why newbie should use regexp? I'm thinking about predefined regexp rules, and ability to add new ones (for advanced users).

Comment 11

16 years ago
*** Bug 152017 has been marked as a duplicate of this bug. ***

Comment 12

15 years ago
*** Bug 159818 has been marked as a duplicate of this bug. ***

Comment 13

15 years ago
If you want to control whether a single page loads, just use auto-proxy
configuration.  You supply a javascript function that is given a URL and returns
the proxy to use for that URL.  You can have the function use whatever logic it
likes to decide, for example, whether to use a proxy or get the URL natively. 
This can be used with a fake proxy to block ads (essentially integrating
JunkBuster into the browser).

As for modifying the HTML before sending it to the parser, it would be nice to
have a similar model:  You can supply a function that takes a URL as a
parameter, and it tells you what, if any, function to use to preprocess the
HTML.  This should be done in as generic a manner as possible so as to maximize

Comment 14

14 years ago
I think that the filter functions of Mozilla (spam e-mail, popup, cookies) need
a consistent, regex-enabled UI. "Allow only the already existing cookies" or
"Disallow any cookies except for new ones from domain xy", simple rules which
are not possible yet. 

All filter modules should use the same data base, for example the core code of
an open-source data base like mySQL. This way we can stop the bloat. Have a look
at the files in the profile folder and at execution time of various events -
people are already beginning to laugh!

Comment 15

14 years ago
cookies, popup windows, e-mail spam, passwords and other form auto-completion,
and now the image manager. text files with *.s, *.tbl and *.w. 

>All filter modules should use the same data base, for example the core code of
an open-source data base like mySQL. 

Maybe I am wrong, but I do not like featuritis. 
This is not going to happen in the parser.  If someone wants to write an
extension to do this, please feel free to file a bug on the hooks you think you
need (they would be hooks into content dispatch and necko, probably, not the
Last Resolved: 14 years ago
Resolution: --- → WONTFIX
Doing this with regexp would be silly anyway, IMHO. We have a DOM...
You need to log in before you can comment on or make changes to this bug.