86504 - [Feature Request] HTML Filter with Regular Expressions support

Reporter

Description

•

24 years ago

Hello! I would like to see HTML Filter (PlugIn) with regular expressions support. With this I can delete unwanted HTML code before page is displayed. How it should work: User URL Request->Browser gets document->HTML Filter->Browser starts to parse HTML page For Example: Some Page got refresh meta tag. I want to get rid off this tag. Regular Expression to match the metatag: *http-equiv=("|)refresh*content=("|) [#2:10000](*url=("|)([^"' ]+)\1|)* if matched then should be replaced by <center><font size=1><a href=\1 >[Refresh]</a></font></center> This will show link [Refresh], now I can decide when/if page should refresh by clicking this link. This HTML Filter could be used to prevent JavaScript, PopUps, displaying images, tracking user (webbugs, refferer link deleting), fixing most popular bugs in HTML code etc.

kevin arnold

Comment 1

•

24 years ago

How about just the ability to pipe HTML pages through an external program before rendering. This should be far easier to code into Mozilla and be far more flexible than built in Mozilla substitutions.

Marek Wawoczny

Reporter

Comment 2

•

24 years ago

I think that better make it in Mozilla, coz not everyone want to use external program, and not every external program is freeware. Now I'm using such program - Proxomitron - (to filter HTML) via local proxy. So if U thinking about pipening HTML to external app U can use any browser without modifications, almost all HTML filters use browser proxy settings. If it would be built in Mozilla U could turn off/on it by only one click - for example on browser Navigation bar also configuration would be easy. No more clicking 2 or more applications to turn on some feature. Also I'm thinking about easy adding new rutines. For example U load some page, and it wouldn't be filtered as U wish, so U right click on element which U don't want to see next time and in menu "Add to unwanted stuff". Easy. Mayby it would be even easier to non-proffesional/new users. Also don't forget about proffesional users - they can setup regular expressions patterns to suit their demands. But this is only wish, most important thing is to make Mozilla fast and the best browser in World :)))

Asa Dotzler [:asa]

Comment 3

•

24 years ago

->parser?

Assignee: asa → harishd

Component: Browser-General → Parser

QA Contact: doronr → bsharma

timeless

Comment 4

•

24 years ago

you might try protozilla, i really can't imagine someone implementing this as part of the trunk.

Boris Zbarsky [:bzbarsky]

Comment 5

•

24 years ago

confirming enhancement request

Status: UNCONFIRMED → NEW

Ever confirmed: true

OS: Windows 2000 → All

Hardware: PC → All

harishd

Assignee

Comment 6

•

24 years ago

This bug has been marked "future" because the original netscape engineer working on this is over-burdened. If you feel this is an error, that you or another known resource will be working on this bug,or if it blocks your work in some way -- please attach your concern to the bug for reconsideration.

Target Milestone: --- → Future

Moied

Updated

•

24 years ago

QA Contact: bsharma → moied

Raphael Wegmann

Comment 7

•

23 years ago

I think, that a regex support is not needed. Instead it should be easy to maintain a list of unwanted URLs (just as the "Block Images from Server", but also for javascript Pop-ups, HTML-pages and Banners).

Marek Wawoczny

Reporter

Comment 8

•

23 years ago

>I think, that a regex support is not needed. Instead it should be easy to maintain >a list of unwanted URLs (just as the "Block Images from Server", but also for >javascript Pop-ups, HTML-pages and Banners). And what about example that I provided in bug report? Block feature is only partial, imagine what you can do with regexp... You can change any string (for example meta data, tags, fragments of code) with regexp.

Raphael Wegmann

Comment 9

•

23 years ago

Changing the HTML source of a page could be a copyright violation,but "not loading" parts of a Web page (banners, pop-ups, flash, ...)doesn't violate any copyrights. Besides it's much easier to use,for those who don't want to learn a new regex syntax.

Marek Wawoczny

Reporter

Comment 10

•

23 years ago

Hmmmm... So you are trying to say that all HTML filter applications, proxy servers (some are adding stuff to viewed pages) are copyright violation? Don't think so, you use that for viewing, you are not distributing in any way modified pages etc. Btw. Why newbie should use regexp? I'm thinking about predefined regexp rules, and ability to add new ones (for advanced users).

R.K.Aa.

Comment 11

•

23 years ago

*** Bug 152017 has been marked as a duplicate of this bug. ***

Alfonso Martinez

Comment 12

•

23 years ago

*** Bug 159818 has been marked as a duplicate of this bug. ***

Preston Crow

Comment 13

•

23 years ago

If you want to control whether a single page loads, just use auto-proxy configuration. You supply a javascript function that is given a URL and returns the proxy to use for that URL. You can have the function use whatever logic it likes to decide, for example, whether to use a proxy or get the URL natively. This can be used with a fake proxy to block ads (essentially integrating JunkBuster into the browser). As for modifying the HTML before sending it to the parser, it would be nice to have a similar model: You can supply a function that takes a URL as a parameter, and it tells you what, if any, function to use to preprocess the HTML. This should be done in as generic a manner as possible so as to maximize flexibility.

Ulrich Grassberger

Comment 14

•

22 years ago

I think that the filter functions of Mozilla (spam e-mail, popup, cookies) need a consistent, regex-enabled UI. "Allow only the already existing cookies" or "Disallow any cookies except for new ones from domain xy", simple rules which are not possible yet. All filter modules should use the same data base, for example the core code of an open-source data base like mySQL. This way we can stop the bloat. Have a look at the files in the profile folder and at execution time of various events - people are already beginning to laugh!

Ulrich Grassberger

Comment 15

•

22 years ago

cookies, popup windows, e-mail spam, passwords and other form auto-completion, and now the image manager. text files with *.s, *.tbl and *.w. >All filter modules should use the same data base, for example the core code of an open-source data base like mySQL. Maybe I am wrong, but I do not like featuritis.

Boris Zbarsky [:bzbarsky]

Comment 16

•

21 years ago

This is not going to happen in the parser. If someone wants to write an extension to do this, please feel free to file a bug on the hooks you think you need (they would be hooks into content dispatch and necko, probably, not the parser).

Status: NEW → RESOLVED

Closed: 21 years ago

Resolution: --- → WONTFIX

Hixie (not reading bugmail)

Comment 17

•

21 years ago

Doing this with regexp would be silly anyway, IMHO. We have a DOM...

Bugzilla

[Feature Request] HTML Filter with Regular Expressions support

Categories

(Core :: DOM: HTML Parser, enhancement)

Tracking

()

People

(Reporter: m.wawoczny, Assigned: harishd)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Comment 16

Comment 17