Closed Bug 86504 Opened 24 years ago Closed 21 years ago

[Feature Request] HTML Filter with Regular Expressions support

Categories

(Core :: DOM: HTML Parser, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED WONTFIX
Future

People

(Reporter: m.wawoczny, Assigned: harishd)

References

Details

Hello! I would like to see HTML Filter (PlugIn) with regular expressions support. With this I can delete unwanted HTML code before page is displayed. How it should work: User URL Request->Browser gets document->HTML Filter->Browser starts to parse HTML page For Example: Some Page got refresh meta tag. I want to get rid off this tag. Regular Expression to match the metatag: *http-equiv=("|)refresh*content=("|) [#2:10000](*url=("|)([^"' ]+)\1|)* if matched then should be replaced by <center><font size=1><a href=\1 >[Refresh]</a></font></center> This will show link [Refresh], now I can decide when/if page should refresh by clicking this link. This HTML Filter could be used to prevent JavaScript, PopUps, displaying images, tracking user (webbugs, refferer link deleting), fixing most popular bugs in HTML code etc.
How about just the ability to pipe HTML pages through an external program before rendering. This should be far easier to code into Mozilla and be far more flexible than built in Mozilla substitutions.
I think that better make it in Mozilla, coz not everyone want to use external program, and not every external program is freeware. Now I'm using such program - Proxomitron - (to filter HTML) via local proxy. So if U thinking about pipening HTML to external app U can use any browser without modifications, almost all HTML filters use browser proxy settings. If it would be built in Mozilla U could turn off/on it by only one click - for example on browser Navigation bar also configuration would be easy. No more clicking 2 or more applications to turn on some feature. Also I'm thinking about easy adding new rutines. For example U load some page, and it wouldn't be filtered as U wish, so U right click on element which U don't want to see next time and in menu "Add to unwanted stuff". Easy. Mayby it would be even easier to non-proffesional/new users. Also don't forget about proffesional users - they can setup regular expressions patterns to suit their demands. But this is only wish, most important thing is to make Mozilla fast and the best browser in World :)))
->parser?
Assignee: asa → harishd
Component: Browser-General → Parser
QA Contact: doronr → bsharma
you might try protozilla, i really can't imagine someone implementing this as part of the trunk.
confirming enhancement request
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Windows 2000 → All
Hardware: PC → All
This bug has been marked "future" because the original netscape engineer working on this is over-burdened. If you feel this is an error, that you or another known resource will be working on this bug,or if it blocks your work in some way -- please attach your concern to the bug for reconsideration.
Target Milestone: --- → Future
QA Contact: bsharma → moied
I think, that a regex support is not needed. Instead it should be easy to maintain a list of unwanted URLs (just as the "Block Images from Server", but also for javascript Pop-ups, HTML-pages and Banners).
>I think, that a regex support is not needed. Instead it should be easy to maintain >a list of unwanted URLs (just as the "Block Images from Server", but also for >javascript Pop-ups, HTML-pages and Banners). And what about example that I provided in bug report? Block feature is only partial, imagine what you can do with regexp... You can change any string (for example meta data, tags, fragments of code) with regexp.
Changing the HTML source of a page could be a copyright violation,but "not loading" parts of a Web page (banners, pop-ups, flash, ...)doesn't violate any copyrights. Besides it's much easier to use,for those who don't want to learn a new regex syntax.
Hmmmm... So you are trying to say that all HTML filter applications, proxy servers (some are adding stuff to viewed pages) are copyright violation? Don't think so, you use that for viewing, you are not distributing in any way modified pages etc. Btw. Why newbie should use regexp? I'm thinking about predefined regexp rules, and ability to add new ones (for advanced users).
*** Bug 152017 has been marked as a duplicate of this bug. ***
*** Bug 159818 has been marked as a duplicate of this bug. ***
If you want to control whether a single page loads, just use auto-proxy configuration. You supply a javascript function that is given a URL and returns the proxy to use for that URL. You can have the function use whatever logic it likes to decide, for example, whether to use a proxy or get the URL natively. This can be used with a fake proxy to block ads (essentially integrating JunkBuster into the browser). As for modifying the HTML before sending it to the parser, it would be nice to have a similar model: You can supply a function that takes a URL as a parameter, and it tells you what, if any, function to use to preprocess the HTML. This should be done in as generic a manner as possible so as to maximize flexibility.
I think that the filter functions of Mozilla (spam e-mail, popup, cookies) need a consistent, regex-enabled UI. "Allow only the already existing cookies" or "Disallow any cookies except for new ones from domain xy", simple rules which are not possible yet. All filter modules should use the same data base, for example the core code of an open-source data base like mySQL. This way we can stop the bloat. Have a look at the files in the profile folder and at execution time of various events - people are already beginning to laugh!
cookies, popup windows, e-mail spam, passwords and other form auto-completion, and now the image manager. text files with *.s, *.tbl and *.w. >All filter modules should use the same data base, for example the core code of an open-source data base like mySQL. Maybe I am wrong, but I do not like featuritis.
This is not going to happen in the parser. If someone wants to write an extension to do this, please feel free to file a bug on the hooks you think you need (they would be hooks into content dispatch and necko, probably, not the parser).
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → WONTFIX
Doing this with regexp would be silly anyway, IMHO. We have a DOM...
You need to log in before you can comment on or make changes to this bug.