Last Comment Bug 356355 - numeric domain normalization only happens on enchash table values
: numeric domain normalization only happens on enchash table values
Status: VERIFIED FIXED
: fixed1.8.1.1
Product: Toolkit
Classification: Components
Component: Safe Browsing (show other bugs)
: Trunk
: All All
: -- normal (vote)
: ---
Assigned To: Tony Chang (Google)
:
Mentors:
: 358966 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-10-11 16:59 PDT by Tony Chang (Google)
Modified: 2014-05-27 12:25 PDT (History)
9 users (show)
dveditz: blocking1.8.1.1+
gavin.sharp: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
v1: canonicalize urls before lookup in -url and -domain tables (13.87 KB, patch)
2006-10-11 17:05 PDT, Tony Chang (Google)
mmc.bugzilla: review+
Details | Diff | Splinter Review
patch for ff2 branch (13.94 KB, patch)
2006-10-31 15:38 PST, Tony Chang (Google)
mmc.bugzilla: review+
Details | Diff | Splinter Review
v2: additional fixes (6.93 KB, patch)
2006-11-07 17:56 PST, Tony Chang (Google)
mmc.bugzilla: review+
Details | Diff | Splinter Review
v3: clean up line > 80 char (6.95 KB, patch)
2006-11-14 14:12 PST, Tony Chang (Google)
no flags Details | Diff | Splinter Review
v2: patch for ff2 branch (16.03 KB, patch)
2006-11-14 15:32 PST, Tony Chang (Google)
mmc.bugzilla: review+
jaymoz: approval1.8.1.1-
Details | Diff | Splinter Review
v4: check both old and new canonical form (3.06 KB, patch)
2006-11-21 17:58 PST, Tony Chang (Google)
mmc.bugzilla: review+
Details | Diff | Splinter Review
v3: cumulative patch for branch (17.56 KB, patch)
2006-11-22 15:13 PST, Tony Chang (Google)
mmc.bugzilla: review+
dveditz: approval1.8.1.1+
Details | Diff | Splinter Review
unittests (31.68 KB, patch)
2006-12-05 15:33 PST, Tony Chang (Google)
no flags Details | Diff | Splinter Review

Comment 1 Tony Chang (Google) 2006-10-11 17:05:09 PDT
Created attachment 242005 [details] [diff] [review]
v1: canonicalize urls before lookup in -url and -domain tables

To test, make sure http://24.172.137.213/http.www.paypal.com/cgi-bin/webscrcmd=_login-run/                        is in urlclassifier2.sqlite and try loading the variations mentioned above.
Comment 2 [:mmc] Monica Chew (no longer reading bugmail) 2006-10-12 14:15:19 PDT
I know I already looked at this, but I just realized that the canonicalization only applies to enchash tables.  Don't we want it for non-enchash tables as well?
Comment 3 [:mmc] Monica Chew (no longer reading bugmail) 2006-10-12 14:17:47 PDT
Comment on attachment 242005 [details] [diff] [review]
v1: canonicalize urls before lookup in -url and -domain tables

Nevermind, I see in trtable.js the enchash canonicalization is called for all tables.  Sorry for the confusion.
Comment 4 Tony Chang (Google) 2006-10-12 16:58:11 PDT
Checking in enchash-decrypter.js;
/cvsroot/mozilla/toolkit/components/url-classifier/content/enchash-decrypter.js,v  <--  enchash-decrypter.js
new revision: 1.5; previous revision: 1.4
done
Checking in trtable.js;
/cvsroot/mozilla/toolkit/components/url-classifier/content/trtable.js,v  <--  trtable.js
new revision: 1.7; previous revision: 1.6
done
Checking in url-canonicalizer.js;
/cvsroot/mozilla/toolkit/components/url-classifier/content/url-canonicalizer.js,v  <--  url-canonicalizer.js
new revision: 1.4; previous revision: 1.3

On trunk.
Comment 5 Mike Beltzner [:beltzner, not reading bugmail] 2006-10-31 12:39:43 PST
Tony, can we also get a branch patch worked up for 1.8.1.1?
Comment 6 Tony Chang (Google) 2006-10-31 15:38:34 PST
Created attachment 244261 [details] [diff] [review]
patch for ff2 branch

Same as the other patch except one line was different in enchash-decrypter.js on branch/trunk.  After this patch is applied, the two files will be the same.
Comment 7 [:mmc] Monica Chew (no longer reading bugmail) 2006-10-31 15:52:02 PST
Comment on attachment 242005 [details] [diff] [review]
v1: canonicalize urls before lookup in -url and -domain tables

Looks fine.
Comment 8 Tony Chang (Google) 2006-10-31 16:59:04 PST
*** Bug 358966 has been marked as a duplicate of this bug. ***
Comment 9 [:mmc] Monica Chew (no longer reading bugmail) 2006-10-31 17:12:41 PST
Comment on attachment 244261 [details] [diff] [review]
patch for ff2 branch

Looks fine.
Comment 10 Tony Chang (Google) 2006-11-07 17:56:41 PST
Created attachment 244971 [details] [diff] [review]
v2: additional fixes

Fixes 2 additional issues:
- a regression caused by the v1 patch where canonicalizing urls results in only using the first 5 dots of the hostname
- a bug in enchash lookups where we were comparing the non-normalized url against the regular expressions instead of the normalized url

I'll request a review from Monica once she's back from vacation (Nov 13).
Comment 11 Tony Chang (Google) 2006-11-14 11:17:21 PST
Comment on attachment 244971 [details] [diff] [review]
v2: additional fixes

The only difference from toolbar code is that some unittests are missing (trtable.js doesn't have a unittest block).

I'll try to add the unittests back in in a different patch in a mozilla friendly way.
Comment 12 [:mmc] Monica Chew (no longer reading bugmail) 2006-11-14 11:26:58 PST
Comment on attachment 244971 [details] [diff] [review]
v2: additional fixes

Looks fine.
>===================================================================
>--- toolkit/components/url-classifier/content/enchash-decrypter.js	old
>+++ toolkit/components/url-classifier/content/enchash-decrypter.js	new
>+    testing["http://www.barclays.co.uk.brccontrol.assruspede.org.bz/detailsconfirm"] = "co.uk.brccontrol.assruspede.org.bz";
over 80 chars (I think)
Comment 13 Tony Chang (Google) 2006-11-14 14:12:50 PST
Created attachment 245608 [details] [diff] [review]
v3: clean up line > 80 char

Same as v2 with line wrapping.
Comment 14 Tony Chang (Google) 2006-11-14 14:13:16 PST
v3: on trunk

Checking in enchash-decrypter.js;
/cvsroot/mozilla/toolkit/components/url-classifier/content/enchash-decrypter.js,v  <--  enchash-decrypter.js
new revision: 1.6; previous revision: 1.5
done
Checking in trtable.js;
/cvsroot/mozilla/toolkit/components/url-classifier/content/trtable.js,v  <--  trtable.js
new revision: 1.8; previous revision: 1.7
done
Comment 15 Tony Chang (Google) 2006-11-14 15:32:23 PST
Created attachment 245614 [details] [diff] [review]
v2: patch for ff2 branch

Combine the two patches (v1 and v3) into a single patch for the ff 2 branch.
Comment 16 [:mmc] Monica Chew (no longer reading bugmail) 2006-11-14 16:14:13 PST
Comment on attachment 245614 [details] [diff] [review]
v2: patch for ff2 branch

Looks fine.
Comment 17 Daniel Veditz [:dveditz] 2006-11-20 12:19:59 PST
Does this patch also normalize the URLs coming from the server? In the urlclassifier database there are phishing urls of every conceivable obfuscation, I'd hate to add this check and start missing all the non-normalized ones.

qawanted: Need to verify that on trunk before we take this patch on the branch, because that result would be worse than the original problem if not handled.
Comment 18 Tony Chang (Google) 2006-11-20 13:33:20 PST
(In reply to comment #17)
> Does this patch also normalize the URLs coming from the server? In the
> urlclassifier database there are phishing urls of every conceivable
> obfuscation, I'd hate to add this check and start missing all the
> non-normalized ones.

This doesn't normalize the URLs from the server, but we're going to be normalizing all of them real soon now (we're still testing to make sure we're not losing anything in the process).

If we wanted to normalize URLs from the server, it would be more complicated since the update happens on a background thread but nsIIOService (to make the nsIURI objects) needs to run on the main thread.
Comment 19 Daniel Veditz [:dveditz] 2006-11-21 13:28:35 PST
Define "Real Soon Now"... this sounds like the kind of thing where client and server need to update around the same time, or we're leaving some big gaps. Or the server could start sending both forms for non-normalized ones for a while... better to have a temporarily bloated list than holes in coverage. Especially since people won't upgrade to the fixed version instantaneously. The bulk update the first few days, but there's a tail of people who apparently have update problems or insufficient rights.

We're hoping to ship 2.0.0.1 mid December. Will the server be changed over by then?
Comment 20 Tony Chang (Google) 2006-11-21 13:33:09 PST
(In reply to comment #19)
> Define "Real Soon Now"... this sounds like the kind of thing where client and
> server need to update around the same time, or we're leaving some big gaps. Or
> the server could start sending both forms for non-normalized ones for a
> while... better to have a temporarily bloated list than holes in coverage.
> Especially since people won't upgrade to the fixed version instantaneously. The
> bulk update the first few days, but there's a tail of people who apparently
> have update problems or insufficient rights.
> 
> We're hoping to ship 2.0.0.1 mid December. Will the server be changed over by
> then?

After some more discussion, it seems to make more sense for the client to try both (check the table for the normalized and the non-normalized url).  That way we don't need to try to sync the client with the server.  The servers will make the switch some time after 2.0.0.1 ships.

I'll have an updated patch for this later today.
Comment 21 Jay Patel [:jay] 2006-11-21 15:27:39 PST
Comment on attachment 245614 [details] [diff] [review]
v2: patch for ff2 branch

Will wait for Tony's new patch, please request approval for that one.
Comment 22 Tony Chang (Google) 2006-11-21 17:58:42 PST
Created attachment 246235 [details] [diff] [review]
v4: check both old and new canonical form

Additional patch for trunk:
Check both the URL both using the original canonicalized form and the new form in local database lookups.
Comment 23 Tony Chang (Google) 2006-11-22 15:13:03 PST
Created attachment 246351 [details] [diff] [review]
v3: cumulative patch for branch
Comment 24 [:mmc] Monica Chew (no longer reading bugmail) 2006-11-22 15:26:55 PST
Comment on attachment 246351 [details] [diff] [review]
v3: cumulative patch for branch

Looks fine.
Comment 25 [:mmc] Monica Chew (no longer reading bugmail) 2006-11-22 16:30:46 PST
Comment on attachment 246235 [details] [diff] [review]
v4: check both old and new canonical form

LG
Comment 26 Tony Chang (Google) 2006-11-22 16:37:49 PST
patch v4: check both old and new canonical form
on trunk

Checking in trtable.js;
/cvsroot/mozilla/toolkit/components/url-classifier/content/trtable.js,v  <--  trtable.js
new revision: 1.9; previous revision: 1.8
Comment 27 Daniel Veditz [:dveditz] 2006-11-27 16:55:14 PST
Comment on attachment 246351 [details] [diff] [review]
v3: cumulative patch for branch

approved for 1.8 branch, a=dveditz for drivers
Comment 28 Tony Chang (Google) 2006-11-27 17:24:34 PST
on branch
Comment 29 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2006-11-28 08:11:42 PST
This patch landed in the toolkit without a unit test. Tree rules for the toolkit now require unit tests, please prepare a test for this bug.
Comment 30 Tony Chang (Google) 2006-11-28 19:49:10 PST
(In reply to comment #29)
> This patch landed in the toolkit without a unit test. Tree rules for the
> toolkit now require unit tests, please prepare a test for this bug.

How do I go about testing chrome JS?  There are a bunch of tests[1], they're just not automated.

https://bugzilla.mozilla.org/attachment.cgi?id=246351&action=diff#toolkit/components/url-classifier/content/enchash-decrypter.js_sec3
Comment 31 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2006-11-29 06:29:05 PST
I'm pretty sure we could use mochitest for this. http://developer.mozilla.org/en/docs/Mochitest_FAQ and ask sayrer ;-)
Comment 32 Chris Cooper [:coop] 2006-12-04 17:02:59 PST
Trying to verify this fix, but the browser doesn't seem to want to load any hex, octal, or decimal IP addresses.

For ref: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.1pre) Gecko/20061201 BonEcho/2.0.0.1pre

The listed examples long since fell off the net, so I went to phishtank.com and grabbed a new dotted quad phishing address for testing. I ended up using:

http://210.21.119.127/sbcbconfirm/

...which the anti-phishing correctly catches. However, when I try to load both:

http://0xd2.0x15.0x77.0x7f/sbcbconfirm/
http://0322.025.0167.0177/sbcbconfirm/

...the anti-phishing _does_ pop up the web forgery warning, but the pages themselves do not actually load.

http://3524622207/sbcbconfirm/

...doesn't even load, redirecting instead to http://www.3524622207.com/sbcbconfirm/

Are these numeric domains supposed to load?
Comment 33 Tony Chang (Google) 2006-12-04 18:13:58 PST
(In reply to comment #32)
> However, when I try to load both:
> 
> http://0xd2.0x15.0x77.0x7f/sbcbconfirm/
> http://0322.025.0167.0177/sbcbconfirm/
> 
> ...the anti-phishing _does_ pop up the web forgery warning, but the pages
> themselves do not actually load.
> 
> http://3524622207/sbcbconfirm/
> 
> ...doesn't even load, redirecting instead to
> http://www.3524622207.com/sbcbconfirm/
> 
> Are these numeric domains supposed to load?

Hmm, it looks like hex/dec domains only load on Windows.  Not sure why they get renamed in linux/OSX.
Comment 34 Tony Chang (Google) 2006-12-05 15:33:21 PST
Created attachment 247594 [details] [diff] [review]
unittests

Patch to move unittests to mochitest framework.
Comment 35 Tony Chang (Google) 2006-12-05 15:36:40 PST
Tests on trunk.

Checking in testing/mochitest/tests/index.html;
/cvsroot/mozilla/testing/mochitest/tests/index.html,v  <--  index.html
new revision: 1.47; previous revision: 1.46
done
RCS file: /cvsroot/mozilla/testing/mochitest/tests/test_bug356355.xhtml,v
done
Checking in testing/mochitest/tests/test_bug356355.xhtml;
/cvsroot/mozilla/testing/mochitest/tests/test_bug356355.xhtml,v  <--  test_bug356355.xhtml
initial revision: 1.1
done
Checking in toolkit/components/url-classifier/content/enchash-decrypter.js;
/cvsroot/mozilla/toolkit/components/url-classifier/content/enchash-decrypter.js,v  <--  enchash-decrypter.js
new revision: 1.7; previous revision: 1.6
done
Checking in toolkit/components/url-classifier/content/trtable.js;
/cvsroot/mozilla/toolkit/components/url-classifier/content/trtable.js,v  <--  trtable.js
new revision: 1.12; previous revision: 1.11
done
Checking in toolkit/components/url-classifier/content/url-canonicalizer.js;
/cvsroot/mozilla/toolkit/components/url-classifier/content/url-canonicalizer.js,v  <--  url-canonicalizer.js
new revision: 1.5; previous revision: 1.4
done
Comment 36 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2009-07-27 13:20:34 PDT
I'm guessing QAWANTED is no longer needed on this bug?
Comment 37 Aakash Desai [:aakashd] 2009-07-28 08:34:41 PDT
It's not needed anymore as the qawanted flag was set back by comment #17 before the patch landed on branch. Also, I'm verifying this due to the number of unit tests in mochitest that have been there since 2006 per comment #35.

Note You need to log in before you can comment on or make changes to this bug.