Closed
Bug 1169366
Opened 9 years ago
Closed 9 years ago
Update data file with new allowed frecent sites list
Categories
(Firefox :: New Tab Page, defect)
Firefox
New Tab Page
Tracking
()
People
(Reporter: Mardak, Assigned: emtwo)
References
Details
(Whiteboard: .001)
Attachments
(7 files, 7 obsolete files)
78.39 KB,
patch
|
Mardak
:
review+
|
Details | Diff | Splinter Review |
51.71 KB,
application/json
|
Details | |
118.03 KB,
application/json
|
Details | |
29.79 KB,
image/png
|
Details | |
29.47 KB,
image/png
|
Details | |
75.09 KB,
patch
|
Sylvestre
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
4.37 KB,
patch
|
Sylvestre
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
We have ~60 new frecent sites lists to add for suggested tiles launch bug 1140185.
We'll also tweak some existing categories, e.g., adding techcrunch to technology.
This is currently waiting on finalization of the list and measuring the privacy aspects of each grouping.
Updated•9 years ago
|
Iteration: --- → 41.2 - Jun 8
Comment 1•9 years ago
|
||
MaxP values can be found here: https://docs.google.com/spreadsheets/d/1aM7FZwA3ounZVybS4xfhmTbKoHTRx_BiePVTLUFMnD4/edit#gid=0
Comment 2•9 years ago
|
||
Attachment #8615539 -
Attachment is obsolete: true
Reporter | ||
Comment 3•9 years ago
|
||
Do we have a user facing name that can go with each of these? Firefox will fall back to showing the hardcoded name if the server doesn't provide an explanation text, e.g.,
Suggested for Business/ Finance_Investing News visitors
Comment 4•9 years ago
|
||
If we're using "A/n [category] site suggestion" the max character is 24 if we want to keep it at one line. One line looks better, but we should account for two for localization.
If we're using "Suggested for [category] visitors" the max character for one line is 22.
Jterry/Ashonnard, can you come up with user facing category names? We should only 1 level names. i.e. not the full "Business/ Finance_Investing News" but "Investing News"
Flags: needinfo?(jterry)
Flags: needinfo?(ashonnard)
Reporter | ||
Comment 5•9 years ago
|
||
Where are you getting those numbers from? The font isn't fixed width, so "mmmmm" might not fit but "iiiii" could fit even though they have the same number of characters.
"Suggested for (x) visitors" is already 22 characters without adding in the category!!
What is the max length we should be working for? This will help us cut down the category names.
Flags: needinfo?(jterry)
Flags: needinfo?(edilee)
Flags: needinfo?(ashonnard)
Reporter | ||
Comment 7•9 years ago
|
||
I believe kghim was referring to the category name itself being 22 characters as he's already accounted for the other text on the line.
Although he retested with using the widest character (W) and could only fit ~10 and the thinnest character fit ~30.
Flags: needinfo?(edilee)
Comment 8•9 years ago
|
||
(In reply to ashonnard from comment #6)
> "Suggested for (x) visitors" is already 22 characters without adding in the
> category!!
>
> What is the max length we should be working for? This will help us cut down
> the category names.
Ashonnard, the character would just be for the category name. I think we can expand this to 30 characters because of the thin letter (iiii) and wide letter (WWWW) combination.
As mruttley indicated on email, we should have category name that we keep for hierarchical purposes and a display name which will have a 30 character limit.
Comment 9•9 years ago
|
||
All maxP values apart from Mozilla are under 0.7 now, the highest being 0.67.
More data can be found here: https://docs.google.com/spreadsheets/d/1aM7FZwA3ounZVybS4xfhmTbKoHTRx_BiePVTLUFMnD4/edit#gid=0
I will attach the new cats.json file. If you'd like to work on the provisional names for the adgroups, I can create a new feature in Bucketerer?
Comment 10•9 years ago
|
||
Updated•9 years ago
|
Attachment #8615540 -
Attachment is obsolete: true
Comment 11•9 years ago
|
||
(In reply to Kevin Ghim from comment #8)
> (In reply to ashonnard from comment #6)
> > "Suggested for (x) visitors" is already 22 characters without adding in the
> > category!!
> >
> > What is the max length we should be working for? This will help us cut down
> > the category names.
>
> Ashonnard, the character would just be for the category name. I think we can
> expand this to 30 characters because of the thin letter (iiii) and wide
> letter (WWWW) combination.
>
> As mruttley indicated on email, we should have category name that we keep
> for hierarchical purposes and a display name which will have a 30 character
> limit.
kevin, going out on a limb here...
Do we have the flexibility to end it with any type of collective noun? For example, "Suggested for literature fans" or "Suggested for finance interests". Ending each bucket with visitors is a little hard for top lining buckets, such as "Suggested for literature visitors" or "Suggested for finance visitors"
Hoping we can use other collective nouns to make it sound more broad and less robotic.
Obviously this phrase was discussed for legal so let me know if i am far off with this request.
-a
Flags: needinfo?(kghim)
Comment 12•9 years ago
|
||
I could create a feature which would let Bucketerer users edit:
- The official category taxonomy name e.g. Arts & Entertainment / Literature
- The short name: Literature
- The phrase: Literature fans
Comment 13•9 years ago
|
||
(In reply to ashonnard from comment #11)
> (In reply to Kevin Ghim from comment #8)
> > (In reply to ashonnard from comment #6)
> > > "Suggested for (x) visitors" is already 22 characters without adding in the
> > > category!!
> > >
> > > What is the max length we should be working for? This will help us cut down
> > > the category names.
> >
> > Ashonnard, the character would just be for the category name. I think we can
> > expand this to 30 characters because of the thin letter (iiii) and wide
> > letter (WWWW) combination.
> >
> > As mruttley indicated on email, we should have category name that we keep
> > for hierarchical purposes and a display name which will have a 30 character
> > limit.
>
> kevin, going out on a limb here...
> Do we have the flexibility to end it with any type of collective noun? For
> example, "Suggested for literature fans" or "Suggested for finance
> interests". Ending each bucket with visitors is a little hard for top lining
> buckets, such as "Suggested for literature visitors" or "Suggested for
> finance visitors"
>
> Hoping we can use other collective nouns to make it sound more broad and
> less robotic.
> Obviously this phrase was discussed for legal so let me know if i am far off
> with this request.
> -a
Annelise, good suggestion, but until the end of June, we'll have to stick with "Suggested for [category] visitors" because that explanation text is hardcoded on the client. Once 40 goes into beta, we can send in the explanation text which will be "A/n [category] site suggestion" Does that have a better flow?
Mruttley, that would be great.
Flags: needinfo?(kghim)
Reporter | ||
Comment 14•9 years ago
|
||
We can already send the custom explanation from the server. It's just that only 40 users will see the explanation. So we should use the custom explanation now if we have it.
Comment 15•9 years ago
|
||
(In reply to Kevin Ghim from comment #13)
> (In reply to ashonnard from comment #11)
> > (In reply to Kevin Ghim from comment #8)
> > > (In reply to ashonnard from comment #6)
> > > > "Suggested for (x) visitors" is already 22 characters without adding in the
> > > > category!!
> > > >
> > > > What is the max length we should be working for? This will help us cut down
> > > > the category names.
> > >
> > > Ashonnard, the character would just be for the category name. I think we can
> > > expand this to 30 characters because of the thin letter (iiii) and wide
> > > letter (WWWW) combination.
> > >
> > > As mruttley indicated on email, we should have category name that we keep
> > > for hierarchical purposes and a display name which will have a 30 character
> > > limit.
> >
> > kevin, going out on a limb here...
> > Do we have the flexibility to end it with any type of collective noun? For
> > example, "Suggested for literature fans" or "Suggested for finance
> > interests". Ending each bucket with visitors is a little hard for top lining
> > buckets, such as "Suggested for literature visitors" or "Suggested for
> > finance visitors"
> >
> > Hoping we can use other collective nouns to make it sound more broad and
> > less robotic.
> > Obviously this phrase was discussed for legal so let me know if i am far off
> > with this request.
> > -a
>
> Annelise, good suggestion, but until the end of June, we'll have to stick
> with "Suggested for [category] visitors" because that explanation text is
> hardcoded on the client. Once 40 goes into beta, we can send in the
> explanation text which will be "A/n [category] site suggestion" Does that
> have a better flow?
>
> Mruttley, that would be great.
Kevin, would love to think about alternative flow. In the meantime, i will update the categories to fit within "Suggested for (category) visitors"
Reporter | ||
Updated•9 years ago
|
Iteration: 41.2 - Jun 8 → 41.3 - Jun 29
Comment 16•9 years ago
|
||
:ashonnard - I've created a new bug to integrate better descriptors here: https://bugzilla.mozilla.org/show_bug.cgi?id=1173569
Comment 17•9 years ago
|
||
All maxp values are OK, see: https://docs.google.com/spreadsheets/d/1aM7FZwA3ounZVybS4xfhmTbKoHTRx_BiePVTLUFMnD4/edit#gid=0
Attachment #8616216 -
Attachment is obsolete: true
Reporter | ||
Comment 18•9 years ago
|
||
untested. unit tests probably need to be updated
Reporter | ||
Comment 19•9 years ago
|
||
Just to save this somewhere..
=" [ '" & JOIN(",", SORT(TRANSPOSE(SPLIT(INDIRECT("adgroups!B" & ROUND(ROW()/2)), "#")))) & "',"
=" '" & REGEXREPLACE(INDIRECT("adgroups!C" & ROUND(ROW()/2)), "(^An? | site suggestion$)", "") & "' ],"
Reporter | ||
Comment 20•9 years ago
|
||
We need a new test and fix code to allow for the names and descriptions with &
XML Parsing Error: not well-formed
Location:
Line Number 1, Column 197:
<div xmlns="http://www.w3.org/1999/xhtml"><div xmlns="http://www.w3.org/1999/xhtml"><span xmlns="http://www.w3.org/1999/xhtml"><div class='newtab-suggested-bounds'> Suggested for <strong> events & tickets </strong> visitors </div>
Tested with https://people.mozilla.org/~elee/suggested.tickets.json
Reporter | ||
Comment 21•9 years ago
|
||
It looks like we'll need to escape 5 characters &<>'" and perhaps at minimal these two: &<
Assignee: nobody → msamuel
Comment 22•9 years ago
|
||
An easier solution is changing "&" to "and", as we still fall below the character count.
Is that possible?
Flags: needinfo?(kghim)
Flags: needinfo?(edilee)
Reporter | ||
Comment 23•9 years ago
|
||
We could just use "and" then we could split off a separate bug to handle the & and < causing xml entity issues. Alternatively we could just hardcode & as & for now.
Flags: needinfo?(edilee)
Comment 24•9 years ago
|
||
(In reply to Ed Lee :Mardak from comment #23)
> We could just use "and" then we could split off a separate bug to handle the
> & and < causing xml entity issues. Alternatively we could just hardcode & as
> & for now.
Whichever you think is easier. "and" seems like a quick fix while "&" is being worked out.
also, i found the copy paste error from yesterday. "charity" set off the alignment, as it wasn't in the previous list. Please confirm the following:
Entertainment_General -----> "An entertainment site suggestion" -----> "entertainment"
Environment_General -----> "An environment site suggestion" -----> "environment"
Flags: needinfo?(edilee)
Reporter | ||
Comment 25•9 years ago
|
||
With these copy/paste issues, it's probably safest to get the data from the source with Bucketerer export. It's unclear if the list of sites changed from one sheet to another or if the list in Bucketerer is different now.
Flags: needinfo?(edilee)
Comment 26•9 years ago
|
||
(In reply to Ed Lee :Mardak from comment #25)
> With these copy/paste issues, it's probably safest to get the data from the
> source with Bucketerer export. It's unclear if the list of sites changed
> from one sheet to another or if the list in Bucketerer is different now.
Updated in Bucketer-er.
http://172.29.2.113:5010/
Do you need a new export?
Flags: needinfo?(edilee)
Assignee | ||
Comment 27•9 years ago
|
||
Attachment #8623346 -
Attachment is obsolete: true
Attachment #8623370 -
Attachment is obsolete: true
Attachment #8623891 -
Flags: review?(edilee)
Reporter | ||
Comment 28•9 years ago
|
||
Comment on attachment 8623891 [details] [diff] [review]
v1: Bucket updates & escaping html characters
>+++ b/browser/base/content/test/newtab/browser_newtab_enhanced.js
>+ suggestedLink.explanation = "Testing junk explanation &<>\"'";
>+ ok(suggested.indexOf("Testing junk explanation &<>\"'") > -1, "Junk test");
Any idea why " and ' don't end up as " and '? I noticed it wasn't entirely necessary to escape the quotes anyway.
>+++ b/browser/modules/DirectoryLinksProvider.jsm
>+ _escapeChars: function(text) {
nit: _escapeChars(text) {
>+ let charMap = {
>+ '&': '&',
>+ '<': '<',
>+ '>': '>',
>+ '"': '"',
>+ "'": '''
>+ };
>+
>+ return text.replace(/[&<>"']/g, function(character) {
>+ return charMap[character];
nit: replace(.., char => charMap[char])
>+ [ 'askmen.com,boredomtherapy.com,buzzfeed.com,complex.com,dailymotion.com,elitedaily.com,gawker.com,howstuffworks.com,instagram.com,madamenoire.com,polygon.com,ranker.com,rollingstone.com,ted.com,theblaze.com,thechive.com,thecrux.com,thedailybeast.com,thoughtcatalog.com,uproxx.com,upworthy.com,zergnet.com',
>+ 'environment' ],
As Annelise noted, there was some copy/paste error from some part of the chain. askmen is not quite about the environment. ;)
'entertainment'
Assuming bucketerer has the source of truth, we should get a direct export. Ideally in the [[sorted_sites_string, short_desc_string], ...] format
Flags: needinfo?(edilee)
Attachment #8623891 -
Flags: review?(edilee) → review+
Reporter | ||
Comment 29•9 years ago
|
||
(In reply to Ed Lee :Mardak from comment #28)
> >+ [ 'askmen.com,boredomtherapy.com,buzzfeed.com,complex.com,dailymotion.com,elitedaily.com,gawker.com,howstuffworks.com,instagram.com,madamenoire.com,polygon.com,ranker.com,rollingstone.com,ted.com,theblaze.com,thechive.com,thecrux.com,thedailybeast.com,thoughtcatalog.com,uproxx.com,upworthy.com,zergnet.com',
> >+ 'environment' ],
> 'entertainment'
To be clear, let's land with the data file we have now except change this askmen.com environment -> entertainment. If there are changes from bucketerer, we'll file a new bug and separately update the data file again.
Assignee | ||
Comment 30•9 years ago
|
||
(In reply to Ed Lee :Mardak from comment #28)
> Comment on attachment 8623891 [details] [diff] [review]
> v1: Bucket updates & escaping html characters
>
> >+++ b/browser/base/content/test/newtab/browser_newtab_enhanced.js
> >+ suggestedLink.explanation = "Testing junk explanation &<>\"'";
> >+ ok(suggested.indexOf("Testing junk explanation &<>\"'") > -1, "Junk test");
> Any idea why " and ' don't end up as " and '? I noticed it wasn't
> entirely necessary to escape the quotes anyway.
I was wondering that myself but I'm not entirely sure. It might have something to do with encodeURIComponent().
Comment 31•9 years ago
|
||
Attachment #8623232 -
Attachment is obsolete: true
Comment 32•9 years ago
|
||
(In reply to Matthew Ruttley [:mruttley] from comment #31)
> Created attachment 8623925 [details]
> JSON dump of buckets, accurate as of 2015-06-17 6.30pm
Please use the most updated JSON (comment 31)
Reporter | ||
Comment 33•9 years ago
|
||
Comment on attachment 8623925 [details]
JSON dump of buckets, accurate as of 2015-06-17 6.30pm
Some issues with this file:
> "A pet site suggestion "
trailing space
> "toywiz.com "
> "name": "Retail_Toys"
trailing space for toywiz
> "An events and tickets site suggestion"
Annelise said this should be & not "and"
> "progressive.com",
> "progressive.com ",
> "name": "Insurance_General"
Duplicate progressive with one having trailing space
> "alibaba.com",
> "alibaba.com ",
> "name": "Retail_General Shopping"
Same issue trailing space on duplicate
> "geeks.com ",
> "outletpc.com ",
> "outpost.com ",
> "name": "Technology_Retail"
Several with extra space
Reporter | ||
Comment 34•9 years ago
|
||
JSON.parse(require("fs").readFileSync("adgroups.json","utf8")).map(function(a){return[a.sites.map(function(v){return v.trim()}).filter(function(v,p,s){return s.indexOf(v)==p}).sort()+"",a.descriptors[0].replace(/^\s*An?\s*|\s*(site )?suggestion\s*$/g, "")]})
Comment 35•9 years ago
|
||
Attachment #8623925 -
Attachment is obsolete: true
Reporter | ||
Comment 36•9 years ago
|
||
Comment on attachment 8623941 [details]
Latest data as of 2015-06-17 7.01pm, no dupes, no trailing spaces
This data file is indeed clean and the conversion script:
JSON.parse(require("fs").readFileSync("adgroups.json","utf8")).map(function(a){return[a.sites+"",a.descriptors[0].replace(/^An? | (site )?suggestion$/g,"")]})
Looks like bucketerer still has Retail_Tickets as "An events and tickets site suggestion" instead of ".. events & tickets .."
Assignee | ||
Comment 37•9 years ago
|
||
Comment 38•9 years ago
|
||
(In reply to Ed Lee :Mardak from comment #36)
> Comment on attachment 8623941 [details]
> Latest data as of 2015-06-17 7.01pm, no dupes, no trailing spaces
>
> This data file is indeed clean and the conversion script:
>
> JSON.parse(require("fs").readFileSync("adgroups.json","utf8")).
> map(function(a){return[a.sites+"",a.descriptors[0].replace(/^An? | (site
> )?suggestion$/g,"")]})
>
> Looks like bucketerer still has Retail_Tickets as "An events and
> tickets site suggestion" instead of ".. events & tickets .."
it should be ".. events & tickets .."
I updated it in Bucket-er.
Please confirm you can update it to ".. events & tickets .." wherever applicable
Comment 39•9 years ago
|
||
Status: NEW → RESOLVED
Closed: 9 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → Firefox 41
Updated•9 years ago
|
Flags: needinfo?(kghim)
Reporter | ||
Comment 40•9 years ago
|
||
screenshot after visiting enough sites and ticketmaster.com with about:config browser.newtabpage.directory.source set to
https://people.mozilla.org/~elee/suggested.tickets.json
Reporter | ||
Comment 41•9 years ago
|
||
Same as previous attachment except source as
https://people.mozilla.org/~elee/suggested.tickets.explain.json
which has an extra
"explanation": "An %1$S site suggestion",
Assignee | ||
Comment 42•9 years ago
|
||
Comment on attachment 8623891 [details] [diff] [review]
v1: Bucket updates & escaping html characters
Approval Request Comment
[Feature/regressing bug #]: Feature: add more suggested tiles targeting buckets
[User impact if declined]: Some suggestions may be targeted inaccurately or have missing potential targets
[Describe test coverage new/current, TreeHerder]: Tested manually on Nightly and locally on Aurora. Mochitests adjusted/added where appropriate.
[Risks and why]: Low risk - just adding more suggestion buckets and a function to escape characters in their labels.
[String/UUID change made/needed]: N/A
Attachment #8623891 -
Flags: approval-mozilla-aurora?
Comment 43•9 years ago
|
||
Could you split the change into two ? the function and the updated list of suggestion
or is that too much work?
Assignee | ||
Comment 44•9 years ago
|
||
Sure. Does that help with approval?
Assignee | ||
Comment 45•9 years ago
|
||
See comment 42
Attachment #8624744 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 46•9 years ago
|
||
See comment 42
Attachment #8624747 -
Flags: approval-mozilla-aurora?
Assignee | ||
Updated•9 years ago
|
Attachment #8623891 -
Flags: approval-mozilla-aurora?
Updated•9 years ago
|
Attachment #8624747 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee | ||
Comment 47•9 years ago
|
||
https://hg.mozilla.org/releases/mozilla-aurora/rev/74a45b6b2654
https://hg.mozilla.org/releases/mozilla-aurora/rev/1a8c7c8716c4
status-firefox40:
--- → fixed
Comment 48•9 years ago
|
||
Comment on attachment 8624744 [details] [diff] [review]
[aurora] Part 1: Bucket updates
I forgot to update it.
Attachment #8624744 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in
before you can comment on or make changes to this bug.
Description
•