Closed Bug 495491 Opened 15 years ago Closed 15 years ago

Phase 1 of the editor communication spec

Categories

(addons.mozilla.org Graveyard :: Admin/Editor Tools, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: clouserw, Assigned: smccammon)

References

Details

(Whiteboard: [patch])

Attachments

(4 files, 3 obsolete files)

Implement Phase 1 of the internal editor communication spec:

http://docs.google.com/Doc?docid=dcm3h3vx_0cwsxmzc5&hl=en
The spec seems pretty clear to me.

If we're looking to trim phase I down, the only thing I could see would be cutting thread subscription and email notification. However, those items would not be complex to implement for the value that they provide.
Thanks Scott. I agree that those two features would be awesome. If we have the time to put it in that would be ideal but I know you guys are slammed. I'll let you make the decision and support you either way on my end.
Assignee: nobody → smccammon
This is public version of the Google doc: http://docs.google.com/View?id=dcm3h3vx_0cwsxmzc5
Attached image proposed schema (obsolete) —
If we were running Cake1.2, I would go with a schema that could be used with core Tree behavior (http://book.cakephp.org/view/91/Tree). However, a simple parent/child comment structure should suffice given the relatively small tree sizes expected.

The proposed ReviewThreads table/model doesn't contain much data, but I see it containing most of the logic related to querying and building comment trees.
Attached image proposed schema, v2
Less normalization, more timestamps.
Attachment #383787 - Attachment is obsolete: true
Comment on attachment 383956 [details]
proposed schema, v2

- I think a unique key on user_id and comment_id  could replace id in reviewthread_subscriptions
- The tables should conform to cake naming standards and our own weirdness.  Something like "users_versioncomments" and "versioncomments" (the second would look better as versions_comments but that would look like a join table...sigh...)
- It's just a typo but created/modified should be datetime everywhere.
Rey, what type of toolbar did you have in mind for the comment form?

A full-blown html wysiwg editor such as TinyMCE is most flexible, but somewhat of an inelegant sledge hammer.

Something simpler based on markdown could be cleaner. WMD (http://wmd-editor.com/demo) is pretty slick though not yet officially open-source.

Other good options?
Wil & chatted and thought it wasn't necessary to go full-tilt WYSIWYG. Perhaps something like this: http://markitup.jaysalvat.com/home/ which is jQuery-based and provides a lot of flexibility. It's also MIT/GPL.

The phase 2 spec shows the markdown options I'm looking for:

http://docs.google.com/Doc?id=dcm3h3vx_9gdn2bwhg&hl=en
Attached patch Vanilla commentsSplinter Review
I'm still working on markdown and markitup integration, but to get things rolling here is a fully-functional patch with plain-text comments.
Attachment #384816 - Flags: review?(clouserw)
The bulk of this patch is from two libraries: PHPMarkdown and Markitup.

The new MarkdownComponent should be especially scrutinized as it is generating html and stripping out non-whitelisted tags and attributes. While we can probably trust the editors, this is generic enough that it could very likely be used on more public pages later.
Attachment #385228 - Flags: review?(clouserw)
Comment on attachment 384816 [details] [diff] [review]
Vanilla comments

R+ because it works.  Suggestions/fixes:

- no need to turn off foreign_key_checks in your xxx-*sql

- You've got your email address hardcoded in /controllers/components/email.php

- in models/versioncomment.php's subscribe() and unsubscribe(), can you use "REPLACE INTO" to avoid the double queries?

- Can we add "[AMO]" to the front of the subject and "; $addonName Review ($addonId)" to the end?  (open to other suggestions too)

- I think expanding/collapsing is supposed to be per-thread, not per message.  So collapsing a thread would mean all sub-messages would be hidden.  I might be wrong on this though, Rey?

- I think there should be a short blurb under "Editor Comments" that explains exactly what they are and (most importantly) who can see them and who gets emailed.  Something like:  "These comments can only be seen by editors and administrators"

- For the record, we run out of memory on this somewhere between 2000 and 3000 comments on a version.  I think that's fine for now.  I didn't see anywhere else that was loading these comments needlessly.
Attachment #384816 - Flags: review?(clouserw) → review+
Yes collapsing is per thread.
Regarding the blurb below the "Editor Comments" label, I don't think that's necessary as the only people that will be able to see this are editors or admins.
Thanks for the suggestions, Wil.

This patch addresses them all, and also adds comment counts to the history table (which I completely overlooked in the spec).

Note: will probably cause a couple of collisions on strings if applied after the Markdown patch.
Attachment #385303 - Flags: review?(clouserw)
r28610 for vanilla patch and r28611 for its migration
Comment on attachment 385303 [details] [diff] [review]
Fixes to Vanilla Comments

More than a few hundred replies in a thread makes firefox start to throw up "stop script" warnings.  I don't think it's worth being worried about yet.
Attachment #385303 - Flags: review?(clouserw) → review+
r28901 for vanilla fixes.

If the thread expand/collapse performance becomes an issue, I think using show()/hide() rather than slideDown()/slideUp() on replies would take care of things without losing much visual feedback from the animation.
Blocks: 501581
Hey Scott, I'm sorry but I'm not going to have time to review that patch in 5.0.7.  We'll hit it early 5.0.8, and I don't think it's blocking anything else.  Thanks.
Whiteboard: [patch]
Target Milestone: 5.0.7 → 5.0.8
Attached patch Markdown + Markitup v2 (obsolete) — Splinter Review
The previous patch had some serious bit-rot, so here is a fresh one for review (and so I can generate some new patches for phase2 bug that depend on this one)
Attachment #385228 - Attachment is obsolete: true
Attachment #388773 - Flags: review?(clouserw)
Attachment #385228 - Flags: review?(clouserw)
Blocks: 495492
I've never used markdown so it may just be because it's different, but I'm pretty sure I don't like it. :)  Requiring a tab indent for a code block?  Who thought that was a good idea (particularly because tab jumps form elements)?  Highlighting multiple lines of text and trying to apply a style (list, code, quote) only applies it to the first line.  Markdown gets a solid meh.  However, it does give Rey his buttons.

So, comments on the patch.  The code looked good.  There were a ton of lines where you removed the third argument to ___() - I'd rather you not do that as it makes merging into the .po later harder (just for future reference).  Additionally you're hardcoding the list of acceptable tags in the help.  Seems like we could just print BASIC_HTML_TAGS out.

And comments on the idea.  I'm wary about allowing HTML.  Using some kind of markup makes me happier, but apparently I can still use normal HTML as well with this library?  That means somewhere deep in this pile of code they are doing some kind of parsing to figure out what I meant and trying to avoid disallowed stuff?  That makes me nervous.

Additionally, allowing <a> still raises security issues (bug 343573).  I don't see a way around that.

So, from here:

1) Is there a way to only allow a non-html markup and only allow the tags in the  spec?

2) Rey, do we need <a> in this go-round?  I don't know of a way to do that securely without some major changes on AMO (setting up a new domain kind of changes).
(In reply to comment #20)

> There were a ton of lines where you removed the third argument to ___() -
> I'd rather you not do that as it makes merging into the .po later harder
> (just for future reference).

Those string changes and a merged .po were part of the v1 patch. By the time v2 was updated to apply cleanly, the .po had already picked up the strings in trunk so I left out unnecessary conflicts. At any rate, when strings *are* merged would you like fall-back values to remain or be removed from code?

> That means somewhere deep in this pile of code they are doing some kind of
> parsing to figure out what I meant and trying to avoid disallowed stuff?
> That makes me nervous.

I agree and certainly don't trust a regex-based parser to be bulletproof which is why the component applies strip_tags() to the transformed markdown. However, strip_tags() doesn't touch attributes of white-listed tags, so for the final pass I resorted back to regexs for attribute sanitation. I'd love a better solution.

> 1) Is there a way to only allow a non-html markup and only allow the tags in
> the spec?

We could temporarily encode code blocks, run the markdown text through strip_tags() _without_ a whitelist, decode the code blocks, then send everything through markdown to generate its benign html. The result should include markup and attributes only created by markdown. Code block parsing and transformation becomes the key point of trust.

Limiting the tags that markdown produces could be done with an extra white-listed strip_tags() call at the end, or by customizing the markdown parser (not very fun).
(In reply to comment #21)
> (In reply to comment #20)
> 
> > There were a ton of lines where you removed the third argument to ___() -
> > I'd rather you not do that as it makes merging into the .po later harder
> > (just for future reference).
> 
> Those string changes and a merged .po were part of the v1 patch. By the time v2
> was updated to apply cleanly, the .po had already picked up the strings in
> trunk so I left out unnecessary conflicts. At any rate, when strings *are*
> merged would you like fall-back values to remain or be removed from code?

If they are in the en-US .po already, I don't care one way or the other.  Whatever is easiest for you.


> > That means somewhere deep in this pile of code they are doing some kind of
> > parsing to figure out what I meant and trying to avoid disallowed stuff?
> > That makes me nervous.
> 
> I agree and certainly don't trust a regex-based parser to be bulletproof which
> is why the component applies strip_tags() to the transformed markdown. However,
> strip_tags() doesn't touch attributes of white-listed tags, so for the final
> pass I resorted back to regexs for attribute sanitation. I'd love a better
> solution.

I don't think users need to enter attributes at all.

> 
> > 1) Is there a way to only allow a non-html markup and only allow the tags in
> > the spec?
> 
> We could temporarily encode code blocks, run the markdown text through
> strip_tags() _without_ a whitelist, decode the code blocks, then send
> everything through markdown to generate its benign html. The result should
> include markup and attributes only created by markdown. Code block parsing and
> transformation becomes the key point of trust.
> 
> Limiting the tags that markdown produces could be done with an extra
> white-listed strip_tags() call at the end, or by customizing the markdown
> parser (not very fun).

I'd rather not strip tags, since that's throwing out content.  Can we sanitize the whole string (which, I don't think will mess with markdown markup) and then run it through markdown to add the html tags?  Then we push that to the view and anything that wasn't in markdown's format is turned into entities and will just show up.  What do you think?
(In reply to comment #20)
> Requiring a tab indent for a code block?  Who
> thought that was a good idea (particularly because tab jumps form elements)? 

FWIW, "To produce a code block in Markdown, simply indent every line of the block by at least 4 spaces or 1 tab." from http://daringfireball.net/projects/markdown/syntax#precode .
(In reply to comment #22)

> I'd rather not strip tags, since that's throwing out content.  Can we sanitize
> the whole string (which, I don't think will mess with markdown markup) and then
> run it through markdown to add the html tags?  Then we push that to the view
> and anything that wasn't in markdown's format is turned into entities and will
> just show up.  What do you think?

I think that might work. Markdown does entityize code block content, so that would either need to be temporarily encoded to avoid double entities, or removing markdown's entity transformation. The phase2 patch already extends the markdown parser in order to support fenced code blocks and syntax highlighting, so that would be an obvious place to change behavior. I'll try it out later this morning.
(In reply to comment #20)
> 2) Rey, do we need <a> in this go-round?  I don't know of a way to do that
> securely without some major changes on AMO (setting up a new domain kind of
> changes).

If it's a major change then we can bypass it. Will they still be able to enter a text-based URL (non-hyperlinked) into the comment?
(In reply to comment #25)
> (In reply to comment #20)
> > 2) Rey, do we need <a> in this go-round?  I don't know of a way to do that
> > securely without some major changes on AMO (setting up a new domain kind of
> > changes).
> 
> If it's a major change then we can bypass it. Will they still be able to enter
> a text-based URL (non-hyperlinked) into the comment?

Yep
Go for it.
Attachment #388773 - Flags: review?(clouserw) → review-
As suggested, markdown text is now sanitized before processing so things should be much safer. Compared to the v2 patch, a new AmoMarkdownParser class extends the vendor class to:

 - tighten down allowed syntax
 - replace sucky space-prefixed codeblock syntax with a "fenced" syntax
 - work with sanitized markdown (the > character was used for blockquotes)
Attachment #388773 - Attachment is obsolete: true
Attachment #390288 - Flags: review?(clouserw)
Comment on attachment 390288 [details] [diff] [review]
Markdown + Markitup v3

Feels like we are including a ton of code for some simple replacement, but maybe we'll expand things in the future.

I still don't like the markdown syntax though.  Even simple stuff doesn't react the way I would expect.  E.g.
>a
>a

Is put all on one line because I didn't add extra spaces to the end.  Weak sauce.

Anyway, r+.
Attachment #390288 - Flags: review?(clouserw) → review+
markdown + markitup added in r30354.

Wil and I concluded that a webroot/vendors area for Markitup (and other tightly integrated JS libraries) would be best for organization. The committed revision reflects this deviation from the patch.
Status: NEW → RESOLVED
Closed: 15 years ago
Keywords: push-needed
Resolution: --- → FIXED
Product: addons.mozilla.org → addons.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: