Open Bug 442747 Opened 16 years ago Updated 13 years ago

RDF buglist output is unhelpful and inconsistent

Categories

(Bugzilla :: Query/Bug List, defect)

3.0.4
defect
Not set
minor

Tracking

()

People

(Reporter: paulc, Unassigned)

Details

I was working on a module for the future QMO website, trying to aggregate bugs for generating various stats (# of bugs filed per day, qawanted bugs, more and more).

This may be broken down into several bugs. The bug is mostly to help others in the future, as I've mostly found workarounds for any problems described below. The main concern here is with the inconsistencies in XML formatting. It would have helped me, and it would help others in the future if we make it standards compliant and follow good practices. I'm no expert on XML and XSLT practices, so please pardon me if I am horribly wrong in any of these.

1. Bugzilla outputs lists of bugs in XML format (add parameter ctype=rdf to URL)
The XML formatting is not quite consistent with standards and the naming/namespacing isn't particularly great.
For example, try: 
https://bugzilla.mozilla.org/buglist.cgi?ctype=rdf&bug_status=UNCONFIRMED%2CNEW%2CASSIGNED%2CREOPENED%2CRESOLVED&chfield=[Bug%20creation]&chfieldfrom=-24h&field-1-0-0=bug_status&field-1-1-0=product&product=Firefox&query_format=advanced&remaction=&type-1-0-0=any

Issues:
* XML tags:
- severity is called "bug_severity", status is called "bug_status", but other tags have no "bug" prefix
- every bug is included in a <bz:bug> tag, which is included in a <li> tag. Why include <li> that contains only one element?
- similarly (but the other way around), <Seq> is included in a <bz:bugs> tag, again only one element in another.
- and the third time, <RDF> includes <bz:result> which includes everything.

* Values/content:
- changeddate and opendate have THREE different formats. This may be useful for readability on Bugzilla lists (html), but if one wishes to parse this XML and output it nicely (which is why we offer it in the first place, I would suspect), it's hard to deal with all this inconsistency. The three formats available now are:
a) time (hh:mm:ss), if bug filed today
b) day(three letter abbreviation) + time (hh:mm, no seconds!), if bug  filed within the past 7 days
c) date (with no time), if bug filed before that
- some fields are capitalized (e.g. bug status, resolution), while others aren't
- some empty fields display dashes, 2 or 3. e.g. target_milestone displays 3 "---" and priority displays 2 "--"
- if field is set to value "all", first letter is capitalized "All"
- some <short_desc> (summary) tags display the content on a new line:
<bz:short_desc>
content...
</bz:short_desc>
While others display it on a single line:
<bz:short_desc>content...</bz:short_desc>
- for a single bug, the parameter to display it in XML format is ctype=xml; for a list of bugs, it's ctype=rdf

Suggested solutions:
- remove the "bug_" prefix from severity and status. it's inside a bz:bug tag so it's clear what each is referring to
- remove <li>, <Seq>, <RDF>
- remove the namespacing "bz:" or add a namespace stylesheet
- everything is prefixed with bz:. Is there a namespace for this XML? If there is one, it should be referenced in the output.
- make changeddate and opendate display a full and standard date-time format, such as m/d/y hh:mm:ss
- lowercase all XML tags and output values for resolution and status
- make all empty values/unassigned values display consistently, either "none", "empty" or something.
- values for "all" (like OS, Platform), should be lowercase
- <bz:short_desc> displays consistently on the same line like all other fields
- add another format, say ctype=xml, with the fixes.

2. Documentation on query parameters, output format, url explanation.
Is this already done? It'd be really helpful. I am willing to take this assignment and do as much of it as I can. If so, where should I put the doc?
Mainly, it would help to have a documentation on the parameters to put in the query to get certain things. For example, for changed bugs, you can put in "chfieldto=start_date&chfieldfrom=end_date" to get the bugs changed between start_date and end_date.
OS: Mac OS X → All
Hardware: PC → All
Version: unspecified → 3.0.4
gerv might have some thoughts on any backwards compatibility problems that changes like this might have on other applications or reports built on bugzilla.

Sam's start work on some dashboard(s) that help to surface important bug work and this might help or have impact on that work,
Oh yeah, I forgot: For feedback, suggestions, and if you wanna see what I've done so far, check out:
https://intranet.mozilla.org/Paul_Craciunoiu
(chris, I added your duplicates stats idea)
Feel free to edit the page and add your suggestions, with [yourname] at the end so I know who to ask for details if I don't get it :)

Sam, maybe I can help you out with what you'll be doing, or we can coordinate, so that we don't do the same work twice.
some time ago I've been working on a central start page for Bugzilla users that could present the data the user really is looking for (especially if he's a dev):

http://landfill.bugzilla.org/gandui

plus revision of search mockup:

http://landfill.bugzilla.org/gandui/query.cgi

just adding here because maybe you'll find some elements interesting for your work
myk also had a number of features in the old 'buzulla' extension that might be interesting to surface in the qa companion, or web content status pages that help to surface important bugs, and manipulate search results.

you can have a look at old bugzula stuff here, but you might need an older version of the suite or firefox to try it out.
Thanks a bunch Zbigniew!
Question: for your second link, what are the queries for the "Changed" tab that display bugs commented, confirmed or resolved between A and B? The other ones for changed and opened I already saw on Bugzilla.

Also, if you have any documentation or easy way to see how you correlated the query with the fields there, it'd be helpful!
on https://intranet.mozilla.org/Paul_Craciunoiu there are some ideas on bug filters/queries as in: 

#  Blocking Bugs filed by the community [marcia]
# Blocking Bugs filed by the QAE team [marcia] 

one problem here is that our definition for bug markings that mean "blocking" seems to change project to project, or even within various cycles and stages of progress.    getting some standard reports that are really useful might help us to make the set of 'blocking markings' we use more consistant across the projects and development lifecycles, or we need to make that part of the system pretty flexible so a project admin could adjust the queries and reports on the fly...

Chris, should there have been a link in your comment? Where can I find the extension? bugzula is it?
Paul: it's a mockup, sorry. I did not work on implementation.
Please focus on *one* feature/enhancement only and fix it or this bug will be closed as INVALID. Having such bugs is completely useless for us. It's not the right place to have a discussion on all the various improvements we could have in Bugzilla. If bugs related to a specific RFE already exist, then comment in these bugs directly.
(In reply to comment #7)
> Chris, should there have been a link in your comment? Where can I find the
> extension? bugzula is it?

It's actually Bugxula, and the code is available at http://bugxula.mozdev.org/source.html .
Thanks for posting those mocks. I don't recall anyone mentioning them and since we're hoping to redo the homepage and search, I'll keep this posting in mind!
I noticed that if you go to the list of bugs and click on the buttom called "XML" you get an output that looks much better and formatted as opposed to when you just add the "ctype=rdf" parameter in the URL.
Would it be possible at least to set that same format to be displayed when querying from the URL with ctype? Either rdf or xml would be fine and help me quite a bit...
> Please focus on *one* feature/enhancement only and fix it or this bug will be
closed as INVALID.

we may spin several bugs off this and make this a tracking bug.  hold off on marking this invalid, there is actually a lot of useful discussion going on in the bug right now that will help to coordinate a lot of parallel thinking and work that is going on.

Marking this as meta bug (how to mark something as a "tracking" bug)
Keywords: meta
(In reply to comment #0)
> Issues:
> * XML tags:
> - severity is called "bug_severity", status is called "bug_status", but other
> tags have no "bug" prefix

These names are directly from the database. Yes, they are historically slightly crufty but there's a big advantage that they are unambiguous and the same everywhere. You shouldn't be directly displaying these strings anyway, because it will make your output un-localizable.

> - every bug is included in a <bz:bug> tag, which is included in a <li> tag. Why
> include <li> that contains only one element?

I suspect that this is what the RDF standard requires for a <Seq>. RDF stands for Resource Description Format, which is understood by several programs. We didn't just invent all this ourselves. But even if it's not required by RDF, why is it harmful? What would be the gain in changing it and breaking compatibility?

> - similarly (but the other way around), <Seq> is included in a <bz:bugs> tag,
> again only one element in another.

This is because we are combining tags from two namespaces - the RDF namespace and our own.

> - and the third time, <RDF> includes <bz:result> which includes everything.

And again.

> * Values/content:
> - changeddate and opendate have THREE different formats. 

Yes, this needs fixing. I believe there's a bug on it, because it also makes client-side sorting in HTML buglists difficult.

> - some fields are capitalized (e.g. bug status, resolution), while others
> aren't

That's just the values in the database (which are the defaults). People are used to it. What's the problem?

> - some empty fields display dashes, 2 or 3. e.g. target_milestone displays 3
> "---" and priority displays 2 "--"

Feel free to fix that.

> - if field is set to value "all", first letter is capitalized "All"

I'm not sure what you mean here. "All" is the text value in the database.

> - some <short_desc> (summary) tags display the content on a new line:
> <bz:short_desc>
> content...
> </bz:short_desc>
> While others display it on a single line:
> <bz:short_desc>content...</bz:short_desc>

Your XML parser should have no problem with that. You are using an XML parser, right?

> - for a single bug, the parameter to display it in XML format is ctype=xml; for
> a list of bugs, it's ctype=rdf

That's because the RDF buglists are RDF :-) The XML bugs are our own schema.

> - everything is prefixed with bz:. Is there a namespace for this XML? If there
> is one, it should be referenced in the output.

Nope, because I don't think there was ever a proper schema for our additions.

> - lowercase all XML tags and output values for resolution and status

You mean, change the data so you aren't returning what's in the database?

> - values for "all" (like OS, Platform), should be lowercase

Why?

> 2. Documentation on query parameters, output format, url explanation.
> Is this already done? It'd be really helpful. 

It's called the code :-) Documentation has a tendency to get out of sync with reality unless it's as close to the code as possible. I would suggest adding any documentation to the relevant template files.

> Mainly, it would help to have a documentation on the parameters to put in the
> query to get certain things. For example, for changed bugs, you can put in
> "chfieldto=start_date&chfieldfrom=end_date" to get the bugs changed between
> start_date and end_date.

That's documented by the fact that you can do it from the query page...

Gerv
(In reply to comment #2)
> Oh yeah, I forgot: For feedback, suggestions, and if you wanna see what I've
> done so far, check out:
> https://intranet.mozilla.org/Paul_Craciunoiu
> (chris, I added your duplicates stats idea)
> Feel free to edit the page and add your suggestions, with [yourname] at the end
> so I know who to ask for details if I don't get it :)

Paul,

You have a Gmail email address but access to intranet.mozilla.org, so I am guessing you are a contractor of some sort. You may not be aware that the Mozilla project consists of a large community of people from all around the world, and various different companies or none. intranet.mozilla.org is accessible only to employees/contractors of the Mozilla Foundation and Corporation, and so is not a suitable spot for public development work. Most project participants can't see what you write there, let alone add comments.

Feel free to create a page on the public wiki.mozilla.org for your work :-)

Gerv
(In reply to comment #12)
> I noticed that if you go to the list of bugs and click on the buttom called
> "XML" you get an output that looks much better and formatted as opposed to when
> you just add the "ctype=rdf" parameter in the URL.
> Would it be possible at least to set that same format to be displayed when
> querying from the URL with ctype?

If you view the source (or, on a small buglist, use the "frmget" bookmarklet so you can see the URL generated when you click the button), you will see that you just need to call show_bug.cgi with multiple "id=XXX" parameters and a "ctype=xml". This is not the same as the RDF buglists, because it includes all fields of the bug except for attachment data, whereas the buglists only include the fields in the columns you have requested for your search.

I would also recommend that you run your ideas past the mozilla.dev.apps.bugzilla newsgroup on news.mozilla.org, where the Bugzilla hackers hang out. They may well have useful input to give.

Gerv
(In reply to comment #15)
> > * XML tags:
> > - severity is called "bug_severity", status is called "bug_status", but other
> > tags have no "bug" prefix
> 
> These names are directly from the database. Yes, they are historically slightly
> crufty but there's a big advantage that they are unambiguous and the same
> everywhere. You shouldn't be directly displaying these strings anyway, because
> it will make your output un-localizable.
It's not about displaying them, it's about parsing them for a key=>value kinda thing, the tag name being the key. I have two options: either keep the same inconsistency or do some if's. Either way it's a hassle. It's not a big deal if this is hard to change in the output.
> > - every bug is included in a <bz:bug> tag, which is included in a <li> tag. Why
> > include <li> that contains only one element?
> 
> I suspect that this is what the RDF standard requires for a <Seq>. RDF stands
> for Resource Description Format, which is understood by several programs. We
> didn't just invent all this ourselves. But even if it's not required by RDF,
> why is it harmful? What would be the gain in changing it and breaking
> compatibility?
This is yours 100%, since I don't know enough about XML/RDF to comment on it. So you may disregard the three double-tags issues for the two namespaces.
> > - similarly (but the other way around), <Seq> is included in a <bz:bugs> tag,
> > again only one element in another.
> 
> This is because we are combining tags from two namespaces - the RDF namespace
> and our own.
In that case, is there a link I can use to the Bugzilla namespace? Or just know for the future?
> > - and the third time, <RDF> includes <bz:result> which includes everything.
> 
> And again.
> 
> > * Values/content:
> > - changeddate and opendate have THREE different formats. 
> 
> Yes, this needs fixing. I believe there's a bug on it, because it also makes
> client-side sorting in HTML buglists difficult.
Thank you :-)
> 
> > - some fields are capitalized (e.g. bug status, resolution), while others
> > aren't
> 
> That's just the values in the database (which are the defaults). People are
> used to it. What's the problem?
Inconsistency...? It's not a *big* problem, I was just hoping to help make Bugzilla more consistent.
> > - some empty fields display dashes, 2 or 3. e.g. target_milestone displays 3
> > "---" and priority displays 2 "--"
> 
> Feel free to fix that.
Which I will by using some if's... that would otherwise be unnecessary.
> > - if field is set to value "all", first letter is capitalized "All"
> 
> I'm not sure what you mean here. "All" is the text value in the database.
> 
> > - some <short_desc> (summary) tags display the content on a new line:
> > <bz:short_desc>
> > content...
> > </bz:short_desc>
> > While others display it on a single line:
> > <bz:short_desc>content...</bz:short_desc>
> 
> Your XML parser should have no problem with that. You are using an XML parser,
> right?
Again, just helping with consistency. I have workarounds for all of the issues in here, I was mostly trying to make future challenges like this easier for others.
> > - for a single bug, the parameter to display it in XML format is ctype=xml; for
> > a list of bugs, it's ctype=rdf
> 
> That's because the RDF buglists are RDF :-) The XML bugs are our own schema.
Okay...
> > - everything is prefixed with bz:. Is there a namespace for this XML? If there
> > is one, it should be referenced in the output.
> 
> Nope, because I don't think there was ever a proper schema for our additions.
*sigh*
> > - lowercase all XML tags and output values for resolution and status
> 
> You mean, change the data so you aren't returning what's in the database
It was a bad idea to store some as upper case and some as lower case in the first place. Not that it's too complicated to lowercase them, but, again, inconsistency.
> > - values for "all" (like OS, Platform), should be lowercase
> 
> Why?
> 
> > 2. Documentation on query parameters, output format, url explanation.
> > Is this already done? It'd be really helpful. 
> 
> It's called the code :-) Documentation has a tendency to get out of sync with
> reality unless it's as close to the code as possible. I would suggest adding
> any documentation to the relevant template files.
> 
> > Mainly, it would help to have a documentation on the parameters to put in the
> > query to get certain things. For example, for changed bugs, you can put in
> > "chfieldto=start_date&chfieldfrom=end_date" to get the bugs changed between
> > start_date and end_date.
> 
> That's documented by the fact that you can do it from the query page...
True :-)
> Gerv
> 

Thanks for the input, Gerv.
(In reply to comment #17)
> (In reply to comment #12)
I'll poke around, thanks for the suggestions! Any help is appreciated.
(In reply to comment #16)
> (In reply to comment #2)
> > Oh yeah, I forgot: For feedback, suggestions, and if you wanna see what I've
> > done so far, check out:
> > https://intranet.mozilla.org/Paul_Craciunoiu
> > (chris, I added your duplicates stats idea)
> > Feel free to edit the page and add your suggestions, with [yourname] at the end
> > so I know who to ask for details if I don't get it :)
> 
> Paul,
> 
> You have a Gmail email address but access to intranet.mozilla.org, so I am
> guessing you are a contractor of some sort. You may not be aware that the
> Mozilla project consists of a large community of people from all around the
> world, and various different companies or none. intranet.mozilla.org is
> accessible only to employees/contractors of the Mozilla Foundation and
> Corporation, and so is not a suitable spot for public development work. Most
> project participants can't see what you write there, let alone add comments.
> 
> Feel free to create a page on the public wiki.mozilla.org for your work :-)
> 
> Gerv
> 
Yeah, I'll be moving that to a public spot. It was really in the drafting stage for myself when it started. Good point!
Hey Paul. Please do break this down into several bugs and close this one--it's very difficult to read or track one bug with many different issues in it.

If you want a stable API, you should use the XMLRPC API, although it doesn't support searching yet, so I suppose that doesn't help much in this case.

If you want better XML output, you should search for just bug_ids (use columnlist=) and use show_bug.cgi?ctype=xml&id=1,2,3,4,5--that's what I do when I'm integrating with Bugzilla.

Summary: Broad Bugzilla enhancements → RDF buglist output is unhelpful and inconsistent
Assignee: ui → query-and-buglist
Severity: normal → minor
Component: User Interface → Query/Bug List
Keywords: meta
Hey Max. Thanks, I'll look into the API. Do you have a quick way for posting only the bug_ids you're interested in to show_bug.cgi?

I've opened a thread on the mozilla.dev.apps.bugzilla newsgroup here:
http://groups.google.com/group/mozilla.dev.apps.bugzilla/browse_thread/thread/a1794cbccdd3f6f2#
... and made the content (previously on intranet) publicly available at:
http://wiki.mozilla.org/QA/Community/QMO_Bugzilla
Severity: minor → normal
Component: Query/Bug List → User Interface
Severity: normal → minor
Component: User Interface → Query/Bug List
(In reply to comment #18)
> I have two options: either keep the same
> inconsistency or do some if's. Either way it's a hassle. 

Keep the same "inconsistency" (i.e. consistency with the database).

Gerv


(In reply to comment #15)
> > * Values/content:
> > - changeddate and opendate have THREE different formats. 
> 
> Yes, this needs fixing. I believe there's a bug on it, because it also makes
> client-side sorting in HTML buglists difficult.
> 
Is there really a bug for this particular issue?  I cannot find it.
You need to log in before you can comment on or make changes to this bug.