Proposed Keyword : dev-notes Description : Use this keyword to highlight that the bug is purely a note by the developer to keep track of a task, such as, clean-up a piece of code, etc. Reasons for the keyword to be added in bugzilla: 1. There are a number of bugs which are opened by developers that purely comprise of notes by them, to keep track of a task they need to complete 2. It will help to standarize a process where developers can add that keyword themselves, and it will be applicable to every other component too. 3. Would be easier for users to query a component, and if needed, eliminate such bugs from the bug list.
I think the intention is to indicate the bug has a "code level" change that can not be verified without looking at the source code. Maybe a better keyword would be "whitebox_change"??. I think this would be a useful keyword, since I frequently have to put in notations after closing this type of bug indicating it can only be verified by looking at the source code.
How about? keyword: whitebox Description: Indicates that the bug can only be verified by inspecting the code.
We need this keyword mostly for the following purposes: Minimization (or creating reduced testcases): The minimization or tescase reduction process doesn't apply to white-box bugs, as the bug is already reduced to the code level. Testcases could still be created by someone who has the necessary understanding of the code. I think QA should not spend time trying to create such testcases. Optionally, developers may want to attach a testcase that exercizes the behaviour implemented in the code. Verification: As described above by Kevin, the bug can only be verified by inspecting the code -- unless a testcase is attached. My recommendation for the description would be something like: Indicates that the bug does not require a reduced testcase to reproduce, and the bug may be verified by inspecting the code.
Is everyone happy with this?... Asa, is ok with mozilla.org?
No additional comments in a week, so we'll assume everyone is happy with it. Asa, please go ahead and set up the following keyword/description asap: Keyword: whitebox Description: Indicates that the bug does not require a reduced testcase to reproduce, and it may be verified by inspecting the code.
asa's gone for about a week, i can now make these changes if he signs off, but...
Would Lisa be able to make this change in Asa's absence ?
If Timeless can make the change, I prefer that he do it (as QA contact on this bug). Has the mozilla community agreed to this change? I don't see the approval anywhere. Timeless - can you get a sense from the Mozilla community in Asa's absence? We do need this keyword to help Kevin's team better prioritize the bugs that need reduced testcases.
sorry, while asa gave me the power he didn't authorize me to do that. i've cc'd some people for feedback.
The name and description of the keyword seem more specific than the use for which it is intended. I think we used to have a keyword that seems more appropriate for this use than what is proposed here: "donttest".
I personally have no objection. Of course, I'm not one to maintain keywords... Based on the explicit definition given, I don't see how it relates to documentation, and I can't offer much of an opinion either way. It would relate if we expanded "code" to include documentation markup... I'm thinking about this, and I don't have an answer either way. See bug 157668 for a slightly related issue -- I recommended we have some sort of keyword specifically for tracking documentation issues. Having this sort of keyword in addition to a docs notes keyword can't hurt documentation one bit. unofficial r=ajv (I can't officially r= anything yet ;) )
I'm not happy with the keyword whitebox. I understand the value of having a flag but I think whitebox doesn't convey the meaning this flag should. I think I agree with dbaron that donttest would be a better choice.
Maybe I'm just not with the lingo, but 'whitebox' doesn't mean anything to me. I'm not even exactly sure that I understand what the keyword's purpose is. Am I right in thinking that it's for changes that only affect the code, and not the functionality, of the product and therefore can only be verified by looking at the said code and not with test cases? If so, I think something like 'qacodeonly' would be a better keyword than 'whitebox' or 'donttest' (which I'm worried that some people could infer as 'dontcare').
"donttest" is a bit misleading since all of these bug fixes need testing of some sort even if it's by a developer or QA who has access to source code to view that the code level change has been made. How about: "codeverify" "codelevel"
Re comment 15: I have no idea why Netscape QA puts so much emphasis on pro-forma verification (and tries to enforce this emphasis on mozilla.org). Good testing involves much more than verification that certain narrowly-defined bugs (often defined by a single testcase) are fixed, and yet it seems that the only thing mozilla.org requires of its QA contacts is this single, often low-productivity, activity. Testing should concentrate on those activities that are most likely to improve the quality of the product. This sometimes includes verification, but usually verification in a much broader sense than it seems to be practiced by most current QA contacts. This sometimes involves running existing testcases and searching for regressions. It sometimes involves writing new testcases. Verification does have a purpose. It is useful to ensure that the bug, as described, has been fixed, rather than fixed in a single testcase that shows one instance of that description. This requires writing additional testcases that fit the description of the bug. Good verification also involves considering *broader* definitions of the bug. Bugs within the broader definition might be present even after the fix, in which case they should be filed. Bugs within the broader definition might also have been fixed by the fix checked in for the more narrowly defined bug. Such other bugs might even exist in other bugs reports in Bugzilla, in which case those other bugs should be resolved, since a good QA contact would be somewhat knowledgable about the bugs in his or her component and thus aware of their existence. (This is not the only reason a good QA contact should be aware of the bugs in the component. It is also useful to know which areas are generally buggy, and need more new testcases, or which areas are not being tested, and thus need more new testcases.) Good verification may also involve regression testing of areas related to the fix. (After all, that should be the point of requiring trunk verification before something is checked in to a stable branch, but I don't think I've ever seen news of regressions caused by the fix for a bug, but outside of the description of the bug, noted on a bug *as part of* the branch verification process.) However, I get the sense that Netscape QA uses (and wants mozilla.org to use) number of bugs verified as a measure of productivity of testers. This is a very bad measure, since it encourages wasting time on verification that could have resulted in significantly larger improvements to the quality of the codebase if spent on some other activity (such as writing new testcases, an activity I think many QA contacts spend very little time on). (Note that I tend to think of the tasks of QA as those done in document-related (as opposed to network-related) standards compliance QA, which involves lots of testcases. However, I think there are probably strong analogies in many other areas.) If Netscape QA really thinks it's important to verify that certain patches attached to bugs have been checked in (i.e., that such activity is likely to catch errors that will lead to an improvement in quality), we should write a *script* to verify this (and note the checkin in the attachment status) in all cases whenever a bug is resolved or a branch resolution keyword is added, and QA contacts can stop wasting their time on activity that can be done by a script. What does this rant have to do with this bug? I claim that merely checking that a specific code change has been made is very unlikely to improve the quality of the product. (Checking that it has been made on a certain branch might sometimes be an exception, but rarely. Good verification (which includes things such as regression testing in related areas), as I described above, would likely be useful in more cases, but I don't see that being done by more than a handful of QA contacts.) If you're not going to do *useful* testing, there's no point doing testing at all, so I think "donttest" is an appropriate keyword. I don't think we need keywords whose names and descriptions are a description of bad testing procedure. If a bug hasn't been verified in a useful way, there is no need for it to change to the VERIFIED state.
Please note that Netscape does not use the metric of how many bugs verified as part of productivity on any performance reviews. Pls do not infer this if you don't have direct knowledge on how we measure an employee's productivity in QA. However, I get paid and get asked by my upper management who pay me to make sure that we look at all the fixes that come into our product (Netscape) which is based off the Mozilla codebase. Currently, the only way I have to track that is to know if there are bugs which are not in the "verified" state (or keyword, if a branch is involved). I really don't care what this keyword is called and I really do not appreciate a lecture on QA or bug verifications. I was only offering an friendly opinion to perhaps help move this bug enhancement along. This enhancement was filed as a brainstorm with some of the layout development owners to help them triage and manage the amount of incoming bugs. And, if you do have some time, I'd be happy to share with you all of the regression testing and scheduled testing that we do on the product. Sometimes, views into Netscape QA is measured on what the public can see (which is normally bug activity), but we do a lot more that. If you are working at Netscape, I'd be happy to share with your our last month's status reports so you can see that we don't spend time on useless verifications. We have too much on our plate for that.
I think I can judge what things QA does that lead to improvements in the quality of the products created from the mozilla.org codebase, since improvements in the quality of the products are (almost entirely, modulo changes to documentation and to release schedules) caused by changes to the code, and all the changes to the code are public and have public bugs describing them. Based on following bugs (and who files them, and who helps in them), I can get a good sense of what improvements to the code happened as a result of or with the assistance of QA activity. And please don't kid me about the use of verification as a metric. Just because you don't use it in performance reviews doesn't mean you don't care about it too much. I've seen it many times, such as: bug 176140 comment 2 http://www.mozilla.org/status/2000-11-08.html
David is right on all counts, and I say this from first hand experience as a member of Netscape's standards compliace QA team for over a year.
it would seem that either a "whitebox" or a "donttest" keyword would be used to exclude bugs, rather than identify them. the inverse, i suppose, would be a "testme" or "testable" keyword. either way, i don't see the actual benefit of creating this distinction, other than using a "testme" keyword for triage or qa.
I guess I'm the father of verfication metrics as a way to understand and guide the project in what testing activity needs to be done at any point in the development or milestone release cycle, so all the arrows directed at using verfication metrics should be directed at me (Note I said "project"; verification were never intented as a metric to measure individual contributor performance). Here is the thinking. For any given milestone somewhere around, say, 700 bug fixes or features are checked in. We have all observed that some pct. of bug fixes and feature work results in unintended side affects and problems. I submit that it is useful to understand what this pct. is as the number can help to predict the chances of regression in the endgame when your trying to decide which kinds of changes to take and how many changes you can take, and how many cycles of changes you can take with any expectation of "bouncing off and staying at zarro boogs" for a milestone. The only way to get at the "pct. of regression" is to attempt verfication of all the changes for any milestone and reopen bugs where regressions are found. The only way know if you have attempted verfication of 700 or so changes is to track how many bugs have been verified and reopened. Updates to bugzilla have broken my old reporting tools, but classically we would see somewhere around 20%-30% of all checkin's result in some level of regression or unintended side effects. The reports also help us to understand the progress over time in attempting verfication on the 700 or so bugs each milestone. If we neared the end of the milestone and we had "200 bugs verified" I submit that we really knew a lot less about the quality of the builds than if we had "600 bugs verified". Applying the unverified bug number and regression rate metrics we can roughly predict how many regressions remained undiscovered in the code base from the changes made during the milestone. From this we can and figure out if we can live with that level of risk, or if need to extend the testing and ramp down period to accomodate more testing and regression fixing to meet the needs of a given milestone. So that is the theory behind using verfication metrics to help the project. We have used the theory several times to help roughly predict dates for major milestones like mozilla 1.0 and some of the other recent netscape release milestones. Can application of the theory create distortion if the depth and breadth of verfication is uneven? Sure. But I think that is a different problem, and that gets us back the what I think the intention of this bug is. I think if we revisit the lifecycle of possible verfication outcomes we can I identify some states that will help us to spot communication and verfication problems that are limiting our ability to get additional depth and breadth coverage where we could really use it. So here is a crack at some of those states. A bug fix or feature gets checked in. The QA contact approaches the bug, - tries to get an understanding of the changes, - tries to figure out a strategy and set of tools that would do the best job of confirming the existance of the fix, - tries to figure out a strategy for turning up possible unintended side effects. At this point the QA contact might be left in one of many states. -They might, or might not, understand the changes based on information in the bug -They might, or might not, have a good plan for confirming the fix -They might, or might not, have a good plan for extended testing to check for side effects... -They might, or might not, have an opinon about how much time or effort should be expended on verifying the bug fix. In addition to these states, they might understand the bug, have good plans and ideas on how to proceed with the testing, but they also might have a need to get additional help from others to execute these tests, or they might be ok with execution of the test plan on their own.... Having keywords or additional verification states can help us to understand how the verfications are proceeding, fill in gaps where communication is missing, and generally understand the verfication process so we can get better at it. Here are some samples of verfication states that might be interesting from a project perspective. Some sample keywords are also suggested for each. Doing querys on the bug fixes that need additional information to develop a test plan would be useful to make sure communication gaps are filled. (need-testing-info) Querying the bug fixes that only recieved contrainted testing around the fix due to time or resource contraints could tell us information that would help us to redirect folks to get addition depth or breadth covers on key fixes. (need-limited-verf-help limited-verf) Doing a query that tells us how many bugs we were able to complete high levels of depth and breadth coverage on can help us to understand how many bug fixes are getting the attention that we all want. ( need-depth-test-help, depth-tested) Doing a query on the bugs that the QA contact made a decision that reflects their opion about low value in further testing of the bug helps to set up project wide review of these bugs to make sure the right testing decisions are being made. (donttest, whitebox, no-testing-needed, just-verf-the-patch-is-checked-in-mon) ....) So here is what a future report might look like that would help us to manage the testing process for a milestone, make sure testing efforts are directed and the most improtant testing activites, and give us a way to better understand and improve how well we can test the changes going into any milestone... 700 - bugs fixed 200 - unverified (not looked at by qa contacts 500 - some level of verfication or examination 60 - need testing info 190 - limited-verf-completed 20 - need-limited-verf-help 110 - depth-tested-completed 20 - need-depth-testing-help 100 - donttest, dont-spend-more-time-on-this-dude, check-to-verf-patch-on-branch I'd love this kind of metrics. With them we could mount efforts on key bugs that need more testing help. We could quickly mark bugs that dont deserve the addition time, mark bugs that are only getting a limited amount of testing, but we also put in checks communicating and auditing these decisions.
The thing is, the overwhelming majority of the time, bugs are not reopened by their QA contact. They are reopened by other engineers or by triagers when finding newly filed regressions.
Anyway, this is getting away from the issue at hand, which is whether we need a keyword to label bugs for which it is not really appropriate to create a testcase or steps to reproduce. So the question is, do we really need such a keyword. Personally, as a layout QA contact, I don't think I need this keyword, but I do think that it is annoying to be QA contact on bugs like "rename nsIThis to nsIThat" -- I have no intention of verifying them by looking at lxr, and I'll be checking they haven't caused regressions simply by running my usual tests, not specifically when looking for regressions from such a bug. I am thinking of simply reassigning QA to email@example.com in cases such as that. What I usually say in keyword request bugs is that the people who want the keyword should try using a status whiteboard indicator first, and if that turns out to be really useful, then we can add a keyword.
No further comments, so those who are still interested in this should try using a status whiteboard indicator first, and if that turns out to be really useful, then we can add a keyword. WONTFIX for now.