Closed Bug 172191 Opened 22 years ago Closed 22 years ago

Add keyword : whitebox

Categories

(bugzilla.mozilla.org :: Administration, task)

x86
All
task
Not set
normal

Tracking

()

VERIFIED WONTFIX

People

(Reporter: madhur, Assigned: ian)

Details

Proposed Keyword : dev-notes

Description :
Use this keyword to highlight that the bug is purely a note by the developer to
keep track of a task, such as, clean-up a piece of code, etc.

Reasons for the keyword to be added in bugzilla:
1. There are a number of bugs which are opened by developers that purely
comprise of notes by them, to keep track of a task they need to complete
2. It will help to standarize a process where developers can add that keyword
themselves, and it will be applicable to every other component too. 
3. Would be easier for users to query a component, and if needed, eliminate such
bugs from the bug list.
I think the intention is to indicate the bug has a "code level" change that can
not be verified without looking at the source code. Maybe a better keyword would
be "whitebox_change"??.  I think this would be a useful keyword, since I
frequently have to put in notations after closing this type of bug indicating it
can only be verified by looking at the source code.
How about?

keyword: whitebox
Description: Indicates that the bug can only be verified by inspecting the code.
Summary: Add keyword : dev-notes → Add keyword : whitebox
We need this keyword mostly for the following purposes:

Minimization (or creating reduced testcases):  The minimization or tescase 
reduction process doesn't apply to white-box bugs, as the bug is already reduced 
to the code level.  Testcases could still be created by someone who has the 
necessary understanding of the code.  I think QA should not spend time trying to 
create such testcases.  Optionally, developers may want to attach a testcase 
that exercizes the behaviour implemented in the code. 

Verification:  As described above by Kevin, the bug can only be verified by 
inspecting the code -- unless a testcase is attached.

My recommendation for the description would be something like:

  Indicates that the bug does not require a reduced testcase to reproduce, and 
the bug may be verified by inspecting the code.
Is everyone happy with this?... Asa, is ok with mozilla.org?
No additional comments in a week, so we'll assume everyone is happy with it.  
Asa, please go ahead and set up the following keyword/description asap:

Keyword:  whitebox
Description:  Indicates that the bug does not require a reduced testcase to 
reproduce, and it may be verified by inspecting the code.
asa's gone for about a week, i can now make these changes if he signs off, but...
Status: NEW → ASSIGNED
Would Lisa be able to make this change in Asa's absence ?
If Timeless can make the change, I prefer that he do it (as QA contact on this
bug).  

Has the mozilla community agreed to this change?  I don't see the approval anywhere.

Timeless - can you get a sense from the Mozilla community in Asa's absence?  We
do need this keyword to help Kevin's team better prioritize the bugs that need
reduced testcases.
sorry, while asa gave me the power he didn't authorize me to do that. i've cc'd
some people for feedback.
The name and description of the keyword seem more specific than the use for
which it is intended.  I think we used to have a keyword that seems more
appropriate for this use than what is proposed here:  "donttest".
donttest was removed by asa as bug 91335 and (mpt also requested its removal in
bug 87526)
I personally have no objection.  Of course, I'm not one to maintain 
keywords...  Based on the explicit definition given, I don't see how it relates 
to documentation, and I can't offer much of an opinion either way.

It would relate if we expanded "code" to include documentation markup...

I'm thinking about this, and I don't have an answer either way.  See bug 157668 
for a slightly related issue -- I recommended we have some sort of keyword 
specifically for tracking documentation issues.  Having this sort of keyword in 
addition to a docs notes keyword can't hurt documentation one bit.

unofficial r=ajv (I can't officially r= anything yet ;) )  
I'm not happy with the keyword whitebox. I understand the value of having a flag
but I think whitebox doesn't convey the meaning this flag should. I think I
agree with dbaron that donttest would be a better choice. 
Maybe I'm just not with the lingo, but 'whitebox' doesn't mean anything to me.
I'm not even exactly sure that I understand what the keyword's purpose is.

Am I right in thinking that it's for changes that only affect the code, and not
the functionality, of the product and therefore can only be verified by looking
at the said code and not with test cases?

If so, I think something like 'qacodeonly' would be a better keyword than
'whitebox' or 'donttest' (which I'm worried that some people could infer as
'dontcare').
"donttest" is a bit misleading since all of these bug fixes need testing of some
sort even if it's by a developer or QA who has access to source code to view
that the code level change has been made.

How about:  
"codeverify"
"codelevel"
Re comment 15:  I have no idea why Netscape QA puts so much emphasis on
pro-forma verification (and tries to enforce this emphasis on mozilla.org). 
Good testing involves much more than verification that certain narrowly-defined
bugs (often defined by a single testcase) are fixed, and yet it seems that the
only thing mozilla.org requires of its QA contacts is this single, often
low-productivity, activity.

Testing should concentrate on those activities that are most likely to improve
the quality of the product.  This sometimes includes verification, but usually
verification in a much broader sense than it seems to be practiced by most
current QA contacts.  This sometimes involves running existing testcases and
searching for regressions.  It sometimes involves writing new testcases.

Verification does have a purpose.  It is useful to ensure that the bug, as
described, has been fixed, rather than fixed in a single testcase that shows one
instance of that description.  This requires writing additional testcases that
fit the description of the bug.  Good verification also involves considering
*broader* definitions of the bug.  Bugs within the broader definition might be
present even after the fix, in which case they should be filed.  Bugs within the
broader definition might also have been fixed by the fix checked in for the more
narrowly defined bug.  Such other bugs might even exist in other bugs reports in
Bugzilla, in which case those other bugs should be resolved, since a good QA
contact would be somewhat knowledgable about the bugs in his or her component
and thus aware of their existence.  (This is not the only reason a good QA
contact should be aware of the bugs in the component.  It is also useful to know
which areas are generally buggy, and need more new testcases, or which areas are
not being tested, and thus need more new testcases.)  Good verification may also
involve regression testing of areas related to the fix.  (After all, that should
be the point of requiring trunk verification before something is checked in to a
stable branch, but I don't think I've ever seen news of regressions caused by
the fix for a bug, but outside of the description of the bug, noted on a bug *as
part of* the branch verification process.)

However, I get the sense that Netscape QA uses (and wants mozilla.org to use)
number of bugs verified as a measure of productivity of testers.  This is a very
bad measure, since it encourages wasting time on verification that could have
resulted in significantly larger improvements to the quality of the codebase if
spent on some other activity (such as writing new testcases, an activity I think
many QA contacts spend very little time on).  (Note that I tend to think of the
tasks of QA as those done in document-related (as opposed to network-related)
standards compliance QA, which involves lots of testcases.  However, I think
there are probably strong analogies in many other areas.)

If Netscape QA really thinks it's important to verify that certain patches
attached to bugs have been checked in (i.e., that such activity is likely to
catch errors that will lead to an improvement in quality), we should write a
*script* to verify this (and note the checkin in the attachment status) in all
cases whenever a bug is resolved or a branch resolution keyword is added, and QA
contacts can stop wasting their time on activity that can be done by a script.


What does this rant have to do with this bug?  I claim that merely checking that
a specific code change has been made is very unlikely to improve the quality of
the product.  (Checking that it has been made on a certain branch might
sometimes be an exception, but rarely.  Good verification (which includes things
such as regression testing in related areas), as I described above, would likely
be useful in more cases, but I don't see that being done by more than a handful
of QA contacts.)

If you're not going to do *useful* testing, there's no point doing testing at
all, so I think "donttest" is an appropriate keyword.  I don't think we need
keywords whose names and descriptions are a description of bad testing
procedure.  If a bug hasn't been verified in a useful way, there is no need for
it to change to the VERIFIED state.
Please note that Netscape does not use the metric of how many bugs verified as
part of productivity on any performance reviews.  Pls do not infer this if you
don't have direct knowledge on how we measure an employee's productivity in QA.

However, I get paid and get asked by my upper management who pay me to make sure
that we look at all the fixes that come into our product (Netscape) which is
based off the Mozilla codebase. Currently, the only way I have to track that is
to know if there are bugs which are not in the "verified" state (or keyword, if
a branch is involved).   

I really don't care what this keyword is called and I really do not appreciate a
lecture on QA or bug verifications.  I was only offering an friendly opinion to
perhaps help move this bug enhancement along.  This enhancement was filed as a
brainstorm with some of the layout development owners to help them triage and
manage the amount of incoming bugs.

And, if you do have some time, I'd be happy to share with you all of the
regression testing and scheduled testing that we do on the product.  Sometimes,
views into Netscape QA is measured on what the public can see (which is normally
bug activity), but we do a lot more that.  If you are working at Netscape, I'd
be happy to share with your our last month's status reports so you can see that
we don't spend time on useless verifications.  We have too much on our plate for
that.
I think I can judge what things QA does that lead to improvements in the quality
of the products created from the mozilla.org codebase, since improvements in the
quality of the products are (almost entirely, modulo changes to documentation
and to release schedules) caused by changes to the code, and all the changes to
the code are public and have public bugs describing them.  Based on following
bugs (and who files them, and who helps in them), I can get a good sense of what
improvements to the code happened as a result of or with the assistance of QA
activity.

And please don't kid me about the use of verification as a metric.  Just because
you don't use it in performance reviews doesn't mean you don't care about it too
much.  I've seen it many times, such as:
bug 176140 comment 2
http://www.mozilla.org/status/2000-11-08.html
David is right on all counts, and I say this from first hand experience as a
member of Netscape's standards compliace QA team for over a year.
bug 65558 enh asa@mozilla.org RESO WONT mozilla. Bugzilla codelevel keyword
Status: ASSIGNED → NEW
it would seem that either a "whitebox" or a "donttest" keyword would be used to
exclude bugs, rather than identify them.  the inverse, i suppose, would be a
"testme" or "testable" keyword.  either way, i don't see the actual benefit of
creating this distinction, other than using a "testme" keyword for triage or qa.
I guess I'm the father of verfication metrics as a way to understand and guide
the project in what testing activity needs to be done at any point in the
development or milestone release cycle, so all the arrows directed at using
verfication metrics should be directed at me  (Note I said "project";
verification were never intented as a metric to measure individual contributor
performance). 

Here is the thinking.

For any given milestone somewhere around, say, 700 bug fixes or features are
checked in.  We have all observed that some pct. of bug fixes and feature work
results in unintended side affects and problems.  I submit that it is useful to
understand what this pct. is as the number can help to predict the chances of
regression in the endgame when your trying to decide which kinds of changes to
take and how many changes you can take, and how many cycles of changes you can
take with any expectation of "bouncing off and staying at zarro boogs" for a
milestone.

The only way to get at the "pct. of regression" is to attempt verfication of all
the changes for any milestone and reopen bugs where regressions are found.  The
only way know if you have attempted verfication of 700 or so changes is to track
how many bugs have been verified and reopened.

Updates to bugzilla have broken my old reporting tools, but classically we would
see somewhere around 20%-30% of all checkin's result in some level of regression
or unintended side effects.  The reports also help us to understand the progress
over time in attempting verfication on the 700 or so bugs each milestone.   If
we neared the end of the milestone and we had "200 bugs verified" I submit that
we really knew a lot less about the quality of the builds than if we had "600
bugs verified".  Applying the unverified bug number and regression rate metrics
we can roughly predict how many regressions remained undiscovered in the code
base from the changes made during the milestone. From this we can and figure out
if we can live with that level of risk, or if need to extend the testing and
ramp down period to accomodate more testing and regression fixing to meet the
needs of a given milestone. 

So that is the theory behind using verfication metrics to help the project. We
have used the theory several times to help roughly predict dates for major
milestones like mozilla 1.0 and some of the other recent netscape release
milestones.  Can application of the theory create distortion if the depth and
breadth of verfication is uneven?  Sure.  But I think that is a different
problem, and that gets us back the what I think the intention of this bug is.

I think if we revisit the lifecycle of possible verfication outcomes we can I
identify some states that will help us to spot communication and verfication
problems that are limiting our ability to get additional depth and breadth
coverage where we could really use it.

So here is a crack at some of those states.

A bug fix or feature gets checked in.

The QA contact approaches the bug,
- tries to get an understanding of the changes, 
- tries to figure out a strategy and set of tools that would do the 
   best job of confirming the existance of the fix, 
- tries to figure out a strategy for turning up possible unintended side effects.

At this point the QA contact might be left in one of many 
states.  
-They might, or might not, understand the changes based on information in the bug
-They might, or might not, have a good plan for confirming the fix
-They might, or might not, have a good plan for extended testing to check for
side effects...
-They might, or might not, have an opinon about how much time or effort should
be expended on verifying the bug fix.

In addition to these states, they might understand the bug, have good plans and
ideas on how to proceed with the testing, but they also might have a need to get
additional help from others to execute these tests, or they might be ok with
execution of the test plan on their own....

Having keywords or additional verification states can help us to understand how
the verfications are proceeding, fill in gaps where communication is missing,
and  generally understand the verfication process so we can get better at it. 
Here are some samples of verfication states that might be interesting from a
project perspective.  Some sample keywords are also suggested for each.

Doing querys on the bug fixes that need additional information to develop a test
plan would be useful to make sure communication gaps are filled. (need-testing-info)

Querying the bug fixes that only recieved contrainted testing around the fix due
to time or resource contraints could tell us information that would help us to
redirect folks to get addition depth or breadth covers on key fixes.
(need-limited-verf-help limited-verf)

Doing a query that tells us how many bugs we were able to complete high levels
of depth and breadth coverage on can help us to understand how many bug fixes
are getting the attention that we all want. ( need-depth-test-help, depth-tested)

Doing a query on the bugs that the QA contact made a decision that reflects
their opion about low value in further testing of the bug helps to set up
project wide review of these bugs to make sure the right testing decisions are
being made. (donttest, whitebox, no-testing-needed,
just-verf-the-patch-is-checked-in-mon) ....)


So here is what a future report might look like that would help us to manage the
testing process for a milestone, make sure testing efforts are directed and the
most improtant testing activites, and give us a way to better understand and
improve how well we can test the changes going into any milestone...

700 - bugs fixed
   200 - unverified (not looked at by qa contacts
   500 - some level of verfication or examination
       60 - need testing info
      190 - limited-verf-completed
       20 - need-limited-verf-help
      110 - depth-tested-completed
       20 - need-depth-testing-help
      100 - donttest, dont-spend-more-time-on-this-dude, 
            check-to-verf-patch-on-branch

I'd love this kind of metrics.  With them we could mount efforts on key bugs
that need more testing help.  We could quickly mark bugs that dont deserve the
addition time, mark bugs that are only getting a limited amount of testing, but
we also put in checks communicating and auditing these decisions.
The thing is, the overwhelming majority of the time, bugs are not reopened by
their QA contact. They are reopened by other engineers or by triagers when
finding newly filed regressions.
->me
Assignee: asa → ian
Anyway, this is getting away from the issue at hand, which is whether we need a
keyword to label bugs for which it is not really appropriate to create a
testcase or steps to reproduce.

So the question is, do we really need such a keyword.

Personally, as a layout QA contact, I don't think I need this keyword, but I do
think that it is annoying to be QA contact on bugs like "rename nsIThis to
nsIThat" -- I have no intention of verifying them by looking at lxr, and I'll be
checking they haven't caused regressions simply by running my usual tests, not
specifically when looking for regressions from such a bug. I am thinking of
simply reassigning QA to nobody@mozilla.org in cases such as that.

What I usually say in keyword request bugs is that the people who want the
keyword should try using a status whiteboard indicator first, and if that turns
out to be really useful, then we can add a keyword.
No further comments, so those who are still interested in this should try using a 
status whiteboard indicator first, and if that turns out to be really useful, then
we can add a keyword.

WONTFIX for now.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → WONTFIX
vrfy wont
Status: RESOLVED → VERIFIED
Component: Bugzilla: Keywords & Components → Administration
Product: mozilla.org → bugzilla.mozilla.org
You need to log in before you can comment on or make changes to this bug.