Open Bug 1479358 Opened 6 years ago Updated 1 month ago

TEST Bug for Automated Intermittent-Failure Classification Tool

Categories

(Testing :: General, enhancement, P5)

enhancement

Tracking

(Not tracked)

UNCONFIRMED

People

(Reporter: moritz.eck, Unassigned)

Details

Attachments

(1 obsolete file)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36

Steps to reproduce:

This is no bug or failure. 
We're using this bug to test an automated classification tool which will help categorize intermittent-failures before they are fixed. 
This bug is used to test the automated commentary posting of the classification tool. Please leave it open for the next 2-3 weeks.


Actual results:

This is no bug or failure. 
We're using this bug to test an automated classification tool which will help categorize intermittent-failures before they are fixed. 
This bug is used to test the automated commentary posting of the classification tool. Please leave it open for the next 2-3 weeks.


Expected results:

This is no bug or failure. 
We're using this bug to test an automated classification tool which will help categorize intermittent-failures before they are fixed. 
This bug is used to test the automated commentary posting of the classification tool. Please leave it open for the next 2-3 weeks.
If we do not comply with Mozilla or Bugzilla rules by opening this bug, please let us know! 
Is there a test environment for Bugzilla or what other steps could we take to test our automated classification tool.
We will close this bug again in 2-3 weeks. Thanks for your help!
I hope this is ok for you. We work together with Marco Castellucio. You can also ask him if you'd like further information on this project. Thanks!
Flags: needinfo?(gbrown)
This is a comment from the automated intermittent-failure classification tool!
Introduction Text - Explaining the Project
This is going to be the main text for the classification...
Concluding Statement - Asking for the feedback
__Dear Reader__,
        We are a team from the University of Zurich, Switzerland and Mozilla and are building an automated intermittent-failure classification tool. The tool shall help fix intermittent-failures by predicting the type of failure (category) of yet not-fixed intermittent-failures on Bugzilla. To evaluate and improve our classifier we require your feedback on the classification result. 

        __Classification Result:__
        This intermittent-failure is most likely affected by a:
__Assertion Failure (test vs. reference value)__: This includes test failures in which a (pre-)defined value was not met or the value the test produced lies outside of the range of acceptable values. This often occurs when running reference tests (e.g. comparing images, audio, etc.). Also, this includes wrong assertion statements only leading to intermittent-failures and not permanent failures.
__Feedback:__ 
        If you have been involved in fixing this intermittent-failure, can you please comment whether: 
        - you believe the classification result is correct?
        - the classification was useful for your analysis, the bug fixing process or something else in the process.

        If you found the classification useful, please let us know how. If you found the classification result not useful, please let us know how we could improve the classifier or the result for it to be useful for the bug fixing process. 
        
        Thank you very much for your help!
Dear Reader,
    We are a team from the University of Zurich, Switzerland and Mozilla and are building an automated intermittent-failure classification tool. The tool shall help fix intermittent-failures by predicting the type of failure (category) of yet not-fixed intermittent-failures on Bugzilla. To evaluate and improve our classifier we require your feedback on the classification result. 
    
    Classification Result:
    This intermittent-failure is most likely affected by a:
Assertion Failure (test vs. reference value): This includes test failures in which a (pre-)defined value was not met or the value the test produced lies outside of the range of acceptable values. This often occurs when running reference tests (e.g. comparing images, audio, etc.). Also, this includes wrong assertion statements only leading to intermittent-failures and not permanent failures.
Feedback:
    If you have been involved in fixing this intermittent-failure, can you please comment whether: 
        - you believe the classification result is correct?
        - the classification was useful for your analysis, the bug fixing process or something else in the process.
    
    If you found the classification useful, please let us know how. 
    If you found the classification result not useful, please let us know how we could improve the classifier or the result for it to be useful for the bug fixing process. Thank you very much for your help!
Dear Reader,
We are a team from the University of Zurich, Switzerland and Mozilla and are building an automated intermittent-failure classification tool. The tool shall help fix intermittent-failures by predicting the type of failure (category) of yet not-fixed intermittent-failures on Bugzilla. To evaluate and improve our classifier we require your feedback on the classification result. 
    
Classification Result:
This intermittent-failure is most likely affected by a:
	Assertion Failure (test vs. reference value): This includes test failures in which a (pre-)defined value was not met or the value the test produced lies outside of the range of acceptable values. This often occurs when running reference tests (e.g. comparing images, audio, etc.). Also, this includes wrong assertion statements only leading to intermittent-failures and not permanent failures.
Feedback:
If you have been involved in fixing this intermittent-failure, can you please comment whether: 
    - you believe the classification result is correct?
    - the classification was useful for your analysis, the bug fixing process or something else in the process.

If you found the classification useful, please let us know how. 
If you found the classification result not useful, please let us know how we could improve the classifier or the result for it to be useful for the bug fixing process. Thank you very much for your help!
Dear Reader,
We are a team from the University of Zurich, Switzerland and Mozilla and are building an automated intermittent-failure classification tool. The tool shall help fix intermittent-failures by predicting the type of failure (category) of yet not-fixed intermittent-failures on Bugzilla. To evaluate and improve our classifier we require your feedback on the classification result. 
    
Classification Result:
This intermittent-failure is most likely affected by a:
	"Assertion Failure (test vs. reference value): This includes test failures in which a (pre-)defined value was not met or the value the test produced lies outside of the range of acceptable values. This often occurs when running reference tests (e.g. comparing images, audio, etc.). Also, this includes wrong assertion statements only leading to intermittent-failures and not permanent failures."

Feedback:
If you have been involved in fixing this intermittent-failure, can you please comment whether: 
    - you believe the classification result is correct?
    - the classification was useful for your analysis, the bug fixing process or something else in the process.

If you found the classification useful, please let us know how. 
If you found the classification result not useful, please let us know how we could improve the classifier or the result for it to be useful for the bug fixing process. Thank you very much for your help!
Dear Reader,
We are a team from the University of Zurich, Switzerland and Mozilla and are building an automated intermittent-failure classification tool. The tool shall help fix intermittent-failures by predicting the type of failure and, thus provide a starting point for the bug fixing process. To evaluate and improve our classifier we require your feedback on the classification result. 
    
Classification Result:
This intermittent-failure is most likely affected by a:
	Assertion Failure (test vs. reference value): This includes test failures in which a (pre-)defined value was not met or the value the test produced lies outside of the range of acceptable values. This often occurs when running reference tests (e.g. comparing images, audio, etc.). Also, this includes wrong assertion statements only leading to intermittent-failures and not permanent failures.""

Feedback:
If you have been involved in fixing this intermittent-failure, please comment whether: 
    - you believe the classification result is correct?
    - the classification was useful for your analysis, the bug fixing process itself or something else in the process.
        - If you found the classification useful, please let us know how. 
        - If not, please let us know how we could improve the classifier or the result for it to be useful for the bug fixing process. 

Thank you very much for your help!
Dear Reader,
We are a team from the University of Zurich, Switzerland and Mozilla and are building an automated intermittent-failure classification tool. The tool shall help fix intermittent-failures by predicting the type of failure and, thus provide a starting point for the bug fixing process. To evaluate and improve our classifier we require your feedback on the classification result. 
    
Classification Result - this intermittent-failure is most likely affected by a:
	Assertion Failure (test vs. reference value): This includes test failures in which a (pre-)defined value was not met or the value the test produced lies outside of the range of acceptable values. This often occurs when running reference tests (e.g. comparing images, audio, etc.). Also, this includes wrong assertion statements only leading to intermittent- and not permanent-failures.

Feedback:
If you have been involved in fixing this intermittent-failure, please comment whether: 
    - you believe the classification result is correct?
        - if possible, please explain why or why not.
        
    - the classification was useful for your analysis, the bug fixing process itself or something else in the process.
        - If you found the classification useful, please let us know what exactly.
        - If not, please let us know what would improve the the usefulness for you.  

Thank you very much for your help!
Dear Reader,
We are a team from the University of Zurich, Switzerland and Mozilla and are building an automated intermittent-failure classification tool. The tool shall help fix intermittent-failures by predicting the type of failure and, thus provide a starting point for the bug fixing process. To evaluate and improve our classifier we require your feedback on the classification result. 
    
Classification Result. This intermittent-failure is most likely affected by a:
Assertion Failure (test vs. reference value): This includes test failures in which a (pre-)defined value was not met or the value the test produced lies outside of the range of acceptable values. This often occurs when running reference tests (e.g. comparing images, audio, etc.). Also, this includes wrong assertion statements only leading to intermittent- and not permanent-failures.

Feedback: Please comment whether: 
- you believe the classification result is correct?
    - if possible, please explain why or why not.

- the classification was useful for your analysis, the bug fixing process itself or something else in the process.
    - If you found the classification useful, please let us know what exactly.
    - If not, please let us know what would improve the the usefulness for you.  

Thank you very much for your help!
Dear Reader,

We are a joint research team from the University of Zurich, Switzerland* and Mozilla** and are building a tool to automatically recognize the type of an intermittent-failure, such as the one here.

By automatically recognizing the type of failure, the tool should provide a starting point for the bug fixing process. To evaluate and improve the tool, we kindly ask your feedback.
    
Tool Result - this intermittent-failure is most likely affected by a:
Assertion Failure (test vs. reference value): This includes test failures in which a (pre-)defined value was not met or the value the test produced lies outside of the range of acceptable values. This often occurs when running reference tests (e.g. comparing images, audio, etc.). Also, this includes wrong assertion statements only leading to intermittent- and not permanent-failures.

Feedback: 
1. Did the tool correctly recognize the type of intermittent-failure? (please also explain why or why not)
2. Did the information from the tool help your analysis, the bug fixing process, or anything in the process? (please also let us know how the tool was useful and/or what would improve the tool's usefulness for you)

Thank you very much for your help! 
*Moritz, Fabio and Alberto (http://www.ifi.uzh.ch/en/zest/team.html)
**Marco Castellucio (https://marco-c.github.io/)
Dear Reader,

We are a joint research team from the University of Zurich, Switzerland* and Mozilla** and are building a tool to automatically recognize the type of an intermittent-failure, such as the one here.

By automatically recognizing the type of failure, the tool should provide a starting point for the bug fixing process. To evaluate and improve the tool, we kindly ask your feedback.
    
Tool Result - this intermittent-failure is most likely affected by a:
Concurrency Issue: This includes thread management issues (different threads or their outcomes depending on an implicit ordering), race conditions and deadlocks and issues related to an asynchronous wait failure (e.g. a process trying to access an external resource or continuing before the external resource is available).

Feedback: 
1. Did the tool correctly recognize the type of intermittent-failure? (please also explain why or why not)
2. Did the information from the tool help your analysis, the bug fixing process, or anything in the process? (please also let us know how the tool was useful and/or what would improve the tool's usefulness for you)

Thank you very much for your help! 
*Moritz, Fabio and Alberto (http://www.ifi.uzh.ch/en/zest/team.html)
**Marco Castellucio (https://marco-c.github.io/)
Dear Reader,

We are a joint research team from the University of Zurich, Switzerland* and Mozilla** and are building a tool to automatically recognize the type of an intermittent-failure, such as the one here.

By automatically recognizing the type of failure, the tool should provide a starting point for the bug fixing process. To evaluate and improve the tool, we kindly ask your feedback.
    
Tool Result - this intermittent-failure is most likely affected by a:
Test Order Dependency: This includes situations where one or multiple test(s) have not cleaned up after itself or require a certain state to be created/set before starting to run. Also, situations where a test is started while another one is still running, and both are accessing/requiring the same state. Please check whether there are clean-up or start-up dependencies between the test cases.

Feedback: 
1. Did the tool correctly recognize the type of intermittent-failure? (please also explain why or why not)
2. Did the information from the tool help your analysis, the bug fixing process, or anything in the process? (please also let us know how the tool was useful and/or what would improve the tool's usefulness for you)

Thank you very much for your help! 
*Moritz, Fabio and Alberto (http://www.ifi.uzh.ch/en/zest/team.html)
**Marco Castellucio (https://marco-c.github.io/)
Dear Reader,

We are a joint research team from the University of Zurich, Switzerland* and Mozilla** and are building a tool to automatically recognize the type of an intermittent-failure, such as the one here.

By automatically recognizing the type of failure, the tool should provide a starting point for the bug fixing process. To evaluate and improve the tool, we kindly ask your feedback.
    
Tool Result - this intermittent-failure is most likely affected by a:
Resource Leak: This includes tests which have memory leaks, garbage collection issues and other memory allocation / pointer (de-)referencing issues. E.g. a test fails because it holds a pointer to a resource which has been garbage collected or the system crashes because it ran out of memory.

Feedback: 
1. Did the tool correctly recognize the type of intermittent-failure? (please also explain why or why not)
2. Did the information from the tool help your analysis, the bug fixing process, or anything in the process? (please also let us know how the tool was useful and/or what would improve the tool's usefulness for you)

Thank you very much for your help! 
*Moritz, Fabio and Alberto (http://www.ifi.uzh.ch/en/zest/team.html)
**Marco Castellucio (https://marco-c.github.io/)
Dear Reader,

We are a joint research team from the University of Zurich, Switzerland* and Mozilla** and are building a tool to automatically recognize the type of an intermittent-failure, such as the one here.

By automatically recognizing the type of failure, the tool should provide a starting point for the bug fixing process. To evaluate and improve the tool, we kindly ask your feedback.
    
Tool Result - this intermittent-failure is most likely affected by a:
Assertion Failure (test vs. reference value): This includes test failures in which a (pre-)defined value was not met or the value the test produced lies outside of the range of acceptable values. This often occurs when running reference tests (e.g. comparing images, audio, etc.). Also, this includes wrong assertion statements only leading to intermittent- and not permanent-failures.

Feedback: 
1. Did the tool correctly recognize the type of intermittent-failure? (please also explain why or why not)
2. Did the information from the tool help your analysis, the bug fixing process, or anything in the process? (please also let us know how the tool was useful and/or what would improve the tool's usefulness for you)

Thank you very much for your help! 
*Moritz, Fabio and Alberto (http://www.ifi.uzh.ch/en/zest/team.html)
**Marco Castellucio (https://marco-c.github.io/)
Dear Reader,

We are a joint research team from the University of Zurich, Switzerland* and Mozilla** and are building a tool to automatically recognize the type of an intermittent-failure, such as the one here.

By automatically recognizing the type of failure, the tool should provide a starting point for the bug fixing process. To evaluate and improve the tool, we kindly ask your feedback.
    
Tool Result - this intermittent-failure is most likely affected by a:
Test Timeout: This includes test failures in which a single test case or the whole test suite timed out. This might be due to an unresponsive or slow target system, a test suite which has grown too large causing intermittent-timeouts or a timeout threshold which was set too small.

Feedback: 
1. Did the tool correctly recognize the type of intermittent-failure? (please also explain why or why not)
2. Did the information from the tool help your analysis, the bug fixing process, or anything in the process? (please also let us know how the tool was useful and/or what would improve the tool's usefulness for you)

Thank you very much for your help! 
*Moritz, Fabio and Alberto (http://www.ifi.uzh.ch/en/zest/team.html)
**Marco Castellucio (https://marco-c.github.io/)
Dear Reader,

We are a joint research team from the University of Zurich, Switzerland* and Mozilla** and are building a tool to automatically recognize the type of an intermittent-failure, such as the one here.

By automatically recognizing the type of failure, the tool should provide a starting point for the bug fixing process. To evaluate and improve the tool, we kindly ask your feedback.
    
Tool Result - this intermittent-failure is most likely affected by a:
Floating-Point- and Time (Zone) Precision: This includes test failures where the floating-point precision was different between threads, resources or different to a (pre-)defined value (e.g. more or less accurate than expected). Also, this category includes failures which occurred due to time precision (e.g. a test taking the time zone into account whereas the reference value doesn’t).

Feedback: 
1. Did the tool correctly recognize the type of intermittent-failure? (please also explain why or why not)
2. Did the information from the tool help your analysis, the bug fixing process, or anything in the process? (please also let us know how the tool was useful and/or what would improve the tool's usefulness for you)

Thank you very much for your help! 
*Moritz, Fabio and Alberto (http://www.ifi.uzh.ch/en/zest/team.html)
**Marco Castellucio (https://marco-c.github.io/)
Dear Reader,

We are a joint research team from the University of Zurich, Switzerland* and Mozilla** and are building a tool to automatically recognize the type of an intermittent-failure, such as the one here.

By automatically recognizing the type of failure, the tool should provide a starting point for the bug fixing process. To evaluate and improve the tool, we kindly ask your feedback.
    
Tool Result - this intermittent-failure is most likely affected by a:
Platform / Operating System Failure: This includes test failures which occur intermittently only on specific platforms (e.g. only in –verify mode, debug builds only, etc.) or operating systems (including operating system emulator issues). The failure can be caused due to missing dependencies (e.g. not installed or different version of a dependency) or specific requirements not met (e.g. not sufficient memory, ram, etc. or not able to run on a specific operating system). Also, the failure can be caused by the taskcluster or task instance executing the tests.

Feedback: 
1. Did the tool correctly recognize the type of intermittent-failure? (please also explain why or why not)
2. Did the information from the tool help your analysis, the bug fixing process, or anything in the process? (please also let us know how the tool was useful and/or what would improve the tool's usefulness for you)

Thank you very much for your help! 
*Moritz, Fabio and Alberto (http://www.ifi.uzh.ch/en/zest/team.html)
**Marco Castellucio (https://marco-c.github.io/)
(In reply to meck93 from comment #1)
> If we do not comply with Mozilla or Bugzilla rules by opening this bug,
> please let us know! 
> Is there a test environment for Bugzilla or what other steps could we take
> to test our automated classification tool.
> We will close this bug again in 2-3 weeks. Thanks for your help!

There is also bugzilla.allizom.org and bugzilla-dev.allizom.org; ask in #bmo on irc if you want to use those.

I don't mind having a bug like this in Testing::General for a while, if that's convenient. (It would be nice if you set the Importance field to something other than "--": that will take it off of my triage list.) Good luck with your project!
Flags: needinfo?(gbrown)
**test**: this is
__test__:: tis
Severity: normal → enhancement
Priority: -- → P5
(In reply to Geoff Brown [:gbrown] from comment #20)
> I don't mind having a bug like this in Testing::General for a while, if
> that's convenient. (It would be nice if you set the Importance field to
> something other than "--": that will take it off of my triage list.) Good
> luck with your project!

[:gbrown]: Cool, thanks! Is P5 and trivial / enhancement ok?
Yes, that's fine.
We are trying to build a tool to automatically classify intermittent failures, which would provide a starting point for fixing the bug, reducing the manual work for the developers.
We are collecting some feedback on the results, to see if they’re good enough and where we need to improve.
    
For this bug, the tool says that the intermittent failure is most likely a:
Concurrency Issue: This includes tests in which failures occur due to thread management issues (different threads or their outcomes depending on an implicit ordering), race conditions and/or deadlocks, and issues related to an asynchronous waits (e.g. a process trying to access an external resource or continuing before the external resource is available).


Once you’re done investigating and/or fixing the bug, could you tell me:
- Did the tool correctly recognize the type of intermittent failure?
- Did the information from the tool help your analysis, the bug fixing process, or anything in the process? (please also let us know how the tool was useful and/or what would improve the tool's usefulness for you)
Severity: normal → S3
Attachment #9386043 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: