Closed Bug 848577 Opened 12 years ago Closed 12 years ago

Create a Mozmill Endurance test to catch regular expression memory spike (bug 837845)

Categories

(Mozilla QA Graveyard :: Mozmill Tests, defect, P2)

defect

Tracking

(Not tracked)

VERIFIED INVALID

People

(Reporter: u279076, Assigned: AndreeaMatei)

References

Details

(Whiteboard: [mentor=davehunt][endurance] s=130318 u=new c=endurance p=1)

In Firefox 19 we introduced a regression resulting in a massive memory leak involving regular expressions. This memory leak has subsequently been fixed. Steps to Reproduce: 1. Load https://bugzilla.mozilla.org/attachment.cgi?id=709849 2. Copy 200KB+ of text and paste it into the textbox 3. Click the Test button Result: Memory usage will grow into multiple GB As I said, this has now been fixed via bug 837845 but it would be great to have an automated testcase for this regression.
Depends on: 837845
Dave, so I assume we only need iterations here and that entities are not used? I will put this as P2 so we can address it most likely already next week.
Priority: -- → P2
Whiteboard: [endurance]
Yeah, entities doesn't make much sense here.
Whiteboard: [endurance] → [mentor=davehunt][endurance]
Whiteboard: [mentor=davehunt][endurance] → [endurance][mentor=davehunt][lang=js]
Whiteboard: [endurance][mentor=davehunt][lang=js] → [endurance] s=130318 u=new c=endurance p=1
Assignee: nobody → andreea.matei
Status: NEW → ASSIGNED
Dave, should we make use of the page in step 1 (the attachment from bug 709849) and move it to mozqa.com, for this test? To type 200kb+ text I assume we can put some text in a while and after clicking test button, we should make sure the "Done" popup comes up quickly enough and that about:memory is at under 1GB? Or even better, I could test manually on some machines how much we grow when we access the page multiple times and make an average estimate. How does this sound?
(In reply to Andreea Matei [:AndreeaMatei] from comment #3) > Dave, should we make use of the page in step 1 (the attachment from bug > 709849) and move it to mozqa.com, for this test? Yes, we'll need a page in testdata for this. > To type 200kb+ text I assume we can put some text in a while and after > clicking test button, we should make sure the "Done" popup comes up quickly > enough and that about:memory is at under 1GB? > Or even better, I could test manually on some machines how much we grow when > we access the page multiple times and make an average estimate. > How does this sound? We don't need to do any assertions in the endurance test on memory or speed, that will be evident in the checkpoint results. The steps should probably be: 1. Open test URL 2. Enter 200Kb of text 3. Click 'Test' 4. Click 'Done'
Whiteboard: [endurance] s=130318 u=new c=endurance p=1 → [mentor=davehunt][endurance] s=130318 u=new c=endurance p=1
Depends on: 862293
Umm...why is this appropriate for a Mozmill test? This sounds more appropriate to be implemented in TBPL CI, not QA CI, given that the changes are purely in the JS engine codebase. Unless there's a requirement for external UI-based interaction to create automation here that can't be done with frameworks such as mochitest.
No requirement for UI based interaction, but the endurance tests allow us to repeat an interaction whilst measuring various metrics. Apologies if I'm forgetting something, but I don't think we currently have such a framework for mochitest.
(In reply to Dave Hunt (:davehunt) from comment #6) > No requirement for UI based interaction, but the endurance tests allow us to > repeat an interaction whilst measuring various metrics. Apologies if I'm > forgetting something, but I don't think we currently have such a framework > for mochitest. Ah okay. I'll centralize the discussion to this bug, as you stated in the other bug. See my response though in https://bugzilla.mozilla.org/show_bug.cgi?id=837845#c31. If the reason for the spike is a memory leak, then I think a simple crashtest that models the attached test case could easily solve this problem to add automation coverage here. I don't know as much about Mozmill endurance tests, but I'd imagine the value of where that could be value of where they come in is for such cases as: - UI interaction is required - Measuring accumulation of memory that is *not* due to leaks, but more for cases where memory consumption is abnormally high on multiple runs (intense memory ops) Let me know what you think.
Just to add more info - the reasoning for the crashtest is the fact that crashtests that run in TBPL will fail if there's a leak during the test.
A leak is not equal with high memory usage. So please don't compare those two different things. Our endurance tests get run over a longer period of time, which we cannot do for any in-tree test. If you want to have this discussed please start a new thread on the appropriate list, but I don't think it worth continuing here.
(In reply to Henrik Skupin (:whimboo) from comment #9) > A leak is not equal with high memory usage. So please don't compare those > two different things. Our endurance tests get run over a longer period of > time, which we cannot do for any in-tree test. If you want to have this > discussed please start a new thread on the appropriate list, but I don't > think it worth continuing here. I think you missed my point entirely here. I already said that they are different. And I disagree this isn't appropriate to discuss here, given that I'm questioning the validity of the bug. The comments I posed are talking about what framework is appropriate for bug 837845, which is talking about a leak overtime leading to a spike. If the leak is the root cause, then I would generally think a Mozmill endurance is overkill and holds less value than building a TBPL-based test, given that failures that happen in TBPL result in an immediate backout of a patch. So choosing the right framework is really important to consider, so I would caution throwing that analysis out. Putting needinfo on Clint for his input here as well. I still haven't received an analysis response that shows a demonstration of why a Mozmill endurance test is the *right* framework of choice for bug 837845. My initial analysis of that bug actually argues in favor of a crashtest as the more important test to get in the tree.
Flags: needinfo?(ctalbert)
I may be wrong, but I don't believe the intention here is to verify bug 837845. I would agree that an endurance test would probably not be appropriate for that as it would appear that the issue was immediately apparent. The advantage of an endurance test would be that we can repeat the steps several times, allowing us to detect if there is a memory leak or degradation of performance over time in this area of functionality. I would agree that using Mozmill might be overkill in that user interaction is not required (as I mentioned in bug 837845 comment 26), but if the goal is to identify performance issues over time then I don't think we currently have another suitable framework.
(In reply to Dave Hunt (:davehunt) from comment #11) > I may be wrong, but I don't believe the intention here is to verify bug > 837845. I would agree that an endurance test would probably not be > appropriate for that as it would appear that the issue was immediately > apparent. Fair enough. I agree on that. Probably would be worthwhile to close the leak up with a crashtest though still. > > The advantage of an endurance test would be that we can repeat the steps > several times, allowing us to detect if there is a memory leak or > degradation of performance over time in this area of functionality. I would > agree that using Mozmill might be overkill in that user interaction is not > required (as I mentioned in bug 837845 comment 26), but if the goal is to > identify performance issues over time then I don't think we currently have > another suitable framework. In the case of picking up a leak on multiple runs, I think you could still pull that off in a crashtest through using local storage and reloads potentially to get this running in TBPL. Although a dedicated framework that controls this without using reloads and local storage might be better, so long as the test fails upon any detection of a leak, like what's seen in TBPL. In the case of memory degradation over time in the case of this bug, so long as expectations are set of what "degradation" means and being able to fail fast when the definition of "degradation" is hit, then yeah, you could add a test here. I do see a general conclusion though from this bug that there's definitely value to consider getting endurance tests that are deemed "reliable" running in TBPL (e.g. the automation in this bug would likely be okay TBPL). But that's a separate discussion outside the scope of this bug.
Flags: needinfo?(ctalbert)
As said earlier this is not a leak but a memory spike which is seen for a given amount of time when loading the mentioned testcase. After a while the memory usage goes back to normal and any reserved memory is getting released. So none of our in-tree testsuites will be able to catch this. And yes, this includes crashtests and mochitests. Those tests can only detect leaks during shutdown and report those appropriately. That's all. Also I think instead of TBPL you meant Buildbot which is the underlying continuous integration system. TBPL is simply a frontend to show the results. On buildbot we also run talos but that's a framework for performance tests. It doesn't check for leaks. So endurance tests really seem to be the only option here. But based on it there may be another option. Given that areweslimyet.com runs tests based on our endurance tests, they might be able to cover that. Sadly those tests are only get run in Nightly and no other builds. So that's a huge benefit for our Mozmill based endurance tests, given that we run those for each supported branch immediately when the builds are out.
Summary: Create a Mozmill Endurance test to catch regular expression memory leak (bug 837845) → Create a Mozmill Endurance test to catch regular expression memory spike (bug 837845)
New news - Terrence btw disagrees a test is warranted here, given that the regressing code was removed. See https://bugzilla.mozilla.org/show_bug.cgi?id=837845#c34, which has qa-testsuite minused now. Closing on that basis.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → INVALID
Thanks for following up on this Jason. I agree, we don't need tests for code which no longer exists.
Status: RESOLVED → VERIFIED
Product: Mozilla QA → Mozilla QA Graveyard
You need to log in before you can comment on or make changes to this bug.