Closed
Bug 551617
(HomelessReftests)
Opened 15 years ago
Closed 3 years ago
Tool to detect test files not used by reftest.list manifest
Categories
(Testing :: General, defect)
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: murali, Unassigned)
References
(Depends on 4 open bugs)
Details
Attachments
(4 files, 5 obsolete files)
Please find enclosed a script that scans a given directory for all reftest.list files and then identifies if there are any files that are in the 'reftest' directories but neither called by other files nor used in any manifest files.
Those files could be newly added test cases which were not updated in the reftest.list files or some thing else.
How to use the script :
The script takes two arguments.
First one is the path of your Firefox code base.
Second one is the string 'reftest.list'
Usage :: python mchecker.py /path/to/your/src/directory reftest.list
It generates two text files for now .
found.txt :: Which shows what all the files that are used as part of other files or present in manifest lists.
notfound.txt :: which shows all files that are not called by any other files or referenced in reftest.list files.
I have yet to add the usage and commandline arguments validation code which I will complete soon.
In the mean time, can you please comment on the logic and utility
Updated•15 years ago
|
Summary: Please review this code → Tool to detect test files not used by reftest.list manifest
Updated•15 years ago
|
Attachment #431775 -
Attachment mime type: text/x-python → text/plain
Updated•15 years ago
|
Attachment #431775 -
Flags: feedback?(ctalbert)
Updated•15 years ago
|
Attachment #431775 -
Flags: feedback?(jmaher)
Comment 1•15 years ago
|
||
I will leave out style and polish comments for now.
If the intention is to identify files that are on the filesystem, but not in a manifest file, then I don't understand why the first for loop is for each file in thereftestfileslist (which that variable name must get changed!)
I would expect to see something like:
reftestFiles = parseManifest(rootManifest) #re use/port reftest.js parser
existingFiles = parseFilesystem(rootDir) #some kind of os.walk
for each f in existingFiles:
if reftestFiles.find(f) == None:
print "HOMELESS TEST %(s) needs a loving manifest" % f
keep in mind the .find() is made up, but can consist of a regex match on the leaf name or some simple function that searches leaf name and ensures singularity based on path name.
Another thing which I don't understand is all_files. I don't understand the argument list as the second argument is a pattern, but also a .list file? I am not sure if this is intended to be a recursive function, but never called recursively.
Comment 2•15 years ago
|
||
you might want to check out dmandelin's check manifest in js/src/tests/jstests.py
It seems like a script like this would have a lot of false positives due to support files (images, subframes, style sheets, scripts, etc.); I'm not sure if it's too many to be useful. It might be possible to get a much more accurate list by comparing the list of files that are served when all tests are run over HTTP (can the server take logs?) (once bug 544385 lands) with the list of files packaged by the test packaging.
Reporter | ||
Comment 4•15 years ago
|
||
Dbaron,
I need more clarity on your comments. Can you please elaborate!
Here is my understanding on how the reftests directory is organized.
There are three classes of files in the reftest.
Super-class: The reftest.list files which call other lists via 'include' or they call test cases in them.
First-class: These are the actual test cases that are referenced by Super-class files
Second-class: These are support files which are not directly referenced by Super-class files but are used in the First-class files.
So, my script is doing three passes.
First-pass : Scan the full ROOT directory structure and get the list of reftest.list files
Second-pass : In each such directory where a reftest.list file resides, get the list of all other files in that directory and see if they are included in any of the Super-class files.
Third-pass : If a file is not included by any of the Super-class files, then check if that file is being included in any other files in its own directory or any of sub-directories in its own directory.
In that process I am generating two files ... as you see ...
The found.txt is a useless artifact of my multi-pass scanning.
The notfound.txt is what matters. it lists all those files that are not included in any files or lists.
I agree that my algorithm is not Big(O) optimized. But you seem to think that the output is erroneous.
It could be possible that I totally screwed up in ways that I have not yet realized. Can you please elaborate more on your comments.
Reporter | ||
Comment 5•15 years ago
|
||
(In reply to comment #2)
> you might want to check out dmandelin's check manifest in
> js/src/tests/jstests.py
Thanks BD .. I am on it
Reporter | ||
Comment 6•15 years ago
|
||
(In reply to comment #5)
> (In reply to comment #2)
> > you might want to check out dmandelin's check manifest in
> > js/src/tests/jstests.py
>
> Thanks BD .. I am on it
Oops .. I meant BC
(In reply to comment #4)
> Third-pass : If a file is not included by any of the Super-class files, then
> check if that file is being included in any other files in its own directory or
> any of sub-directories in its own directory.
I didn't realize you were doing that.
(That said, does the way you do this count file C as found if a test A in the reftest.list references file B which in turn references file C?)
Reporter | ||
Comment 8•15 years ago
|
||
(In reply to comment #2)
> you might want to check out dmandelin's check manifest in
> js/src/tests/jstests.py
Hi BC, I have checked the script and here are my findings :
The logic used in dmandelin's 'def' is very similar to what I have done ( or vice-versa ).
However, there are some simplification-factors available in jsreftest that are ( unfortunately ) unavailable with reftest.
Looks like there is a single manifest file in dmandelin's check and it is assumed that all files that have a '.js' extension should be in that or else missing.
In my case, we have a graph of manifest files. Also the structure is slightly complicated as explained in Comment #4.
I am going through multiple passes and pretty much applying similar logic as that of the "def check_manifest(test_list)"
Again, since I am new to the Python language .. I may have totally missed the mark here. Please correct me if you find that I am wrong.
Reporter | ||
Comment 9•15 years ago
|
||
> (That said, does the way you do this count file C as found if a test A in the
> reftest.list references file B which in turn references file C?)
Yes I do. as long as any file in the ROOT path can be traced through any number of hoops to any reftest.list I mark it as 'found'
Comment 10•15 years ago
|
||
I think dbaron was asking if a file includes another file, but it is not in the manifest. For example:
in the manifest file there is just:
http://mxr.mozilla.org/mozilla-central/source/layout/reftests/font-face/reftest.list#53
but in the file:
http://mxr.mozilla.org/mozilla-central/source/layout/reftests/font-face/cross-iframe-1.html?force=1#28
it references the *-inner.html files.
Thinking more about an earlier comment dbaron made regarding http server logging would be a good way to do this. I assume we could turn on verbose logging on a webserver and write a quick log parser to determine what files were loaded.
Reporter | ||
Comment 11•15 years ago
|
||
> (That said, does the way you do this count file C as found if a test A in the
> reftest.list references file B which in turn references file C?)
Yes I do. as long as any file in the ROOT path can be traced through any number of hoops to any reftest.list
Reporter | ||
Comment 12•15 years ago
|
||
> in the manifest file there is just:
> http://mxr.mozilla.org/mozilla-central/source/layout/reftests/font-face/reftest.list#53
>
> but in the file:
> http://mxr.mozilla.org/mozilla-central/source/layout/reftests/font-face/cross-iframe-1.html?force=1#28
>
> it references the *-inner.html files.
I am not sure about the point you are making.
My logic fails only if there is a cyclic reference like A refers to B and B refers to A and both are not in any reftest.list. In such a case, instead of reporting that both are missing from reftest.list I mark them as found.
If we have such a case, we need to examine that.
Reporter | ||
Updated•15 years ago
|
Assignee: nobody → mnandigama
(In reply to comment #10)
> Thinking more about an earlier comment dbaron made regarding http server
> logging would be a good way to do this. I assume we could turn on verbose
> logging on a webserver and write a quick log parser to determine what files
> were loaded.
The http server logging does have one area it will miss: files that exist only to be the "wrong" file, i.e., to make sure we don't load a certain file in some case.
Comment 14•15 years ago
|
||
Most of the output seems to be real problems.
False positives:
* passinner.png is used to test relative URLs. You can ignore-list it.
* LICENSE, README, readme.txt can be ignore-listed.
* Please ignore-list ".DS_Store". Mac OS X litters these all over.
* I think you want to ignore any file whose name ends with ^headers^.
Comment 15•15 years ago
|
||
For the rest of the output, I'll file bugs (or report false positives here) as appropriate.
Updated•15 years ago
|
Alias: HomelessReftests
Reporter | ||
Comment 16•15 years ago
|
||
I have incorporated all comments from jmaher and Jessey. Please review and approve this script.
Attachment #431775 -
Attachment is obsolete: true
Attachment #431775 -
Flags: feedback?(jmaher)
Attachment #431775 -
Flags: feedback?(ctalbert)
Reporter | ||
Updated•15 years ago
|
Attachment #432175 -
Flags: review?(jmaher)
Attachment #432175 -
Flags: feedback?(dbaron)
Updated•15 years ago
|
Attachment #432175 -
Attachment mime type: text/x-python → text/plain
Reporter | ||
Comment 17•15 years ago
|
||
(In reply to comment #1)
Big guy !!
I will Big(O) optimize my script using lists and only two scans across the directories as you suggested.
Please stay tuned for the 3rd review some time next week.
For now, this script gives what you want !! the result ...
> I will leave out style and polish comments for now.
>
> If the intention is to identify files that are on the filesystem, but not in a
> manifest file, then I don't understand why the first for loop is for each file
> in thereftestfileslist (which that variable name must get changed!)
>
> I would expect to see something like:
>
> reftestFiles = parseManifest(rootManifest) #re use/port reftest.js parser
> existingFiles = parseFilesystem(rootDir) #some kind of os.walk
> for each f in existingFiles:
> if reftestFiles.find(f) == None:
> print "HOMELESS TEST %(s) needs a loving manifest" % f
>
> keep in mind the .find() is made up, but can consist of a regex match on the
> leaf name or some simple function that searches leaf name and ensures
> singularity based on path name.
>
> Another thing which I don't understand is all_files. I don't understand the
> argument list as the second argument is a pattern, but also a .list file? I am
> not sure if this is intended to be a recursive function, but never called
> recursively.
Comment on attachment 432175 [details]
version 2 of the script incorporating all review comments
Not clear what feedback is being requested here; I don't think there's any need for me to review this. I'd only note that:
># The Initial Developer of the Original Code is
># Murali Krishna Nandigama
If you're a Mozilla Corp. employee, this should be the Mozilla Foundation.
># Portions created by the Initial Developer are Copyright (C) 2010-11
># the Initial Developer. All Rights Reserved.
Should be copyright 2010.
>def main():
>
> rootDirectory = sys.argv[1]
> manifestFileName = sys.argv[2]
>if __name__ == '__main__':
> main()
This doesn't seem to serve much useful modularity purpose. Perhaps it would be more useful if it looked something like:
def find_unlisted_files(rootDirectory, manifestFileName):
...
if __name == '__main__':
find_unlisted_files(sys.argv[1], sys.argv[2])
Attachment #432175 -
Flags: feedback?(dbaron)
Comment 19•15 years ago
|
||
Comment on attachment 432175 [details]
version 2 of the script incorporating all review comments
>
>def main():
>
> rootDirectory = sys.argv[1]
> manifestFileName = sys.argv[2]
>
this was mentioned by dbaron that you should have a different function name other than main.
I would keep main which verifies the input parameters and errors out if they are invalid while providing help. Then call your main function with your validated input parameters.
> theRefTestFilesList = list(find_all_files(rootDirectory,manifestFileName))
>
what about calling this 'reftestFiles'?
> for file in theRefTestFilesList:
> print file
> # Get the base directory of the reftest.list file and scan all files.
> dirpath= os.path.dirname(file)
we don't use file anymore inside this loop, just dirpath. if we are concerned about dirpath, make a shorter list to loop on.
infile=open(myfile,"r")
text=infile.read()
infile.close()
index = text.find(eachFile)
can you make a function here like 'findTextInFile(text, filename)' which returns an index or None?
Lets get this cleaned up with the couple of comments you have on this version and give it another round of reviews.
Attachment #432175 -
Flags: review?(jmaher) → review-
Comment 20•15 years ago
|
||
Please make it use a top-level manifest (e.g. central/layout/reftests/reftest.list) rather than scanning an entire source tree for files named reftest.list.
Reporter | ||
Comment 21•15 years ago
|
||
ok tough guys ;)
I am implementing the suggestions.
Comment 22•15 years ago
|
||
I'll submit a big patch to try-server and then mozilla-central to fix the obvious test problems. For the four that confuse me, I'll file bugs or email developers.
Attachment #431996 -
Attachment is obsolete: true
Comment 23•15 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/eee895e24866 fixes a bunch of the tool's warnings.
A few of the tests turned out to have been omitted from manifests because they failed. I filed bugs on those.
Comment 24•15 years ago
|
||
All the problems found with reftests have been fixed or filed as bugs blocking this one. Murali, lemme know when you have a new version of the tool and I'll run it for crashtests too.
Reporter | ||
Comment 25•15 years ago
|
||
This version has made the changes recommended by reviewers. Now the script takes only one argument. the fully qualified reftest.list ot crashtest.list file name.
Starting from the file, the list of list files is build and searching is done.
I have not implemented jmaher's suggestion of creating a def for opening file and returning the search index. That , I felt was too much of modularizing for the sake of doing it. I can live with a little bit of disorder :)
Usage : python generic-mchecker.py /usr/home/src/layout/reftests/reftest.list
Attachment #432175 -
Attachment is obsolete: true
Reporter | ||
Updated•15 years ago
|
Attachment #434050 -
Attachment mime type: text/x-python → text/plain
Reporter | ||
Comment 26•15 years ago
|
||
Crashtest not found list
Reporter | ||
Comment 27•15 years ago
|
||
reftest needs home list
CAUTION: my codebase on which I ran these queries is not up to date.
Script to run :
python generic-mchecker.py /home/mnandigama/src/testing/crashtest/crashtests.list
python generic-mchecker.py /home/mnandigama/src/layout/reftests/reftest.list
Comment 28•15 years ago
|
||
(In reply to comment #27)
> Created an attachment (id=434063) [details]
> reftest needs home list
>
> reftest needs home list
>
> CAUTION: my codebase on which I ran these queries is not up to date.
>
Murali, please do not confuse things by running the script on an out-of-date build tree and attaching the output. It'd be much more helpful to run the tool with Jesse's patches which correct all the true errors and then it should find only the four tests that have been removed from manifests in order to be disabled.
Reporter | ||
Comment 29•15 years ago
|
||
Point taken ..Thanks
Reporter | ||
Comment 30•15 years ago
|
||
This list is latest and all tests found in this list are not bein gused by any manifest file
Attachment #434062 -
Attachment is obsolete: true
Reporter | ||
Comment 31•15 years ago
|
||
This list contains 26 tests. This is a reduction from previous 45 count. Even if there are open bugs filed on these 26 files, these keep appearing in the missing list as long as they are not fixed.
Attachment #434063 -
Attachment is obsolete: true
Reporter | ||
Comment 32•15 years ago
|
||
Comment on attachment 434623 [details]
Latest reftests needhome tests list
There seems a strange repeat happening on one test file /home/mnandigama/src/layout/reftests/reftest-sanity/prefix-suffix.html which appeared 4 times. I will investigate the root-cause
Comment 33•15 years ago
|
||
I fixed all the crashtests listed in comment 30's attachment.
Comment 34•3 years ago
|
||
The bug assignee didn't login in Bugzilla in the last 7 months.
:ahal, could you have a look please?
For more information, please visit auto_nag documentation.
Assignee: murali.nandigama → nobody
Flags: needinfo?(ahal)
Comment 35•3 years ago
|
||
No activity for 12 years (!), closing.
Status: NEW → RESOLVED
Closed: 3 years ago
Flags: needinfo?(ahal)
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•