Closed Bug 1068923 Opened 10 years ago Closed 10 years ago

The manifest parser should provide some way to uniquely identify tests

Categories

(Testing :: Mozbase, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla36

People

(Reporter: chmanchester, Assigned: chmanchester)

References

Details

Attachments

(1 file, 5 obsolete files)

This is for the benefit of structure logging, discussion in bug 1033126 comment 32 and below. bug 920849 and bug 901990 seem to have the history of the dupe-manifest situation.

My proposal is to have an identifier field in the test object that will be the path, as usual, unless "dupe-manifest" is present, in which case the file name of the manifest (and maybe part of the path) is put at the beginning. Mozlog's formatters already know how to log iterable test ids, so this could be a tuple or something.

I'm assuming this is the best compromise on getting something reliably unique from manifestparser based test manifests and logging what other tools expect (and I'm assuming that the uniqueness property isn't likely to change). I'm open to suggestions.
This is what I mean. Pretty special. I don't think this is a crazy thing to ask of the manifests, so given how many of our tests are on these sort of manifests let's decide what we actually want and can depend on here.
Attachment #8491074 - Flags: feedback?(ted)
Assignee: nobody → cmanchester
Status: NEW → ASSIGNED
(In reply to Chris Manchester [:chmanchester] from comment #1)
> Created attachment 8491074 [details] [diff] [review]
> Implement an 'id' field for test objects provided by the manifestparser
> 
> This is what I mean. Pretty special. I don't think this is a crazy thing to
> ask of the manifests, so given how many of our tests are on these sort of
> manifests let's decide what we actually want and can depend on here.

I thought about this for a minute and maybe it is a crazy thing to ask of the manifests. We will, however, need to include the top level manifest in the test object if manifest name + test path is the reliable way to get uniqueness.
(In reply to Chris Manchester [:chmanchester] from comment #2)
> I thought about this for a minute and maybe it is a crazy thing to ask of
> the manifests. We will, however, need to include the top level manifest in
> the test object if manifest name + test path is the reliable way to get
> uniqueness.

How is playing bug 1023790 in here? That's something which will land soon.
Summary: The manifest parser should provide a unique identifier per test in test objects → The manifest parser should provide some way to uniquely identify tests
(In reply to Henrik Skupin (:whimboo) from comment #3)
> (In reply to Chris Manchester [:chmanchester] from comment #2)
> > I thought about this for a minute and maybe it is a crazy thing to ask of
> > the manifests. We will, however, need to include the top level manifest in
> > the test object if manifest name + test path is the reliable way to get
> > uniqueness.
> 
> How is playing bug 1023790 in here? That's something which will land soon.

Not sure. Looks like that could be used to achieve the same thing, but would require modifying the manifests themselves. Maybe I'm missing something though.
Comment on attachment 8491074 [details] [diff] [review]
Implement an 'id' field for test objects provided by the manifestparser

This does what I want locally but not running from tests.zip. Back to the drawing board.
Attachment #8491074 - Flags: feedback?(ted)
This at least provides unique test names (https://tbpl.mozilla.org/php/getParsedLog.php?id=48400117&tree=Try&full=1), but it occurs to me a name like all-test-dirs.list could easily be subject to change.
Attachment #8491074 - Attachment is obsolete: true
This doesn't work as soon as a duplicated manifest includes another manifest. The trail of including manifests is unique, but wow maybe that's overkill given it's for one directory. I'm thinking I'll ping gps when he's back, but I wonder if we could just copy these particular test files and rename them something like <path>-unpack.js during a build step. I don't really know how that would work so maybe it makes things complicated elsewhere, but would help us keep our story straight for identifying tests.
I guess this is a dupe of bug 957380, but as acknowledged in that bug the approach there doesn't conform to the output expected by some parsers.
:gps, as it looks like you've looked into this before, do you have any feedback on either this patch or what I've suggested in comment 7 ?
Flags: needinfo?(gps)
I am kinda philosophically opposed to renaming tests that go into the test archive to have a manifest component because this adds confusion as to where a test came from. The counterargument is that not many tests exist in multiple manifests, so the cognitive burden should only be felt by few.

But there is another problem: identifying which tests to run. That is bug 969524. And I have a proposal there that I think will work here. And I would strongly prefer for there to be symmetry between both solutions. That proposal is to introduce an alternative syntax for uniquely identifying a test:

  <path-to-manifest>:<test name or file>

e.g. toolkit/mozapps/extensions/test/xpcshell/xpcshell-unpack.ini:test_upgrade.js

(the last component there is "xpcshell-unpack.ini:test_upgrade.js" instead of "test_upgrade.js" as it would be reported today)
Flags: needinfo?(gps)
The new syntax looks good, but the issue with using it for logging is the havoc it might wreak on out of band log parsers (I don't want to break them unnecesarily until we're ready with all their relevant producers emitting structured logs). I think I've determined that the bug suggester in tbpl really wants to find a test file name after the last forward slash before the second pipe but I didn't check treeherder and maybe we can update this anyway if there's ample agreement that this is how we should identify tests.

My main question here is how to get a comprehensible idea of uniquely identifying a test in the harness from what we get out of the manifest. The answer seems to be a combination of the test path and the path of the including manifest. Is there an advantage to putting this in the mozbuild manifest emitter vs. the manifest parser itself?
Ok, I checked and treeherder is a pretty straightforward rewrite of the tbpl code on this subject:

https://github.com/mozilla/treeherder-service/blob/e35938277af77166d813b4321b9eb036d910a1c4/treeherder/log_parser/utils.py#L40
Here's my proposal for getting the information we need to uniquely identify a test out of the manifests. This is agnostic regarding the way we ultimately log tests or specify them from the commandline, but we should be sure that it will be useful for those purposes.
Attachment #8501464 - Flags: feedback?(gps)
Attachment #8491766 - Attachment is obsolete: true
Comment on attachment 8501464 [details] [diff] [review]
Provide the including manifest name in case of a dupe-manifest to uniquely identify tests.

Review of attachment 8501464 [details] [diff] [review]:
-----------------------------------------------------------------

This gets the job done, but I'm going to bikeshed on the implementation.

First, I think the whole "dupe-manifest" thing in .ini files is a giant hack. It is a work around to explicitly add metadata that the manifest parser wasn't exposing at the time. (IIRC we were lazy and didn't want to update the manifest parser.) This patch takes it full circle by incorporating the hack into the manifest parser itself.

Downstream consumers essentially want to know from which manifest a test file came from. If there are multiple manifests utilizing a test file, it needs a way to identify that. They can do this be tracking the "primary" manifest a test came from. IIRC, we couldn't do this initially when dupe-manifest was rolled out: the results of manifest parsing would only identify the actual file an entry came from (not the entry point).

I thus believe a better approach is to add a "primary-manifest" key or some such to the test entries emitted from the parser. As long as downstream consumers can differentiate between "a test was defined in this manifest file" vs "a test *context* came from this other manifest file," we should be fine.

Does that make sense? (Keep in mind I haven't been following manifest parser changes intently, so I might be missing recent context.)
Attachment #8501464 - Flags: feedback?(gps) → feedback-
(In reply to Gregory Szorc [:gps] from comment #14)
> Comment on attachment 8501464 [details] [diff] [review]
> Provide the including manifest name in case of a dupe-manifest to uniquely
> identify tests.
> 
> Review of attachment 8501464 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This gets the job done, but I'm going to bikeshed on the implementation.
> 
> First, I think the whole "dupe-manifest" thing in .ini files is a giant
> hack. It is a work around to explicitly add metadata that the manifest
> parser wasn't exposing at the time. (IIRC we were lazy and didn't want to
> update the manifest parser.) This patch takes it full circle by
> incorporating the hack into the manifest parser itself.
> 
> Downstream consumers essentially want to know from which manifest a test
> file came from. If there are multiple manifests utilizing a test file, it
> needs a way to identify that. They can do this be tracking the "primary"
> manifest a test came from. IIRC, we couldn't do this initially when
> dupe-manifest was rolled out: the results of manifest parsing would only
> identify the actual file an entry came from (not the entry point).
> 
> I thus believe a better approach is to add a "primary-manifest" key or some
> such to the test entries emitted from the parser. As long as downstream
> consumers can differentiate between "a test was defined in this manifest
> file" vs "a test *context* came from this other manifest file," we should be
> fine.

Figuring out which manifest is primary in the presence of arbitrary include chains seems complicated. Right now we can get away with assuming the including manifest is primary, but like I mentioned in comment 7 this goes wrong when a dupe manifest itself includes a manifest. I'm inclined to rely on this assumption for the moment and emit call the key "including-manifest" to avoid confusion.

> 
> Does that make sense? (Keep in mind I haven't been following manifest parser
> changes intently, so I might be missing recent context.)

As I understand it this means getting rid of the "dupe-manifest" key in .ini files altogether in favor of just getting the metadata we need out of the manifest parser. This sounds good to me, I'll write a patch.
The mozbuild front end consumes the relevant manifests individually, so the dupe-manifest property can't be determined from the information available to the manifest parser. I need to move on for the moment.
Ok, I looked at this again and I think this is about as good as we can do. This is a lot like the last patch except that I added a new field to test objects "including-manifest" to make it clearer what's going on.

The thing meant by the 'dupe-manifest' key in manifests can be determined by the manifest consumer, but this requires seeing two manifests and figuring out they have a common child manifest. This can be done in the manifestparser pretty easily, but only if the entry point to parsing is at the common parent of the manifests including the dupe or above. The mozbuild frontend reads individual manifests at the level of the sibling manifests that include a common child, so this additional context would need to be tracked in mozbuild, and would require that every test object in a child manifest include the name of its parent so that when a duplicate is detected we can end up including the relevant information.

I'm not familiar with mozbuild at all so I might be missing something, but this seems like this is the right tradeoff, that adding more manifest parsing logic to mozbuild would be the worse than keeping the redundant key in the manifests.
Attachment #8505597 - Flags: feedback?(gps)
Attachment #8501464 - Attachment is obsolete: true
Comment on attachment 8505597 [details] [diff] [review]
Provide the including manifest name in case of a dupe-manifest to uniquely identify tests.

Review of attachment 8505597 [details] [diff] [review]:
-----------------------------------------------------------------

I think this will work. I'd consider making "parentmanifest" always present if it is defined. Why make it conditional on dupe-manifest?
Attachment #8505597 - Flags: feedback?(gps) → feedback+
The path of the including manifest is included in test objects whenever possible to aide identification of tests in cases the same test file is included by multiple manifests.
Attachment #8507926 - Flags: review?(ahalberstadt)
Attachment #8505597 - Attachment is obsolete: true
Comment on attachment 8507926 [details] [diff] [review]
Provide the including manifest name to uniquely identify tests in case of a dupe manifest.

Review of attachment 8507926 [details] [diff] [review]:
-----------------------------------------------------------------

Hm, this kind of sucks because like you mentioned in several other comments it is still possible to get non-unique manifest chains with intermediate including. I'm going to r+ this for now because this has gone through many feedback cycles and is blocking you.. but I kind of feel like this bug is solving the problem in the wrong way.

To me, the manifest include chain isn't what gives uniqueness to the tests in [1]. IMO the fact that the manifest chain is unique in this case is a side effect brought on by the fact that manifestparser doesn't support multiple test names in the same file. What actually gives uniqueness to the tests in [1] is the fact that they are called with different head files. So in my ideal world, it should be possible to write a single manifest that has the following in it:

[test_foo.js]
head = head1.js

[test_foo.js]
head = head2.js

In this world, the unique key is the test object itself. I think breaking manifestparser away from the assumption that the test paths are unique is a good thing to do and would be my preferred way of solving this. That being said, I understand this would be a big scary change with lots of unknown implications. So for the sake of progress I'll give this an r+, but it might be worth revisiting this later.

[1] http://dxr.mozilla.org/mozilla-central/source/toolkit/mozapps/extensions/test/xpcshell/xpcshell-shared.ini

::: testing/mozbase/manifestparser/manifestparser/manifestparser.py
@@ +538,5 @@
> +                # If a test was included by a parent manifest we may need to
> +                # indicate that in the test object for the sake of identifying
> +                # a test, particularly in the case a test file is included by
> +                # multiple manifests.
> +                test['including-manifest'] = parentmanifest

"including-manifest" is kind of confusing to me. Would "ancestor-manifest" work instead?
Attachment #8507926 - Flags: review?(ahalberstadt) → review+
Attachment #8507926 - Attachment is obsolete: true
Comment on attachment 8509757 [details] [diff] [review]
Provide the including manifest name to uniquely identify tests in case of a dupe manifest.

r=ahal
Attachment #8509757 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/0ce325e61d11
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: