Closed
Bug 1270213
Opened 8 years ago
Closed 8 years ago
In telemetry JSONStore, use docID for file names rather than unique ID argument
Categories
(Firefox for Android Graveyard :: General, defect)
Tracking
(firefox49 fixed)
RESOLVED
FIXED
Firefox 49
Tracking | Status | |
---|---|---|
firefox49 | --- | fixed |
People
(Reporter: mcomella, Assigned: mcomella)
References
Details
Attachments
(7 files)
58 bytes,
text/x-review-board-request
|
sebastian
:
review+
|
Details |
58 bytes,
text/x-review-board-request
|
sebastian
:
review+
|
Details |
58 bytes,
text/x-review-board-request
|
sebastian
:
review+
|
Details |
58 bytes,
text/x-review-board-request
|
sebastian
:
review+
|
Details |
58 bytes,
text/x-review-board-request
|
sebastian
:
review+
|
Details |
58 bytes,
text/x-review-board-request
|
sebastian
:
review+
|
Details |
58 bytes,
text/x-review-board-request
|
sebastian
:
review+
|
Details |
We'll lose an data in the old format when we switch to the new system (unless we keep dead code around) so we should fix this before it leaves Nightly.
Assignee | ||
Comment 1•8 years ago
|
||
Relevant info: (In reply to Michael Comella (:mcomella) from bug 1243585 comment #92) > (In reply to Mark Finkle (:mfinkle) from comment #91) > > (In reply to Michael Comella (:mcomella) from comment #45) > > > (In reply to Mark Finkle (:mfinkle) from comment #44) > > > > As long as we are not encoding data into the filename for extraction, > > > > I'm fine with the plan. > > > > > > For my edification, why is that? > > > > Because encoding bits of useful data into the filename is a bad pattern. > > Why is it a bad pattern? I see a series of trade-offs: > - Error-prone > + Data is accessible without having to read the file (useful for formats > that require you to read the whole file or risk complexity, like JSON) > + Prevents duplication of data (e.g. if you need to modify it, but see > also: error-prone) > > To be explicit about what I'm doing, I'm saving a unique ID for the pings in > the filename. This ID is not extracted to be added back to the ping, but is > used by the store to identify specific pings, e.g. to remove them when an > upload is completed, or (since we assume the IDs are strictly increasing) > when pruning we can remove the oldest pings. > > > Encoding data that is not even required outside the ping (like the sequence > > number) doesn't even make sense. > > The sequence number is required for the core ping and I copy it to be the > unique ID (rather than creating another arbitrary ID). The sequence number > may not exist in all pings, which is why the unique ID is duplicated & > separated. > > > I could see using the docId as the full filename. It's a UUID and would > > never conflict with different ping types. The docId is actually needed in > > the URL generation too, so using it as the filename has the benefit not > > needing to open the ping to get any data. > > I like this because it doesn't rely on the ping-creator specifying an ID > that is potentially incorrect – non-unique or non-increasing. We can > generate the docID ourselves under the hood and use it as the file name. > We'll have to rely on file modified times (creation times are not accessible > in Java 7) for pruning, but since we write the files once, this should be > fine. > > > In fact, desktop/gecko uses docId as the archive filename. Did you take a > > look at how desktop/gecko manages archiving and resending pings? > > I have not – I'll look into it.
Assignee | ||
Comment 2•8 years ago
|
||
Desktop code: 11:22 <mfinkle> http://mxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/ 11:23 <mfinkle> http://mxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/TelemetryArchive.jsm 11:23 <mfinkle> but probably need to look at the call sites
Assignee | ||
Comment 3•8 years ago
|
||
11:34 <mfinkle> mcomella: TelemetrySend.jsm has some stuff too 11:34 <mfinkle> they have limits on how many archived pings to save 11:34 <mfinkle> for space i guess 11:34 <mfinkle> the android limit is lower than desktop, for example 12:07 <gfritzsche> mcomella: starting point is here: https://dxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/TelemetrySend.jsm#691 [ed: in submitPing function]
Assignee | ||
Comment 4•8 years ago
|
||
From a quick skim: it looks like desktop saves pings into files with the ID as their name. Tangent: they don't appear to do batch operations on the pings – they retrieve/remove with specific IDs.
Assignee | ||
Comment 5•8 years ago
|
||
These will be used by the Store in the upcoming patches. Review commit: https://reviewboard.mozilla.org/r/51187/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/51187/
Attachment #8749847 -
Flags: review?(s.kaspari)
Attachment #8749848 -
Flags: review?(s.kaspari)
Attachment #8749849 -
Flags: review?(s.kaspari)
Attachment #8749850 -
Flags: review?(s.kaspari)
Attachment #8749851 -
Flags: review?(s.kaspari)
Attachment #8749852 -
Flags: review?(s.kaspari)
Attachment #8749853 -
Flags: review?(s.kaspari)
Assignee | ||
Comment 6•8 years ago
|
||
Review commit: https://reviewboard.mozilla.org/r/51189/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/51189/
Assignee | ||
Comment 7•8 years ago
|
||
I don't actually use isUUID util, but rather the static regex constants. I'm not sure if I should remove it because: 1) I use it to test the regex constants/patterns are correct 2) It seems like someone might find it useful at some point Review commit: https://reviewboard.mozilla.org/r/51191/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/51191/
Assignee | ||
Comment 8•8 years ago
|
||
Review commit: https://reviewboard.mozilla.org/r/51193/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/51193/
Assignee | ||
Comment 9•8 years ago
|
||
Future patches will change the remaining code - This is not yet expected to compile. Review commit: https://reviewboard.mozilla.org/r/51195/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/51195/
Assignee | ||
Comment 10•8 years ago
|
||
This is not yet expected to compile. Review commit: https://reviewboard.mozilla.org/r/51197/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/51197/
Assignee | ||
Comment 11•8 years ago
|
||
The code is now expected to compile. Review commit: https://reviewboard.mozilla.org/r/51199/diff/#index_header See other reviews: https://reviewboard.mozilla.org/r/51199/
Comment 12•8 years ago
|
||
Comment on attachment 8749847 [details] MozReview Request: Bug 1270213 - Move generic file filters to FileUtil. r=sebastian https://reviewboard.mozilla.org/r/51187/#review48065
Attachment #8749847 -
Flags: review?(s.kaspari) → review+
Updated•8 years ago
|
Attachment #8749848 -
Flags: review?(s.kaspari) → review+
Comment 13•8 years ago
|
||
Comment on attachment 8749848 [details] MozReview Request: Bug 1270213 - Add FileLastModifiedComparator. r=sebastian https://reviewboard.mozilla.org/r/51189/#review48067
Updated•8 years ago
|
Attachment #8749849 -
Flags: review?(s.kaspari)
Comment 14•8 years ago
|
||
Comment on attachment 8749849 [details] MozReview Request: Bug 1270213 - Add UUIDUtil.isUUID. r=sebastian https://reviewboard.mozilla.org/r/51191/#review48069 ::: mobile/android/base/java/org/mozilla/gecko/util/UUIDUtil.java:26 (Diff revision 1) > + > + /** > + * Return if the given string is a UUID. A caller can use `UUID.fromString` and catch the exception > + * for this same behavior, but exceptions cause a performance hit and shouldn't be used for control flow. > + */ > + public static boolean isUUID(@NonNull final String UUID) { Mh. If this is actually never used then I'd remove it. If you only use the regex/pattern then why not test them in the test? The current design allows me to change the implementation of isUUID() and the tests pass. However your regex/pattern won't have any coverage anymore. ::: mobile/android/tests/background/junit4/src/org/mozilla/gecko/util/TestUUIDUtil.java:23 (Diff revision 1) > + public void testIsUUID() throws Exception { > + for (int i = 0; i < 3; ++i) { > + final String uuid = UUID.randomUUID().toString(); > + assertTrue("Generated UUID, " + uuid + ", is a UUID", UUIDUtil.isUUID(uuid)); > + } > + > + final String uuid = UUID.randomUUID().toString(); > + for (final String str : new String[] { "not-a-uuid", uuid.substring(0, uuid.length() - 1), uuid + "lol" }) { > + assertFalse("Given UUID, " + str + ", is not a valid UUID", UUIDUtil.isUUID(str)); > + } Somehow this looks more like a test of java.util.UUID.randomUUID() than isUUID(). While I like this saniy check, I think you should at least test this against some static values in isolation. Otherwise your test just asserts that isUUID() returns true for whatever randomUUID() returns.
Comment 15•8 years ago
|
||
Comment on attachment 8749849 [details] MozReview Request: Bug 1270213 - Add UUIDUtil.isUUID. r=sebastian https://reviewboard.mozilla.org/r/51191/#review48071
Attachment #8749849 -
Flags: review+
Comment 16•8 years ago
|
||
Comment on attachment 8749850 [details] MozReview Request: Bug 1270213 - Factor lockAndReadJSONFromFile out from getAllPings. r=sebastian r=sebastian https://reviewboard.mozilla.org/r/51193/#review48073
Attachment #8749850 -
Flags: review?(s.kaspari) → review+
Updated•8 years ago
|
Attachment #8749851 -
Flags: review?(s.kaspari) → review+
Comment 17•8 years ago
|
||
Comment on attachment 8749851 [details] MozReview Request: Bug 1270213 - Update TelemetryPing & builders to use docID. r=sebastian https://reviewboard.mozilla.org/r/51195/#review48075 ::: mobile/android/base/java/org/mozilla/gecko/telemetry/pingbuilders/TelemetryCorePingBuilder.java:55 (Diff revision 1) > private static final String PROFILE_CREATION_DATE = "profileDate"; > - public static final String SEQ = "seq"; > + private static final String SEQ = "seq"; > private static final String VERSION_ATTR = "v"; > > - public TelemetryCorePingBuilder(final Context context, final int sequenceNumber) { > - super(sequenceNumber); > + public TelemetryCorePingBuilder(final Context context) { > + super(); Nit: You do not need to explicitly call super: http://docs.oracle.com/javase/tutorial/java/IandI/super.html "If a constructor does not explicitly invoke a superclass constructor, the Java compiler automatically inserts a call to the no-argument constructor of the superclass. [..]"
Comment 18•8 years ago
|
||
Comment on attachment 8749852 [details] MozReview Request: Bug 1270213 - Update TelemetryUploadService to use doc ID. r=sebastian https://reviewboard.mozilla.org/r/51197/#review48077
Attachment #8749852 -
Flags: review?(s.kaspari) → review+
Comment 19•8 years ago
|
||
https://reviewboard.mozilla.org/r/51197/#review48079 > This is not yet expected to compile. Btw. if those patches are not going to be squashed then I wonder if this will affect bisecting. There will be a bunch of revisions that do not compile and can't be tested.
Comment 20•8 years ago
|
||
Comment on attachment 8749853 [details] MozReview Request: Bug 1270213 - Move PingStore* to doc ID filenames. r=sebastian https://reviewboard.mozilla.org/r/51199/#review48081 ::: mobile/android/base/java/org/mozilla/gecko/telemetry/stores/TelemetryJSONFilePingStore.java:104 (Diff revision 1) > - final SortedSet<Integer> ids = getIDsFromFileList(files); > - removeFilesWithSmallestIDs(ids, files.length - MAX_PING_COUNT); > - } > + final SortedSet<File> sortedFiles = new TreeSet<>(new FileLastModifiedComparator()); > + sortedFiles.addAll(Arrays.asList(files)); > + deleteSmallestFiles(sortedFiles, files.length - MAX_PING_COUNT); Can this be affected by manually or automatically (ntpd) updating the time? Does the filesystem use a constant timezone or can this be affected by travelling?
Attachment #8749853 -
Flags: review?(s.kaspari) → review+
Assignee | ||
Comment 21•8 years ago
|
||
https://reviewboard.mozilla.org/r/51199/#review48081 > Can this be affected by manually or automatically (ntpd) updating the time? > > Does the filesystem use a constant timezone or can this be affected by travelling? These are good questions. It appears it all [depends on the filesystem](http://ask-leo.com/why_do_file_timestamps_compare_differently_every_time_change.html), which is inconsistent across Android devices. That being said, the data from one day isn't any more or less significant than any other day so it's okay if we lose newer data in favor of uploading old data. However, our sequence numbers will become slightly less useful to explain outages (e.g. if we don't receive 3, 5, 6, it's not necessarily "they had no connection for 3, regained for 4, and lost for 5-6" but instead, "they could have pruned 3, 5, 6 from changed timestamps and uploaded 4 when ready"). We should ask Georg if this is significant.
Assignee | ||
Comment 22•8 years ago
|
||
Georg, these changesets simplify our ping store's implementation but use the file's timestamp to identify which pings to delete first when we store too many. How the timestamps change from clock skew as well as traveling varies depending on filesystem and thus the timestamps can't be relied upon. Practically speaking, newer data isn't any more useful than older data (assuming we're uploading data from within a few days time) so this sounds fine. However, the sequence numbers can no longer be relied upon to consistently show outages because they can be removed in any arbitrary order – is this okay? For what it's worth, we'll still have the (untrustworthy) ping creation date, (untrustworthy) ping submission date, and the server received date to help with forensics.
Flags: needinfo?(gfritzsche)
Assignee | ||
Comment 23•8 years ago
|
||
To be explicit, I'm blocked on comment 22 before I can upload the reuploader changes (bug 1243585 & see comment 0).
Comment 24•8 years ago
|
||
Sorry, i'm a bit busy on a work-week. This shouldn't matter much and seems like a rare edge-case. I'd recommend sending pings in a defined order (desktop does "oldest to newest"), but in case of skew or quota-exceeding its fine to break this.
Flags: needinfo?(gfritzsche)
Assignee | ||
Comment 25•8 years ago
|
||
https://hg.mozilla.org/integration/fx-team/rev/55e56996c6cc693fe7dbd81581d8607def3c4280 Bug 1270213 - Move generic file filters to FileUtil. r=sebastian https://hg.mozilla.org/integration/fx-team/rev/a7df3ff13165f25439acbfdfd581cb4e9379ef8d Bug 1270213 - Add FileLastModifiedComparator. r=sebastian https://hg.mozilla.org/integration/fx-team/rev/a318ef39ebfe3ab894705769466a898de289c002 Bug 1270213 - Add UUIDUtil.isUUID. r=sebastian https://hg.mozilla.org/integration/fx-team/rev/4020777ed10953906891787a0e088a444199c557 Bug 1270213 - Factor lockAndReadJSONFromFile out from getAllPings. r=sebastian r=sebastian https://hg.mozilla.org/integration/fx-team/rev/e9d292545018d2abff9e38d753829cf6323369b4 Bug 1270213 - Update TelemetryPing & builders to use docID. r=sebastian https://hg.mozilla.org/integration/fx-team/rev/51f9b0fc547c3c2e41e576b42ec06f62d9b28a02 Bug 1270213 - Update TelemetryUploadService to use doc ID. r=sebastian https://hg.mozilla.org/integration/fx-team/rev/bae636aae9eb500c36985c93fe34003e94b017f5 Bug 1270213 - Move PingStore* to doc ID filenames. r=sebastian https://hg.mozilla.org/integration/fx-team/rev/a97eec37ec2d4ded719ad996d75c431b7043fa5a Bug 1270213 - Add comment explaining pruning behavior. r=me
Comment 26•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/55e56996c6cc https://hg.mozilla.org/mozilla-central/rev/a7df3ff13165 https://hg.mozilla.org/mozilla-central/rev/a318ef39ebfe https://hg.mozilla.org/mozilla-central/rev/4020777ed109 https://hg.mozilla.org/mozilla-central/rev/e9d292545018 https://hg.mozilla.org/mozilla-central/rev/51f9b0fc547c https://hg.mozilla.org/mozilla-central/rev/bae636aae9eb https://hg.mozilla.org/mozilla-central/rev/a97eec37ec2d
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox49:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 49
Updated•3 years ago
|
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•