Split log files to prevent them from growing too large

RESOLVED FIXED in Instantbird 42

Status

Instantbird
Other
RESOLVED FIXED
3 years ago
2 years ago

People

(Reporter: nhnt11, Assigned: nhnt11)

Tracking

(Depends on: 1 bug, Blocks: 3 bugs)

trunk
Instantbird 42
Dependency tree / graph

Details

Attachments

(2 attachments, 14 obsolete attachments)

1.73 KB, patch
florian
: review+
Details | Diff | Splinter Review
33.40 KB, patch
aleth
: review+
Details | Diff | Splinter Review
(Assignee)

Description

3 years ago
Currently log files correspond to single "sessions" (conversation opened till conversation closed). This can range from a few minute to a few days if the conversation isn't closed. This is bad for logs because DailyLogEnumerator splits logs into days based on the filename of the log - resulting in a long session of multiple days being shown under a single date in the log viewer. The LogWriter for a conversation should be closed, and a new one opened, at midnight.
Is this midnight local time? Midnight UTC? Some other arbitrary midnight? :P

I'm not positive I love this idea, if you're a person who is generally awake at "midnight" then all your sessions will be split across two logs. (And with the current viewer you'd have to switch days to see the rest of the conversation.)

Comment 2

3 years ago
(In reply to Patrick Cloke [:clokep] from comment #1)
> Is this midnight local time? Midnight UTC? Some other arbitrary midnight? :P
> 
> I'm not positive I love this idea, if you're a person who is generally awake
> at "midnight" then all your sessions will be split across two logs. (And
> with the current viewer you'd have to switch days to see the rest of the
> conversation.)

I doubt anyone loves this idea, but to stop logfiles growing to arbitrary size, potentially spanning days and weeks, midnight as a cutoff is as good as any other. Your example would suggest 4am or something which is just as arbitrary ;)

Hopefully infinite scroll will take care of the "switch days" problem (i.e. you could just keep scrolling into the next day/session).

Is the timezone problem really a problem worth spending time on for this?

Alternative proposals welcome.
(In reply to aleth [:aleth] from comment #2)
> (In reply to Patrick Cloke [:clokep] from comment #1)
> > Is this midnight local time? Midnight UTC? Some other arbitrary midnight? :P
> > 
> > I'm not positive I love this idea, if you're a person who is generally awake
> > at "midnight" then all your sessions will be split across two logs. (And
> > with the current viewer you'd have to switch days to see the rest of the
> > conversation.)
> 
> I doubt anyone loves this idea, but to stop logfiles growing to arbitrary
> size, potentially spanning days and weeks, midnight as a cutoff is as good
> as any other. Your example would suggest 4am or something which is just as
> arbitrary ;)
I'm not really implying use 4am, I'm implying this might not be the best way to do this.

> Hopefully infinite scroll will take care of the "switch days" problem (i.e.
> you could just keep scrolling into the next day/session).
Sure, but currently we can't do that. There's also weird issues where if you switch timezones and such you still want all the dates to be correct.

> Is the timezone problem really a problem worth spending time on for this?
It isn't a problem, it's a question.

> Alternative proposals welcome.
We discussed an alternative on IRC [1]:
22:03:51 <flo-retina> I think we should split logs when there's more than some amount of time of inactivity (20 minutes? an hour?)
22:04:22 <flo-retina> and then, if the conversation has been ongoing for a long time, we should split at midnight
22:04:36 <flo-retina> that will avoid splitting private conversations happening near midnight

[1] http://log.bezut.info/instantbird/140614/#m353
(Assignee)

Comment 4

3 years ago
Created attachment 8446849 [details] [diff] [review]
Patch v1

This starts a new log file in the following three scenarios:
- If it has been 20 minutes since the last message.
- If at midnight, it's been longer than 3 hours since we started the file.
- After every 1000 messages.

I've also added a test for this.
Assignee: nobody → nhnt11
Status: NEW → ASSIGNED
Attachment #8446849 - Flags: review?(florian)
Attachment #8446849 - Flags: review?(aleth)
(Assignee)

Updated

3 years ago
Summary: Start a new log file at midnight → Split log files to prevent them from growing too large

Comment 5

3 years ago
(In reply to Nihanth Subramanya [:nhnt11] from comment #4)
> This starts a new log file in the following three scenarios:
> - If it has been 20 minutes since the last message.
> - If at midnight, it's been longer than 3 hours since we started the file.
> - After every 1000 messages.
> 
> I've also added a test for this.

There are some UI implications to these changes too. We currently add a session start ruler for every new log file. But with these changes, what the user may think of as a "session" no longer corresponds to log files -- there can be more log files than sessions.

It seems we need an additional flag in the log file header which indicates whether a given file is to be taken as a new session (when absent, the default would be true).

The problem is "session" isn't really well-defined at the moment: you can have disconnects/reconnects as well as huge gaps of inactivity within a session at the moment. We should clarify the expected behaviour. It seems it should correspond to the user *actively* opening or closing a conversation.

So I don't think any of the three cases you implement above qualify as "new sessions", but I'm not sure about this.

Comment 6

3 years ago
(In reply to aleth [:aleth] from comment #5)
> The problem is "session" isn't really well-defined at the moment: you can
> have disconnects/reconnects as well as huge gaps of inactivity within a
> session at the moment. We should clarify the expected behaviour. It seems it
> should correspond to the user *actively* opening or closing a conversation.

Btw, if we went with this definition, we should probably ensure we start a new log file after the user /parts a channel, for example.
(Assignee)

Comment 7

3 years ago
Created attachment 8448343 [details] [diff] [review]
Patch v2

Adds a noNewSession flag to the header of split log files, which maintains current behavior with sessions.
Updates the test to accommodate this.
Attachment #8446849 - Attachment is obsolete: true
Attachment #8446849 - Flags: review?(florian)
Attachment #8446849 - Flags: review?(aleth)
Attachment #8448343 - Flags: review?(aleth)

Comment 8

3 years ago
Comment on attachment 8448343 [details] [diff] [review]
Patch v2

Review of attachment 8448343 [details] [diff] [review]:
-----------------------------------------------------------------

How does this interact with the gloda indexing?

::: chat/components/src/logger.js
@@ +114,5 @@
>                        "logs", aAccount.protocol.normalizedName,
>                        encodeName(aAccount.normalizedName));
>  }
>  
> +function getLogFilePathForConversation(aConv, aFormat, aStartTime) {

Shouldn't aStartTime be optional here? (i.e. using aConv.startDate if it's absent)

Does anything outside logger.js call this?

@@ +125,4 @@
>  }
>  
> +function getNewLogFileName(aFormat, aStartTime) {
> +  let date = aStartTime ? new Date(aStartTime) : new Date();

Is this ever called without a start time? Do we want this to be possible (What date is used then?)?

@@ +163,3 @@
>    format: "txt",
>    encoder: new TextEncoder(),
> +  _startNewFile: function cl_initLogFile(aStartTime, aNoNewSession) {

If you give this function a name, make it something like the method name ;)

@@ +170,5 @@
> +    // file name. To avoid this, increment the new start time by a second until
> +    // it's greater than the current one. This is ugly, but should rarely be needed.
> +    while (this._startTime >= aStartTime)
> +      aStartTime += 1000;
> +    this._startTime = this._lastMessageTime = aStartTime;

Do we really want to overwrite the lastmessagetime here? why?

@@ +186,5 @@
> +        isChat: this._conv.isChat,
> +        normalizedName: this._conv.normalizedName
> +      };
> +      if (aNoNewSession)
> +        header.noNewSession = true;

Should we call this continuingSession (or similar) instead to avoid the negative?

::: chat/components/src/test/test_logger.js
@@ +220,5 @@
>  
>  // Tests the global function getLogFilePathForConversation in logger.js.
>  let test_getLogFilePathForConversation = function* () {
> +  let path = gLogger.getLogFilePathForConversation(dummyConv, "format",
> +                                                   dummyConv.startDate / 1000);

yeah, looks like that last parameter wants to be optional.

@@ +466,5 @@
> +  logWriter._startTime = new Date(logWriter._startTime).setHours(20, 0, 0, 0);
> +  let nearlyMidnight = new Date(logWriter._startTime).setHours(23, 50, 0, 0);
> +  oldPath = logWriter.path;
> +  logWriter._lastMessageTime = nearlyMidnight;
> +  message.time = new Date(nearlyMidnight).setHours(24, 1, 0, 0) / 1000;

setHours(24)? Does that do what you want (i.e also change the day)?

@@ +473,5 @@
> +  notEqual(oldPath, logWriter.path);
> +
> +  // Ensure a new file is created after 1000 messages.
> +  oldPath = logWriter.path;
> +  logWriter._messageCount = 1000;

I would rather actually write 1000 messages here.

@@ +482,5 @@
> +  // with the same time, ensure that the start time of the next log file is greater
> +  // than the previous one, and that a new path is being used.
> +  let oldStartTime = logWriter._startTime;
> +  oldPath = logWriter.path;
> +  logWriter._messageCount = 1000;

Here it makes sense to do this as you don't want to spend any time writing messages for this test.
Attachment #8448343 - Flags: review?(aleth) → review-
(Assignee)

Comment 9

3 years ago
Created attachment 8449792 [details] [diff] [review]
Patch v3

(In reply to aleth [:aleth] from comment #8)
> Comment on attachment 8448343 [details] [diff] [review]
> Patch v2
> 
> Review of attachment 8448343 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> How does this interact with the gloda indexing?
Shouldn't affect gloda indexing, since the log file naming scheme and so on are the same. The only thing that's changing is that some log files may have a noNewSession flag, which is only used by logger.js anyway.

> ::: chat/components/src/logger.js
> @@ +114,5 @@
> >                        "logs", aAccount.protocol.normalizedName,
> >                        encodeName(aAccount.normalizedName));
> >  }
> >  
> > +function getLogFilePathForConversation(aConv, aFormat, aStartTime) {
> 
> Shouldn't aStartTime be optional here? (i.e. using aConv.startDate if it's
> absent)
I've made it optional now.

> Does anything outside logger.js call this?
Only tests, but I guess this is what you were asking.

> @@ +125,4 @@
> >  }
> >  
> > +function getNewLogFileName(aFormat, aStartTime) {
> > +  let date = aStartTime ? new Date(aStartTime) : new Date();
> 
> Is this ever called without a start time? Do we want this to be possible
> (What date is used then?)?
It's called from the system log writers, and for those we just use the current time.

> @@ +163,3 @@
> >    format: "txt",
> >    encoder: new TextEncoder(),
> > +  _startNewFile: function cl_initLogFile(aStartTime, aNoNewSession) {
> 
> If you give this function a name, make it something like the method name ;)
Sorry about this renaming derp.

> @@ +170,5 @@
> > +    // file name. To avoid this, increment the new start time by a second until
> > +    // it's greater than the current one. This is ugly, but should rarely be needed.
> > +    while (this._startTime >= aStartTime)
> > +      aStartTime += 1000;
> > +    this._startTime = this._lastMessageTime = aStartTime;
> 
> Do we really want to overwrite the lastmessagetime here? why?
We'd need to set it if it wasn't already set, i.e. for the first log file. I think just always setting it here works and is easier to read than an extra null check before setting it (or using it in logMessage).

> @@ +186,5 @@
> > +        isChat: this._conv.isChat,
> > +        normalizedName: this._conv.normalizedName
> > +      };
> > +      if (aNoNewSession)
> > +        header.noNewSession = true;
> 
> Should we call this continuingSession (or similar) instead to avoid the
> negative?
I don't like "continuingSession" but if you have a strong opinion on this I'll change it.

> @@ +466,5 @@
> > +  logWriter._startTime = new Date(logWriter._startTime).setHours(20, 0, 0, 0);
> > +  let nearlyMidnight = new Date(logWriter._startTime).setHours(23, 50, 0, 0);
> > +  oldPath = logWriter.path;
> > +  logWriter._lastMessageTime = nearlyMidnight;
> > +  message.time = new Date(nearlyMidnight).setHours(24, 1, 0, 0) / 1000;
> 
> setHours(24)? Does that do what you want (i.e also change the day)?
Yes. Calling setHours with a parameter outside the range advances the next parameter - so setHours(24) is equivalent to advancing the date && setHours(0).

> @@ +473,5 @@
> > +  notEqual(oldPath, logWriter.path);
> > +
> > +  // Ensure a new file is created after 1000 messages.
> > +  oldPath = logWriter.path;
> > +  logWriter._messageCount = 1000;
> 
> I would rather actually write 1000 messages here.
I feel like it's unnecessary I/O, but I guess it tests to make sure _messageCount gets incremented so okay. Though fwiw _messageCount is really an internal parameter...
Attachment #8448343 - Attachment is obsolete: true
Attachment #8449792 - Flags: review?(aleth)
(Assignee)

Comment 10

3 years ago
Created attachment 8449801 [details] [diff] [review]
Patch v3 for real

Forgot to qref, sorry.
Attachment #8449792 - Attachment is obsolete: true
Attachment #8449792 - Flags: review?(aleth)
Attachment #8449801 - Flags: review?(aleth)
Comment on attachment 8449801 [details] [diff] [review]
Patch v3 for real

Review of attachment 8449801 [details] [diff] [review]:
-----------------------------------------------------------------

Looks promising.

I think you need to add an exception somewhere in the code to ignore the timestamp of delayed messages, otherwise the beginning of a twitter timeline or XMPP MUC could be split to plenty of small log files. (And then I'm afraid you also need to fix twitter.js to correctly set that boolean for all messages that are fetched when reconnecting, and that were posted before the time when the account got connected :-().

::: chat/components/src/logger.js
@@ +165,3 @@
>    format: "txt",
>    encoder: new TextEncoder(),
> +  startNewFile: function cl_startNewFile(aStartTime, aNoNewSession) {

What's the meaning of 'cl' here? Is it a 'ConversationLog' that we renamed to 'LogWriter' in a previous patch, and forgot to change these names to 'lw'?

@@ +171,5 @@
> +    // we will create another file, using the same start time - and so the same
> +    // file name. To avoid this, increment the new start time by a second until
> +    // it's greater than the current one. This is ugly, but should rarely be needed.
> +    while (this._startTime >= aStartTime)
> +      aStartTime += 1000;

Errr... how many times are you going to iterate through this code if for some reason I change my computer's clock a year back?

Shouldn't this just be:
aStartTime = Math.max(aStartTime, this._startTime + 1000);

@@ +188,5 @@
> +        isChat: this._conv.isChat,
> +        normalizedName: this._conv.normalizedName
> +      };
> +      if (aNoNewSession)
> +        header.noNewSession = true;

Please use a name without negative. There are plenty of other things that this file isn't (it's not a flying pig either, right?), let's describe what it is.

Suggestion: 'continuedSession' (or just 'continued', but that may not be descriptive enough)

@@ +230,5 @@
>    logMessage: function cl_logMessage(aMessage) {
> +    // We start a new log file in the following cases:
> +    // - If it has been 20 minutes since the last message.
> +    // - If at midnight, it's been longer than 3 hours since we started the file.
> +    // - After every 1000 messages.

Each of these items should have a const defined right after it.

@@ +231,5 @@
> +    // We start a new log file in the following cases:
> +    // - If it has been 20 minutes since the last message.
> +    // - If at midnight, it's been longer than 3 hours since we started the file.
> +    // - After every 1000 messages.
> +    const inactivityLimit = 20 * 60 * 1000;

Our coding style for consts is either |const kNameOfConst = ...| or |const NAME_OF_CONST = ...|.

@@ +232,5 @@
> +    // - If it has been 20 minutes since the last message.
> +    // - If at midnight, it's been longer than 3 hours since we started the file.
> +    // - After every 1000 messages.
> +    const inactivityLimit = 20 * 60 * 1000;
> +    const midnightTimeLimit = 3 * 60 * 60 * 1000;

I wonder if 'midnight' (ie. '0') shouldn't also be a const. And what's named midnightTimeLimit could be named kDayOverlapLimit or something that describes what it actually means.

@@ +239,5 @@
> +    let messageMidnight = new Date(messageTime).setHours(0, 0, 0, 0);
> +
> +    if (messageTime - this._lastMessageTime > inactivityLimit ||
> +        messageMidnight - this._startTime > midnightTimeLimit ||
> +        ++this._messageCount > 1000)

make this 1000 a const.

@@ +549,5 @@
>          continue;
>        }
> +
> +      if (!data.noNewSession)
> +        messages.push(sessionMsg);

It seems a be strange that we are initializing the sessionMsg object in cases where we are not using it. But I would need more context to see if there's actually a good reason to do so.
Attachment #8449801 - Flags: feedback+
(Assignee)

Comment 12

3 years ago
Created attachment 8450638 [details] [diff] [review]
Patch v4

Address review comments.
A new session is now created after a period of inactivity (which is now 30 minutes instead of 20).
A session message is now added for the first log file even if it has the continuedSession flag set.
Requesting f? because I haven't modified the twitter code to set the delayed flag yet.
Attachment #8449801 - Attachment is obsolete: true
Attachment #8449801 - Flags: review?(aleth)
Attachment #8450638 - Flags: feedback?(florian)
(Assignee)

Comment 13

3 years ago
Created attachment 8450640 [details] [diff] [review]
Patch v4.1

Fixed a comment mistake in the test, and check the header for the continuedSession flag in all three cases.
Attachment #8450638 - Attachment is obsolete: true
Attachment #8450638 - Flags: feedback?(florian)
Attachment #8450640 - Flags: feedback?(florian)
(In reply to Nihanth Subramanya [:nhnt11] from comment #12)

> A session message is now added for the first log file even if it has the
> continuedSession flag set.

I don't understand the meaning of this sentence. More specifically, how can the "first log" be a continued session?

Comment 15

3 years ago
I realize there has been a lot of discussion about this already, but:

Since continued sessions should be invisible to the user (no session start ruler), why not start a new log at midnight rather than three hours later? This makes the correct display of the logs for a particular day easy.

The only drawback is temporary (assuming the gsoc project is a success): while infinite scroll doesn't exist, it's not possible to scroll up/down past day boundaries in the log viewer, so you can't view the complete session all at once. This is the only potential argument for keeping the midnight + 3 hours version now and switching to the precise midnight version later that I can think of.
(Assignee)

Comment 16

3 years ago
(In reply to aleth [:aleth] from comment #15)
> ruler), why not start a new log at midnight rather than three hours later?

We start a new log file for the first message after midnight in the current patch, but only if the log was started before 21:00. The splitting itself happens at midnight, not 3:00.

Comment 17

3 years ago
(In reply to Nihanth Subramanya [:nhnt11] from comment #16)
> We start a new log file for the first message after midnight in the current
> patch, but only if the log was started before 21:00. The splitting itself
> happens at midnight, not 3:00.

OK, I misunderstood that part. My proposal would still be to drop the "only if..." condition, either now or when infinite scroll is ready.
(Assignee)

Comment 18

3 years ago
Created attachment 8451940 [details] [diff] [review]
Set delayed flag on older tweets
Attachment #8451940 - Flags: review?(florian)
(Assignee)

Comment 19

3 years ago
Comment on attachment 8450640 [details] [diff] [review]
Patch v4.1

Setting r? now that the twitter patch is up.
Attachment #8450640 - Flags: review?(florian)
Attachment #8450640 - Flags: review?(aleth)
Attachment #8450640 - Flags: feedback?(florian)

Comment 20

3 years ago
Comment on attachment 8450640 [details] [diff] [review]
Patch v4.1

Review of attachment 8450640 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with those minor changes!

::: chat/components/src/test/test_logger.js
@@ +444,5 @@
> +    yield gLogger.gFilePromises.get(logWriter.path);
> +  });
> +
> +  yield logMessage(message);
> +  message.time += 35 * 60; // 35 minutes later.

This and the following should really use the constants from logger.js, but I'm not going to r- for that, as long as you don't mind manually tweaking the test if we decide to tweak these later ;)

@@ +475,5 @@
> +
> +  // Ensure a new file is created after 1000 messages.
> +  oldPath = logWriter.path;
> +  for (let i = 0; i < 1000; ++i)
> +    yield logMessage(message);

Don't yield here.
Attachment #8450640 - Flags: review?(aleth) → review+
(Assignee)

Comment 21

3 years ago
Created attachment 8452438 [details] [diff] [review]
Patch v5

Make the limit constants properties of LogWriter.prototype and use them in the test. Updated comments.
Yield only on the last of the 1001 messages instead of all of them.
Attachment #8450640 - Attachment is obsolete: true
Attachment #8450640 - Flags: review?(florian)
Attachment #8452438 - Flags: review?(florian)
(Assignee)

Updated

3 years ago
Blocks: 955014
Attachment #8451940 - Flags: review?(florian) → review+
Comment on attachment 8452438 [details] [diff] [review]
Patch v5

Review of attachment 8452438 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the delay.
Attachment #8452438 - Flags: review?(florian) → review+

Updated

3 years ago
Keywords: checkin-needed

Comment 23

3 years ago
Please push this to try to give it a test run on all OS.
Keywords: checkin-needed
OS: Mac OS X → All
Hardware: x86 → All

Comment 24

3 years ago
Maybe use try on comm-aurora if c-c is busted.
(Assignee)

Comment 25

3 years ago
Created attachment 8474049 [details] [diff] [review]
Patch v6

Update to make gloda indexing work with split log files. This necessitated modifying getLogPathForConversation to return all the paths of a log writer rather than just the current one.
Attachment #8452438 - Attachment is obsolete: true
Attachment #8474049 - Flags: review?(florian)
Comment on attachment 8474049 [details] [diff] [review]
Patch v6

Review of attachment 8474049 [details] [diff] [review]:
-----------------------------------------------------------------

::: mail/components/im/modules/index_im.js
@@ +319,5 @@
> +      if (!this._knownConversations[convId].logFiles) {
> +        let logFiles =
> +          yield Services.logs.getLogPathsForConversation(aConversation);
> +        if (!logFiles.length) {
> +          // Log files doesn't exist yet, nothing to do!

"files doesn't" is bad grammar. Maybe:
No log files exist yet, nothing to do!

@@ +343,5 @@
>        }
>  
>        let conv = this._knownConversations[convId];
>        let cache = conv.convObj;
> +      for (let logFile of conv.logFiles) {

I don't fully remember this code, so I can't say for sure what should be done here, but when reading this I'm not entirely convinced it will do the right thing.

It seems you are triggering a separate indexing job for each file. Does this mean that in TB search results each split file will appear as a separate conversation?

Are the "continuedSession" flags going to be ignored?
Flags: needinfo?(nhnt11)
(Assignee)

Comment 27

3 years ago
I finally looked at this again, and this is my observation:
Yes, TB search results will see each file as a separate conv, but why does this matter? Search results show only a brief "preview" of the conversation with context around the matched message anyway. When you click on the result to open the conversation, the log is shown grouped properly.
I'm not 100% sure about this and am building TB to test this out.
Flags: needinfo?(nhnt11)
(Assignee)

Updated

3 years ago
Depends on: 1084633

Updated

3 years ago
Blocks: 1099067
(Assignee)

Comment 28

3 years ago
Comment on attachment 8474049 [details] [diff] [review]
Patch v6

I'll re-request review after bug 1103647 lands.
Attachment #8474049 - Flags: review?(florian)
(Assignee)

Comment 29

2 years ago
Created attachment 8627090 [details] [diff] [review]
Patch v7

This rebases the index_im changes onto the current revision. I stand by what I said earlier that the way gloda indexes split log files with this patch does not affect how Thunderbird displays them. Faceted search results will search by file, but they show just a teeny bit of context anyway. When a result is clicked, the log is constructed out of all the files for the given day anyway, so all split logs will be read.

This does not change anything else from the previous patch (which was r+).
Attachment #8474049 - Attachment is obsolete: true
Attachment #8627090 - Flags: review?(aleth)
(Assignee)

Comment 30

2 years ago
Created attachment 8627107 [details] [diff] [review]
Patch v8

OK, turns out the previous patch broke Tb search after all. This one fixes it. The glodaConv value (which stores the id of the db entry) wasn't getting changed for new files, so the new log content was being indexed against the first log file's entry's id. This patch fixes that.

Also, the way the message count was being incremented, the second split would happen after kMessageCountLimit+1. This fixes that and adds a test for it.
Attachment #8627090 - Attachment is obsolete: true
Attachment #8627090 - Flags: review?(aleth)
Attachment #8627107 - Flags: review?(aleth)
(Assignee)

Comment 31

2 years ago
Created attachment 8627109 [details] [diff] [review]
Interdiff with last r+ patch

This is an interdiff between Patch v8 and Patch v5, except for index_im.js, which needs a full review.

Comment 32

2 years ago
Comment on attachment 8627107 [details] [diff] [review]
Patch v8

Review of attachment 8627107 [details] [diff] [review]:
-----------------------------------------------------------------

::: chat/components/src/logger.js
@@ +756,3 @@
>      // regardless of whether these operations succeeded.
> +    for (let path of paths)
> +      yield gFilePromises.get(path);

Does this actually always work? Or do you need Promise.all?

What are we guaranteeing to the caller, here?

::: mail/components/im/modules/index_im.js
@@ +431,5 @@
> +          yield this._indexingJobPromise;
> +        this._indexingJobPromise = new Promise(aResolve => {
> +          this._indexingJobCallbacks.set(convId, aResolve);
> +        });
> +  

Trailing whitespace.

@@ +583,5 @@
>        // chat were not in fact indexed before that session was shut down.
>        let id = yield this._getIdFromPath(relativePath);
>        if (id)
>          glodaConv.id = id;
> +      glodaConv.path = aLogPath;

This seems wrong to me. GlodaIMConversation objects already store the path.
Attachment #8627107 - Flags: review?(aleth) → review-
(Assignee)

Comment 33

2 years ago
Created attachment 8627440 [details] [diff] [review]
Patch v9

Checks the relative path of the glodaConv instead of setting the path.

(In reply to aleth [:aleth] from comment #32)
> Comment on attachment 8627107 [details] [diff] [review]
> Patch v8
> 
> Review of attachment 8627107 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: chat/components/src/logger.js
> @@ +756,3 @@
> >      // regardless of whether these operations succeeded.
> > +    for (let path of paths)
> > +      yield gFilePromises.get(path);
> 
> Does this actually always work? Or do you need Promise.all?
Why wouldn't it always work?
If we use Promise.all it'd be something like
yield Promise.all(paths.map(p => gFilePromises.get(p));
but the present code waits for all of them just fine.
> What are we guaranteeing to the caller, here?
We are guaranteeing that any queued messages are written (or failed to be written) to the log file before returning the paths. i.e., logger.js is presently done with those files.
Attachment #8627107 - Attachment is obsolete: true
Attachment #8627109 - Attachment is obsolete: true
Attachment #8627440 - Flags: review?(aleth)
(Assignee)

Updated

2 years ago
Blocks: 1175706

Comment 34

2 years ago
(In reply to Nihanth Subramanya [:nhnt11] from comment #33)
> Why wouldn't it always work?
> If we use Promise.all it'd be something like
> yield Promise.all(paths.map(p => gFilePromises.get(p));
> but the present code waits for all of them just fine.
> > What are we guaranteeing to the caller, here?
> We are guaranteeing that any queued messages are written (or failed to be
> written) to the log file before returning the paths. i.e., logger.js is
> presently done with those files.

OK, that's fine here. (In principle you could imagine something queuing a write on file 1 again while wer're waiting for the promise of file 2)

Comment 35

2 years ago
Comment on attachment 8627440 [details] [diff] [review]
Patch v9

Review of attachment 8627440 [details] [diff] [review]:
-----------------------------------------------------------------

::: mail/components/im/modules/index_im.js
@@ +419,2 @@
>  
> +      for (let logFile of logFiles) {

Is there a simple way we can avoid having to loop through all of them? (Using the previous value(s) of conv.logFile?) Or don't we care about the overhead (for people who never restart their TB)?
(Assignee)

Comment 36

2 years ago
Created attachment 8628053 [details] [diff] [review]
Patch v10
Attachment #8628053 - Flags: review?(aleth)
(Assignee)

Comment 37

2 years ago
The loop you mentioned is smarter now and only indexes files that need to be.
This is possible because imLogger yields on file operations before returning the file paths, so if multiple new files get filled up, it's guaranteed that we only need to index them once (they won't be written to again).
(Assignee)

Updated

2 years ago
Attachment #8627440 - Attachment is obsolete: true
Attachment #8627440 - Flags: review?(aleth)

Comment 38

2 years ago
Comment on attachment 8628053 [details] [diff] [review]
Patch v10

Review of attachment 8628053 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks! Looking forward to having this in nightlies.

::: mail/components/im/modules/index_im.js
@@ +421,5 @@
> +      // When new log files are started, we want to finish indexing the previous
> +      // one as well as index the new ones. The index of the previous one is
> +      // conv.logFiles.length - 1, so we slice from there.
> +      let currentLogFiles = conv.logFiles.length > 1 ?
> +                            logFiles.slice(conv.logFiles.length - 1) :

nit: slice(-2) would be easier to read.
Attachment #8628053 - Flags: review?(aleth) → review+

Comment 39

2 years ago
Comment on attachment 8628053 [details] [diff] [review]
Patch v10

Review of attachment 8628053 [details] [diff] [review]:
-----------------------------------------------------------------

Took another look, now I've had coffee.

::: mail/components/im/modules/index_im.js
@@ +420,5 @@
> +      // The last log file in the array is the one currently being written to.
> +      // When new log files are started, we want to finish indexing the previous
> +      // one as well as index the new ones. The index of the previous one is
> +      // conv.logFiles.length - 1, so we slice from there.
> +      let currentLogFiles = conv.logFiles.length > 1 ?

Would it be better to use (roughly speaking)
logFiles.slice(-1) if logfiles.length == conv.logfiles.length
logFiles.slice(-2) otherwise
Or is that risky for some reason?

Is the content of conv.logFiles actually ever used? If not, you could just store the length.
Attachment #8628053 - Flags: review+

Comment 40

2 years ago
This is off-topic, but infinite scroll sounds undesirable. Someone who IMs a lot over many days, currently only has to click the mouse once to go from the first day of a conversation to the last day. Scrolling through 24 hours worth of stuff to accomplish the same thing would be a regression.
(Assignee)

Comment 41

2 years ago
Created attachment 8638738 [details] [diff] [review]
Patch v11

slice(-2) won't give us all new log files if there are multiple new ones.
Attachment #8628053 - Attachment is obsolete: true
Attachment #8638738 - Flags: review?(aleth)

Comment 42

2 years ago
Comment on attachment 8638738 [details] [diff] [review]
Patch v11

Review of attachment 8638738 [details] [diff] [review]:
-----------------------------------------------------------------

::: mail/components/im/modules/index_im.js
@@ +352,5 @@
>      if (!(convId in this._knownConversations)) {
>        this._knownConversations[convId] = {
>          id: convId,
>          scheduledIndex: null,
> +        logFiles: null,

Is this still used? Shouldn't it be logFileCount now?

@@ +376,5 @@
>      if (!(convId in this._knownConversations)) {
>        this._knownConversations[convId] = {
>          id: convId,
>          scheduledIndex: null,
> +        logFiles: null,

Same here.

@@ +395,2 @@
>  
> +      if (conv.logFileCount == undefined) {

=== ?
Attachment #8638738 - Flags: review?(aleth) → review-
(Assignee)

Comment 43

2 years ago
Created attachment 8638841 [details] [diff] [review]
Patch v11.1

Addressed review comments.

I'm keeping it as ==, because sometimes it matches both undefined and null (but not 0, which is why I didn't use |!logFileCount|).
Attachment #8638738 - Attachment is obsolete: true
Attachment #8638841 - Flags: review?(aleth)

Updated

2 years ago
Attachment #8638841 - Flags: review?(aleth) → review+

Comment 44

2 years ago
url:        https://hg.mozilla.org/comm-central/rev/578db539d403a3d08ce628b31d2a2963ddf7b766
changeset:  578db539d403a3d08ce628b31d2a2963ddf7b766
user:       Nihanth Subramanya <nhnt11@gmail.com>
date:       Tue Jul 08 03:20:02 2014 +0530
description:
Bug 1025522 - Split log files to prevent them from growing too large - ensure Twitter correctly sets the delayed flag on older tweets. r=florian

url:        https://hg.mozilla.org/comm-central/rev/afc53c36ca1f5a08aa37b9320b3e5bbacae764e4
changeset:  afc53c36ca1f5a08aa37b9320b3e5bbacae764e4
user:       Nihanth Subramanya <nhnt11@gmail.com>
date:       Fri Jun 27 04:46:45 2014 +0530
description:
Bug 1025522 - Split log files to prevent them from growing too large. r=aleth, florian a=aleth CLOSED TREE
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 1.6
Updating milestone from 1.6 to 42.
Target Milestone: 1.6 → Instantbird 42

Updated

2 years ago
Depends on: 1191303
(Assignee)

Updated

2 years ago
Depends on: 1191957
You need to log in before you can comment on or make changes to this bug.