Closed Bug 552143 Opened 14 years ago Closed 14 years ago

Console logging storage mechanism

Categories

(DevTools :: General, defect)

x86
All
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 568629

People

(Reporter: ddahl, Assigned: ddahl)

References

Details

Attachments

(2 files)

We need to figure out a good way to store, truncate and archive (optionally) the logging data flowing through the console
Created ConsoleStorage.jsm:

http://hg.mozilla.org/users/ddahl_mozilla.com/heads-up-display/file/42b38ea8143e/toolkit/components/console/hudservice/ConsoleStorage.jsm

Still iterating - using a generator to iterate what may be at times a very large dataset. Need to work out the truncation and paging yet.
The storage object holds all of the logged data well, we need to focus on truncation after a set time, perhaps 5 minutes. Currently the data is logged into an object and indexed in an array. this may have been a bad design move for the truncation operation.

A. We need to be able to quickly truncate each log.

B. We need to be able to quickly find a log object in order to append additional data to it.

Truncation is important for keeping the memory footprint small, but we need to be able to update each log entry with additional asynchronous operations. (mostly http transactions)

perhaps we need to index everything in an object and keep all of the data in an array. The object will hold the array's index and match it to the object's sequence Id. we may also need multiple indexes to keep track of timestamp --> sequence id, etc...
While working on this I wrote down some notes, here they come.

The current ConsoleStorage situation:

Entries are recorded in the consoleDisplays object, and they are grouped by 
hudId. That's consoleDisplays[hudId]. Each member of the consoleDisplays object 
is also an object of the form: consoleDisplays[hudId][entryId].

When an entry is recorded it is added to consoleDisplays[hudId][entryId].  
Additionally, the entryId is pushed in the displayIndexes[hudId] array. For each 
hudId there's an array in the displayIndexes object.

A globalStorageIndex object is updated to hold each entryId, associated 
to the hudId.

Each recorded entry is wrapped into a ConsoleEntry object.

To recap:
- recordEntry(hudId, entry) does:
  - assign an ID to the entry and wrap it into a ConsoleEntry object.
  - globalStorageIndex[entryId] = hudId.
  - consoleDisplays[hudId][entryId] = entry.
  - displayIndexes[hudId].push(entryId).
  - return entry.
- getEntry(entryId) does:
  - hudId = globalStorageIndex[entryId].
  - return consoleDisplays[hudId][entryId].
- updateEntry(entryId, entry) does:
  - hudId = globalStorageIndex(entryId).
  - consoleDisplays[hudId][entryId] = entry.

(Obviously, the above is a bit simplified, but that's the main idea. Also, 
 updateEntry() is not implemented yet.)

Advantages of the current storage:
- It is fairly straight-forward to record, update and retrieve entries.
- It is straight-forward/easy and quick to retrieve entries for a specific 
hudId.

Disadvantages:
- It is not trivial to truncate the "global" storage based on the number of 
entries, nor based on the oldest entries, say 5 minutes or older entries. This 
is because entries are put into per-HUD storage objects, and walking the objects 
would be "boring"/slow/not optimal.

I would also note that some properties and methods are unused or unimplemented (mainly storage truncation, entry updates). The test code includes some usage beyond that of the main code.
Continuing with my notes. Here is a proposal I have in mind.

Let's have the following:
- globalEntries object. This holds all entries: key names are entryIds, and 
values are the entry objects.

- globalIndex array. This holds each entry ID in the order it was pushed.

- hudIndex object. This holds arrays for each hud being logged. Key names are 
the hudIds and values are arrays holding entryIds.

How it would work:
- recordEntry(hudId, entry):
  - assign entryId to entry.
  - globalEntries[entryId] = entry.
  - globalIndex.push(entryId).
  - hudIndex[hudId].push(entryId).
  - return entryId.
- getEntry(entryId):
  - return globalEntries[entryId].
- updateEntry(entryId, entry):
  - globalEntries[entryId] = entry.

Advantages:
- It keeps the easiness and quickness of retrieving entries based on entryId, 
  easy entry updates, and easy recording. It also makes it easy to retrieve 
  entries associated to a specific hudId.
- It is now easier to truncate globalEntries based on the number of elements - 
we can use globalIndex.length. When we decide it's time to cut N elements, we 
can delete them from the globalEntries object, then we can splice the 
globalIndex array.
- To perform storage truncation based on a time limit, we can walk the array 
from the oldest (first) element 0, and see at which index we have to stop - 
where the time limit is reached. We delete each entry from the globalEntries and 
at the end we can splice the globalIndex array.

Problems:
Once entries are truncated from storage, the hudIndex array would still hold 
obsolete entryIds. To clean hudIndex we would have to check entry.hudId during 
truncation, then use that hudId to access hudIndexes[hudId] to splice the array.  
That would certainly not be nice.

Do we need entry indexing for each hudId? At the moment, I see that this 
possibility of retrieving entries for a given hudId is not needed.

How truncation could be performed:
When recordEntry() is invoked we could also invoke truncateEntries() which:

- Check if (globalIndex.length > somePreferenceNrMaxEntries) { delete the 
  entries that go over the limit }.
- Check if the oldest entry has a timestamp indicating a date older than 
allowed, say 5 minutes (given by some preference). If yes, then the entry is 
deleted. This is repeated until no stale entry is found.

If invoking truncateEntries() whenever recordEntry() is called seems like 
overkill, we can use an interval to automatically invoke the storage truncation 
every few seconds (about 30 seconds should be fine?).

How deletion should be implemented? We could use splice() on the globalIndex 
array every time a deletion needs to be performed, but this can get really slow 
when there are many entries, because they need to be reindexed (arr[n] becomes 
arr[n-1] and so on).

To avoid such slowdowns we could keep track of the number of "deleted" entries.  
After a certain number of such deleted entries, we can perform the actual 
delete. However, the downside would be that it will use more memory, until the 
stale entries are cleaned-up.

Random thoughts:

- One idea would be to have the ConsoleStorage run in a separate worker thread.  
This would be beneficial for the overall operation of the HUD console tool. The 
idea I have in mind is that messages that are logged must not contain DOM nodes 
like they do now. We should only store JavaScript objects/arrays/primitives. The 
HUD GUI code should be responsible for reading and displaying these log objects.  
Currently, this cannot be implemented because the log objects point to actual 
DOM nodes.

- We need to reuse the message object properties as much as possible, to create 
as little data overhead as possible. For example, we should reuse the 
message.hudId, timestamp and even ID. Why do we need to generate a new ID?  
Aren't message IDs unique globally, irrespective of HUD?

Regarding entry IDs, if the network logging code needs to update previous log 
entries, now (and in this proposal) it has to store the ID generated by the 
console storage recordEntry() method. We could ensure that each new message has 
a unique ID that is reused by the console storage, that we can later use for 
updating entries, and so on.

- For archiving log entries we could write them to a file when from-memory 
truncation is performed.

- We could use a single array to store all the entries - no need for a separate 
object to hold them. However, it would become much harder/slower to retrieve 
entries for a given HUD, or for a given entry ID.

If we only choose to use a single object to store entries - without an 
additional array, then we don't have the notion of order (oldest or newest).  
Maybe a linked-list approach would work? I'll think more about this idea. I 
*might* come with another proposal along these lines, if it proves to be
better stuff.

Any feedback is welcome!
This is a work in progress patch, attaching for feedback mainly.

In this patch I have working code for the proposal I wrote down in my previous comment.

- The truncateEntries() is arbitrarily invoked for every 10 entries added with 
recordEntry(). That needs to be fixed. The code is there because ...

- ... the truncateEntries() method should be invoked by a global timer, every 30 seconds or so, to clean the storage. At the moment, I do not know which 
chromeWindow to use. Should I use the first chromeWindow for which 
windowInitializer() is invoked?

- The truncateEntries() method currently performs delete this.globalEntries[id] 
for each obsolete entry. Should this be performed then, or the entries should be 
deleted only when they are removed from globalIndex as well? Is deleting from 
objects slow?

- At the moment I did not add any indexing of entries per HUD - as 
explained in the proposal (hudIndex). Is this needed? I saw it's only 
used by test code.

- The cs.maxEntries and cs.oldestEntryAllowed properties are hard-coded now, 
  since they are subject to change. Should they both become preferences?

Obviously, I still left some obsolete code in there, without removing it. Most 
notably: ConsoleEntry, ConsoleStorage.displayStore(), testCreateDisplay(), and 
testIteration().

Over the next day(s) I would like to prepare more changes for the storage code.  
I would like to go along the lines of a single unique ID for each message 
object, and indexing of entries per HUD. This would bring back the notion of 
createDisplay()/removeDisplay(). Perhaps I will also try a different way of 
storing the entries.

Any feedback is very much welcome. Thanks!

Note: this patch depends on the changes in bug 568657.
I had another idea of how to store the entries. Here's another proposal:

- cs.entries is an object that holds all entries. They need no separate array for indexing. Sorting is achieved with a linked-list approach. Each entry has two properties: entry.previousEntry and entry.nextEntry. These two hold the IDs of the previous and, respectively, next entry.

- cs.firstEntry holds the ID of the oldest entry.

- cs.lastEntry holds the ID of the newest entry.

- cs.hudIndex is an object that indexes the IDs of entries, per HUD.

This implementation includes the cs.createDisplay(), cs.removeDisplay() and the cs.displayStore() methods, together with their associated test code.

I like this implementation better than the previous one. Please let me know the one you prefer, and if you have ideas to further improve the implementation you choose.

Thank you!


Note: this patch depends on bug 568657. Also, the problem with the timer-based invocation of truncateEntries() remains valid - waiting your feedback.
I hope to look at you rpatch soon, in the meantime, we should also think about how we want to enable live searching of the storage when a user enters a string into the filter box
mihai:

Please set a feedback flag for me on this, I will try to get to it this week.
Attachment #456488 - Flags: feedback?(ddahl)
Comment on attachment 456629 [details] [diff] [review]
second proposal patch - wip

Please check both proposals. I personally prefer the second proposal code, but I believe you might be interested in using something like "best of both worlds".

The patch you choose will need a complete rebase to the latest code from mozilla-central. Both are based on your hg repo.
Attachment #456629 - Flags: feedback?(ddahl)
Comment on attachment 456488 [details] [diff] [review]
initial proposal patch - wip

I think we should put this on hold until bug 568629 is worked out
Attachment #456488 - Flags: feedback?(ddahl)
Comment on attachment 456629 [details] [diff] [review]
second proposal patch - wip

I think we should put this on hold until bug 568629 is worked out
Attachment #456629 - Flags: feedback?(ddahl)
The new console service will take care of storing the data and truncating it, etc. so there is no need to track this piece separately.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → DUPLICATE
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: