Closed Bug 1363482 Opened 7 years ago Closed 7 years ago

Pre-load files needed during startup on a background thread

Categories

(Core :: General, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
mozilla57
Performance Impact high
Tracking Status
firefox57 --- fixed

People

(Reporter: kmag, Assigned: kmag)

References

(Depends on 1 open bug, Blocks 3 open bugs)

Details

(Whiteboard: [triaged])

Attachments

(10 files)

59 bytes, text/x-review-board-request
erahm
: review+
Details
59 bytes, text/x-review-board-request
erahm
: review+
Details
59 bytes, text/x-review-board-request
erahm
: review+
Details
59 bytes, text/x-review-board-request
erahm
: review+
Details
59 bytes, text/x-review-board-request
erahm
: review+
Details
59 bytes, text/x-review-board-request
bholley
: review+
Details
59 bytes, text/x-review-board-request
chutten
: review+
Details
59 bytes, text/x-review-board-request
erahm
: review+
Details
59 bytes, text/x-review-board-request
heycam
: review+
Details
59 bytes, text/x-review-board-request
rhelmer
: review+
Details
This ties into the script preloading from bug 1359653, and in particular makes use of the time the main thread is blocked preparing for off-thread parsing to begin loading files from omnijar on a background thread.

I have an experimental implementation, and so far it's showing good results, so I'm going to work on making it production ready.
Blocks: webext-perf
Attachment #8867437 - Flags: review?(rvitillo) → review?(chutten)
Comment on attachment 8867431 [details]
Bug 1363482: Part 1 - Fix FileLocation(nsIZipArchive*) and add a GetBaseZip() getter.

https://reviewboard.mozilla.org/r/138976/#review142640

::: commit-message-23dc7:2
(Diff revision 1)
> +Bug 1363482: Part 1 - Fix FileLocation(nsIZipArchive*) and add a GetBaseZip() getter. r?erahm
> +

Can you add more detail in the body of the commit message? It's not clear what you're fixing or why we need to expose `mBaseZip`.
Attachment #8867431 - Flags: review?(erahm) → review-
(In reply to Eric Rahm [:erahm] from comment #11)
> Comment on attachment 8867431 [details]
> Bug 1363482: Part 1 - Fix FileLocation(nsIZipArchive*) and add a
> GetBaseZip() getter.
> 
> https://reviewboard.mozilla.org/r/138976/#review142640
> 
> ::: commit-message-23dc7:2
> (Diff revision 1)
> > +Bug 1363482: Part 1 - Fix FileLocation(nsIZipArchive*) and add a GetBaseZip() getter. r?erahm
> > +
> 
> Can you add more detail in the body of the commit message? It's not clear
> what you're fixing or why we need to expose `mBaseZip`.

Sure, but in short:

The FileLocation constructor for zip archives was defined in the header but not actually implemented, so I got a linker error when I tried to use it.

I need to expose mBaseZip so I can compare it against the Omnijar archives. There are getters for similar properties, so it seemed to make sense to have one for that, too.
Comment on attachment 8867436 [details]
Bug 1363482: Part 6 - Add Cu.readFile helper to access the file preloader from JS.

https://reviewboard.mozilla.org/r/138986/#review142648

r- at least without better documentation that answers my questions without having to look at the other patches in this bug.

::: js/xpconnect/idl/xpccomponents.idl:698
(Diff revision 1)
> +    /*
> +     * Reads the given file and returns its contents. If called during early
> +     * startup, the file will be pre-read on a background thread during profile
> +     * startup so its contents will be available the next time they're read.
> +     */

I don't get it. What is returned if we're called during early startup? Do we throw? Should this be two functions, preloadFile and readPreloadedFile, instead?

I haven't looked at the rest of this bug, but at first glance this patch looks like a sync IO API. Shouldn't there be promises involved?
Attachment #8867436 - Flags: review?(bobbyholley) → review-
Before I go too far reviewing this, can you put together numbers on what kind of win we're looking at here? Also can you give me a high level overview of what the patches are attempting to do?
Flags: needinfo?(kmaglione+bmo)
Comment on attachment 8867436 [details]
Bug 1363482: Part 6 - Add Cu.readFile helper to access the file preloader from JS.

https://reviewboard.mozilla.org/r/138986/#review142648

> I don't get it. What is returned if we're called during early startup? Do we throw? Should this be two functions, preloadFile and readPreloadedFile, instead?
> 
> I haven't looked at the rest of this bug, but at first glance this patch looks like a sync IO API. Shouldn't there be promises involved?

It returns the contents of the file synchronously, whether called during early startup or not. If called during early startup, the file is pre-loaded during the next session automatically. If called after, the file is read and forgotten.

Ideally, yes, promises would be involved, but this is meant to deal with the cases where async IO is currently infeasible (at the moment, that's just xulStore.json, for JS callers). The file is read asynchronously, and is usually ready by the time readFile is called. If it's not, we block waiting for the read to finish, but still come out ahead of where we would with pure blocking IO.
(In reply to Kris Maglione [:kmag] (busy; behind on reviews) from comment #15)
> Comment on attachment 8867436 [details]
> Bug 1363482: Part 6 - Add Cu.readFile helper to access the file preloader
> from JS.
> 
> https://reviewboard.mozilla.org/r/138986/#review142648
> 
> > I don't get it. What is returned if we're called during early startup? Do we throw? Should this be two functions, preloadFile and readPreloadedFile, instead?
> > 
> > I haven't looked at the rest of this bug, but at first glance this patch looks like a sync IO API. Shouldn't there be promises involved?
> 
> It returns the contents of the file synchronously, whether called during
> early startup or not. If called during early startup, the file is pre-loaded
> during the next session automatically.

Ah, during the next _session_ - that was the part that was unclear from the comment.

> If called after, the file is read and
> forgotten.
> 
> Ideally, yes, promises would be involved, but this is meant to deal with the
> cases where async IO is currently infeasible (at the moment, that's just
> xulStore.json, for JS callers). The file is read asynchronously, and is
> usually ready by the time readFile is called. If it's not, we block waiting
> for the read to finish, but still come out ahead of where we would with pure
> blocking IO.

I don't understand. It's just one file? Why don't we just preload that file explicitly rather than using this adaptive approach?

If this is all documented in a comment in one of the patches, feel free to point me to it.
(In reply to Eric Rahm [:erahm] from comment #14)
> Before I go too far reviewing this, can you put together numbers on what
> kind of win we're looking at here? Also can you give me a high level
> overview of what the patches are attempting to do?

I need to do some more profiling before I can give you good numbers. For an unpacked build, the numbers are pretty significant. For a packed build, it's more complicated, and depends on system load. A typical try run only shows about a 5ms improvement on Windows. But for a cold start, or on a system under load, I'd expect that to be closer to tens or hundreds of milliseconds, but need to do more work profiling.

There are some readahead issues that confound that, though. In particular, we're currently specifying OS_READAHEAD on Windows for all jar archives:

http://searchfox.org/mozilla-central/rev/484d2b7f51b7aed035147bbb4a565061659d9278/modules/libjar/nsZipArchive.cpp#181-183

which as far as I can tell is a pretty big mistake. Most jar archives that we open are not sequential, and omni.ja only has a small block of sequential data that we already handle specially. I've done some try runs that suggest that disabling that significantly improves startup performance on Windows talos runs, but I need to do some more testing to see how it interacts with these patches.
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #16)
> (In reply to Kris Maglione [:kmag] (busy; behind on reviews) from comment #15)
> > If called after, the file is read and
> > forgotten.
> > 
> > Ideally, yes, promises would be involved, but this is meant to deal with the
> > cases where async IO is currently infeasible (at the moment, that's just
> > xulStore.json, for JS callers). The file is read asynchronously, and is
> > usually ready by the time readFile is called. If it's not, we block waiting
> > for the read to finish, but still come out ahead of where we would with pure
> > blocking IO.
> 
> I don't understand. It's just one file? Why don't we just preload that file
> explicitly rather than using this adaptive approach?
> 
> If this is all documented in a comment in one of the patches, feel free to
> point me to it.

That's the only file I'm currently using this for from JS, but there are a few dozen others that are being loaded this way from native code.

I chose the adaptive approach because I'd already laid the groundwork for it in bug 1359653, so it was much simpler to implement this way than manually pre-loading all of the individual files. It also makes it much easier to maintain, and gives us a reliably ordered list of what's actually being loaded during startup, rather than us having to make guesses ahead of time.

And it helps to have a complete list in a single place during early startup for integrating with the script pre-loader, since it takes advantage of the main thread being blocked to begin loading omnijar archive entries off-thread.
Flags: needinfo?(kmaglione+bmo)
Priority: -- → P1
Whiteboard: [qf] → [qf][triaged]
Kris, can you still give a high level overview of the changes?
Flags: needinfo?(kmaglione+bmo)
(In reply to Eric Rahm [:erahm] from comment #19)
> Kris, can you still give a high level overview of the changes?

Sorry, I thought I'd done that in the commit message, but I guess not. So, in short narrative form:

There are a lot of places where we currently need to synchronously load resources, either from plain files, or files within zip archives. Ideally, most of those would be switched to use asynchronous IO APIs, but several of them are needed at certain points during startup, and need to block the startup sequence at that point until the resources are loaded, anyway.

In the case of resources loaded from omnijar, this usually isn't a major problem, since we use readahead for most files loaded early from omnijar, but we still pay a bit for decompression on the main thread. For files loaded from disk, the story is worse, and we often wind up waiting tens to hundreds of milliseconds for IO during cold start or under load.

These patches build on the work from the script preloader, and provide a synchronous API to read local files and omnijar entries, and store a list of entries needed during early startup in a given session. In the next session, those files are loaded in a background thread, in the order they were first needed in the last session, during early startup, and their contents are stored until they're first needed. In practice, since we currently have several hundred milliseconds from profile startup to the first window being shown, they are almost always ready by the time they're needed.

For plain files, the loading is fairly simple, and the contents are simply read and stored. For omnijar files, there are thread safety issues for off-thread reads. Most of the reading and decompression can safely happen off-thread, but the setup needs to happen either on the main thread, or while the main thread is blocked. These patches take advantage of the fact that the main thread is blocked, without needing access to omnijar, during the script preloader's off-main-thread compile setup to do that initialization work, and then perform the rest of the read and decompression operations asynchronously.
Flags: needinfo?(kmaglione+bmo)
Comment on attachment 8867432 [details]
Bug 1363482: Part 2 - Allow pre-loading file and JAR entry contents off-thread during startup.

https://reviewboard.mozilla.org/r/138978/#review142642

Just a bunch of questions, design-wise I'd prefer not to use strings as data buffers.

::: js/xpconnect/loader/ScriptPreloader.cpp:389
(Diff revision 1)
>  
>      if (!XRE_IsParentProcess()) {
>          return Ok();
>      }
>  
> +    // Note: Code on the main thread *must not access Omnijar in any way* until

Do we have a way of enforcing that?

::: js/xpconnect/loader/URLPreloader.h:39
(Diff revision 1)
> +
> +using namespace mozilla::loader;
> +
> +class ScriptPreloader;
> +
> +class URLPreloader final : public nsIObserver

Can you add a class comment?

::: js/xpconnect/loader/URLPreloader.h:72
(Diff revision 1)
> +
> +    static Result<const nsCString, nsresult> ReadURI(nsIURI* uri, ReadType readType = Forget);
> +
> +    static Result<const nsCString, nsresult> ReadFile(nsIFile* file, ReadType readType = Forget);
> +
> +    static Result<const nsCString, nsresult> ReadFile(const nsCString& path, ReadType readType = Forget);

nit: `const nsACString&`

::: js/xpconnect/loader/URLPreloader.h:99
(Diff revision 1)
> +    virtual ~URLPreloader();
> +
> +    Result<Ok, nsresult> WriteCache();
> +
> +    // Begins reading files off-thread, and ensures that initialization has
> +    // completed before leaving the current scope. The caller *must* ensure that

Again, is there a way to enforce this?

::: js/xpconnect/loader/URLPreloader.h:136
(Diff revision 1)
> +            TypeAppJar,
> +            TypeGREJar,
> +            TypeFile,
> +        };
> +
> +        CacheKey() = default;

Does a default constructor even make sense here?

::: js/xpconnect/loader/URLPreloader.h:194
(Diff revision 1)
> +            return Omnijar::GetReader(OmnijarType());
> +        }
> +
> +        Result<FileLocation, nsresult> ToFileLocation();
> +
> +        EntryType mType = TypeFile;

I think we should just stick with explicit constructor and remove this default value.

::: js/xpconnect/loader/URLPreloader.h:209
(Diff revision 1)
> +                          , public LinkedListElement<URLEntry>
> +    {
> +        URLEntry(const CacheKey& key)
> +            : CacheKey(key)
> +        {
> +            mData.SetIsVoid(true);

What's up with this?

::: js/xpconnect/loader/URLPreloader.h:216
(Diff revision 1)
> +
> +        URLEntry(nsIFile* file)
> +        {
> +            nsCString path;
> +            MOZ_ALWAYS_SUCCEEDS(file->GetNativePath(path));
> +            *this = CacheKey(TypeFile, path);

This is pretty sketchy, can you add a `CacheKey(nsIFile)` ctor instead?

::: js/xpconnect/loader/URLPreloader.h:232
(Diff revision 1)
> +              return a->mReadTime == b->mReadTime;
> +            }
> +
> +            bool LessThan(const URLEntry* a, const URLEntry* b) const
> +            {
> +                return a->mReadTime < b->mReadTime;

TimeStamp's comparision operator asserts if it hasn't been set yet right? Do we need to check for that?

::: js/xpconnect/loader/URLPreloader.h:248
(Diff revision 1)
> +        }
> +
> +        Result<const nsCString, nsresult> Read();
> +        static Result<const nsCString, nsresult> ReadLocation(FileLocation& location);
> +
> +        size_t SizeOfIncludingThis(mozilla::MallocSizeOf mallocSizeOf)

nit: This can be `const`

::: js/xpconnect/loader/URLPreloader.h:267
(Diff revision 1)
> +        TimeStamp mReadTime{};
> +
> +        nsresult mResultCode = NS_OK;
> +    };
> +
> +    // Resolves the given URI to a CacheKey, if the URI is cacheable.



::: js/xpconnect/loader/URLPreloader.cpp:27
(Diff revision 1)
> +#include "nsDebug.h"
> +#include "nsIFile.h"
> +#include "nsIFileURL.h"
> +#include "nsIObserverService.h"
> +
> +#define Comment MacComment

Please add a comment for this stuff, I'm assuming it was unifed bustage? Another option would be to just not build this unified which might be cleaner.

::: js/xpconnect/loader/URLPreloader.cpp:74
(Diff revision 1)
> +    MOZ_COLLECT_REPORT(
> +        "explicit/url-preloader/other", KIND_HEAP, UNITS_BYTES,
> +        ShallowSizeOfIncludingThis(MallocSizeOf),
> +        "Memory used by the URL preloader service itself.");
> +
> +    for (auto elem : IterHash(mCachedURLs)) {

nit: I think `const auto` should work.

::: js/xpconnect/loader/URLPreloader.cpp:216
(Diff revision 1)
> +URLPreloader::WriteCache()
> +{
> +    MOZ_ASSERT(!NS_IsMainThread());
> +
> +    nsCOMPtr<nsIFile> cacheFile;
> +    MOZ_TRY_VAR(cacheFile, GetCacheFile(NS_LITERAL_STRING("-new.bin")));

In theory we could open a temp file with `nsDumpUtils::OpenTempFile` which would avoid the need to check if it exists. I don't feel too strongly about this though.

::: js/xpconnect/loader/URLPreloader.cpp:332
(Diff revision 1)
> +            mReaderInitialized = true;
> +            mal.NotifyAll();
> +            return;
> +        }
> +
> +        int count = 0;

Can you make this `numZipEntries` or something more descriptive?

::: js/xpconnect/loader/URLPreloader.cpp:361
(Diff revision 1)
> +                continue;
> +            }
> +
> +            size_t size = item->RealSize();
> +
> +            entry->mData.SetLength(size);

I'm not super stoked about use a string for the buffer, nsTArray is probably better.

::: js/xpconnect/loader/URLPreloader.cpp:364
(Diff revision 1)
> +            size_t size = item->RealSize();
> +
> +            entry->mData.SetLength(size);
> +            auto data = entry->mData.BeginWriting();
> +
> +            cursors.infallibleEmplaceBack(item, zip, reinterpret_cast<uint8_t*>(data),

I'm a little concerned about the overall memory overhead here. Do you have measurements?

::: js/xpconnect/loader/URLPreloader.cpp:380
(Diff revision 1)
> +    uint32_t i = 0;
> +    for (auto entry : pendingURLs) {
> +        // If there is any other error code, the entry has already failed at
> +        // this point, so don't bother trying to read it again.
> +        if (entry->mResultCode != NS_ERROR_NOT_INITIALIZED) {
> +            continue;

Why don't we notify here?

::: js/xpconnect/loader/URLPreloader.cpp:396
(Diff revision 1)
> +            auto& cursor = cursors[i++];
> +
> +            uint32_t len;
> +            cursor.Copy(&len);
> +            if (len != entry->mData.Length()) {
> +                rv = NS_ERROR_FAILURE;

Lets truncate the data as well.

::: js/xpconnect/loader/URLPreloader.cpp:401
(Diff revision 1)
> +                rv = NS_ERROR_FAILURE;
> +            }
> +        }
> +
> +        MonitorAutoLock mal(mMonitor);
> +        entry->mResultCode = rv;

Why does this need to be protected by the lock?

::: js/xpconnect/loader/URLPreloader.cpp:406
(Diff revision 1)
> +        entry->mResultCode = rv;
> +        mal.NotifyAll();
> +    }
> +
> +    // We're done reading pending entries, so clear the list.
> +    while (auto* elem = pendingURLs.getFirst()) {

You can just call `clear` right?

::: js/xpconnect/loader/URLPreloader.cpp:437
(Diff revision 1)
> +        URLEntry entry(key);
> +
> +        return entry.Read();
> +    }
> +
> +    auto entry = mCachedURLs.LookupOrAdd(key, key);

Should a lock be held here?

::: js/xpconnect/loader/URLPreloader.cpp:457
(Diff revision 1)
> +
> +/* static */ Result<const nsCString, nsresult>
> +URLPreloader::Read(CacheKey& key, ReadType readType)
> +{
> +    if (!sInitialized) {
> +        return URLEntry(key).Read();

Can you add a comment about what's going on here?

::: js/xpconnect/loader/URLPreloader.cpp:474
(Diff revision 1)
> +
> +    return GetSingleton().ReadURIInternal(uri, readType);
> +}
> +
> +Result<const nsCString, nsresult>
> +URLPreloader::ReadFileInternal(nsIFile* file, ReadType readType)

This appears to be unused

::: js/xpconnect/loader/URLPreloader.cpp:499
(Diff revision 1)
> +}
> +
> +/* static */ Result<const nsCString, nsresult>
> +URLPreloader::ReadFile(const nsCString& path, ReadType readType)
> +{
> +    nsCOMPtr<nsIFile> file = new nsLocalFile();

This should be `RefPtr` right?

::: js/xpconnect/loader/URLPreloader.cpp:501
(Diff revision 1)
> +/* static */ Result<const nsCString, nsresult>
> +URLPreloader::ReadFile(const nsCString& path, ReadType readType)
> +{
> +    nsCOMPtr<nsIFile> file = new nsLocalFile();
> +    NS_TRY(file->InitWithNativePath(path));
> +    return ReadFile(file, readType);

The amount of indirection here is a bit odd, you create a new nsIFile with the path just to extract the path out again in the chained ReadFile call.

::: js/xpconnect/loader/URLPreloader.cpp:508
(Diff revision 1)
> +
> +/* static */ Result<const nsCString, nsresult>
> +URLPreloader::Read(FileLocation& location, ReadType readType)
> +{
> +    if (location.IsZip()) {
> +        if (location.GetBaseZip()) {

What's a case where there's no base zip? I'm not even sure GetBaseZip is a thing.

::: js/xpconnect/loader/URLPreloader.cpp:521
(Diff revision 1)
> +    nsCOMPtr<nsIFile> file = location.GetBaseFile();
> +    return ReadFile(file, readType);
> +}
> +
> +/* static */ Result<const nsCString, nsresult>
> +URLPreloader::ReadZip(nsZipArchive* zip, const char* path, ReadType readType)

This could probably take an nsACString.

::: js/xpconnect/loader/URLPreloader.cpp:592
(Diff revision 1)
> +        NS_TRY(file->GetNativePath(path));
> +
> +        return CacheKey(CacheKey::TypeFile, path);
> +    }
> +
> +    // Not a file or Omnijar URI, so currently unsupported.

Perhaps we should log this so that we know there's a type we should add?

::: js/xpconnect/loader/URLPreloader.cpp:615
(Diff revision 1)
> +        NS_TRY(file->InitWithNativePath(mPath));
> +        return FileLocation(file);
> +    }
> +
> +    RefPtr<nsZipArchive> zip = Archive();
> +    return FileLocation(zip, mPath.get());

So FileLocation's copy ctor is actually pretty heavy, I think if we want to do something like this we should add a move ctor or just `new` it.

::: js/xpconnect/loader/URLPreloader.cpp:641
(Diff revision 1)
> +
> +    nsCString result;
> +    result.SetLength(size);
> +    NS_TRY(data.Copy(result.BeginWriting(), size));
> +
> +    return Move(result);

nsCString doesn't actually have a move constructor, using a nsTArray might be a better option. In general I'd prefer if all of these `Read` functions returned an nsTArray (or vector or RangedPtr or whatever).

::: js/xpconnect/loader/URLPreloader.cpp:661
(Diff revision 1)
> +        while (mResultCode == NS_ERROR_NOT_INITIALIZED) {
> +            mal.Wait();
> +        }
> +    }
> +
> +    if (mResultCode == NS_OK && mData.IsVoid()) {

`IsEmpty` is probably fine, I don't think we should be caching empty files at all.

::: js/xpconnect/loader/URLPreloader.cpp:690
(Diff revision 1)
> +URLPreloader::Observe(nsISupports* subject, const char* topic, const char16_t* data)
> +{
> +    nsCOMPtr<nsIObserverService> obs = services::GetObserverService();
> +    if (!strcmp(topic, DELAYED_STARTUP_TOPIC)) {
> +        obs->RemoveObserver(this, DELAYED_STARTUP_TOPIC);
> +        mStartupFinished = true;

We should flush the cache at this point right?
Attachment #8867432 - Flags: review?(erahm) → review-
Comment on attachment 8867432 [details]
Bug 1363482: Part 2 - Allow pre-loading file and JAR entry contents off-thread during startup.

https://reviewboard.mozilla.org/r/138978/#review142642

> Do we have a way of enforcing that?

I couldn't think of any particularly good way, aside from adding a mutex and a bunch of assertions to nsZipArchive. It didn't seem worth it, given the relatively small amount of code this guard protects.

> Does a default constructor even make sense here?

Ideally, no, but I wound up needing default constructed instances that I could use with copy assignment, and the alternatives weren't any better.

> What's up with this?

String values start out as empty strings, but I wanted to be able to tell the difference between an empty file and one that hadn't been read yet. Void string vs. empty string is a good way to do that, hence always setting the initial value to void.

> TimeStamp's comparision operator asserts if it hasn't been set yet right? Do we need to check for that?

We only put entries that have actually been read during this session into the arrays that we sort, so if either of the read times are empty, we'd want a failed assertion either way.

> Please add a comment for this stuff, I'm assuming it was unifed bustage? Another option would be to just not build this unified which might be cleaner.

Yeah. Sorry, I thought I'd added a comment for that.

We need to be able to create nsLocalFile instances before XPCOM is initialized, which means we need to import nsLocalFile.h directly. Unfortunately, nsLocalFile.h imports carbon headers, which define serveral symbols with the same names that other files in this module import from the JS namespace.

I did consider building this non-unified, but it uses a lot of the same templates as the ScriptPreloader does, so I'd prefer to keep them in the same compilation unit to avoid bloat as much as possible.

> In theory we could open a temp file with `nsDumpUtils::OpenTempFile` which would avoid the need to check if it exists. I don't feel too strongly about this though.

I don't think that will work. In order for the move-into-place semantics to work correctly, the two files need to be on the same partition, which is often not the case for temp directories and profiles. I'd also rather that we not create a bunch of useless temp files if the write fails, so I'd rather remove the file from the last session if it somehow got left around.

> I'm not super stoked about use a string for the buffer, nsTArray is probably better.

I have my reservations about it too. But most of the consumers need either nsCStrings or input streams, which we can transparently construct from nsCStrings. nsTArrays don't allow releasing their storage, so if we use them here, we're going to wind up either having to make copies, or construct dependent strings that we have to carefully manage the lifetimes of (particularly for the input stream case). All of which I'd rather avoid if at all possible.

Strings also have the benefit of shared storage and reference counting, so I think we're much less likely to run into memory issues using them than using arrays.

> I'm a little concerned about the overall memory overhead here. Do you have measurements?

No, I was hoping to get measurements from the memory reporters, but all of the data is freed by by the time startup finishes. I could disable the data freeing after the first read and check again, but I do know that it's a fairly small amount of data, most of which ends up being freed quickly enough that we shouldn't wind up with significantly higher peak heap size with this than we do without it.

> Why don't we notify here?

Because we already notified at the end of the block above, so anything waiting on the result code for this would have already been notified.

> Why does this need to be protected by the lock?

It probably doesn't.

> You can just call `clear` right?

No. Only AutoCleanLinkedList has a clear method, and it deletes all of its elements when it removes them. We need the elements to stay alive, though, so we can't use it.

> Should a lock be held here?

No. The only time this hash is touched off-main-thread is while AutoBeginReading is held, during which time this should never be called (I can add some assertions for that). After that, everything that happens off-thread uses a separate linked list, and doesn't touch the hash.

> The amount of indirection here is a bit odd, you create a new nsIFile with the path just to extract the path out again in the chained ReadFile call.

Umm... yeah, these two methods are completely backwards.

> What's a case where there's no base zip? I'm not even sure GetBaseZip is a thing.

There are two ways to construct a FileLocation for a zip archive. One is to pass it the archive, and the other is to pass it a zip file, and have it open the archive when someone attempts to read it.

We only care about the first case, where the FileLocation is constructed using one of the Omnijar archives, which is what happens for built-in chrome.manifest files in packed builds.

> Perhaps we should log this so that we know there's a type we should add?

Well, I know one type that we should (arguably) add is files in extension XPIs, which means that logging here would generate a lot of noise. I may handle that case as a follow-up, but that means a lot of tricky interactions with the jar: cache to avoid opening those archives multiple times, so if we decide it's worth doing, I think it's worth doing in a separate bug.

> nsCString doesn't actually have a move constructor, using a nsTArray might be a better option. In general I'd prefer if all of these `Read` functions returned an nsTArray (or vector or RangedPtr or whatever).

I know, but it its copy constructor is relatively cheap, and handles this well enough with reference counting. I could remove the Move(), but if a slightly-more-efficient Move constructor is added at some point, than this should begin moving rather than copying without any further changes.

> `IsEmpty` is probably fine, I don't think we should be caching empty files at all.

Well, the thing is that if we don't cache the fact that the file is empty, we'll wind up reading it again synchronously when it's first accessed. It probably doesn't happen much in practice, but I'd rather handle it cleanly if it does.

> We should flush the cache at this point right?

Ideally. Sort of. The data for any files that have been read by this point will have already been freed. Any that haven't been read by this point, we could free, but it might still be needed soon after if there are timing quirks.

The actual entries, we need to keep around until we've written the new file list, either way. After that point, we can drop the empty entries. It may make sens to just drop all entires, empty or not, once we've written the file list, though. There probably isn't any reason to worry about unused entries being used after that point.
Comment on attachment 8867440 [details]
Bug 1363482: Part 10 - Preload addonStartup.json off-thread during startup.

https://reviewboard.mozilla.org/r/138994/#review143298

::: toolkit/mozapps/extensions/AddonManagerStartup.cpp:488
(Diff revision 1)
> +  } else if (res.unwrapErr() != NS_ERROR_FILE_NOT_FOUND) {
> +    return res.unwrapErr();
> +  }
>  
>    if (data.IsEmpty() || !ParseJSON(cx, data, locations)) {
>      return NS_OK;

Would it be helpful to log here, with the reason (file not found, no data loaded, could not parse?)
Attachment #8867440 - Flags: review?(rhelmer) → review+
Comment on attachment 8867440 [details]
Bug 1363482: Part 10 - Preload addonStartup.json off-thread during startup.

https://reviewboard.mozilla.org/r/138994/#review143298

> Would it be helpful to log here, with the reason (file not found, no data loaded, could not parse?)

For a read error, we'd throw above. For file not found, we just return null, and don't treat it as an error.

If there's a parse error, ParseJSON sets a pending exception on the JS context, and the function throws even though we return OK. We report that to the console on the JS side. I'm pretty sure I wrote a comment about that at some point...
Comment on attachment 8867436 [details]
Bug 1363482: Part 6 - Add Cu.readFile helper to access the file preloader from JS.

https://reviewboard.mozilla.org/r/138986/#review143384

Ok, I'll defer the overall design discussion to others.

Please rename this API to readFileSynchronously and significantly expand the API comment to answer all the questions I had in this bug. r=me with that.
Attachment #8867436 - Flags: review- → review+
Whiteboard: [qf][triaged] → [qf:p1][triaged]
Comment on attachment 8867439 [details]
Bug 1363482: Part 9 - Preload system stylesheets off-thread during startup.

https://reviewboard.mozilla.org/r/138992/#review144432

::: layout/style/Loader.cpp:1520
(Diff revision 1)
>      else {
>        // either we are loading something inside a document, in which case
>        // we should always have a requestingNode, or we are loading something
>        // outside a document, in which case the loadingPrincipal and the
>        // triggeringPrincipal should always be the systemPrincipal.
> +      auto result = URLPreloader::ReadURI(aLoadData->mURI);

Please write the type out explicitly; URLPreloader::ReadURI isn't really well known enough that I can tell what it returns.  Also add a comment saying why we're calling URLPreloader::ReadURI.

Does the string just contain the raw bytes for the file we've read out of the omnijar (i.e., it hasn't done any decoding into UTF-8 or anything)?  Comments in URLPreloader.h don't say.  If so, this is fine, otherwise I wonder how it would interact with a @charset in the style sheet.  r=me if so.
Attachment #8867439 - Flags: review?(cam) → review+
Comment on attachment 8867437 [details]
Bug 1363482: Part 7 - Preload XULStore.json off-thread during startup.

https://reviewboard.mozilla.org/r/138988/#review145558

Seems legit to me.
Attachment #8867437 - Flags: review?(chutten) → review+
Comment on attachment 8867433 [details]
Bug 1363482: Part 3 - Preload string bundles off-thread during startup.

https://reviewboard.mozilla.org/r/138980/#review150928

r=me
Attachment #8867433 - Flags: review?(erahm) → review+
Comment on attachment 8867434 [details]
Bug 1363482: Part 4 - Preload INI files off-thread during startup.

https://reviewboard.mozilla.org/r/138982/#review150930

r=me assuming my comments are addressed / explained.

::: commit-message-16c64:1
(Diff revision 1)
> +Bug 1363482: Part 4 - Preloade INI files off-thread during startup. r?erahm

nit: ye olde englishe

::: xpcom/base/nsINIParser.cpp:13
(Diff revision 1)
>  #include "nsCRTGlue.h"
>  #include "nsError.h"
>  #include "nsIFile.h"
>  #include "nsINIParser.h"
>  #include "mozilla/FileUtils.h" // AutoFILE
> +#include "mozilla/URLPreloader.h" // AutoFILE

Nit: comment seems incorrect?

::: xpcom/base/nsINIParser.cpp:63
(Diff revision 1)
>  };
>  
>  nsresult
>  nsINIParser::Init(nsIFile* aFile)
>  {
> -  /* open the file. Don't use OpenANSIFileDesc, because you mustn't
> +  auto result = URLPreloader::ReadFile(aFile);

I'm assuming `ReadFile` does the same windows voodoo? Also is it reading in binary mode like this wants?

::: xpcom/base/nsINIParser.cpp
(Diff revision 1)
>      // files are Utf-8 anyway.  Just skip the BOM and process as usual.
> -    buffer = &mFileContents[3];
> -  }
> -
> -#ifdef XP_WIN
> +    mFileContents.Append(aStr);
> +    buffer = mFileContents.BeginWriting() + 3;
> +  } else {
> +    if (StringHead(aStr, 2) == "\xFF\xFE") {
> -  if (flen >= 2 &&

This used to be windows only, is there a reason you've changed that?

::: xpcom/base/nsINIParser.cpp:106
(Diff revision 1)
> -      return NS_ERROR_FAILURE;
> -    }
>  
> -    UniquePtr<char[]> utf8Buffer(new char[flen]);
> -    if (WideCharToMultiByte(CP_UTF8, 0, reinterpret_cast<LPWSTR>(buffer), -1,
> -                            utf8Buffer.get(), flen, nullptr, nullptr) == 0) {
> +      AppendUTF16toUTF8(Substring(str, 1), mFileContents);
> +    } else {
> +      mFileContents.Append(aStr);

I *think* `Append` is going to do a full copy whereas just assigning might be able to share the buffer.

::: xpcom/base/nsINIParser.cpp:109
(Diff revision 1)
> -    if (WideCharToMultiByte(CP_UTF8, 0, reinterpret_cast<LPWSTR>(buffer), -1,
> -                            utf8Buffer.get(), flen, nullptr, nullptr) == 0) {
> +    } else {
> +      mFileContents.Append(aStr);
> -      return NS_ERROR_FAILURE;
>      }
> -    mFileContents = Move(utf8Buffer);
> -    buffer = mFileContents.get();
> +
> +    buffer = mFileContents.BeginWriting();

It would be nice if we could do a read-only buffer and avoid having to make a copy. That would mean switching over from NS_strtok to `Tokenizer` or somethign like that. That's a bit out of scope for what you're doing, can you just file a follow up for that?
Attachment #8867434 - Flags: review?(erahm) → review+
Comment on attachment 8867435 [details]
Bug 1363482: Part 5 - Preload component manifests off-thread during startup.

https://reviewboard.mozilla.org/r/138984/#review150962

r=me

::: xpcom/components/nsComponentManager.cpp:562
(Diff revision 1)
> -  uint32_t len;
> -  FileLocation::Data data;
> -  UniquePtr<char[]> buf;
> -  nsresult rv = aFile.GetData(data);
> -  if (NS_SUCCEEDED(rv)) {
> -    rv = data.GetSize(&len);
> +
> +  auto result = URLPreloader::Read(aFile);
> +  if (result.isOk()) {
> +    nsCString buf(result.unwrap());
> +
> +    ParseManifest(aType, aFile, buf.BeginWriting(), aChromeOnly, aXPTOnly);

Again it would be nice if we could avoid doing a copy here but it's no worse than the old code so meh.
Attachment #8867435 - Flags: review?(erahm) → review+
Comment on attachment 8867434 [details]
Bug 1363482: Part 4 - Preload INI files off-thread during startup.

https://reviewboard.mozilla.org/r/138982/#review150930

> Nit: comment seems incorrect?

Oops...

> I'm assuming `ReadFile` does the same windows voodoo? Also is it reading in binary mode like this wants?

NSPR does, yes. The Windows special casing was only necessary for stdio.

> This used to be windows only, is there a reason you've changed that?

There didn't seem to be a good reason for it, other than the previous use of the win32 API for conversion. UTF-16 with BOMs is typically only used on Windows, but if we see a UTF-16 BOM at the start of a file anywhere else, I don't see what we gain from not treating it as UTF-16.

> I *think* `Append` is going to do a full copy whereas just assigning might be able to share the buffer.

I think you're right, but we need to call BeginWriting in order to tokenize, so we wind up doing a full copy either way.

> It would be nice if we could do a read-only buffer and avoid having to make a copy. That would mean switching over from NS_strtok to `Tokenizer` or somethign like that. That's a bit out of scope for what you're doing, can you just file a follow up for that?

I'm inclined to agree, but, yeah, it seems better as a follow-up.
Comment on attachment 8867438 [details]
Bug 1363482: Part 8 - Preload preferences files off-thread during startup.

https://reviewboard.mozilla.org/r/138990/#review150968

r=me with minor changes.

::: modules/libpref/Preferences.cpp:1068
(Diff revision 1)
> -    offset += amtRead;
> -    if (offset == fileSize) {
> -      break;
> -    }
> +  }
> +
> +  auto data = result.unwrap();

Can you make this `const auto&`?

::: modules/libpref/Preferences.cpp:1233
(Diff revision 1)
>  
>  static nsresult pref_ReadPrefFromJar(nsZipArchive* jarReader, const char *name)
>  {
> -  nsZipItemPtr<char> manifest(jarReader, name, true);
> -  NS_ENSURE_TRUE(manifest.Buffer(), NS_ERROR_NOT_AVAILABLE);
> +  auto result = URLPreloader::ReadZip(jarReader, name);
> +  if (result.isErr()) {
> +    return result.unwrapErr();

I think we still want to return `NS_ERROR_NOT_AVAILABLE` (unless `ReadZip` gives a better error).

::: modules/libpref/Preferences.cpp:1235
(Diff revision 1)
>  {
> -  nsZipItemPtr<char> manifest(jarReader, name, true);
> -  NS_ENSURE_TRUE(manifest.Buffer(), NS_ERROR_NOT_AVAILABLE);
> +  auto result = URLPreloader::ReadZip(jarReader, name);
> +  if (result.isErr()) {
> +    return result.unwrapErr();
> +  }
> +  auto& manifest = result.unwrap();

Can you make this `const auto&`?
Attachment #8867438 - Flags: review?(erahm) → review+
Comment on attachment 8867438 [details]
Bug 1363482: Part 8 - Preload preferences files off-thread during startup.

https://reviewboard.mozilla.org/r/138990/#review150968

> Can you make this `const auto&`?

Yeah. I was assuming ParseBuf needed a non-const buffer in the first iteration, and didn't think to change it after I cleaned things up.
Blocks: 1370658
This is an assigned P1 bug without activity in two weeks. 

If you intend to continue working on this bug for the current release/iteration/sprint, remove the 'stale-bug' keyword.

Otherwise we'll reset the priority of the bug back to '--' on Monday, August 28th.
Keywords: stale-bug
Keywords: stale-bug
Comment on attachment 8867432 [details]
Bug 1363482: Part 2 - Allow pre-loading file and JAR entry contents off-thread during startup.

https://reviewboard.mozilla.org/r/138978/#review180658

Pretty sure this addresses the previous issues, thanks!

::: js/xpconnect/loader/URLPreloader.cpp:529
(Diff revisions 1 - 2)
> -        CacheKey key(CacheKey::TypeAppJar, nsDependentCString(path));
> +        CacheKey key(CacheKey::TypeAppJar, path);
>          return Read(key, readType);
>      }
>  
>      // Not an Omnijar archive, so just read it directly.
> -    FileLocation location(zip, path);
> +    FileLocation location(zip, path.BeginReading());

Sadly this probably needs to use `PromiseFlatCString`
Attachment #8867432 - Flags: review?(erahm) → review+
Comment on attachment 8867431 [details]
Bug 1363482: Part 1 - Fix FileLocation(nsIZipArchive*) and add a GetBaseZip() getter.

https://reviewboard.mozilla.org/r/138976/#review180660
Attachment #8867431 - Flags: review?(erahm) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/466dabdf2c4afe34a1a987f44c584c4abe1f48c4
Bug 1363482: Part 1 - Fix FileLocation(nsIZipArchive*) and add a GetBaseZip() getter. r=erahm

https://hg.mozilla.org/integration/mozilla-inbound/rev/1920f26831d238822ce8473051707aa210136195
Bug 1363482: Part 2 - Allow pre-loading file and JAR entry contents off-thread during startup. r=erahm

https://hg.mozilla.org/integration/mozilla-inbound/rev/fff508de0d197bc4a9a70a89d94937bec2b4051c
Bug 1363482: Part 3 - Preload string bundles off-thread during startup. r=erahm

https://hg.mozilla.org/integration/mozilla-inbound/rev/807cf55e7f45c4a65d573d976b41e0c4c5a2f3e6
Bug 1363482: Part 4 - Preload INI files off-thread during startup. r=erahm

https://hg.mozilla.org/integration/mozilla-inbound/rev/b4a57caae7d31696b3605daf24f385a5f20ecf24
Bug 1363482: Part 5 - Preload component manifests off-thread during startup. r=erahm

https://hg.mozilla.org/integration/mozilla-inbound/rev/1f5bed26622ab351107afa822a7218d6603deec6
Bug 1363482: Part 6 - Add Cu.readFile helper to access the file preloader from JS. r=bholley

https://hg.mozilla.org/integration/mozilla-inbound/rev/191e7abe8ce4f4398b3a19b22e53a4ec34b84a09
Bug 1363482: Part 7 - Preload XULStore.json off-thread during startup. r=chutten

https://hg.mozilla.org/integration/mozilla-inbound/rev/4b9237e1824060357c47315cc04700235bee6179
Bug 1363482: Part 8 - Preload preferences files off-thread during startup. r=erahm

https://hg.mozilla.org/integration/mozilla-inbound/rev/2f14f9759a376c1c9c0173e47b649cab47c9215e
Bug 1363482: Part 9 - Preload system stylesheets off-thread during startup. r=heycam

https://hg.mozilla.org/integration/mozilla-inbound/rev/a9a24a28013bb346dc75044d02d4266842349e6c
Bug 1363482: Part 10 - Preload addonStartup.json off-thread during startup. r=rhelmer
Blocks: 1395604
Depends on: 1396366
Depends on: 1404743
Depends on: 1417842
Depends on: 1418325
Depends on: 1448047
Blocks: 1385755
Performance Impact: --- → P1
Whiteboard: [qf:p1][triaged] → [triaged]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: