Closed Bug 775370 Opened 12 years ago Closed 10 years ago

Don't use PermissionManager to save stuff in nsStrictTransportSecurityService

Categories

(Core :: Security: PSM, defect, P2)

defect

Tracking

()

VERIFIED FIXED
mozilla35
Tracking Status
firefox34 --- verified
firefox35 --- verified

People

(Reporter: mounir, Assigned: keeler)

References

(Blocks 1 open bug, )

Details

Attachments

(2 files, 15 obsolete files)

70.40 KB, patch
keeler
: review+
Details | Diff | Splinter Review
42.44 KB, patch
keeler
: review+
Details | Diff | Splinter Review
security/manager/boot/src/nsStrictTransportSecurityService.cpp is using permission manager to save data but it's actually not related to permissions. We should save that to a global indexeddb for Firefox/Gecko.
No longer blocks: 766057
Depends on: 766057
Camilo and Sid were discussing doing something different for this before.
I spoke with Taras about this, and he recommended using a compressed, flat text file. The amount of data nsStrictTransportSecurityService needs to save is very small. Each entry consists of 253 bytes max for a domain name, 8 bytes for an expiration time, and a few more bytes for some metadata. Adoption of HSTS is not very widespread yet (looks like 1370 out of the Alexa top 1000000 as of March: http://www.veracode.com/blog/2013/03/security-headers-on-the-top-1000000-websites-march-2013-report/ ). Using any kind of off-the-shelf database to store this would be inefficient and overly complex.

The new setup would look like this:
* upon initialization, the service would read in the file and populate an in-memory hash table
* any lookups would use the in-memory table (very fast, particularly compared to what we do now)
* any changes to the data would update the table and spin off an asynchronous atomic write to the storage file

(this would have the added benefit of allowing the service to be thread-safe, given a lock on the in-memory table)
Assignee: nobody → dkeeler
Attached patch patch (obsolete) — Splinter Review
Camilo - when you get a chance, could you review this? This patch transitions the site security service away from the permission manager to the new backend, as described above.
Eventually, the backend file will be gzipped, but that will be at least a separate patch if not a follow-up bug.
There's a lot of refactoring going on here, so let me know if you want me to split this patch up into multiple parts.
Thanks!
Attachment #791468 - Flags: review?(cviecco)
Comment on attachment 791468 [details] [diff] [review]
patch

Garrett - if you could have a look as well, that would be great.
Attachment #791468 - Flags: review?(grobinson)
Comment on attachment 791468 [details] [diff] [review]
patch

Review of attachment 791468 [details] [diff] [review]:
-----------------------------------------------------------------

En excellent start but (besides the issues in each file):

How do we integrate "clear recent history" ?
Should we add some telemetry so that we can have an idea of when compression would be nice?

::: security/manager/boot/src/nsSiteSecurityService.cpp
@@ +41,5 @@
>  static PRLogModuleInfo *
>  GetSSSLog()
>  {
>    static PRLogModuleInfo *gSSSLog;
> +  if (!gSSSLog) {

braces++

@@ +60,2 @@
>    , mExpired(false)
>    , mIncludeSubdomains(false)

once we have PK pinning these would be separated into sts or pinning as pinning has also an include subdomains, and expiration fields.

@@ +111,2 @@
>  
> +#define SSS_STATE_FILE_NAME "sssStateFile.txt"

I would name this StateSecuriyServiceState.txt

@@ +151,5 @@
> +    SSSLOG(("read state line '%s'", line.get()));
> +
> +    // The format of each line is each property mHost, mExpireTime, mStsStatus,
> +    // mIncludeSubdomains of nsSSSHostEntry, separated by a single
> +    // comma each.

This I quite short. and unspecified. For each field specify the set of values possible.
mHost -> hostname in punycode? (ascii) format.

EX:
The state file is a comma separated list of fields. With X fields per line being:
hostname, stsexpiration time, stsStatus, stsIncludesubdomain. 
hostname is the punycode hostname
stsexpiration time is the expiration time in S from the unix epoch? (or something like that).....


Also, pkpin fields contain commas. What would you suggest for them, use ';' for the field separator in the file? (we could then reuse your parser?)

@@ +155,5 @@
> +    // comma each.
> +    nsAutoCString host;
> +    PRTime expireTime = 0;
> +    uint32_t stsStatus = 0;
> +    uint32_t includeSubdomains = false; // sscanf doesn't do bools

I think an explicit 0 is better.. (but I can be shown wrong)

@@ +266,5 @@
> +  mData.GetMutableData(&buf);
> +  NS_ENSURE_TRUE(buf, NS_ERROR_OUT_OF_MEMORY);
> +  rv = stringInputStream->SetData(buf, mData.Length());
> +  NS_ENSURE_SUCCESS(rv, rv);
> +  rv = NS_AsyncCopy(stringInputStream, outputStream, mEventTarget,

Dont you need a lock aroudn the file?
What happens is multple threads try to write the file at the same time?

@@ +276,5 @@
> +
> +nsresult
> +nsSiteSecurityService::AsyncWriteState()
> +{
> +  nsAutoCString output;

In all calls to the asyncwrite state you hae already a lock on the mutex. But it is not obvious that you need it, please document.
Attachment #791468 - Flags: review?(cviecco) → review-
Thanks, Camilo!

(In reply to Camilo Viecco (:cviecco) from comment #5)
> Comment on attachment 791468 [details] [diff] [review]
> How do we integrate "clear recent history" ?

Good call - I added this in.

> Should we add some telemetry so that we can have an idea of when compression
> would be nice?

Maybe this could be a follow-up. If compression is quick and easy, I'll just add it in and we won't need to worry about it anyway.

> Also, pkpin fields contain commas. What would you suggest for them, use ';'
> for the field separator in the file? (we could then reuse your parser?)

I'm not sure - I think we should cross that bridge when we come to it.

> Dont you need a lock aroudn the file?
> What happens is multple threads try to write the file at the same time?

I added a comment explaining this. Short answer is the writes are basically atomic, and the last one wins, which is what we want.
Attached patch patch v2 (obsolete) — Splinter Review
Attachment #791468 - Attachment is obsolete: true
Attachment #791468 - Flags: review?(grobinson)
Attachment #792999 - Flags: review?(cviecco)
Comment on attachment 792999 [details] [diff] [review]
patch v2

Brian - I think you should probably have a look at this. Let me know if it would be useful to meet on vidyo/irc/whatever to explain the goals/motivations behind this patch. Thanks!
Attachment #792999 - Flags: review?(brian)
Comment on attachment 792999 [details] [diff] [review]
patch v2

Review of attachment 792999 [details] [diff] [review]:
-----------------------------------------------------------------

Getting there, thanks for the clarifications on the locking of the file.

::: security/manager/boot/src/nsSiteSecurityService.cpp
@@ +75,4 @@
>  nsSiteSecurityService::nsSiteSecurityService()
> +  : mHostTablesMutex("nsSiteSecurityService::mHostTablesMutex")
> +  , mUsePreloadList(true)
> +  , mWorkerThread(nullptr)

+ mStateFile(nullptr),

@@ +105,4 @@
>  
> +  nsresult rv = NS_GetSpecialDirectory(NS_APP_USER_PROFILE_LOCAL_50_DIR,
> +                                       getter_AddRefs(mStateFile));
> +  if (NS_SUCCEEDED(rv)) {

Since you consider no profile ok, why not change logic here to:

if (NS_FAILED(rv)) {
   SSLOG(("no profile dir, no loading or writing to state is enabled"));
   return NS_OK;
}
rv = mStateFile->AppendNative(.....

@@ +174,5 @@
> +      lossOfData = true;
> +      continue;
> +    }
> +    host.Assign(line.get(), commaIndex);
> +    int32_t matches = PR_sscanf(line.get() + commaIndex, ",%lld,%lu,%lu",

Why lld and not llu for expire time? (curious)

@@ +299,5 @@
> +// This is because we synchronously marshall the data to be written and
> +// then fire off an event to our worker thread to actually do the writing.
> +nsresult
> +nsSiteSecurityService::AsyncWriteState()
> +{

Since mStateFile can be null, why not terminate early here on that case? (ie move condition from the Run(), into here.. in the run it should be error?

@@ +309,5 @@
> +  }
> +  mozilla::RefPtr<nsSSSStateWriter> job(new nsSSSStateWriter(output.get(),
> +                                                             mWorkerThread,
> +                                                             mStateFile));
> +  nsresult rv = mWorkerThread->Dispatch(job, NS_DISPATCH_NORMAL);

mWorkerThread could be null (not initialized), need to check for this.
Attachment #792999 - Flags: review?(cviecco) → review-
Comment on attachment 792999 [details] [diff] [review]
patch v2

Review of attachment 792999 [details] [diff] [review]:
-----------------------------------------------------------------

In general, there seems to be a lot to talk about. Maybe bugzilla/splinter is not the best way to have a conversation about this.

This logic is very much like what is in nsCertOverrideService.cpp. It seems very worthwhile to me to, at least, factor out the common logic between nsCertOverrideService and the SiteSecurityService. Also, it would be good to have an explanation of how nsCertOverrideService will get folded into this.

Also, in nsCertOverrideService, note that that service is restricted to being Init()'d on the main thread and that it does SharedSSLState::NoteCertOverrideServiceInstantiated(). I remember there were good reasons for doing things that way but I don't quite remember what they were. I suggest using hg/mxr blame to investigate why that was done that way and see if the same reasoning applies to this service too.

You are basically building a persistent, thread-safe hashtable that maps domains to values that avoids IO on the main thread and that understands the profile system. That's a lot of logic, even before we get to any SSL-specific knowledge that this code has. Also, if Taras gave you the recommendation to do this this way, then presumably other people are going to want to do things this way too. Both of these aspects make me thing that it would be useful to split this code into a generic smart persistent async hashtable part and a separate SiteSecurityService part that understands the SSL stuff. (In particular, if this is a good idea for SSL stuff, maybe it is also a good idea to replace the permission manager backend with this new backend in the future.)

::: security/manager/boot/src/nsSiteSecurityService.cpp
@@ +35,5 @@
>  #include "nsSTSPreloadList.inc"
>  
> +#define HSTS_SET 1
> +#define HSTS_UNSET 0
> +#define HSTS_KNOCKOUT 2

nit: use an enum.

@@ +42,5 @@
>  static PRLogModuleInfo *
>  GetSSSLog()
>  {
>    static PRLogModuleInfo *gSSSLog;
> +  if (!gSSSLog) {

This is not thread-safe. Is that intentional? If it isn't intended to be thread-safe, then what thread is it supposed to be on?

I suggest you just initialize the log module in the Init() of the service.

@@ +75,4 @@
>  nsSiteSecurityService::nsSiteSecurityService()
> +  : mHostTablesMutex("nsSiteSecurityService::mHostTablesMutex")
> +  , mUsePreloadList(true)
> +  , mWorkerThread(nullptr)

It isn't necessary to explicitly initialize mStateFile or mWorkerThread because they are nsCOMPtrs that default construct to nullptr.

@@ +114,2 @@
>  
> +  rv = ReadState();

nsSiteSecurityService will usually be initialized on the main thread, close to startup, right? That would make this main-thread IO at startup, which is double-bad. Is there a way to avoid that?

@@ +158,5 @@
> +    NS_ENSURE_SUCCESS(rv, rv);
> +    SSSLOG(("read state line '%s'", line.get()));
> +
> +    // The state file is a comma-separated list of fields per host, separated
> +    // by newlines. Each line consists of, in order:

Why invent a new format for storing HSTS state, when we could instead just store the actual Strict-Transport-Security header value and then parse that with your existing HSTS header parser?

Also, I am having trouble seeing how this file format would be extended to add key pinning and cert override and the other kinds of information, in an unambiguous way. Could you comment on how the file format would be extended for key pinning?

@@ +234,5 @@
> +
> +  return NS_OK;
> +}
> +
> +void nsSSSStateWriteFinished(void *aUnused, nsresult aRv)

nit: don't use the nsXXX prefix for new code. Put it in the mozilla namespace instead, unprefixed.

@@ +296,5 @@
> +}
> +
> +// When AsyncWriteState is called, the lock on mHostTable must be held.
> +// This is because we synchronously marshall the data to be written and
> +// then fire off an event to our worker thread to actually do the writing.

I find it is more obvious that such conditions are satisfied if you add a const MutexAutoLock & /*proofOfLock*/ witness argument to the function. Then there's almost zero chance that the function will ever get called without the lock being held, and it is easy to reviewers to see this.

@@ +362,5 @@
> +    entry->mHSTSExpireTime = expiretime;
> +    entry->mHSTSStatus = HSTS_SET;
> +    entry->mHSTSIncludeSubdomains = aIncludeSubdomains;
> +    if (!isPrivate) {
> +      AsyncWriteState();

Is it really necessary to write the data to disk as soon as we change the HSTS information? Usually this will happen when we are in a performance-sensitive area. Also, since every response from a server will probably have the Strict-Transport-Security flag set, wouldn't we be writing to this file over and over? It seems like it would be better to either flush the file out periodically or near shutdown (or both).
Attachment #792999 - Flags: review?(brian) → review-
Thank you for the reviews. I think this would benefit from a more formal design document, which will be coming soon.
Attached patch patch v3 (obsolete) — Splinter Review
This patch implements the linked design document. It doesn't yet convert nsCertOverrideService to use DataStorage (previously "SimpleStorage" - what I'm calling the new utility). That will not be a trivial amount of work and should be in a follow-up bug.
Attachment #792999 - Attachment is obsolete: true
Attachment #796159 - Flags: review?(brian)
Attachment #796159 - Flags: review?(grobinson)
Comment on attachment 796159 [details] [diff] [review]
patch v3

Review of attachment 796159 [details] [diff] [review]:
-----------------------------------------------------------------

Why make so generic? protocol buffers is now out of scope? What is the forward compat plan?
Comment on attachment 796159 [details] [diff] [review]
patch v3

Review of attachment 796159 [details] [diff] [review]:
-----------------------------------------------------------------

I would prefer to use protocol buffers and not make a generic solution yet (particularly if we plan to go to protocol buffers anyway).

::: security/manager/boot/src/DataStorage.cpp
@@ +111,5 @@
> +    rv = lineInputStream->ReadLine(line, &moreToRead);
> +    NS_ENSURE_SUCCESS(rv, rv);
> +
> +    // The backing file consist of a list of <key>,<value> pairs,
> +    // separated by newlines.

I allow comments on the file by allowing a '#' at the beggning of a line mean ignore me.

@@ +113,5 @@
> +
> +    // The backing file consist of a list of <key>,<value> pairs,
> +    // separated by newlines.
> +    nsCString key, value;
> +    int32_t commaIndex = line.FindChar(',', 0);

How do you handle escaping a value that has a ',' character?
and if your first external consumer is the certoverride why not make the separator '\t' as they do so you dont
have compatibiluty issues and can reuse their current file?
Attachment #796159 - Flags: review?(cviecco) → review-
Comment on attachment 796159 [details] [diff] [review]
patch v3

Review of attachment 796159 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Camilo Viecco (:cviecco) from comment #15)
> I would prefer to use protocol buffers and not make a generic solution yet
> (particularly if we plan to go to protocol buffers anyway).

I suggest that we file a follow-up bug to switch to protocol buffers. It isn't urgent to do that and in fact I don't know it even makes sense because this is just a key-value store. IMO, YAGNI applies to the protocol buffers argument, based on what we anticipate using this for. If future needs go beyond out anticipation, then we can deal with it in the future.

> I allow comments on the file by allowing a '#' at the beggning of a line
> mean ignore me.

We don't need to support comments because these files are not intended to be human-written or human-read. Also, these files are intended to be compact (efficient to read/write). Also, it is hard to preserve comments when rewriting a file.

> How do you handle escaping a value that has a ',' character?
> and if your first external consumer is the certoverride why not make the
> separator '\t' as they do so you dont
> have compatibiluty issues and can reuse their current file?

This last idea of being able to reuse the existing certoverride.txt file with this is good if we can do it.

::: browser/base/content/sanitize.js
@@ +421,5 @@
> +
> +        // Clear site security settings
> +        var sss = Components.classes["@mozilla.org/ssservice;1"]
> +                            .getService(Components.interfaces.nsISiteSecurityService);
> +        sss.clearState(Components.interfaces.nsISiteSecurityService.HEADER_HSTS);

Should the clearState parameter be a bitfield so that you can coalesce multiple calls to clearState into one?

Also, IIRC, there is an expectation that when data is cleared the clearing (on disk) is done before the dialog box closes, so that the user doesn't accidentally close the browser before all the data has actually been cleared on disk. So, do we need a "synchronous" flag to handle that?

::: security/manager/boot/src/DataStorage.cpp
@@ +111,5 @@
> +    rv = lineInputStream->ReadLine(line, &moreToRead);
> +    NS_ENSURE_SUCCESS(rv, rv);
> +
> +    // The backing file consist of a list of <key>,<value> pairs,
> +    // separated by newlines.

No need for comments because these files are not intended to be human-readable or human-writable. (Remember that eventually we want to compress them.)

@@ +113,5 @@
> +
> +    // The backing file consist of a list of <key>,<value> pairs,
> +    // separated by newlines.
> +    nsCString key, value;
> +    int32_t commaIndex = line.FindChar(',', 0);

The important thing is that the **key** doesn't have a comma. I think it is safe to assume that the keys are always domain names or URLs at the most. That means that we just need to find a separator that is URL-safe such as ' ' (space).
Attachment #796159 - Flags: review- → review?(cviecco)
Attached patch patch v4 (obsolete) — Splinter Review
I agree about punting on protobuf.
Changed DataStorage to take an optional delimiter to make the nsCertOverrideService transition easier.
Changed clearState to clearAll (with no arguments) because that's really what it does. This causes DataStorage to synchronously remove the backing file, so the clear-on-shutdown case is handled.
Also, as an FYI: I'm intending to introduce a more powerful enumeration-based interface to nsISiteSecurityService and DataStorage later, but that's not necessary now.
Attachment #796159 - Attachment is obsolete: true
Attachment #796159 - Flags: review?(grobinson)
Attachment #796159 - Flags: review?(cviecco)
Attachment #796159 - Flags: review?(brian)
Attachment #796936 - Flags: review?(cviecco)
Attached patch patch v5 (obsolete) — Splinter Review
This is the patch that includes the changes to only access Preferences on the main thread and to remove the NS_NewURI call.
Attachment #796936 - Attachment is obsolete: true
Attachment #796936 - Flags: review?(cviecco)
Attachment #796936 - Flags: review?(brian)
Attachment #797589 - Flags: review?(brian)
Comment on attachment 797589 [details] [diff] [review]
patch v5

Review of attachment 797589 [details] [diff] [review]:
-----------------------------------------------------------------

::: security/manager/ssl/tests/unit/test_sss_savestate.js
@@ +60,5 @@
> +
> +
> +  do_test_pending();
> +  Services.prefs.setIntPref("test.datastorage.writetimer", 100);
> +  do_timeout(1000, checkStateWritten);

Is there a way to do this without the timeout? by adding an explicit syncronous push to the site security service? or allowing a writetemer=-1 value mean do sync writes?
Attachment #797589 - Flags: review?(cviecco) → review-
This patch has DataStorage notify observers when it has written to its file. The way it differentiates between different DataStorage instances is by sending the name of the file it wrote as data in the notification.
This patch also fixes a bug where the wrong table was being cleared after observing "last-pb-context-exited".
This will be folded in with the main patch before check-in; this is just for review purposes.
Attachment #798025 - Flags: review?(cviecco)
Attachment #798025 - Flags: review?(cviecco) → review+
Comment on attachment 797589 [details] [diff] [review]
patch v5

Review of attachment 797589 [details] [diff] [review]:
-----------------------------------------------------------------

I stopped about 2/3 through the review because I think we need to answer the questions of host vs. origin vs. URI. HSTS is defined to be per-host but I don't think key pinning or must-staple should be defined that way. What do you think?

::: security/manager/boot/src/DataStorage.cpp
@@ +1,1 @@
> +/* This Source Code Form is subject to the terms of the Mozilla Public

nit: VIM/Emacs mode line.

@@ +24,5 @@
> +DataStorage::DataStorage(const nsCString& aFilename, char aDelimiter)
> +  : mMutex("DataStorage::mMutex")
> +  , mFilename(aFilename)
> +  , mDirty(false)
> +  , mDelimiter(aDelimiter)

Why does the delimiter need to be a parameter? Can't we always just use '\t'?

@@ +29,5 @@
> +{
> +}
> +
> +nsresult
> +DataStorage::Init() {

nit: brace on its own line.

@@ +52,5 @@
> +    rv = mBackingFile->AppendNative(mFilename);
> +    NS_ENSURE_SUCCESS(rv, rv);
> +  }
> +
> +  rv = ReadData();

This will do I/O on the main thread. We should avoid doing that, if practical. If not so practical, then add a comment about this problem and file a follow-up bug. One way to solve the problem would be to kick off an async read and then have DataStorage::Get block while it waits for that thread to do the reading.

@@ +63,5 @@
> +    os->AddObserver(this, "profile-do-change", false);
> +  }
> +
> +  rv = NS_NewThread(getter_AddRefs(mWorkerThread));
> +  NS_ENSURE_SUCCESS(rv, rv);

Is it important to allocate the thread now? The vast majority of the time, this thread won't be doing anything. Could we just create the thread when we actually want to run it?

@@ +65,5 @@
> +
> +  rv = NS_NewThread(getter_AddRefs(mWorkerThread));
> +  NS_ENSURE_SUCCESS(rv, rv);
> +
> +  rv = SetupWriteTimer();

Similarly, should we lazily create the timer? The fewer things we do in Init(), the better, since this happens at startup.

@@ +77,5 @@
> +}
> +
> +nsresult
> +DataStorage::ReadData()
> +{

Which thread will this run on?

If you are depending on it running only on the main thread, please check for that here.

Also, you need to be holding the mutex here, or explain why it isn't necessary to hold the mutex.

@@ +90,5 @@
> +  nsresult rv = mBackingFile->Exists(&exists);
> +  NS_ENSURE_SUCCESS(rv, rv);
> +  if (!exists) {
> +    return NS_OK;
> +  }

This block is unnecessary. The calls below will fail with a specific error code if the file does not exist. You can just check that error code.

@@ +138,5 @@
> +  nsCString value;
> +  bool foundValue = false;
> +  if (aType == DataStorage_Private) {
> +    foundValue = mPrivateDataTable.Get(aKey, &value);
> +  }

I believe that private browsing mode is NOT supposed to fall back to using non-private data, so we shouldn't check mTemporaryDataTable or mPersistentDataDable if the caller is looking for private data.

@@ +158,5 @@
> +{
> +  MutexAutoLock lock(mMutex);
> +
> +  // The key must be a non-empty string containing no instances of the
> +  // delimiter (a space by default)

If we need to have parameterized delimiters then we shouldn't have a default.

@@ +161,5 @@
> +  // The key must be a non-empty string containing no instances of the
> +  // delimiter (a space by default)
> +  NS_ENSURE_TRUE(!aKey.IsEmpty(), NS_ERROR_UNEXPECTED);
> +  int32_t delimiterIndex = aKey.FindChar(mDelimiter, 0);
> +  NS_ENSURE_TRUE(delimiterIndex < 0, NS_ERROR_UNEXPECTED);

NS_ENSURE_ARG(!aKey.isEmpty());
NS_ENSURE_ARG(delimiterIndex < 0);

Then the result will be NS_ERROR_INVALID_ARG.

@@ +174,5 @@
> +    case DataStorage_Private:
> +      mPrivateDataTable.Put(aKey, aValue);
> +      break;
> +    default:
> +      NS_ERROR("given bad DataStorage storage type");

return NS_ERROR_INVALID_ARG;

@@ +178,5 @@
> +      NS_ERROR("given bad DataStorage storage type");
> +  }
> +
> +  if (aType == DataStorage_Persistent) {
> +    mDirty = true;

Here is where you could create the timer lazily.

@@ +213,5 @@
> +{
> +  // The check for mBackingFile being initialized has been hoisted to
> +  // AsyncWriteData (which is the only thing that kicks off this writer).
> +  // If it turns out that mBackingFile is not initialized, something is wrong.
> +  NS_ASSERTION(mBackingFile, "mBackingFile uninitialized unexpectedly");

MOZ_ASSERT(mBackingFile);

@@ +246,5 @@
> +  closure->aString.Append(aKey);
> +  closure->aString.Append(closure->aChar);
> +  closure->aString.Append(aValue);
> +  closure->aString.Append('\n');
> +  return PLDHashOperator::PL_DHASH_NEXT;

This uses unbounded memory and this file will continuosly grow over time. What can we do about that?

One thing we might do is maintain a "last used" timestamp on each entry and then store only the X most recently used entries.

What does permission manager do?

@@ +253,5 @@
> +// When AsyncWriteData is called, the lock for mMutex must be held.
> +// This is because we synchronously marshall the data to be written and
> +// then fire off an event to our worker thread to actually do the writing.
> +nsresult
> +DataStorage::AsyncWriteData()

nsresult
DataStorage::AsyncWriteData(const MutexAutoLock & /*proofOfLock*/)

This way, the compiler will stop you from ever calling this method without holding the lock due to forgetfulness.

@@ +272,5 @@
> +  if (entries == 0) {
> +    mDirty = false;
> +    return NS_OK;
> +  }
> +

This is the place where you can lazily create the thread.

@@ +295,5 @@
> +  if (mBackingFile) {
> +    bool exists = false;
> +    rv = mBackingFile->Exists(&exists);
> +    NS_ENSURE_SUCCESS(rv, rv);
> +    if (exists) {

No need to check whether the file exists before you delete it--just delete it and ignore the "file does not exist" error code.

Clear() will race with the asynchronous write of the file. You need to have Clear() cancel any pending asynchronous write. Otherwise, the data will get written after the Clear(), which is wrong.

@@ +322,5 @@
> +  NS_ENSURE_SUCCESS_VOID(rv);
> +}
> +
> +// The default time between writes, in milliseconds.
> +#define DATA_STORAGE_DEFAULT_WRITE_TIMER 60000

I heard that many (most?) browser sessions are less than 15 min. With this choice, you will almost always be forcing the data to be flushed to disk at shutdown. A smaller interval (~5 min) is probably better.

Also, the timer should get canceled when !mDirty.

@@ +341,5 @@
> +  }
> +
> +  mTimer = do_CreateInstance("@mozilla.org/timer;1", &rv);
> +  NS_ENSURE_SUCCESS(rv, rv);
> +  uint32_t writeTimer = Preferences::GetInt("test.datastorage.writetimer",

Is this pref needed for testing? If so, add a comment to that effect. Otherwise, remove the pref?

@@ +362,5 @@
> +  nsresult rv = NS_OK;
> +  if (strcmp(topic, "last-pb-context-exited") == 0) {
> +    mTemporaryDataTable.Clear();
> +  } else if (strcmp(topic, "profile-before-change") == 0) {
> +    rv = AsyncWriteData();

You need to wait for the async write to finish, because once we return you are not allowed to write to the profile directory any more. One way of doing so is to use the technique used by nsPSMBackgroundThread::requestExit to spin the event loop while the write is happening.

@@ +364,5 @@
> +    mTemporaryDataTable.Clear();
> +  } else if (strcmp(topic, "profile-before-change") == 0) {
> +    rv = AsyncWriteData();
> +    NS_ENSURE_SUCCESS(rv, rv);
> +  } else if (strcmp(topic, "profile-do-change") == 0) {

Need to indicate how you know that we're not doing an async write during this event.

@@ +376,5 @@
> +      NS_ENSURE_SUCCESS(rv, rv);
> +    }
> +    rv = ReadData();
> +    NS_ENSURE_SUCCESS(rv, rv);
> +  } else if (strcmp(topic, NS_PREFBRANCH_PREFCHANGE_TOPIC_ID) == 0) {

Why is this necessary?

::: security/manager/boot/src/DataStorage.h
@@ +1,1 @@
> +/* This Source Code Form is subject to the terms of the Mozilla Public

nit: VIM/Emacs modelines.

@@ +21,5 @@
> + * and DataStorage_Private are not saved. Data stored as DataStorage_Private
> + * masks all other data stored by the same key. Similarly, DataStorage_Temporary
> + * masks data stored as DataStorage_Persistent by the same key. However,
> + * data stored as DataStorage_Private is only visible from other private
> + * contexts.

As I noted above, private browsing data shouldn't just "mask" the other data. The two sets of data should be disjoint. At least, that's how I understand Ehsan's description of things.

@@ +49,5 @@
> +
> +/**
> + * DataStorage is a generic narrow string-based hash map that persists data
> + * on disk and handles persistent, temporary, and private data (see above).
> + * It also handles profile directory changes.

Need to add "thread-safe." Describe how/when data is flushed to disk (on a timer, and at shutdown) and how we prevent the file from growing without bound.

@@ +65,5 @@
> +  nsCString Get(const nsCString& aKey, DataStorageType aType);
> +  nsresult Put(const nsCString& aKey, const nsCString& aValue,
> +               DataStorageType aType);
> +  nsresult Clear();
> +  void DoTimerCallback();

DoTimerCallback should be private.

@@ +72,5 @@
> +  nsresult AsyncWriteData();
> +  nsresult ReadData();
> +  nsresult SetupWriteTimer();
> +
> +  Mutex mMutex;

Document which members mMutex is protecting.

::: security/manager/boot/src/nsSiteSecurityService.cpp
@@ +46,5 @@
>  #define SSSLOG(args) PR_LOG(GetSSSLog(), 4, args)
>  
>  ////////////////////////////////////////////////////////////////////////////////
>  
> +SiteSecurityState::SiteSecurityState(nsCString& aStateString)

IMO, this shouldn't be a constructor, but rather a function that returns a SiteSecurityState using the other constructor.

@@ +49,5 @@
>  
> +SiteSecurityState::SiteSecurityState(nsCString& aStateString)
> +  : mHSTSExpireTime(0)
> +  , mHSTSState(SecurityPropertyUnset)
> +  , mHSTSIncludeSubdomains(false)

key pinning and must-staple will also have this triple structure. I would expect that something like this would be better:

class State {
  PRTime mExpires;
  State mState;
  bool mIncludeSubdomains;
};

class KeyPin : public State { ... };

class SiteSecurityState {
  State hsts;
  State mustStaple;
  KeyPin pinning;
};

@@ +53,5 @@
> +  , mHSTSIncludeSubdomains(false)
> +{
> +  uint32_t hstsIncludeSubdomains = 0; // PR_sscanf doesn't handle bools.
> +  int32_t matches = PR_sscanf(aStateString.get(), "%lld,%lu,%lu",
> +                              &mHSTSExpireTime, &mHSTSState,

I am not sure it is OK to make assumptions about the size of an enum here. I think you need to do something similar to what you did with hstsIncludeSubdomains.

@@ +126,5 @@
> +  if (NS_FAILED(rv)) {
> +    mPreloadListTimeOffset = 0;
> +  }
> +  Preferences::AddStrongObserver(this, "test.currentTimeOffsetSeconds");
> +  mSiteStateStorage = new DataStorage(NS_LITERAL_CSTRING(SSS_STATE_FILE_NAME));

Now we're only supposed to use NS_LITERAL_CSTRING with a string literal argument:

NS_LITERLAL_CSTRING("SiteSecurityServiceState.txt")

This will be required/enforced soon.

@@ +171,5 @@
> +  SiteSecurityState siteState(expiretime, SecurityPropertySet, includeSubdomains);
> +  nsAutoCString stateString;
> +  siteState.ToString(stateString);
> +  nsAutoCString hostname;
> +  nsresult rv = GetHost(aSourceURI, hostname);

I'm pretty sure key pinning cannot be done per-host; it has to be done per-origin.

How about must-staple? Should it be per-host or per-origin?

@@ +176,5 @@
> +  NS_ENSURE_SUCCESS(rv, rv);
> +  SSSLOG(("SSS: setting state for %s", hostname.get()));
> +  DataStorageType storageType = isPrivate ? DataStorage_Private :
> +                                            DataStorage_Persistent;
> +  rv = mSiteStateStorage->Put(hostname, stateString, storageType);

If the state is equivalent to "no state" then we should remove the existing entry in the storage instead of adding a replacement, right?

Or, we can't do that because of the HSTS preload list?

@@ +368,5 @@
>  }
>  
>  NS_IMETHODIMP
> +nsSiteSecurityService::IsSecureURI(uint32_t aType, nsIURI* aURI,
> +                                   uint32_t aFlags, bool* aResult)

IMO, we shouldn't have a method that takes a URI like this. Instead, we should have a method that takes an nsIPrincipal. (Depends on the result of the hostname/origin issue.)

@@ +413,5 @@
>  }
>  
>  NS_IMETHODIMP
> +nsSiteSecurityService::IsSecureHost(uint32_t aType, const char* aHost,
> +                                    uint32_t aFlags, bool* aResult)

Similarly here. We don't need both IsSecureHost and IsSecureURI. We just need IsSecurePrincipal.

(Separate bug?)
Attachment #797589 - Flags: review?(brian) → review-
Comment on attachment 798025 [details] [diff] [review]
notify when DataStorage has written

Review of attachment 798025 [details] [diff] [review]:
-----------------------------------------------------------------

Please ask for a review again if you still need this patch to be reviewed after addressing the review comments for the previous patch (in particular, waiting for the write to happen during shutdown.)
Attachment #798025 - Flags: review?(brian)
Thanks for the review.

(In reply to Brian Smith (:briansmith, was :bsmith@mozilla.com) from comment #21)
> Comment on attachment 797589 [details] [diff] [review]
> patch v5
> 
> Review of attachment 797589 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I stopped about 2/3 through the review because I think we need to answer the
> questions of host vs. origin vs. URI. HSTS is defined to be per-host but I
> don't think key pinning or must-staple should be defined that way. What do
> you think?

I'm not sure what key pinning per-origin gets us - the scheme will always be https, so the only difference between a host and an origin would be the port. The only useful thing about that I see would be the ability to connect to different ports and not worry about either sharing keys or having all of your keys in the header. Is this an important use-case? (Similar argument with must-staple.) Maybe it would be best to say that key pinning / must-staple only applies to port 443.

> @@ +246,5 @@
> > +  closure->aString.Append(aKey);
> > +  closure->aString.Append(closure->aChar);
> > +  closure->aString.Append(aValue);
> > +  closure->aString.Append('\n');
> > +  return PLDHashOperator::PL_DHASH_NEXT;
> 
> This uses unbounded memory and this file will continuosly grow over time.
> What can we do about that?

I'm not sure what you mean by unbounded memory.

> One thing we might do is maintain a "last used" timestamp on each entry and
> then store only the X most recently used entries.
> 
> What does permission manager do?

The permission manager removes expired entries, I believe (although there are entries that don't expire).
This kind of improvement can be a follow-up.

> @@ +49,5 @@
> >  
> > +SiteSecurityState::SiteSecurityState(nsCString& aStateString)
> > +  : mHSTSExpireTime(0)
> > +  , mHSTSState(SecurityPropertyUnset)
> > +  , mHSTSIncludeSubdomains(false)
> 
> key pinning and must-staple will also have this triple structure. I would
> expect that something like this would be better:
> 
> class State {
>   PRTime mExpires;
>   State mState;
>   bool mIncludeSubdomains;
> };
> 
> class KeyPin : public State { ... };
> 
> class SiteSecurityState {
>   State hsts;
>   State mustStaple;
>   KeyPin pinning;
> };

Sounds good; I'll do that when I implement must-staple.

> @@ +176,5 @@
> > +  NS_ENSURE_SUCCESS(rv, rv);
> > +  SSSLOG(("SSS: setting state for %s", hostname.get()));
> > +  DataStorageType storageType = isPrivate ? DataStorage_Private :
> > +                                            DataStorage_Persistent;
> > +  rv = mSiteStateStorage->Put(hostname, stateString, storageType);
> 
> If the state is equivalent to "no state" then we should remove the existing
> entry in the storage instead of adding a replacement, right?
> 
> Or, we can't do that because of the HSTS preload list?

Right - what this should do is check in the preload list and only store a knockout if there's a preloaded entry (otherwise, it should remove the entry from the storage).

> @@ +368,5 @@
> >  }
> >  
> >  NS_IMETHODIMP
> > +nsSiteSecurityService::IsSecureURI(uint32_t aType, nsIURI* aURI,
> > +                                   uint32_t aFlags, bool* aResult)
> 
> IMO, we shouldn't have a method that takes a URI like this. Instead, we
> should have a method that takes an nsIPrincipal. (Depends on the result of
> the hostname/origin issue.)
> 
> @@ +413,5 @@
> >  }
> >  
> >  NS_IMETHODIMP
> > +nsSiteSecurityService::IsSecureHost(uint32_t aType, const char* aHost,
> > +                                    uint32_t aFlags, bool* aResult)
> 
> Similarly here. We don't need both IsSecureHost and IsSecureURI. We just
> need IsSecurePrincipal.
> 
> (Separate bug?)

Yes - separate bug for this.
Attached patch patch v6 (obsolete) — Splinter Review
Here is the updated patch. I can make an interdiff if you prefer, but DataStorage has significantly changed, so I'm not sure it would be helpful. If you could get to this as soon as possible, that would be great.
Attachment #797589 - Attachment is obsolete: true
Attachment #798025 - Attachment is obsolete: true
Attachment #801767 - Flags: review?(brian)
No longer depends on: 766057
Brian, this is time-sensitive with respect to our team goals. I realize it's a lot of code, but you've reviewed most of it already. Let me know if there's anything I can do to speed this up for you.
Flags: needinfo?(brian)
Comment on attachment 801767 [details] [diff] [review]
patch v6

Review of attachment 801767 [details] [diff] [review]:
-----------------------------------------------------------------

David, I reviewed just the DataStorage part. I will review the SiteSecurityService part over the weekend. I think it would be good for us to discuss this today (Friday) if you have time.

::: security/manager/boot/src/DataStorage.cpp
@@ +22,5 @@
> +using namespace mozilla;
> +
> +// The default time between data changing and a write, in milliseconds.
> +// (this comes out to 5 minutes)
> +static const uint32_t sDataStorageDefaultTimerDelay = 300000;

nit: this is easy to see if you write it as (5u * 60u * 1000u);

@@ +35,5 @@
> +}
> +
> +DataStorage::~DataStorage()
> +{
> +  if (mWorkerThread) {

When will this condition be true? I would expect that the thread should always have been destroyed by this point.

@@ +56,5 @@
> +  // If this fails, that's ok - we're probably in an xpcshell test.
> +  // It just means that we won't be reading or saving any state until
> +  // the profile directory exists.
> +  nsresult rv = NS_GetSpecialDirectory(NS_APP_USER_PROFILE_50_DIR,
> +                                       getter_AddRefs(mBackingFile));

It doesn't seem like a good idea to me to make this exception if it is done only for tests. Can't we just make all the xpcshell tests that would trigger the usage of this component initialize a profile? That way we are testing a realistic configuration.

Also, I would ask bsmedberg if it is ever the case that an XPCOM component will get initialized without NS_APP_USER_PROFILE_50_DIR being set. I believe NS_APP_USER_PROFILE_50_DIR is always set before XPCOM is started up, and I believe that we will never change profile directories after startup. This patch can be made simpler if those are both true.

@@ +62,5 @@
> +    rv = mBackingFile->Append(mFilename);
> +    NS_ENSURE_SUCCESS(rv, rv);
> +  }
> +
> +  rv = AsyncReadData(lock);

It seems like we should not call AsyncReadData when NS_GetSpecialDirectory fails.

@@ +66,5 @@
> +  rv = AsyncReadData(lock);
> +  NS_ENSURE_SUCCESS(rv, rv);
> +
> +  nsCOMPtr<nsIObserverService> os = services::GetObserverService();
> +  if (os) {

It seems like we should just fail if we can't get the observer service.

@@ +72,5 @@
> +    os->AddObserver(this, "profile-before-change", false);
> +    os->AddObserver(this, "profile-do-change", false);
> +  }
> +
> +  // For test purposes, we can set the write timer to be very fast.

nit: s/be very fast/fire frequently/?

@@ +75,5 @@
> +
> +  // For test purposes, we can set the write timer to be very fast.
> +  mTimerDelay = Preferences::GetInt("test.datastorage.writetimer",
> +                                    sDataStorageDefaultTimerDelay);
> +  Preferences::AddStrongObserver(this, "test.datastorage.writetimer");

check return value of AddStrongObserver.

@@ +80,5 @@
> +
> +  return NS_OK;
> +}
> +
> +class DataStorage::ReadingDone : public nsRunnable

Add a comment saying that one of these is posted to the main thread when of-the-main-thread file I/O completes.

Can you use NS_NewRunnableMethodWithArg(aDataStorage, DataStorage::FinishReadData, aData) instead of defining this class? AFAICT, you can.

@@ +86,5 @@
> +public:
> +  ReadingDone(DataStorage *aDataStorage, nsCString& aData)
> +    : mDataStorage(aDataStorage)
> +    , mData(aData)
> +  {

MOZ_ASSERT(!NS_IsMainThread());?

@@ +98,5 @@
> +};
> +
> +NS_IMETHODIMP
> +DataStorage::ReadingDone::Run()
> +{

MOZ_ASSERT(NS_IsMainThread());?

@@ +101,5 @@
> +DataStorage::ReadingDone::Run()
> +{
> +  nsresult rv = mDataStorage->FinishReadData(mData);
> +  NS_ENSURE_SUCCESS(rv, rv);
> +  return rv;

It does not do any good to return an error from Run() so you should just return NS_OK.

@@ +124,5 @@
> +NS_IMETHODIMP
> +DataStorage::Reader::Run()
> +{
> +  nsCString data;
> +  nsresult rv = NS_ConsumeStream(mInputStream, UINT32_MAX, data);

This will crash the process--the whole phone in the case of B2G--with OOM if the file is too large. We need to restrict the amount of memory we use in order for this to be acceptable to land. Ask the B2G team how much memory is reasonable for this component to use.

Also, NS_ConsumeStream will, AFAICT, truncate the result if the input file is larger than maxCount bytes. If this is what you want, then make a note of that.

@@ +133,5 @@
> +}
> +
> +// Each line is: <key>\t<score>\t<last accessed time>\t<value>
> +// Where <score> is a uint32_t as a string, <last accessed time> is a
> +// int64_t as a string, and the rest are strings.

nit: IMO, it is better to use nsresult to indicate errors than bool. If you switch to nsresult then you can also use NS_ENSURE_TRUE.

Either way, add a comment that when the function succeeds it returns true (if you stick with boolean) with aKeyOut and aEntryOut set, and otherwise it returns false (if you stick with boolean).

@@ +137,5 @@
> +// int64_t as a string, and the rest are strings.
> +bool
> +DataStorage::ParseLine(nsDependentCSubstring& aLine, nsCString& aKeyOut,
> +                       Entry& aEntryOut)
> +{

I find the organization of this code a little confusing. For example, take the processing of |accessed|. It is split into four separate parts that are interleaved with the processing of three other things that are similarly organized. That makes it hard to see that |accessed| is being completely/properly processed. IMO, it is clearer to put all the processing (including even the declaration of the variables like |accessedIndex|) together.

@@ +181,5 @@
> +    MutexAutoLock lock(mMutex);
> +
> +    // The backing file consists of a list of
> +    //   <key>\t<score>\t<last accessed time>\t<value>
> +    // separated by newlines.

Indicate what happens when the last line is terminated by a newline and/or is not terminated by a newline.

@@ +183,5 @@
> +    // The backing file consists of a list of
> +    //   <key>\t<score>\t<last accessed time>\t<value>
> +    // separated by newlines.
> +    int32_t previousIndex = 0;
> +    int32_t newlineIndex = aData.FindChar('\n', 0);

Nit: Not a big deal, but it is not as efficient to scan for newline and then scan for the tab separators within the newline. It is more efficient to parse each field and then look for the newline, so that you only have to visit each character once.

@@ +200,5 @@
> +      }
> +    }
> +  }
> +
> +  NotifyObservers("DataStorageRead");

TODO: Why isn't this using the runnable defined above?

@@ +202,5 @@
> +  }
> +
> +  NotifyObservers("DataStorageRead");
> +
> +  return (lossOfData ? NS_SUCCESS_LOSS_OF_INSIGNIFICANT_DATA : NS_OK);

nit: you can avoid this extra calculation by replacing |bool lossOfData = false;| with |nsresult rv = NS_OK;| and |lossOfData = true;| with |rv = NS_SUCCESSS_LOSS_OF_INSIGNIFICANT_DATA;|

@@ +215,5 @@
> +    return NS_OK;
> +  }
> +
> +  int64_t fileSize = 0;
> +  nsresult rv = mBackingFile->GetFileSize(&fileSize);

Why not just skip the GetFileSize call completely? AFAIACT, it is unnecessary.

@@ +230,5 @@
> +  }
> +
> +  nsCOMPtr<nsIInputStream> fileInputStream;
> +  rv = NS_NewLocalFileInputStream(getter_AddRefs(fileInputStream), mBackingFile);
> +  NS_ENSURE_SUCCESS(rv, rv);

I think your (rv == NS_ERROR_FILE_NOT_FOUND || rv == NS_ERROR_FILE_TARGET_DOES_NOT_EXIST) should go here.

But, NS_NewLocalFileInputStream can jank because open() can (and often does) block on disk I/O. So, you should do the file opening and closing off the main thread.

@@ +253,5 @@
> +
> +  Entry entry;
> +  bool foundValue = GetInternal(aKey, &entry, aType, lock);
> +  if (!foundValue) {
> +    return EmptyCString();

You need to document that it is not possible to distinguish between missing data and the empty string.

It seems wrong that we have to do two hash table lookups here. Shouldn't we use GetEntry to get a pointer to the entry, then update its score, and then returns its value? Then you could avoid copying the entire entry into |entry| when you really only need to copy (return) the value.

@@ +286,5 @@
> +      return &mTemporaryDataTable;
> +    case DataStorage_Private:
> +      return &mPrivateDataTable;
> +      break;
> +    default:

Just MOZ_CRASH here. You are going to crash anyway because GetInternal unconditionally dereference the pointer that you return from this function.

@@ +293,5 @@
> +
> +  return nullptr;
> +}
> +
> +/* static */

Explain the eviction strategy in a comment. It isn't clear how this is supposed to work.

Since you don't/can't pass proofOfLock here, document with a comment that the mutex must be held.

@@ +309,5 @@
> +void
> +DataStorage::MaybeEvictOneEntry(DataStorageType aType,
> +                                const MutexAutoLock& aProofOfLock)
> +{
> +  static const uint32_t sMaxDataEntries = 1024;

This seems to assume that all entries are small or at least roughly the same size. Perhaps we should enforce a maximum entry size if we are going to depend on this assumption.

@@ +330,5 @@
> +  // The key must be a non-empty string containing no instances of '\t'
> +  NS_ENSURE_TRUE(!aKey.IsEmpty(), NS_ERROR_INVALID_ARG);
> +  int32_t delimiterIndex = aKey.FindChar('\t', 0);
> +  NS_ENSURE_TRUE(delimiterIndex < 0, NS_ERROR_INVALID_ARG);
> +  // The value must not contain '\n'

aKey must not contain \n either, right?

@@ +343,5 @@
> +    MaybeEvictOneEntry(aType, lock);
> +  }
> +  entry.mValue = aValue;
> +  nsresult rv = PutInternal(aKey, entry, aType, lock);
> +  NS_ENSURE_SUCCESS(rv, rv);

Same comment as above. There's no need to do two hash operations; you can use GetEntry to get a read/write pointer to the existing entry and save a lookup.

@@ +356,5 @@
> +{
> +  DataStorageTable *table = GetTableForType(aType, aProofOfLock);
> +  table->Put(aKey, aEntry);
> +
> +  if (aType == DataStorage_Persistent && !mTimer) {

TODO: Need to document that the tables and mTimer are protected by mMutex.

@@ +378,5 @@
> +    NS_ENSURE_SUCCESS_VOID(rv);
> +  }
> +}
> +
> +class DataStorage::WritingDone : public nsRunnable

Replace with NS_NewRunnableMethodWithArg?

@@ +402,5 @@
> +
> +class DataStorage::Writer : public nsRunnable
> +{
> +public:
> +  Writer(nsCString& aData, nsIOutputStream *aOutputStream,

Same issue as the reader: if you open the file on the main thread then you will jank the main thread if open() blocks on I/O. Similarly, you will jank the main thread if close() blocks on I/O.

@@ +421,5 @@
> +
> +// Given that all reads of the backing file happen before all writes, and
> +// given that all writes of the backing file are from a single thread, and
> +// given run-to-completion semantics of event dispatch, there should be no
> +// concurrency issues regarding the backing file. Essentially, last writer wins.

It isn't clear that all reads of the backing file happen before all writes. But, it does seem to be the case that, because this component serializes all the reads and writes onto a single thread, that there cannot be concurrent reading and writing. That is something that should be documented in the documentation comment for the mWorkerThread.

@@ +448,5 @@
> +PLDHashOperator
> +DataStorage::WriteDataCallback(const nsACString& aKey, Entry aEntry, void *aArg)
> +{
> +  nsCString *output = (nsCString *)aArg;
> +  output->Append(aKey);

output->Assign(aKey);

@@ +455,5 @@
> +  output->Append('\t');
> +  output->AppendInt(aEntry.mLastAccessed);
> +  output->Append('\t');
> +  output->Append(aEntry.mValue);
> +  output->Append('\n');

Here, you are newline-terminating entries. But, in the comments above, you say entries are newline-separated. I think termination is better than separation so I suggest changing the comments above to say "terminated/termination/termiwhatever".

@@ +480,5 @@
> +
> +  nsCOMPtr<nsIOutputStream> outputStream;
> +  rv = NS_NewLocalFileOutputStream(getter_AddRefs(outputStream),
> +                                   mBackingFile,
> +                                   PR_CREATE_FILE | PR_TRUNCATE | PR_WRONLY);

As I mentioned above, this will jank the current thread (often the main thread).

@@ +520,5 @@
> +  mPrivateDataTable.Clear();
> +
> +  nsresult rv = NS_OK;
> +  if (mBackingFile) {
> +    rv = mBackingFile->Remove(false);

Add a comment:

// XXX Bug XXXXXX: Janks the main thread.

It isn't good for this to Jank but other parts of "clear recent data" also jank so we can punt a while.

@@ +605,5 @@
> +    NS_ENSURE_SUCCESS(rv, rv);
> +    // Run the thread to completion and prevent any further events
> +    // being scheduled to it.
> +    if (mWorkerThread) {
> +      rv = mWorkerThread->Shutdown();

It isn't great to do a synchronous join here; we're not supposed to jank during shutdown either. We should spin the event loop like nsPSMBackgroundThread does. (Note that you can't hold the lock while you spin the event loop.)

@@ +611,5 @@
> +      mWorkerThread = nullptr;
> +    }
> +    mBackingFile = nullptr;
> +    // We'll lose any data stored between these two events.
> +  } else if (strcmp(topic, "profile-do-change") == 0) {

Does this ever happen in practice? If you haven't seen it, then I'd ask bsmedberg if it can ever happen.

@@ +642,5 @@
> +// If it's been less than a day since this entry has been accessed, the score
> +// does not change.
> +// If it's been less than a week since being accessed, the score increases by 2.
> +// If it's been longer than that, the score decreases by 1.
> +// The minimum score is 0. The maximum score is 65536.

Let's talk about how this works, because I don't get it.

::: security/manager/boot/src/DataStorage.h
@@ +41,5 @@
> + *   with the backing filename as the data in the notification when this
> + *   has completed.
> + * - Get/Put/Remove do the expected things. If Get is called before the
> + *   "DataStorageRead" event is observed, an empty value may be unexpectedly
> + *   returned.

This cannot work like this. If we've stored a must-staple entry (or whatever) for a site then we should not ignore that entry, even if it means janking. That is, Get(), at least, should block and I think you might as well make all three operations block on the read.

@@ +51,5 @@
> + *   private data is cleared.
> + * - When DataStorage observes the topic "profile-before-change", all
> + *   persistent data is synchronously written to the backing file. The worker
> + *   thread responsible for these writes is then temporarily disabled to
> + *   prevent further writes to that file.

The worker thread should be permanently disabled at profile-before-change because after that even there is no profile for the thread to read/write data from/to.

@@ +59,5 @@
> + *   mechanism described above).
> + *   This restarts the worker thread, and new data will be written as
> + *   appropriate.
> + *   NB: Any data stored between observing the events "profile-before-change"
> + *   and "profile-do-change" will be lost.

Ask bsmedberg if this can ever happen.

@@ +64,5 @@
> + * - For testing purposes, the preference "test.datastorage.writetimer" can
> + *   be set to cause the asynchronous writing of data to happen more quickly.
> + *   The value in question is in milliseconds.
> + * - When there are no more references to the DataStorage object, it shuts
> + *   down its worker thread.

I see. But, is it possible for this to happen since the DataStorage is a strong observer for the observer service? It seems like the observer service will maintain a reference to the DataStorage forever.

@@ +74,5 @@
> + *   - If the entry has been touched in the past week, its score increases by 2.
> + *   - Otherwise, the entry's score is decreased by 1.
> + *   - The minimum score is 0 and the maximum is 65536.
> + *   - An entry that has a score no more than any other entry in the same
> + *     table is evicted.

Do you mean that the entries with the lowest scores are evicted first, when the table becomes full?

You should document the motivation for doing things this way. We are trying to prevent websites from flooding us with a bunch of unimportant entries that would, without this logic, cause us to evict important entries. We assume that entries that are accessed frequently (daily) over the course of a long period of time are important entries.

Why is the maximum score 65536? It seems like if we used a larger integer, we could probably avoid overflow for most practical purposes.

By the way, I think your logic might have a flaw. Let's say there are two entries, A (score 1000) and B (score 1000) that haven't been accessed for over a week. Now, the user accesses entry A. Then, A's score would become 999 but B's score would stay 1000. But, that doesn't make much sense considering that we just actually accessed A!

What is wrong with simply counting the number of distinct days that the entry has been accessed? If the counter is large enough then overflow wouldn't be an issue in practice.
Attached patch DataStorage patch (obsolete) — Splinter Review
Here is just the DataStorage patch, with the changes we discussed.
In particular:
 - I spoke with bsmedberg about the proper handling of profile notifications and fixed how DataStorage works in that respect
 - I spoke with fabrice about memory usage on b2g, and apparently 1MB in the parent process is "probably ok"
 - Some lingering issues were fixed by following the semantics of the permission manager (e.g. clearing data synchronously vs. asynchronously)

Let me know if there are further issues with this implementation as soon as possible so we can fix or resolve them. Thanks!
Attachment #801767 - Attachment is obsolete: true
Attachment #801767 - Flags: review?(brian)
Attachment #806147 - Flags: review?(brian)
Attached patch nsSiteSecurityService patch (obsolete) — Splinter Review
This is the rest of the changes (have nsSiteSecurityService use DataStorage, and tests).
Attachment #806150 - Flags: review?(brian)
Flags: needinfo?(brian)
Comment on attachment 806147 [details] [diff] [review]
DataStorage patch

Review of attachment 806147 [details] [diff] [review]:
-----------------------------------------------------------------

What is the story for testing this? Is your plan to test this indirectly by testing the strict transport security service?

::: security/manager/boot/src/DataStorage.cpp
@@ +80,5 @@
> +
> +class DataStorage::Reader : public nsRunnable
> +{
> +public:
> +  Reader(DataStorage *aDataStorage)

Use "DataStorage* aDataStorage" to match the Mozilla coding style guide.

@@ +89,5 @@
> +private:
> +  NS_DECL_NSIRUNNABLE
> +
> +  nsresult ParseLine(nsDependentCSubstring& aLine, nsCString& aKeyOut,
> +                     Entry& aEntryOut);

static?

@@ +97,5 @@
> +
> +NS_IMETHODIMP
> +DataStorage::Reader::Run()
> +{
> +  nsresult rv = NS_OK;

No need to initialize rv here. It is better not to initialize variables unnecessarily because then the compiler will help you see what code paths you missed with its "use of uninitialized variable" warnings.

@@ +105,5 @@
> +  {
> +    MutexAutoLock lock(mDataStorage->mMutex);
> +    if (mDataStorage->mBackingFile) {
> +      rv = NS_NewLocalFileInputStream(getter_AddRefs(fileInputStream),
> +                                      mDataStorage->mBackingFile);

This will call open(1) with mDataStorage->mMutex held, which means that anything that acquires the mutex on the main thread will block on the I/O that open(1) does. Is it possible to safely avoid this? I can think of a simple way:

0. acquire lock.
1. copy mDataStorage->mBackingFile into a local variable
2. release lock
3. open file and read its contents into memory
4. acquire lock.
5. check that mDataStorage->mBackingFile is equal to the name of the file we just read. If not, forget the data we just read.
6. parse the read data into mDataStorage.
7. release lock.

In theory, this is still racy because mDataStorage->mBackingFile could change A -> B -> C -> ... -> A  during step 3. But, I don't think it actually matters here.

@@ +110,5 @@
> +      // If we failed for some reason other than the file doesn't exist, bail.
> +      if (NS_FAILED(rv) && rv != NS_ERROR_FILE_NOT_FOUND && // on Windows
> +          rv != NS_ERROR_FILE_TARGET_DOES_NOT_EXIST) {      // on Unix
> +        NS_ENSURE_SUCCESS(rv, rv);
> +        return rv;

I think this is clearer (more concise, at least):

NS_ENSURE_TRUE(NS_SUCCEEED(rv) ||
               rv == NS_ERROR_FILE_NOT_FOUND ||
               rv == NS_ERROR_FILE_TARGET_DOES_NOT_EXIST, rv);

In particular, it seems strange to have "return rv;" after an NS_ENSURE_SUCCESS(rv, rv) since you know rv will be NS_OK at that point.

@@ +125,5 @@
> +  }
> +
> +  // Atomically parse the data and insert the entries read.
> +  {
> +    MutexAutoLock lock(mDataStorage->mMutex);

Should we clear mDataStorage->mPersistentDataTable here? Add a comment why or why not.

@@ +148,5 @@
> +      nsresult parseRV = ParseLine(line, key, entry);
> +      if (NS_SUCCEEDED(parseRV)) {
> +        mDataStorage->mPersistentDataTable.Put(key, entry);
> +      }
> +    } while (newlineIndex >= 0);

is this condition needed/correct? It seems like the "if (newlineIndex < 0) { break; }" takes care of the end of file condition.

@@ +153,5 @@
> +  }
> +
> +  // Notify that Get can proceed.
> +  {
> +    MonitorAutoLock readyLock(mDataStorage->mReadyMonitor);

Why do we need both mReadyMonitor and mMutex? Usually it is a bad idea to have multiple locks in use because it makes us more prone to deadlock.

@@ +162,5 @@
> +
> +  nsCOMPtr<nsIRunnable> job =
> +    NS_NewRunnableMethodWithArg<const char *>(mDataStorage,
> +                                              &DataStorage::NotifyObservers,
> +                                              "DataStorageRead");

Who consumes the "DataStorageRead" notifications? i.e. Why do we have this observer?

@@ +165,5 @@
> +                                              &DataStorage::NotifyObservers,
> +                                              "DataStorageRead");
> +  rv = NS_DispatchToMainThread(job, NS_DISPATCH_NORMAL);
> +  NS_ENSURE_SUCCESS(rv, rv);
> +  return rv;

return NS_OK;

@@ +173,5 @@
> +// Where <score> is a uint32_t as a string, <last accessed time> is a
> +// int64_t as a string, and the rest are strings.
> +// Returns a successful status if the line can be decoded into a key and entry.
> +// Otherwise, an error status is returned and the values assigned to the
> +// output parameters is in an undefined state.

Is <value> allowed to have \t in it? I assume so, but we should document it either way.

@@ +203,5 @@
> +  nsDependentCSubstring valuePart(aLine, valueIndex);
> +
> +  // Now attempt to decode the score part as a uint32_t.
> +  int32_t matches = PR_sscanf(scorePart.BeginReading(), "%lu",
> +                              &aEntryOut.mScore);

Should we be using scorePart.ToInteger() here? Seems safer than mucking with PR_sscanf directly.

@@ +210,5 @@
> +  }
> +
> +  // Now attempt to decode the last accessed time as a int64_t.
> +  matches = PR_sscanf(accessedPart.BeginReading(), "%lld",
> +                      &aEntryOut.mLastAccessed);

ditto.

@@ +218,5 @@
> +
> +  // Limit the length of keys and values (256 and 1024, respectively).
> +  if (keyPart.Length() > 256 || valuePart.Length() > 1024) {
> +    return NS_ERROR_UNEXPECTED;
> +  }

This chunk should be done before the sscanf. Document why we have these limits (to ensure that the dataset is not larger than 1.25MB).

@@ +243,5 @@
> +    NS_ENSURE_SUCCESS(rv, rv);
> +  } else {
> +    mBackingFile = nullptr;
> +  }
> +

Document why we dispatch the job even when mBackingFile is null.

@@ +248,5 @@
> +  RefPtr<Reader> job(new Reader(this));
> +  rv = mWorkerThread->Dispatch(job, NS_DISPATCH_NORMAL);
> +  NS_ENSURE_SUCCESS(rv, rv);
> +
> +  return rv;

return NS_OK;

@@ +265,5 @@
> +      }
> +    }
> +  }
> +
> +  MutexAutoLock lock(mMutex);

Again, it seems bad to have both mReadyMonitor and mMutex. The normal way of doing things is that you grab the monitor, do your while (!mReady) loop, and then, *without letting go of the monitor*, read the data. This way, we don't have to consider the "gap" between releasing mMonitor and acquiring mMutex.

@@ +291,5 @@
> +  bool foundValue = table->Get(aKey, aEntry);
> +  return foundValue;
> +}
> +
> +DataStorage::DataStorageTable *

Since this never returns null, you might as well have the function return a reference instead of a pointer. This makes the non-nullness of the result clearer in the callers.

@@ +293,5 @@
> +}
> +
> +DataStorage::DataStorageTable *
> +DataStorage::GetTableForType(DataStorageType aType,
> +                             const MutexAutoLock& aProofOfLock)

/*aProofOfLock*/ so you don't generate "unused parameter" warnings.

@@ +307,5 @@
> +    default:
> +      MOZ_CRASH("given bad DataStorage storage type");
> +  }
> +
> +  return nullptr;

This is unreachable code. You should just put the MOZ_CRASH outside the switch.

@@ +339,5 @@
> +
> +  DataStorageTable *table = GetTableForType(aType, aProofOfLock);
> +  if (table->Count() >= sMaxDataEntries) {
> +    KeyAndEntry toEvict;
> +    toEvict.mEntry.mScore = 4294967295u;

s/4294967295/numeric_limits<uint32_t>::max()/

@@ +341,5 @@
> +  if (table->Count() >= sMaxDataEntries) {
> +    KeyAndEntry toEvict;
> +    toEvict.mEntry.mScore = 4294967295u;
> +    table->EnumerateRead(EvictCallback, (void *)&toEvict);
> +    table->Remove(toEvict.mKey);

This doesn't work if all entries have the score numeric_limits<uint32_t>::max(). Add a comment explaining why this doesn't matter.

This function is O(n) which means every adding an entry to the table is O(n) instead of O(1). Reading is still O(1). But, since the number of entries is so small, I am not sure that the difference between O(n) and O(1) matters. Probably it would have been better to use a simpler, more memory-efficient sorted data structure with lg(n) additions/removals, which would allow us to store more entries. But, we can improve the efficiency of this if/when we need it.

@@ +369,5 @@
> +  bool exists = GetInternal(aKey, &entry, aType, lock);
> +  if (exists) {
> +    entry.UpdateScore();
> +  } else {
> +    MaybeEvictOneEntry(aType, lock);

Note that now our max memory is approximately 1.25MB per table * 3 types + 1.25MB of data pending write + 1MB of data currently being read from disk = 5MB + hash table overhead.

Should we count the maximum number of entries across all three tables instead of giving them each their own maximum allocation?

Granted, the worst case is probably much, much worse than type typical case.

@@ +381,5 @@
> +
> +nsresult
> +DataStorage::PutInternal(const nsCString& aKey, Entry& aEntry,
> +                         DataStorageType aType,
> +                         const MutexAutoLock& aProofOfLock)

/*aProofOfLock*/

@@ +433,5 @@
> +  {
> +    MutexAutoLock lock(mDataStorage->mMutex);
> +    rv = NS_NewLocalFileOutputStream(getter_AddRefs(outputStream),
> +                                     mDataStorage->mBackingFile,
> +                                     PR_CREATE_FILE | PR_TRUNCATE | PR_WRONLY);

Same issue with opening the file holding the lock as I mentioned with the reader.

@@ +450,5 @@
> +
> +  nsCOMPtr<nsIRunnable> job =
> +    NS_NewRunnableMethodWithArg<const char *>(mDataStorage,
> +                                              &DataStorage::NotifyObservers,
> +                                              "DataStorageWritten");

Same question as the reader event--who observes this event?

@@ +476,5 @@
> +}
> +
> +nsresult
> +DataStorage::AsyncWriteData(const MutexAutoLock& aProofOfLock)
> +{

MOZ_ASSERT(NS_IsMainThread());

@@ +495,5 @@
> +}
> +
> +nsresult
> +DataStorage::ClearTimer(const MutexAutoLock& aProofOfLock)
> +{

MOZ_ASSERT(NS_IsMainThread());

@@ +496,5 @@
> +
> +nsresult
> +DataStorage::ClearTimer(const MutexAutoLock& aProofOfLock)
> +{
> +  if (!mTimer) {

mTimer needs to be protected by the lock, right?

@@ +508,5 @@
> +}
> +
> +nsresult
> +DataStorage::Clear()
> +{

MOZ_ASSERT(NS_IsMainThread());

@@ +602,5 @@
> +    }
> +    // Run the thread to completion and prevent any further events
> +    // being scheduled to it. The thread may need the lock, so we can't
> +    // hold it here.
> +    rv = mWorkerThread->Shutdown();

I will look at this part again tomorrow.

@@ +622,5 @@
> +
> +// Updates this entry's score. Returns true if the score has actually changed.
> +// If it's been less than a day since this entry has been accessed, the score
> +// does not change. Otherwise, the score increases by 1.
> +// The default score is 0. The maximum score is 4294967295.

numeric_limits<uint32_t>::max().

@@ +628,5 @@
> +// unbounded resource use.
> +bool
> +DataStorage::Entry::UpdateScore()
> +{
> +  static const int64_t sOneDayInMicroseconds = int64_t(24 * 60 * 60) * PR_USEC_PER_SEC;

Would it be better to store the last-accessed time with day precision (instead of millisecond precision), and then compare the day-precise times for equality? Seems like the result would be about the same but we'd be storing less information (user visited site sometime today, vs user visited site at precisely date-time T) on disk, which seems like a good thing.

@@ +643,5 @@
> +    return false;
> +  }
> +
> +  // Otherwise, increment the score (but don't overflow uint32_t).
> +  if (mScore < 4294967295u) {

Perhaps we should have static const uint32_t MAX_SCORE = UINT32_MAX and use that everywhere.

It is too error prone to have the 4294967295u constant everywhere (anywhere)--too hard to spot typos.
Attachment #806147 - Flags: review?(brian) → review-
Comment on attachment 806150 [details] [diff] [review]
nsSiteSecurityService patch

Review of attachment 806150 [details] [diff] [review]:
-----------------------------------------------------------------

Please feel free to re-request review of this patch or an interdiff if you think review would be helpful.

It would be helpful to know your plan for adding Must-Staple support to SSS. Will the Must-Staple stuff share entries with the HSTS stuff, or it will have its own entries? Seems like it may need its own entries because of the port?

::: mobile/android/modules/Sanitizer.jsm
@@ +118,5 @@
> +
> +        // Clear site security settings
> +        Cc["@mozilla.org/ssservice;1"]
> +          .getService(Ci.nsISiteSecurityService)
> +          .clearAll();

I suggest you use the exact same code so that it is clearer to the person that refactors all of this to remove the code duplication what is the same and what is different.

::: security/manager/boot/src/nsSiteSecurityService.cpp
@@ +55,5 @@
> +  uint32_t hstsState = 0;
> +  uint32_t hstsIncludeSubdomains = 0; // PR_sscanf doesn't handle bools.
> +  int32_t matches = PR_sscanf(aStateString.get(), "%lld,%lu,%lu",
> +                              &mHSTSExpireTime, &hstsState,
> +                              &hstsIncludeSubdomains);

Do/should we support extra junk after the known stuff for forward compat?

If so, should we store that extra junk in a string that we then append to the end of things in ToString()?

@@ +83,2 @@
>  {
>  }

Isn't this equivalent to the compiler-generated copy constructor? If so we can nuke it.

@@ +90,5 @@
> +  aString.AppendInt(mHSTSExpireTime);
> +  aString.Append(',');
> +  aString.AppendInt(mHSTSState);
> +  aString.Append(',');
> +  aString.AppendInt((uint32_t)mHSTSIncludeSubdomains);

static_cast<uint32_t>

@@ +121,2 @@
>  
> +  mUsePreloadList = Preferences::GetBool("network.stricttransportsecurity.preloadlist", true);

Make sure the code wraps at 80 chars.

@@ +121,4 @@
>  
> +  mUsePreloadList = Preferences::GetBool("network.stricttransportsecurity.preloadlist", true);
> +  Preferences::AddStrongObserver(this, "network.stricttransportsecurity.preloadlist");
> +  mPreloadListTimeOffset = Preferences::GetInt("test.currentTimeOffsetSeconds", 0);

ditto. Line is too long

@@ +172,3 @@
>    bool isPrivate = flags & nsISocketProvider::NO_PERMANENT_STORAGE;
> +  DataStorageType storageType = isPrivate ? DataStorage_Private :
> +                                            DataStorage_Persistent;

nit: I suggest you put the colon under the question mark, because that composes better when you chain multiple things together (requires fixed-width font):

x = a == 1 ? b
  : a == 2 ? c
           : d;

Also, it's consistent with how I do it (though, people complain about how I do it to).

@@ +176,3 @@
>    NS_ENSURE_SUCCESS(rv, rv);
>  
> +  return rv;

return NS_OK;

@@ +190,5 @@
>    NS_ENSURE_SUCCESS(rv, rv);
>  
>    bool isPrivate = aFlags & nsISocketProvider::NO_PERMANENT_STORAGE;
> +  DataStorageType storageType = isPrivate ? DataStorage_Private :
> +                                            DataStorage_Persistent;

nit: ditto.

@@ +440,5 @@
>    // on the host, because the knockout entry indicates "we have no information
> +  // regarding the security status of this host".
> +  bool isPrivate = aFlags & nsISocketProvider::NO_PERMANENT_STORAGE;
> +  DataStorageType storageType = isPrivate ? DataStorage_Private :
> +                                            DataStorage_Persistent;

ditto colon.

@@ +446,5 @@
> +  SiteSecurityState siteState(value);
> +  if (siteState.mHSTSState != SecurityPropertyUnset) {
> +    SSSLOG(("Found entry for %s", host.get()));
> +    if (!siteState.IsExpired(aType) &&
> +        siteState.mHSTSState == SecurityPropertySet) {

If an entry is expired then we should remove it.

In datastorage, we should do this expiration check when we parse and/or add entries. And, probably we should clear expired entries before we try to delete a non-expired entry when we need to make room for a new entry.

::: security/manager/ssl/tests/unit/test_sss_eviction.js
@@ +20,5 @@
> +  // there should only be 1024 entries
> +  do_check_eq(lines.length, 1024);
> +  let foundLegitSite = false;
> +  for (let line of lines) {
> +    if (line.startsWith("frequentlyused.mozilla.org")) {

Use example.org/example.com when possible, and try to avoid using mozilla.org when we aren't testing things specific to mozilla.org.

::: security/manager/ssl/tests/unit/test_sss_readstate.js
@@ +14,5 @@
> +function checkStateRead(aSubject, aTopic, aData) {
> +  do_check_eq(aData, SSS_STATE_FILE_NAME);
> +
> +  do_check_false(gSSService.isSecureHost(Ci.nsISiteSecurityService.HEADER_HSTS,
> +                                         "expired.mozilla.org", 0));

ditto in this file.

::: security/manager/ssl/tests/unit/test_sss_savestate.js
@@ +2,5 @@
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +// The purpose of this test is to see that the site security service properly
> +// writes its state file.

If we allow extra junk at the end of the entry and we try to preserve it, then we should update this test or make a new test to make sure the extra junk gets preserved.

@@ +60,5 @@
> +
> +  for (let i = 0; i < 1000; i++) {
> +    SSService.processHeader(Ci.nsISiteSecurityService.HEADER_HSTS,
> +                            uris[i % (uris.length)],
> +                            "max-age=" + (i * 1000) + (i % 2 == 0 ? "; includeSubdomains" : ""), 0);

IMO, a little bit too clever, but OK.

Should we include an expired entry here?

::: security/manager/ssl/tests/unit/test_sts_preloadlist_perwindowpb.js
@@ +145,5 @@
> +  do_timeout(1250, function() {
> +    do_check_false(gSSService.isSecureHost(Ci.nsISiteSecurityService.HEADER_HSTS,
> +                                           "login.persona.org", 0));
> +    run_next_test();
> +  });

Did you intend to remove the "simulate leaving private browsing mode"?

::: security/manager/ssl/tests/unit/test_sts_preloadlist_selfdestruct.js
@@ +12,5 @@
>    // now simulate that it's 19 weeks later than it actually is
>    let offsetSeconds = 19 * 7 * 24 * 60 * 60;
>    Services.prefs.setIntPref("test.currentTimeOffsetSeconds", offsetSeconds);
>  
> +  do_execute_soon(function() {

You are using do_execute_soon because the SSS will not get notified of the current time offset change synchronously, right? Please add a comment to that effect.

::: security/manager/ssl/tests/unit/xpcshell.ini
@@ +24,5 @@
>  [test_sts_preloadlist_selfdestruct.js]
> +[test_sss_readstate.js]
> +[test_sss_readstate_empty.js]
> +[test_sss_savestate.js]
> +[test_sss_eviction.js]

Malformed and/or truncated input would also be great to have a test for.
Attachment #806150 - Flags: review?(brian) → review+
Thanks for the review. If I haven't responded to something (below), it means I agree and have made the change.

(In reply to Brian Smith (:briansmith, was :bsmith@mozilla.com) from comment #30)
> What is the story for testing this? Is your plan to test this indirectly by
> testing the strict transport security service?

Pretty much. Do you think that's insufficient?

> ::: security/manager/boot/src/DataStorage.cpp
> @@ +105,5 @@
> > +  {
> > +    MutexAutoLock lock(mDataStorage->mMutex);
> > +    if (mDataStorage->mBackingFile) {
> > +      rv = NS_NewLocalFileInputStream(getter_AddRefs(fileInputStream),
> > +                                      mDataStorage->mBackingFile);
> 
> This will call open(1) with mDataStorage->mMutex held, which means that
> anything that acquires the mutex on the main thread will block on the I/O
> that open(1) does. Is it possible to safely avoid this? I can think of a
> simple way:
> 
> 0. acquire lock.
> 1. copy mDataStorage->mBackingFile into a local variable
> 2. release lock
> 3. open file and read its contents into memory
> 4. acquire lock.
> 5. check that mDataStorage->mBackingFile is equal to the name of the file we
> just read. If not, forget the data we just read.
> 6. parse the read data into mDataStorage.
> 7. release lock.
> 
> In theory, this is still racy because mDataStorage->mBackingFile could
> change A -> B -> C -> ... -> A  during step 3. But, I don't think it
> actually matters here.

As I understand it, the backing file can only change once, when the profile is initialized (if DataStorage is created before the profile is available). So, I don't think we have to worry about that case.

> @@ +125,5 @@
> > +  }
> > +
> > +  // Atomically parse the data and insert the entries read.
> > +  {
> > +    MutexAutoLock lock(mDataStorage->mMutex);
> 
> Should we clear mDataStorage->mPersistentDataTable here? Add a comment why
> or why not.

Since it's asynchronous and Put can always proceed, I think it's best to not clear it. I added a comment.

> @@ +153,5 @@
> > +  }
> > +
> > +  // Notify that Get can proceed.
> > +  {
> > +    MonitorAutoLock readyLock(mDataStorage->mReadyMonitor);
> 
> Why do we need both mReadyMonitor and mMutex? Usually it is a bad idea to
> have multiple locks in use because it makes us more prone to deadlock.

The monitor serves a different function from the mutex (i.e. non-busy waiting). I added a comment to never acquire both the monitor and the lock at the same time.

> @@ +162,5 @@
> > +
> > +  nsCOMPtr<nsIRunnable> job =
> > +    NS_NewRunnableMethodWithArg<const char *>(mDataStorage,
> > +                                              &DataStorage::NotifyObservers,
> > +                                              "DataStorageRead");
> 
> Who consumes the "DataStorageRead" notifications? i.e. Why do we have this
> observer?

Tests. I added a comment.

> @@ +369,5 @@
> > +  bool exists = GetInternal(aKey, &entry, aType, lock);
> > +  if (exists) {
> > +    entry.UpdateScore();
> > +  } else {
> > +    MaybeEvictOneEntry(aType, lock);
> 
> Note that now our max memory is approximately 1.25MB per table * 3 types +
> 1.25MB of data pending write + 1MB of data currently being read from disk =
> 5MB + hash table overhead.
> 
> Should we count the maximum number of entries across all three tables
> instead of giving them each their own maximum allocation?
> 
> Granted, the worst case is probably much, much worse than type typical case.

I think it's fine as is. If it's a problem, we can lower the limit.

> @@ +476,5 @@
> > +}
> > +
> > +nsresult
> > +DataStorage::AsyncWriteData(const MutexAutoLock& aProofOfLock)
> > +{
> 
> MOZ_ASSERT(NS_IsMainThread());

I don't understand why this needs to be on the main thread. In particular, Clear() calls it, and Clear() can be called from off the main thread.

> @@ +495,5 @@
> > +}
> > +
> > +nsresult
> > +DataStorage::ClearTimer(const MutexAutoLock& aProofOfLock)
> > +{
> 
> MOZ_ASSERT(NS_IsMainThread());

It turns out it's unsafe to mix what threads are operating on a timer, so I had to re-do the timer handling. Now the timer just gets re-initialized when necessary (and never explicitly canceled). This happens by a dispatch to the worker thread. I added a boolean to keep track of if there are any pending writes that have't been dispatched as a way to not repeatedly re-initialized the timer unnecessarily.

> @@ +508,5 @@
> > +}
> > +
> > +nsresult
> > +DataStorage::Clear()
> > +{
> 
> MOZ_ASSERT(NS_IsMainThread());

Again, why does this need to be on the main thread?
Thanks for the review. Same deal with comments below.

(In reply to Brian Smith (:briansmith, was :bsmith@mozilla.com) from comment #31)
> It would be helpful to know your plan for adding Must-Staple support to SSS.
> Will the Must-Staple stuff share entries with the HSTS stuff, or it will
> have its own entries? Seems like it may need its own entries because of the
> port?

I still don't know what the right thing to do here is. If we use origin for must-staple but stay with host for HSTS, we probably will need separate entities, which is inefficient. It also complicates the implementation in that they have to be handled differently when looking up status (there may be ways to abstract this away, though). We should probably talk about this when we have time.

> ::: security/manager/boot/src/nsSiteSecurityService.cpp
> @@ +55,5 @@
> > +  uint32_t hstsState = 0;
> > +  uint32_t hstsIncludeSubdomains = 0; // PR_sscanf doesn't handle bools.
> > +  int32_t matches = PR_sscanf(aStateString.get(), "%lld,%lu,%lu",
> > +                              &mHSTSExpireTime, &hstsState,
> > +                              &hstsIncludeSubdomains);
> 
> Do/should we support extra junk after the known stuff for forward compat?

I don't think that's worth the effort (YAGNI).

> @@ +83,2 @@
> >  {
> >  }
> 
> Isn't this equivalent to the compiler-generated copy constructor? If so we
> can nuke it.

This constructor takes each parameter piece-wise, not another SiteSecurityState.

> @@ +446,5 @@
> > +  SiteSecurityState siteState(value);
> > +  if (siteState.mHSTSState != SecurityPropertyUnset) {
> > +    SSSLOG(("Found entry for %s", host.get()));
> > +    if (!siteState.IsExpired(aType) &&
> > +        siteState.mHSTSState == SecurityPropertySet) {
> 
> If an entry is expired then we should remove it.
> 
> In datastorage, we should do this expiration check when we parse and/or add
> entries. And, probably we should clear expired entries before we try to
> delete a non-expired entry when we need to make room for a new entry.

DataStorage doesn't know when an entry has expired. This is on purpose, because only the consumer of the data knows what it means, when it has expired, and if it's even safe to remove it when it has expired (for instance, it is not safe to remove an expired entry if it masks one on the HSTS preload list).

> @@ +60,5 @@
> > +
> > +  for (let i = 0; i < 1000; i++) {
> > +    SSService.processHeader(Ci.nsISiteSecurityService.HEADER_HSTS,
> > +                            uris[i % (uris.length)],
> > +                            "max-age=" + (i * 1000) + (i % 2 == 0 ? "; includeSubdomains" : ""), 0);
> 
> IMO, a little bit too clever, but OK.

I made it more clear and less clever.

> ::: security/manager/ssl/tests/unit/test_sts_preloadlist_perwindowpb.js
> @@ +145,5 @@
> > +  do_timeout(1250, function() {
> > +    do_check_false(gSSService.isSecureHost(Ci.nsISiteSecurityService.HEADER_HSTS,
> > +                                           "login.persona.org", 0));
> > +    run_next_test();
> > +  });
> 
> Did you intend to remove the "simulate leaving private browsing mode"?

Yes - that was an error, anyway.

> ::: security/manager/ssl/tests/unit/test_sts_preloadlist_selfdestruct.js
> @@ +12,5 @@
> >    // now simulate that it's 19 weeks later than it actually is
> >    let offsetSeconds = 19 * 7 * 24 * 60 * 60;
> >    Services.prefs.setIntPref("test.currentTimeOffsetSeconds", offsetSeconds);
> >  
> > +  do_execute_soon(function() {
> 
> You are using do_execute_soon because the SSS will not get notified of the
> current time offset change synchronously, right? Please add a comment to
> that effect.

It looks like it's synchronous on linux, but I think it isn't on windows. I'll investigate and comment as necessary.
Attached patch DataStorage patch v2 (obsolete) — Splinter Review
Addressed comments. Re-requesting review. I realize you have a lot of other things going on right now, but I'd like to get this finished up soon so we can move on to OCSP must-staple.
Attachment #806147 - Attachment is obsolete: true
Attachment #815501 - Flags: review?(brian)
Attached patch nsSiteSecurityService patch v2 (obsolete) — Splinter Review
Addressed comments. Carrying over r+.
Attachment #806150 - Attachment is obsolete: true
Attachment #815503 - Flags: review+
Comment on attachment 815501 [details] [diff] [review]
DataStorage patch v2

On the grounds that this patch needs to interact well with the xpcom infrastructure, I think someone who's an expert in xpcom should have a look at it, so r? to dougt. Let me know if anything isn't clear or you want some more background before diving in to this.
Attachment #815501 - Flags: review?(brian) → review?(doug.turner)
Comment on attachment 815501 [details] [diff] [review]
DataStorage patch v2

Review of attachment 815501 [details] [diff] [review]:
-----------------------------------------------------------------

David, sorry for the review lag. Unfortunately, I guess Splinter resets unpublished review comments when you change the review flags so I lost most of my pending comments. Here are the ones I remember. I think it is good to have a fresh set of eyes review this too.

::: security/manager/boot/src/DataStorage.cpp
@@ +34,5 @@
> +DataStorage::DataStorage(const nsString& aFilename)
> +  : mMutex("DataStorage::mMutex")
> +  , mPendingWrite(false)
> +  , mShuttingDown(false)
> +  , mReadyMonitor("DataStorage::mReadyMonitor")

I understand that mMutex and mReadyMonitor are used for different things. But, why not just use mReadyMonitor for both things?

@@ +142,5 @@
> +  }
> +
> +  // Atomically parse the data and insert the entries read.
> +  // Don't clear existing entries - they may have been inserted between when
> +  // this read was kicked-off and when it was run.

If you don't clear existing entries then do you need to merge the existing entries with the entries read from the file. Below, you seem to assume that there is no existing entry for whatever entry you read from the file.

@@ +279,5 @@
> +  // We may have been initialized before the profile is ready (or in a
> +  // context in which there is no profile - e.g. an xpcshell test).
> +  // This is ok - we simply won't save anything or respond with data that
> +  // has previously been saved until a profile is available.
> +  // This is similar to how the permission manager behaves.

If we've recorded that a site is must-staple or HSTS then we need to block connections to that server until we can determine from the must-staple or HSTS state for that host. If it is difficult to fix now, please file a follow-up bug to fix this.

In general, I think we're trying to hard here to make this stuff work for the case where xpcshell didn't create a profile directory for the Necko tests. We should just add do_get_profile() to all the Necko tests so that we can make Get() block on reading the file out of the profile. (Gecko doesn't support running without a profile directory any more, does it?) It is more important that we make this work correctly in the actual product than it is for us to avoid touching the existing xpcshell tests.

@@ +347,5 @@
> +    case DataStorage_Temporary:
> +      return mTemporaryDataTable;
> +    case DataStorage_Private:
> +      return mPrivateDataTable;
> +      break;

nit: break not necessary.

@@ +549,5 @@
> +  mPrivateDataTable.Clear();
> +
> +  // Asynchronously clear the file. This is similar to the permission manager
> +  // in that it doesn't wait to synchronously remove the data from its backing
> +  // storage either.

Sorry, missed this before too. I believe that the function that calls this Clear() method is supposed to block until all the data is deleted, so that the user doesn't shut down the browser without us having deleted the data. This will be especially important when we start using _exit(0) to shut down.
Comment on attachment 815501 [details] [diff] [review]
DataStorage patch v2

David, please answer my questions above and request review from me (with a new patch, if necessary).
Attachment #815501 - Flags: review?(doug.turner)
David, will you have cycles for wrapping this up in the next week or two? Please see comment 38 about getting the review process restarted.
Status: NEW → ASSIGNED
Component: Security → Security: PSM
Flags: needinfo?(dkeeler)
Priority: -- → P2
Target Milestone: --- → mozilla29
I had a look at this today - I'm concerned that this change may require basically every xpcshell test to call do_get_profile, but I'll continue investigating next week. I don't think there's a whole lot of work left, other than that.
Flags: needinfo?(dkeeler)
(In reply to David Keeler (:keeler) from comment #40)
> I had a look at this today - I'm concerned that this change may require
> basically every xpcshell test to call do_get_profile, but I'll continue
> investigating next week. I don't think there's a whole lot of work left,
> other than that.

If that's the case, it could just go in http://mxr.mozilla.org/mozilla-central/source/testing/xpcshell/head.js .
(In reply to Josh Matthews [:jdm] from comment #41)
> (In reply to David Keeler (:keeler) from comment #40)
> > I had a look at this today - I'm concerned that this change may require
> > basically every xpcshell test to call do_get_profile, but I'll continue
> > investigating next week. I don't think there's a whole lot of work left,
> > other than that.
> 
> If that's the case, it could just go in
> http://mxr.mozilla.org/mozilla-central/source/testing/xpcshell/head.js .

https://tbpl.mozilla.org/?tree=Try&rev=7588e1137072

Well, I had a go at that, but it turns out some xpcshell tests (e.g. dom/network/tests/unit/test_tcpserversocket.js, dom/plugins/test/unit/test_bug813245.js, etc. - see the xpcshell test failures) rely on do_get_profile not having been called before they're ran.
Another issue is it looks like nsISiteSecurityService is accessed in the child process (as evidenced by dom/browser-element/mochitest/test_browserElement_oop_ErrorSecurity.html - see the mochitest-2 failures). I imagine we'll need to remote it. I'll look into this tomorrow.
Blocks: hpkp
Attached patch DataStorage patch v2 rebased (obsolete) — Splinter Review
Attachment #815501 - Attachment is obsolete: true
Carrying over r+.
Attachment #815503 - Attachment is obsolete: true
Attachment #8481650 - Flags: review+
Comment on attachment 8481649 [details] [diff] [review]
DataStorage patch v2 rebased

Nathan, this patch makes use of some XPCOM locking/threading/timer primitives, and I would appreciate it if you could take a look (or suggest someone who could). The idea is to have a thread-safe, persistent data storage for small things (e.g. HSTS data) without incurring the overhead of mysql. A good way to start would be to read the documentation in DataStorage.h. (Also, you can have a look at the other patch in this bug to see how it's supposed to be used.) Thanks!
Attachment #8481649 - Flags: review?(nfroyd)
Comment on attachment 8481649 [details] [diff] [review]
DataStorage patch v2 rebased

Monica, how do the gtests look? (See also the xpcshell tests in the other patch.)
Attachment #8481649 - Flags: review?(mmc)
Comment on attachment 8481649 [details] [diff] [review]
DataStorage patch v2 rebased

Review of attachment 8481649 [details] [diff] [review]:
-----------------------------------------------------------------

I have some disruptive comments for you, mostly about encoding, so I didn't review DataStorage.cpp yet. Overall the API looks sane to me.

::: security/manager/boot/src/DataStorage.cpp
@@ +90,5 @@
> +
> +class DataStorage::Reader : public nsRunnable
> +{
> +public:
> +  Reader(DataStorage* aDataStorage)

mark explicit to avoid accidentally constructing

@@ +238,5 @@
> +
> +  return NS_OK;
> +}
> +
> +// Each line is: <key>\t<score>\t<last accessed time>\t<value>

oh, now I see why you disallow \t :( I don't think re-writing with protobufs is too difficult but understand if you don't want to do so at this late stage.

@@ +308,5 @@
> +DataStorage::AsyncReadData(bool& aHaveProfileDir,
> +                           const MutexAutoLock& /*aProofOfLock*/)
> +{
> +  aHaveProfileDir = false;
> +  // Allocate a Reader so that even if it isn' dispatched,

isn't

@@ +748,5 @@
> +  return NS_OK;
> +}
> +
> +static const int64_t sOneDayInMicroseconds = int64_t(24 * 60 * 60) *
> +                                             PR_USEC_PER_SEC;

put all consts together?

::: security/manager/boot/src/DataStorage.h
@@ +23,5 @@
> + * persists data on disk and additionally handles temporary and private data.
> + * However, if used in a context where there is no profile directory, data
> + * will not be persisted.
> + *
> + * Its lifecycle is as follows:

Please summarize this description in fewer sentences and move function-specific comments inline with their declarations.

@@ +44,5 @@
> + *   will block.
> + *   NB: It is not currently possible to differentiate between missing data
> + *   and data that is the empty string.
> + * - When any persistent data changes, a timer is initialized that will
> + *   eventually asynchronously write all persistent data to the backing file.

What happens if we shut down before the timer is complete?

@@ +51,5 @@
> + *   It is possible to receive a "DataStorageWritten" event while there exist
> + *   pending persistent data changes. However, those changes will cause the
> + *   timer to be reinitialized and another "DataStorageWritten" event will
> + *   be sent.
> + * - When DataStorage observes the topic "last-pb-context-exited", all

Inconsistent naming with "DataStorageReady" and "DataStorageWritten." I know these are not your events, but why not name yours consistently with existing ones?

@@ +61,5 @@
> + * - For testing purposes, the preference "test.datastorage.writetimer" can
> + *   be set to cause the asynchronous writing of data to happen more quickly.
> + *   The value in question is in milliseconds.
> + * - To prevent unbounded memory and disk use, the number of entries in each
> + *   table is limited to 1024. Evictions are handled in the following manner:

Summarize to LRU?

@@ +72,5 @@
> + * - NB: Instances of DataStorage have long lifetimes because they are strong
> + *   observers of events and won't go away until the observer service does.
> + *
> + * For each key/value:
> + * - The key must be a non-empty string containing no instances of '\t' or '\n',

Why this limitation? What if someone wants to use it later to encode, say, a protobuf, which is not guaranteed at all not to have embedded whitespace?

@@ +99,5 @@
> +
> +  DataStorage(const nsString& aFilename);
> +
> +  nsresult Init(bool& aDataWillPersist);
> +  nsCString Get(const nsCString& aKey, DataStorageType aType);

Here's where I want comments, complete with param and return value descriptions.

::: security/manager/ssl/tests/gtest/DataStorageTest.cpp
@@ +1,1 @@
> +/* -*- Mode: C++; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 2 -*- */

Nice! Thanks for getting this to work :)

@@ +21,5 @@
> +    new DataStorage(NS_LITERAL_STRING("DataStorageTest.txt")));
> +  bool dataWillPersist = false;
> +  EXPECT_EQ(NS_OK, d->Init(dataWillPersist));
> +  EXPECT_TRUE(dataWillPersist);
> +

You may want to break these up into related tests, unless initialization overhead is super-high. If someone breaks a test, it will just show up as SimpleTests failed.

@@ +23,5 @@
> +  EXPECT_EQ(NS_OK, d->Init(dataWillPersist));
> +  EXPECT_TRUE(dataWillPersist);
> +
> +  // Test Put/Get on Persistent data
> +  EXPECT_EQ(NS_OK, d->Put(NS_LITERAL_CSTRING("test"), NS_LITERAL_CSTRING("val"),

NS_LITERAL_CSTRING appears about a million times below. Maybe name and reuse it, in DataStorageTest?

@@ +150,5 @@
> +  }
> +  EXPECT_EQ(entries, 1024);
> +  result = d->Get(NS_LITERAL_CSTRING("key"), DataStorage_Persistent);
> +  EXPECT_TRUE(result.Equals(NS_LITERAL_CSTRING("val")));
> +}

Possible to test shutdown?
Attachment #8481649 - Flags: review?(mmc) → feedback+
OK, just saw comment 17 (sorry for not reading upthread before feedbacking). Does it still apply? Can we shove things like CertOverrideService and expect not to have embedded tabs?
Comment on attachment 8481649 [details] [diff] [review]
DataStorage patch v2 rebased

Review of attachment 8481649 [details] [diff] [review]:
-----------------------------------------------------------------

I have this feeling that when Taras suggested a flat compressed file, he probably envisioned a JSON file.  WebIDL probably makes doing that a lot nicer now than it would have been 12 months ago.  But it's probably not worth trying to rewrite the storage format at this point...

The XPCOM-ish bits look fine; kudos for getting the timer/thread bits correct.  Just a bunch of other small comments.

::: security/manager/boot/src/DataStorage.cpp
@@ +5,5 @@
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +#include <limits>
> +
> +#include "DataStorage.h"

Nit: include the self header file before anything else so that we ensure it's not cargo-culting things.

@@ +8,5 @@
> +
> +#include "DataStorage.h"
> +
> +#include "mozilla/Preferences.h"
> +#include "mozilla/RefPtr.h"

Please don't use RefPtr, use nsRefPtr instead.

@@ +13,5 @@
> +#include "mozilla/Services.h"
> +#include "mozilla/unused.h"
> +#include "nsAppDirectoryServiceDefs.h"
> +#include "nsDirectoryServiceUtils.h"
> +#include "nsILineInputStream.h"

I don't think you need this include.

@@ +57,5 @@
> +
> +  MutexAutoLock lock(mMutex);
> +
> +  nsresult rv;
> +  rv = NS_NewThread(getter_AddRefs(mWorkerThread));

Ugh, another helper thread.  Can't we re-purpose another thread for this or something?

@@ +130,5 @@
> +{
> +  nsresult rv;
> +  // Do the clone the file while holding the lock and then open the stream
> +  // while not holding the lock trick (while the file won't change, it's
> +  // not guaranteed that concurrent operations using it are safe).

This comment (and its copy-pasted cousin some ways below) threw me for a little while, both in its wording and what it's trying to explain.  WDYT about something like "Concurrent operations on nsIFile objects are not guaranteed to be safe, so we must clone the file under the lock to obtain a copy we can safely use." ?

@@ +281,5 @@
> +  }
> +
> +  // Now attempt to decode the score part as a uint32_t.
> +  // XXX nsDependentCSubstring doesn't support ToInteger
> +  int32_t integer = ((nsCString)scorePart).ToInteger(&rv);

Please make this nsCString(scorePart).

@@ +290,5 @@
> +    return NS_ERROR_UNEXPECTED;
> +  }
> +  aEntryOut.mScore = (uint32_t)integer;
> +
> +  integer = ((nsCString)accessedPart).ToInteger(&rv);

This bit of code doesn't seem to match up with the comment above saying that "<last accessed time> is a int64_t as a string".  Please fix whichever is incorrect.  And likewise for the cast here.

@@ +305,5 @@
> +}
> +
> +nsresult
> +DataStorage::AsyncReadData(bool& aHaveProfileDir,
> +                           const MutexAutoLock& /*aProofOfLock*/)

I like this idiom, very nice.

@@ +311,5 @@
> +  aHaveProfileDir = false;
> +  // Allocate a Reader so that even if it isn' dispatched,
> +  // the DataStorageReady notification will be fired and Get
> +  // will be able to proceed (this happens in its destructor).
> +  RefPtr<Reader> job(new Reader(this));

nsRefPtr, please.

@@ +408,5 @@
> +  }
> +  return PLDHashOperator::PL_DHASH_NEXT;
> +}
> +
> +static const uint32_t MAX_SCORE = std::numeric_limits<uint32_t>::max();

Why not just use UINT32_MAX, and then you don't need to bother with numeric_limits?

@@ +421,5 @@
> +void
> +DataStorage::MaybeEvictOneEntry(DataStorageType aType,
> +                                const MutexAutoLock& aProofOfLock)
> +{
> +  static const uint32_t sMaxDataEntries = 1024;

This constant should be someplace else so you can use it when reading the data file.

@@ +428,5 @@
> +  if (table.Count() >= sMaxDataEntries) {
> +    KeyAndEntry toEvict;
> +    // If all entries have score MAX_SCORE, this won't actually remove
> +    // anything. This will never happen, however, because having that high
> +    // a score either means someone tampered with the backing file or every

I don't understand the logic of not guarding against somebody tampering with the backing file here.  Perhaps you should explicitly state that it doesn't matter if either of these hypothetical events happen, because then the Remove() call below would try to remove an entry with an empty key, and empty keys are not permitted in the table?  (And note in the key/value validation logic that eviction depends on having only non-empty keys?)

@@ +445,5 @@
> +  MutexAutoLock lock(mMutex);
> +
> +  nsresult rv;
> +  rv = ValidateKeyAndValue(nsDependentCSubstring(aKey, 0),
> +                           nsDependentCSubstring(aValue, 0));

Why explicitly construct dependent substrings of the entire string here?

@@ +590,5 @@
> +
> +  nsCString output;
> +  mPersistentDataTable.EnumerateRead(WriteDataCallback, (void*)&output);
> +
> +  RefPtr<Writer> job(new Writer(output, this));

nsRefPtr, please.

@@ +596,5 @@
> +  if (NS_WARN_IF(NS_FAILED(rv))) {
> +    return rv;
> +  }
> +
> +  mPendingWrite = false;

I realize runnable dispatch failure is unlikely, but it seems slightly better to move this above the check for dispatch failure.  Very minor nit, your call.

@@ +648,5 @@
> +  return NS_OK;
> +}
> +
> +void
> +DataStorage::SetTimer() {

{ on its own line, please.

Maybe MOZ_ASSERT(!NS_IsMainThread()) here?

@@ +654,5 @@
> +
> +  nsresult rv;
> +  if (!mTimer) {
> +    mTimer = do_CreateInstance("@mozilla.org/timer;1", &rv);
> +    unused << NS_WARN_IF(NS_FAILED(rv));

You should return here, otherwise you'll fall through and try to call InitWithFuncCallback with some kind of garbage.

@@ +691,5 @@
> +}
> +
> +void
> +DataStorage::ShutdownTimer()
> +{

Another MOZ_ASSERT(!NS_IsMainThread()) here?

@@ +729,5 @@
> +        if (NS_FAILED(rv)) {
> +          return rv;
> +        }
> +      }
> +      mShuttingDown = true;

Why is it correct to only set this if the shutdown tasks succeeded?  It's not like you're going to get another shot at doing the write and the timer cancellation.  (And, likewise, why only shutdown the worker thread if you succeeded in dispatching the tasks?)

@@ +779,5 @@
> +  if (daysSinceAccessed < 1) {
> +    return false;
> +  }
> +
> +  // Otherwise, increment the score (but don't overflow uint32_t).

Probably better to not mention the datatype here, just in case mScore changes datatypes at some point.

::: security/manager/boot/src/DataStorage.h
@@ +7,5 @@
> +#ifndef mozilla_DataStorage_h
> +#define mozilla_DataStorage_h
> +
> +#include "mozilla/Mutex.h"
> +#include "mozilla/Monitor.h"

Nit: alphabetical order.
Attachment #8481649 - Flags: review?(nfroyd) → feedback+
(In reply to [:mmc] Monica Chew (please use needinfo) from comment #48)

Thanks for the review - I've addressed your changes unless otherwise specified, below.

> Comment on attachment 8481649 [details] [diff] [review]
> @@ +238,5 @@
> > +
> > +  return NS_OK;
> > +}
> > +
> > +// Each line is: <key>\t<score>\t<last accessed time>\t<value>
> 
> oh, now I see why you disallow \t :( I don't think re-writing with protobufs
> is too difficult but understand if you don't want to do so at this late
> stage.

Yeah, I'm also re-thinking using text instead of binary. Also, I imagine in the future it'll need versioning. I think it's fine to fix in a follow-up, though.

> @@ +44,5 @@
> > + *   will block.
> > + *   NB: It is not currently possible to differentiate between missing data
> > + *   and data that is the empty string.
> > + * - When any persistent data changes, a timer is initialized that will
> > + *   eventually asynchronously write all persistent data to the backing file.
> 
> What happens if we shut down before the timer is complete?

On shutdown, we cancel the timer and basically do a synchronous write of any pending data.

> @@ +51,5 @@
> > + *   It is possible to receive a "DataStorageWritten" event while there exist
> > + *   pending persistent data changes. However, those changes will cause the
> > + *   timer to be reinitialized and another "DataStorageWritten" event will
> > + *   be sent.
> > + * - When DataStorage observes the topic "last-pb-context-exited", all
> 
> Inconsistent naming with "DataStorageReady" and "DataStorageWritten." I know
> these are not your events, but why not name yours consistently with existing
> ones?

I'm not sure what existing event names you mean?

> @@ +61,5 @@
> > + * - For testing purposes, the preference "test.datastorage.writetimer" can
> > + *   be set to cause the asynchronous writing of data to happen more quickly.
> > + *   The value in question is in milliseconds.
> > + * - To prevent unbounded memory and disk use, the number of entries in each
> > + *   table is limited to 1024. Evictions are handled in the following manner:
> 
> Summarize to LRU?

Well, it's not exactly least-recently-used. It's more like least-used-overall, where "used" means "the number of discrete 24-hour periods it was accessed in".
(In reply to Nathan Froyd (:froydnj) from comment #50)

Thank you for the review - I've addressed your comments unless otherwise specified, below.

> Comment on attachment 8481649 [details] [diff] [review]
> I have this feeling that when Taras suggested a flat compressed file, he
> probably envisioned a JSON file.  WebIDL probably makes doing that a lot
> nicer now than it would have been 12 months ago.  But it's probably not
> worth trying to rewrite the storage format at this point...

I'm thinking of tweaking it a bit to basically handle small blobs of binary data, but that would be a follow-up (as would be actually doing compression).

> @@ +57,5 @@
> > +
> > +  MutexAutoLock lock(mMutex);
> > +
> > +  nsresult rv;
> > +  rv = NS_NewThread(getter_AddRefs(mWorkerThread));
> 
> Ugh, another helper thread.  Can't we re-purpose another thread for this or
> something?

Maybe, but at the moment this implementation wants to be able to shutdown the thread to prevent further writes. I think sharing another thread with something else would involve changing this situation.

> @@ +428,5 @@
> > +  if (table.Count() >= sMaxDataEntries) {
> > +    KeyAndEntry toEvict;
> > +    // If all entries have score MAX_SCORE, this won't actually remove
> > +    // anything. This will never happen, however, because having that high
> > +    // a score either means someone tampered with the backing file or every
> 
> I don't understand the logic of not guarding against somebody tampering with
> the backing file here.  Perhaps you should explicitly state that it doesn't
> matter if either of these hypothetical events happen, because then the
> Remove() call below would try to remove an entry with an empty key, and
> empty keys are not permitted in the table?  (And note in the key/value
> validation logic that eviction depends on having only non-empty keys?)

I think the worst that can happen if we get a tampered backing file in this way is that we'll constantly have 1025 entries in the persistent table, and every time we try to insert something, that 1025th entry will be the one that gets evicted. I'm not too worried about this because if an attacker can modify this file, they can modify other files and do a lot worse damage.
Attached patch DataStorage patch v3 (obsolete) — Splinter Review
If you have time, I'd appreciate another review on this :) (I can also prepare an interdiff, if that would help.)
Attachment #8481649 - Attachment is obsolete: true
Attachment #8483780 - Flags: review?(nfroyd)
Comment on attachment 8481650 [details] [diff] [review]
nsSiteSecurityService patch v2 rebased

Whoops - wrong patch. Sorry for the bugspam.
Attachment #8481650 - Flags: review?(mmc)
Comment on attachment 8483780 [details] [diff] [review]
DataStorage patch v3

Review of attachment 8483780 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. My main feedback is that the class comment in DataStorage.h is too long. It needs to be short enough and organized so that a skim-reader can pick up on important stuff. I'd like to take one more look before checking in.

::: security/manager/boot/src/DataStorage.cpp
@@ +560,5 @@
> +  // Observed by tests.
> +  nsCOMPtr<nsIRunnable> job =
> +    NS_NewRunnableMethodWithArg<const char*>(mDataStorage,
> +                                             &DataStorage::NotifyObservers,
> +                                             "DataStorageWritten");

data-storage-written, to be consistent with existing event names like last-pb-context-exited?

::: security/manager/boot/src/DataStorage.h
@@ +37,5 @@
> + *   "DataStorageReady" event will still be emitted. This follows semantics
> + *   similar to the permission manager and allows tests that test
> + *   unrelated components to proceed without a profile.
> + * - There are three types of data: persistent, temporary, and private.
> + *   See the enum DataStorageType for more details.

omit this bullet

@@ +40,5 @@
> + * - There are three types of data: persistent, temporary, and private.
> + *   See the enum DataStorageType for more details.
> + * - Get/Put/Remove do the expected things on the specified type of data.
> + *   If Get is called before the "DataStorageReady" event is observed, it
> + *   will block.

omit this bullet and move to the Get declaration below

@@ +52,5 @@
> + *   pending persistent data changes. However, those changes will cause the
> + *   timer to be reinitialized and another "DataStorageWritten" event will
> + *   be sent.
> + * - When DataStorage observes the topic "last-pb-context-exited", all
> + *   private data is cleared.

omit this bullet, the comment is repeated below in the enum section

@@ +60,5 @@
> + *   disabled to prevent further writes to that file (the delayed-write timer
> + *   is cancelled when this happens).
> + * - For testing purposes, the preference "test.datastorage.writetimer" can
> + *   be set to cause the asynchronous writing of data to happen more quickly.
> + *   The value in question is in milliseconds.

better naming like test.datastorage.write_timer_ms let you omit comments like "The value in question is in milliseconds"

@@ +62,5 @@
> + * - For testing purposes, the preference "test.datastorage.writetimer" can
> + *   be set to cause the asynchronous writing of data to happen more quickly.
> + *   The value in question is in milliseconds.
> + * - To prevent unbounded memory and disk use, the number of entries in each
> + *   table is limited to 1024. Evictions are handled in the following manner:

by a modified LRU, then omit the below because you already replicate it in the implementation comments

@@ +107,5 @@
> +  // Initializes the DataStorage. Must be called before using.
> +  // aDataWillPersist returns whether or not data can be persistently saved.
> +  nsresult Init(/*out*/bool& aDataWillPersist);
> +  // Given a key and a type of data, returns a value. Returns an empty string
> +  // if the key is not present for that type of data.

If Get is called before the "DataStorageReady" event is observed, it
44	 *   will block.
Attachment #8483780 - Flags: review?(mmc) → feedback+
Comment on attachment 8483780 [details] [diff] [review]
DataStorage patch v3

Review of attachment 8483780 [details] [diff] [review]:
-----------------------------------------------------------------

I think all my comments were addressed, just two small things.

::: security/manager/boot/src/DataStorage.cpp
@@ +9,5 @@
> +#include "mozilla/Preferences.h"
> +#include "mozilla/Services.h"
> +#include "mozilla/unused.h"
> +#include "nsAppDirectoryServiceDefs.h"
> +#include "nsAutoPtr.h"

This looks unused.

@@ +246,5 @@
> +// int32_t as a string, and the rest are strings.
> +// <value> can contain anything but a newline.
> +// Returns a successful status if the line can be decoded into a key and entry.
> +// Otherwise, an error status is returned and the values assigned to the
> +// output parameters is in an undefined state.

Nit: I think it's slightly more correct to say "are in an undefined state".
Attachment #8483780 - Flags: review?(nfroyd) → review+
Rebased to deal with changes to DataStorage. Carrying over r+.
Attachment #8481650 - Attachment is obsolete: true
Attachment #8484325 - Flags: review+
Attached patch DataStorage patch v3.1 (obsolete) — Splinter Review
Addressed review comments (thanks for the quick turnaround, all!)
https://tbpl.mozilla.org/?tree=Try&rev=181b809b36a8
Attachment #8483780 - Attachment is obsolete: true
Attachment #8484327 - Flags: review?(mmc)
Comment on attachment 8484327 [details] [diff] [review]
DataStorage patch v3.1

Review of attachment 8484327 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!
Attachment #8484327 - Flags: review?(mmc) → review+
Attachment #8485029 - Attachment description: patch v3.2 → DataStorage patch v3.2
https://hg.mozilla.org/mozilla-central/rev/a5084109d733
https://hg.mozilla.org/mozilla-central/rev/cd93888452f0
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: mozilla29 → mozilla35
Comment on attachment 8484325 [details] [diff] [review]
nsSiteSecurityService patch v2.1

Approval Request Comment
[Feature/regressing bug #]: Requirement for tako pinning (HPKP 787133)
[User impact if declined]: Cannot create trusted apps Bug 1016421
[Describe test coverage new/current, TBPL]: kept old HSTS plus new coverage
[Risks and why]: Might break HSTS, reduced via testing
[String/UUID change made/needed]: None
Attachment #8484325 - Flags: approval-mozilla-aurora?
Comment on attachment 8485029 [details] [diff] [review]
DataStorage patch v3.2


Approval Request Comment
[Feature/regressing bug #]: Requirement for tako pinning (HPKP 787133)
[User impact if declined]: Cannot create trusted apps Bug 1016421
[Describe test coverage new/current, TBPL]: kept old HSTS plus new coverage
[Risks and why]: Might break HSTS, reduced via testing
[String/UUID change made/needed]: None
Attachment #8485029 - Flags: approval-mozilla-aurora?
Comment on attachment 8484325 [details] [diff] [review]
nsSiteSecurityService patch v2.1

I spoke with rbarnes and sicking about this bug. This is required for bug 787133, which is required for 1016421. If something is to break it's most likely to break in HSTS / HPKP. Let's get this on Aurora so we can shake out anything as early as possible. Aurora+
Attachment #8484325 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8485029 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
I'd like to verify this bug as fixed. It appears that I could just verify the existence of SiteSecurityServiceState.txt. However, I don't see it anywhere in the profile directory. Is my assumption incorrect? Is there another way to verify this change? Paging Mr. Keeler...
Flags: needinfo?(dkeeler)
If you're using a new profile, it might not show up for 5 minutes after visiting a site that sends an HSTS header (e.g. bugzilla). The other option would be to visit a site and then close the browser, since that should cause it to flush the file out (if it doesn't, then that's a bug).
Flags: needinfo?(dkeeler)
Thanks David. I followed your instructions. I'm now able to see it on Windows, but not Mac. Would it differ by platform?
Flags: needinfo?(dkeeler)
It shouldn't, unless the profile directory is different somehow on OS X. On my Mac, it's at ~/Library/Application Support/Firefox/Profiles/<profile>/SiteSecurityServiceState.txt
Flags: needinfo?(dkeeler)
OK, got it. Dumb me was looking at the wrong Profiles directory (under ~/Library/Caches) by mistake. It is in the exact place you mention. Sorry. Marking verified now.
Status: RESOLVED → VERIFIED
Depends on: 1123971
Depends on: 1240566
No longer depends on: 1240566
Depends on: 1240566
Depends on: 1232586
Depends on: 1119778
No longer depends on: 1232586
Depends on: 1470918
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: