Closed Bug 820852 Opened 7 years ago Closed 7 years ago

Implement write poisoning on Windows

Categories

(Core :: XPCOM, defect)

x86
macOS
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla21
Tracking Status
relnote-firefox --- -

People

(Reporter: ehsan, Assigned: BenWa)

References

Details

Attachments

(2 files, 4 obsolete files)

Poison all of the writes!
So, I took a quick look around and there seems to be only one write function on Windows (weird, right?!) called NtFlushBuffersFile, which is what kernel32's FlushFileBuffers is implemented on top of.

NtFlushBuffersFile is documented here: <http://msdn.microsoft.com/en-us/library/windows/hardware/ff566454%28v=vs.85%29.aspx> (don't get confused by the Zw prefix.)  It is implemented in ntdll.dll and it's what calls the Windows kernel to perform the write.

For intercepting this call, we can use http://mxr.mozilla.org/mozilla-central/source/toolkit/xre/nsWindowsDllInterceptor.h.  Here's what the code will roughly look like:

WindowsDllInterceptor interceptor;

typedef NTSTATUS (WINAPI* FlushBuffersFile_fn)(HANDLE, PIO_STATUS_BLOCK);

FlushBuffersFile_fn gOriginalFlushBuffersFile;

NTSTATUS WINAPI
patched_FlushBuffersFile(HANDLE aFileHandle, PIO_STATUS_BLOCK aIoStatusBlock)
{
  // poison the write if needed, otherwise:
  return gOriginalFlushBuffersFile(aFileHandle, aIoStatusBlock);
}

void ReplaceFlushBuffersFile()
{
  interceptor.Init("ntdll.dll");
  interceptor.AddHook("NtFlushBuffersFile", reinterpret_cast<intptr_t>(patched_FlushBuffersFile), reinterpret_cast<void**>(&gOriginalFlushBuffersFile));
}
I promise fast reviews for a patch here (for the Windows bits.)  Rafael should review the rest.
Assignee: nobody → bgirard
Attached patch WIP (obsolete) — Splinter Review
Comment on attachment 697745 [details] [diff] [review]
WIP

Review of attachment 697745 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/build/mozPoisonWriteWin.cpp
@@ +27,5 @@
> +  AbortOnBadWrite();
> +  return gOriginalFlushBuffersFile(aFileHandle, aIoStatusBlock);
> +}
> +
> +bool ValidWriteAssert(bool ok)

Can |ok| ever be true?

@@ +39,5 @@
> +
> +  return false;
> +}
> +
> +void AbortOnBadWrite()

The name of this function is very misleading.

@@ +59,5 @@
> +  if (PoisoningDisabled)
> +      return;
> +
> +  // REVIEW Does this need to be static? Examples make this static but I don't see anything in WindowsDllInterceptor that suggests its required
> +  WindowsDllInterceptor interceptor;

It's best to follow suit here.
You might want to take a look at

https://bugzilla.mozilla.org/show_bug.cgi?id=826029#c16

We want to reenable writes just before returning from main, but we try to unload xul, so that can cause problems. For some reason on windowns we only succeed in unloading it in -process-updates runs, which is why I was getting crashes in the update tests.

Since it is not unloaded already in regular runs, maybe the best thing to do is to not try it at all, but we should at least figure out why it is not being unloaded.
(In reply to :Ehsan Akhgari from comment #6)
> Comment on attachment 697745 [details] [diff] [review]
> WIP
> 
> Review of attachment 697745 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: xpcom/build/mozPoisonWriteWin.cpp
> @@ +27,5 @@
> > +  AbortOnBadWrite();
> > +  return gOriginalFlushBuffersFile(aFileHandle, aIoStatusBlock);
> > +}
> > +
> > +bool ValidWriteAssert(bool ok)
> 
> Can |ok| ever be true?

It may later. I'm staying consistent with the windows implementation. I may refactor some common code to a mozPoisonWriteBase.cpp but I'm not sharing enough code yet.

> 
> @@ +39,5 @@
> > +
> > +  return false;
> > +}
> > +
> > +void AbortOnBadWrite()
> 
> The name of this function is very misleading.

I was trying to be consistent with the windows implementation. More logic will go in here soon.

> 
> @@ +59,5 @@
> > +  if (PoisoningDisabled)
> > +      return;
> > +
> > +  // REVIEW Does this need to be static? Examples make this static but I don't see anything in WindowsDllInterceptor that suggests its required
> > +  WindowsDllInterceptor interceptor;
> 
> It's best to follow suit here.

Ok. As long as it's not adding a static constructor.
Comment on attachment 698103 [details] [diff] [review]
Part 1: Poison writes to NtFlushBuffersFile and share code with mozPoisonWriteBase

Review of attachment 698103 [details] [diff] [review]:
-----------------------------------------------------------------

This is going on the right direction, but please check these issues first.

::: toolkit/xre/nsWindowsDllInterceptor.h
@@ +362,5 @@
>            return;
>          }
> +      } else if (origBytes[nBytes] == 0xB8) {
> +        // MOV 0xB8: http://ref.x86asm.net/coder32.html#xB8
> +        nBytes += 5;

Do we have tests for this? Can toolkit/xre/test/win/TestDllInterceptor.cpp be expanded to include it?

::: xpcom/build/mozPoisonWriteBase.cpp
@@ +40,5 @@
> +
> +std::vector<int>& getDebugFDs() {
> +  // We have to use new as some write happen during static destructors
> +  // so an static std::vector might be destroyed while we still need it.
> +  static std::vector<int> *DebugFDs = new std::vector<int>();

This is using a local static, which is guaranteed to have an atomic initialization, but I am not sure if msvc implements this correctly. Could you please check that?

::: xpcom/build/mozPoisonWriteWin.cpp
@@ +32,5 @@
> +}
> +
> +bool ValidWriteAssert(bool ok)
> +{
> +  if (ok) {

Why not just MOZ_ASSERT(ok)?

::: xpcom/build/nsXPComInit.cpp
@@ +525,1 @@
>  

Testing change?
Attachment #698103 - Flags: review?(respindola) → review-
Depends on: 761135
(In reply to Rafael Ávila de Espíndola (:espindola) from comment #11)
> ::: xpcom/build/mozPoisonWriteBase.cpp
> @@ +40,5 @@
> > +
> > +std::vector<int>& getDebugFDs() {
> > +  // We have to use new as some write happen during static destructors
> > +  // so an static std::vector might be destroyed while we still need it.
> > +  static std::vector<int> *DebugFDs = new std::vector<int>();
> 
> This is using a local static, which is guaranteed to have an atomic
> initialization, but I am not sure if msvc implements this correctly. Could
> you please check that?

It's not:
http://stackoverflow.com/questions/10585928/is-static-init-thread-safe-with-vc2010

Let's make sure that debugFDs is always lock, which it currently is. Adding a comment to document this.
Attached patch patch (obsolete) — Splinter Review
I merged the two part since they didn't really apply on their own.
Attachment #698103 - Attachment is obsolete: true
Attachment #698147 - Attachment is obsolete: true
Attachment #699382 - Flags: review?(respindola)
Attachment #699382 - Flags: review?(ehsan)
Comment on attachment 699382 [details] [diff] [review]
patch

Review of attachment 699382 [details] [diff] [review]:
-----------------------------------------------------------------

I only skimmed over the mozPoisonWrite* stuff since I think they're mostly moving code around.  Please let me know if you want a more in-depth review.
Attachment #699382 - Flags: review?(ehsan) → review+
Comment on attachment 699382 [details] [diff] [review]
patch

Review of attachment 699382 [details] [diff] [review]:
-----------------------------------------------------------------

Just wanted to point out that there is a code change within a big code move block to make it more obvious to reviewers.

::: xpcom/build/mozPoisonWriteBase.cpp
@@ +116,5 @@
> +
> +    // We want the sha1 of the entire file, so please don't write to fd
> +    // directly; use sha1Stream.
> +#ifdef XP_WIN
> +    // The windows version doesn't open the file so there's a time-of-check-time-of-use issue :(

Note that this #ifdef is not from the move.
Comment on attachment 699382 [details] [diff] [review]
patch

Review of attachment 699382 [details] [diff] [review]:
-----------------------------------------------------------------

I only skimmed over the mozPoisonWrite* stuff since I think they're mostly moving code around.  Please let me know if you want a more in-depth review.

::: xpcom/build/mozPoisonWriteBase.cpp
@@ +116,5 @@
> +
> +    // We want the sha1 of the entire file, so please don't write to fd
> +    // directly; use sha1Stream.
> +#ifdef XP_WIN
> +    // The windows version doesn't open the file so there's a time-of-check-time-of-use issue :(

Can you please explain this?  The comment doesn't really tell me much about why this is needed.
mkstemp doesn't exist in windows. _mktemp_s only formats the name but doesn't open the file so a fopen_s is also needed.
Comment on attachment 699382 [details] [diff] [review]
patch

Review of attachment 699382 [details] [diff] [review]:
-----------------------------------------------------------------

Has this found any writes on try?

::: browser/app/nsBrowserApp.cpp
@@ +386,5 @@
>    {
>      ScopedLogging log;
>      result = do_main(argc, argv, xreDirectory);
>    }
>  

This depends on 761135, but it will be awesome to see it go!

::: toolkit/xre/test/win/TestDllInterceptor.cpp
@@ +115,5 @@
>        TestHook("user32.dll", "SetWindowLongA") &&
>        TestHook("user32.dll", "SetWindowLongW") &&
>  #endif
>        TestHook("user32.dll", "TrackPopupMenu") &&
> +      TestHook("ntdll.dll", "NtFlushBuffersFile") &&

Thanks!

::: xpcom/build/mozPoisonWriteBase.cpp
@@ +21,5 @@
> +using namespace mozilla;
> +
> +namespace mozilla {
> +
> +char *sProfileDirectory = NULL;

Please add an anonymous namespace to this file and move as much as you can to it. sProfileDirectory can be moved for example.

@@ +123,5 @@
> +    if (fopen_s(&stream, name, "w") != 0) {
> +      NS_RUNTIMEABORT("Um, how did we get here?");
> +    }
> +    fd = fileno(stream);
> +    // REVIEW is a fclose required on the stream

I think so. Since you have a FILE, you should use fclose on it and let it close the FD.  If you want, you can do a fdopen on the non-windows version so that we always just fclose.

Ehsan, is there something more like mkstemp on windows that returns an open file?
Attachment #699382 - Flags: review?(respindola) → review+
> > +    // REVIEW is a fclose required on the stream
> 
> I think so. Since you have a FILE, you should use fclose on it and let it
> close the FD.  If you want, you can do a fdopen on the non-windows version
> so that we always just fclose.

I just noticed that SHA1Stream calls fdopen, so maybe (depending on what windows apis we find) the best option is to just pass a FILE to SHA1Stream and have the caller call fdopen if needed.
Comment on attachment 699382 [details] [diff] [review]
patch

Review of attachment 699382 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/build/mozPoisonWriteBase.cpp
@@ +121,5 @@
> +    int fd = _mktemp_s(name, strlen(name) + 1);
> +    FILE *stream;
> +    if (fopen_s(&stream, name, "w") != 0) {
> +      NS_RUNTIMEABORT("Um, how did we get here?");
> +    }

You should do this in a loop to avoid the race condition.
Attachment #699382 - Flags: review+ → review-
(In reply to Rafael Ávila de Espíndola (:espindola) from comment #20)
> Ehsan, is there something more like mkstemp on windows that returns an open
> file?

Not that I know of.
Attachment #699382 - Attachment is obsolete: true
Attachment #701175 - Flags: review?(ehsan)
Comment on attachment 701175 [details] [diff] [review]
patch + old mac warning fix

Review of attachment 701175 [details] [diff] [review]:
-----------------------------------------------------------------

I know that you're just moving a lot of this code, but you know that I don't give out r+'es that easily.  ;-)  (Sorry!)

::: xpcom/build/mozPoisonWrite.h
@@ +8,5 @@
>  #include <stdio.h>
>  
>  MOZ_BEGIN_EXTERN_C
>    void MozillaRegisterDebugFD(int fd);
> +  void MozillaRegisterDebugFile(FILE *f);

Nit: FILE, to match the Unregister function.

::: xpcom/build/mozPoisonWriteBase.cpp
@@ +47,5 @@
> +}
> +
> +void BaseCleanup() {
> +  if (sProfileDirectory)
> +    PL_strfree(sProfileDirectory);

Null check not needed.

@@ +49,5 @@
> +void BaseCleanup() {
> +  if (sProfileDirectory)
> +    PL_strfree(sProfileDirectory);
> +  sProfileDirectory = nullptr;
> +  delete &getDebugFDs();

This is terrible.  Just make getDebugFDs return a pointer.

@@ +58,5 @@
> +// computes the sha1 of the data that passes through it.
> +class SHA1Stream
> +{
> +public:
> +    SHA1Stream(FILE *stream) {

Nit: explicit.

@@ +59,5 @@
> +class SHA1Stream
> +{
> +public:
> +    SHA1Stream(FILE *stream) {
> +        mFile = stream;

Nit: please use an initializer list.

@@ +110,5 @@
> +                 reinterpret_cast<void*>(&rawStack), 0, nullptr);
> +    Telemetry::ProcessedStack stack = Telemetry::GetStackAndModules(rawStack);
> +
> +    nsPrintfCString nameAux("%s%s", sProfileDirectory,
> +                            "/Telemetry.LateWriteTmpXXXXXX");

Use something like NS_SLASH defined elsewhere in the code base.

@@ +119,5 @@
> +    // directly; use sha1Stream.
> +    FILE *stream;
> +#ifdef XP_WIN
> +    for (;;) {
> +      // The windows version doesn't open the file so there's a time-of-check-time-of-use issue :(

Please make this comment clearer.

@@ +128,5 @@
> +      }
> +      if (fopenResult != 0) {
> +        NS_RUNTIMEABORT("Um, how did we get here?");
> +      }
> +      break;

Hmm, this loop makes my head swing.  I think the following is more readable:

  int result;
  errno_t fOpenResult;
  do {
    result = _mkstemp_s(...);
    fOpenResult = fopen_s(...);
  } while (fOpenResult == EEXIST);

  if (fOpenResult != 0) {
    how did we get here?
  }

What do you think?

Or, even better, just implement your own mkstemp #ifdef XP_WIN earlier in the file and remove this #ifdef and just call it like the Unix code path.

@@ +157,5 @@
> +        //      cmdsize 632
> +        //      segname __TEXT
> +        //      vmaddr 0x0000000100000000
> +        // so to print the line matching the offset 123 one has to run
> +        // atos -o firefox 0x100000123.

This comment seems kind of Mac specific.

::: xpcom/build/mozPoisonWriteBase.h
@@ +5,5 @@
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +// Private interface for code shared between the platforms of mozPoisonWriteXXX
> +
> +#pragma once

Use #include guards.

@@ +20,5 @@
> +
> +struct DebugFDAutoLockTraits {
> +  typedef PRLock *type;
> +  const static type empty() {
> +    return NULL;

nullptr

::: xpcom/build/mozPoisonWriteWin.cpp
@@ +54,5 @@
> +  sNtDllInterceptor.AddHook("NtFlushBuffersFile", reinterpret_cast<intptr_t>(patched_FlushBuffersFile), reinterpret_cast<void**>(&gOriginalFlushBuffersFile));
> +}
> +}
> +
> +// REVIEW How to add a __cleanup method for BaseCleanup
\ No newline at end of file

What?
Attachment #701175 - Flags: review?(ehsan) → review+
(In reply to :Ehsan Akhgari from comment #27)
> Hmm, this loop makes my head swing.  I think the following is more readable:
> 
>   int result;
>   errno_t fOpenResult;
>   do {
>     result = _mkstemp_s(...);
>     fOpenResult = fopen_s(...);
>   } while (fOpenResult == EEXIST);
> 
>   if (fOpenResult != 0) {
>     how did we get here?
>   }
> 
> What do you think?

Testing for EEXIST is actually the wrong thing since "w" will just destroy the old file.

> 
> Or, even better, just implement your own mkstemp #ifdef XP_WIN earlier in
> the file and remove this #ifdef and just call it like the Unix code path.

We need to get the stream now and not the FD so there's not much to gain. I'll use the do {}

> 
> @@ +157,5 @@
> > +        //      cmdsize 632
> > +        //      segname __TEXT
> > +        //      vmaddr 0x0000000100000000
> > +        // so to print the line matching the offset 123 one has to run
> > +        // atos -o firefox 0x100000123.
> 
> This comment seems kind of Mac specific.

I'll let Raphael deal with this when we support the breakpad IDs.

> 
> ::: xpcom/build/mozPoisonWriteBase.h
> @@ +5,5 @@
> > + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> > +
> > +// Private interface for code shared between the platforms of mozPoisonWriteXXX
> > +
> > +#pragma once
> 
> Use #include guards.

I though #pragma was ok you told me last
(In reply to comment #28)
> > ::: xpcom/build/mozPoisonWriteBase.h
> > @@ +5,5 @@
> > > + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> > > +
> > > +// Private interface for code shared between the platforms of mozPoisonWriteXXX
> > > +
> > > +#pragma once
> > 
> > Use #include guards.
> 
> I though #pragma was ok you told me last

No, I was mistaken.  I brought it up on dev-platform and it turns out that it will screw us over if you #include the same file using different cases in its name on case-preserving but insensitive platforms such as Windows and Mac.
STR:
1) Build mac 64 debug with 'ac_add_options --enable-trace-malloc'
2) Run with 'python leaktest.py -- --trace-malloc malloc.log --shutdown-leaks=sdleak.log'

#0  0x00007fff968e5212 in __pthread_kill ()
#1  0x00007fff9101daf4 in pthread_kill ()
#2  0x00007fff91061dce in abort ()
#3  0x0000000100409b6d in PR_Assert (s=0x1004404bc "0 == rv", file=0x100442e80 "/Volumes/SSD-Mac1/Users/benoitgirard/ssd-mozilla/mozilla-central/tree/nsprpub/pr/src/pthreads/ptsynch.c", ln=175) at /Volumes/SSD-Mac1/Users/benoitgirard/ssd-mozilla/mozilla-central/tree/nsprpub/pr/src/io/prlog.c:554
#4  0x000000010042ceae in PR_Lock (lock=0x10c9152b0) at /Volumes/SSD-Mac1/Users/benoitgirard/ssd-mozilla/mozilla-central/tree/nsprpub/pr/src/pthreads/ptsynch.c:175
#5  0x0000000103853e5b in mozilla::DebugFDAutoLock::DebugFDAutoLock (this=0x7fff5fbff2b8) at mozPoisonWriteBase.h:42
#6  0x0000000103852805 in mozilla::DebugFDAutoLock::DebugFDAutoLock (this=0x7fff5fbff2b8) at mozPoisonWriteBase.h:43
#7  0x00000001038520f7 in MozillaRegisterDebugFD (fd=6) at /Volumes/SSD-Mac1/Users/benoitgirard/ssd-mozilla/mozilla-central/tree/xpcom/build/mozPoisonWriteBase.cpp:222
#8  0x000000010392335f in NS_TraceMallocDumpAllocations (pathname=0x106656ec0 "sdleak.log") at /Volumes/SSD-Mac1/Users/benoitgirard/ssd-mozilla/mozilla-central/tree/tools/trace-malloc/lib/nsTraceMalloc.c:1780
#9  0x0000000103923171 in NS_TraceMallocShutdown () at /Volumes/SSD-Mac1/Users/benoitgirard/ssd-mozilla/mozilla-central/tree/tools/trace-malloc/lib/nsTraceMalloc.c:1526
#10 0x0000000103927151 in TraceMallocShutdown::~TraceMallocShutdown (this=0x106d45cc0) at /Volumes/SSD-Mac1/Users/benoitgirard/ssd-mozilla/mozilla-central/tree/tools/trace-malloc/lib/nsTypeInfo.cpp:27
#11 0x0000000103927135 in TraceMallocShutdown::~TraceMallocShutdown (this=0x106d45cc0) at /Volumes/SSD-Mac1/Users/benoitgirard/ssd-mozilla/mozilla-central/tree/tools/trace-malloc/lib/nsTypeInfo.cpp:26
#12 0x00007fff91063307 in __cxa_finalize ()
#13 0x00007fff91064f57 in exit ()
#14 0x00000001000011ab in start ()
The problem is that we need to keep track of the DebugFD regardless of the state of the write poisoning. Otherwise we could enable and miss from opened DebugFD.
https://hg.mozilla.org/mozilla-central/rev/afbe1011ae1c
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
Anyone else seeing this error?
mozPoisonWriteBase.cpp
mozPoisonWriteWin.cpp
c:\t1\hg\comm-central\mozilla\xpcom\build\mozPoisonWriteBase.h(24) : error C2065: 'nullptr' : undeclared identifier
c:\t1\hg\comm-central\mozilla\xpcom\build\mozPoisonWriteBase.h(24) : error C2065: 'nullptr' : undeclared identifier
Depends on: 832973
How can I locally disable this?  Can I modify ValidWriteAssert return true?  Doing shutdown CC logging trips this.
Making it return |false| seems to work.
(In reply to Andrew McCreight [:mccr8] from comment #41)
> Making it return |false| seems to work.

You just need to register the IO as debugging:
http://mxr.mozilla.org/mozilla-central/source/xpcom/build/mozPoisonWrite.h#14
Thanks, that looks like an easy fix. Filed bug  837197 for it.
Blocks: 845098
You need to log in before you can comment on or make changes to this bug.