Closed Bug 1329294 Opened 3 years ago Closed 3 years ago

Windows content temp dir not in LocalLow for parent on new profile

Categories

(Core :: Security: Process Sandboxing, defect)

53 Branch
defect
Not set

Tracking

()

VERIFIED FIXED
mozilla55
Tracking Status
firefox52 --- wontfix
firefox-esr52 --- wontfix
firefox53 blocking verified
firefox54 --- verified
firefox55 --- verified

People

(Reporter: haik, Assigned: bobowen)

References

Details

(Whiteboard: sb+)

Attachments

(2 files, 1 obsolete file)

I'm running into failures running a new test on treeherder for bug 1309394 that attempts to write to the content temp directory. The write fails and the logs show the content temp directory being used is in <USER>\AppData\Local\Temp instead of in the AppData\LocalLow directory.

For example

  C:\Users\CLTBLD~1.T-W\AppData\Local\Temp

Instead of

  %USER PROFILE%\AppData\LocalLow\Mozilla\Temp-{UID}\

The content temp dir is intended to be in a low integrity write location in

  %USER PROFILE%\AppData\LocalLow

The test obtains the content temp dir from javascript using

  let dir = Services.dirsvc.get("ContentTmpD", Ci.nsILocalFile);

When running the test locally, the content temp dir is in LocalLow as expected.
See Also: → 1309394
Whiteboard: sb+
I found that the problem isn't occurring on win32 runs, just win64, and that the LocalLow directory is present. I added some code to the browser chrome tests from bug 1309394 to iterate over all directories reachable from $HOME. With a try run that just included bc6 tests, the following dir was found. I suspect this Temp-{UUID} dir was created by a previous test run on the same VM/machine.

  C:\Users\cltbld.T-W864-IX-090\AppData\LocalLow\Mozilla\Temp-{0011405b-ddc2-492e-afaf-69b5bb2848e0}

Needs more debugging.
See Also: → 1352383
OK I'm pretty sure I know what is causing this and it only happen on Windows because of the GPU process.
It almost certainly only happens on a new profile.

I'll probably land the patches here.
Assignee: nobody → bobowencode
Blocks: 1352383
Status: NEW → ASSIGNED
See Also: 1352383
[Tracking Requested - why for this release]:
I know we're late in the cycle, but it's just been realised that this affects new profiles when running with the GPU process and not just automation.
Also the fix is a simple move of the creation to earlier in the existing function.
Summary: Windows content temp dir not in LocalLow when run on treeherder → Windows content temp dir not in LocalLow for parent on new profile
Attachment #8853733 - Flags: review?(benjamin)
Attachment #8853734 - Flags: review?(haftandilian)
Attachment #8853734 - Flags: review?(haftandilian) → review+
This sounds like a possible release blocker for 53 (given that we are shipping the gpu process in a week and a half).  David, what do you think?

Bob, can you work with QE to test this in nightly and request uplift once you feel certain of the fix? Are there useful tests we can add here?
Flags: needinfo?(dvander)
Flags: needinfo?(bobowencode)
When you say "affects new profiles"  -- how serious of an affect?
I honestly don't know how serious the issue is. The GPU process tries to use XPCOM as little as possible. Bob, what can go wrong when this happens?
Flags: needinfo?(dvander)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #7)
> This sounds like a possible release blocker for 53 (given that we are
> shipping the gpu process in a week and a half).  David, what do you think?
> 
> Bob, can you work with QE to test this in nightly and request uplift once
> you feel certain of the fix? 

The main thing I know that this breaks is printing, which won't work until you restart, so that can be used to test.
It would possibly affect crash reporting as well as I think that uses the temp directory to store some information in the child to be picked up in the parent.

Other things that just use the content temp directory in the child process (possibly things like graphics drivers etc.), should be OK as it's the mismatch between the parent and child that is likely to cause issues.

The fix is very simple, I'll ping bsmedberg later to see if we can get this landed ASAP.
I've checked and the fix patch applies cleanly to Beta, enabling the test doesn't but I don't think we need to uplift that to Beta.

> Are there useful tests we can add here?

I've added an assert that should prevent this situation from occurring again and I'm enabling a test that we thought was just failing because of an issue in automation, because it seemed to pass locally, but I'm guessing the GPU process wasn't running locally for some reason.
Flags: needinfo?(bobowencode)
Attachment #8853733 - Flags: review?(benjamin) → review+
Pushed by bobowencode@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/3b1bbf6392f4
Part 1: Ensure Content Temp Dir is created before use. r=bsmedberg
https://hg.mozilla.org/integration/mozilla-inbound/rev/08e2719fe7be
Part 2: Enable content temp file test on Windows. r=haik
Flags: needinfo?(bobowencode)
So this is crashing during the profile migration part of the refresh stack below.

I assume this is because it tries to show a modal dialog, which kicks off the GPU process.
As part of PrepareLauch it attempts to use nsXREDirProvider, before we have called nsXREDirProvider::DoStartup()

bsmedberg: I'm not sure what issues this could cause apart from with the content temp dir. I could just remove the assert for now, so we at least fix the new profile case.
What do you think?



xul.dll!nsXREDirProvider::GetFile Line 504
xul.dll!FindProviderFile Line 333
xul.dll!nsDirectoryService::Get Line 364
xul.dll!NS_GetSpecialDirectory Line 27
xul.dll!CrashReporter::OOPInit Line 3489
xul.dll!mozilla::ipc::GeckoChildProcessHost::PrepareLaunch Line 336
xul.dll!mozilla::ipc::GeckoChildProcessHost::AsyncLaunch Line 402
xul.dll!mozilla::gfx::GPUProcessHost::Launch Line 44
xul.dll!mozilla::gfx::GPUProcessManager::LaunchGPUProcess Line 149
xul.dll!gfxPlatform::Init Line 707
xul.dll!gfxPlatform::GetPlatform Line 533
xul.dll!CreateVsyncRefreshTimer Line 985
xul.dll!nsRefreshDriver::ChooseTimer Line 1100
xul.dll!nsRefreshDriver::EnsureTimerStarted Line 1316
xul.dll!nsRefreshDriver::ScheduleViewManagerFlush Line 2219
xul.dll!mozilla::PresShell::ScheduleViewManagerFlush Line 3735
xul.dll!SchedulePaintInternal Line 6325
xul.dll!InvalidateFrameInternal Line 6368
xul.dll!nsIFrame::InvalidateFrame Line 6427
xul.dll!nsIFrame::InvalidateFrameSubtree Line 6383
xul.dll!InvalidateCanvasIfNeeded Line 8810
xul.dll!nsCSSFrameConstructor::ContentRangeInserted Line 7915
xul.dll!nsCSSFrameConstructor::ContentInserted Line 7799
xul.dll!mozilla::PresShell::Initialize Line 1789
xul.dll!mozilla::dom::XULDocument::StartLayout Line 1926
xul.dll!mozilla::dom::XULDocument::DoneWalking Line 3033
xul.dll!mozilla::dom::XULDocument::ResumeWalk Line 2977
xul.dll!mozilla::dom::XULDocument::OnScriptCompileComplete Line 3471
xul.dll!NotifyOffThreadScriptCompletedRunnable::Run Line 2671
xul.dll!nsThread::ProcessNextEvent Line 1270
xul.dll!NS_ProcessNextEvent Line 389
xul.dll!nsXULWindow::ShowModal Line 386
xul.dll!nsContentTreeOwner::ShowAsModal Line 541
xul.dll!nsWindowWatcher::OpenWindowInternal Line 1317
xul.dll!nsWindowWatcher::OpenWindow Line 355
xul.dll!_NS_InvokeByIndex Line 57	Unknown
xul.dll!CallMethodHelper::Invoke Line 2010
xul.dll!CallMethodHelper::Call Line 1329
xul.dll!XPCWrappedNative::CallMethod Line 1296
xul.dll!XPC_WN_CallMethod Line 983
xul.dll!js::CallJSNative Line 291
xul.dll!js::InternalCallOrConstruct Line 455
xul.dll!InternalCall Line 500
xul.dll!js::CallFromStack Line 506
xul.dll!Interpret Line 2997
xul.dll!js::RunScript Line 395
xul.dll!js::InternalCallOrConstruct Line 473
xul.dll!InternalCall Line 500
xul.dll!js::Call Line 519
xul.dll!js::Wrapper::call Line 165
xul.dll!js::CrossCompartmentWrapper::call Line 353
xul.dll!js::Proxy::call Line 464
xul.dll!js::proxy_Call Line 716
xul.dll!js::CallJSNative Line 291
xul.dll!js::InternalCallOrConstruct Line 437
xul.dll!InternalCall Line 500
xul.dll!js::Call Line 519
xul.dll!JS_CallFunctionValue Line 2838
xul.dll!nsXPCWrappedJSClass::CallMethod Line 1214
xul.dll!nsXPCWrappedJS::CallMethod Line 614
xul.dll!PrepareAndDispatch Line 85
xul.dll!SharedStub Line 113
xul.dll!XREMain::XRE_mainRun Line 4340
Flags: needinfo?(bobowencode) → needinfo?(benjamin)
It's somewhat normal for code to be using nsXREDirProvider before DoStartup has been called. There are two cases where this happens:

* profile manager launch: in this case, we launch gecko without a profile at all, and so nsXREDirProvider::DoStartup is never going to be called
* profile migration: in this case, we know where the profile is going to be: nsXREDirProvider::SetProfile has already been called, but before we "start" the profile we let the profile migrator run and potentially migrate some files into the profile

I don't know whether this specific profile migration path is actually necessary any more: it was used to migrate pre-0.9 Firefox profiles and Mozilla suite/seamonkey profiles, but I doubt any of those are interesting!
Flags: needinfo?(benjamin)
(In reply to Bob Owen (:bobowen) from comment #16)
> Looks like putting it in SetFile didn't work so well:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=4f2aaa217c12730e9305f366ad9c472a17df66cb

Should have said SetProfile there, but anyway the problem was that xpcom has not been initialised by that point and we can't get the UUID service.

This patch takes a different tack.
I've moved the creation into LoadContentProcessTempDir, so well create it on first use (in the parent) or in DoStartup if we haven't already.

Try push:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=15b1443c9d3438412d84983a7da0fcbf3df35d52
Attachment #8856131 - Flags: review?(benjamin)
Attachment #8853733 - Attachment is obsolete: true
Attachment #8856131 - Flags: review?(benjamin) → review+
Pushed by bobowencode@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/59d5c20392e6
Part 1: Ensure Content Temp Dir is created before use. r=bsmedberg
Pushed by bobowencode@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/35b4896888d3
Part 2: Enable content temp file test on Windows. r=haik
Comment on attachment 8856131 [details] [diff] [review]
Part 1: Ensure Content Temp Dir is created before use

Approval Request Comment
[Feature/Bug causing the regression]:
Indirectly caused by the GPU process, as it triggers this potential underlying issue with new profiles and the content temp dir.

[User impact if declined]:
When triggered on new profiles this issue will mean printing does not work and possibly some crash information will be missing.

[Is this code covered by automated tests?]:
A test that we now realise was failing on Windows because of this issue has now been turned on.

[Has the fix been verified in Nightly?]:
Verified on local build.

[Needs manual test from QE? If yes, steps to reproduce]: 
Yes, STR - start firefox with a new profile and attempt to print.

[List of other uplifts needed for the feature/fix]:
None (in particular the other patch from this bug is not needed as it only enables a test).

[Is the change risky?]:
No.

[Why is the change risky/not risky?]:
Fairly simple change to ensure the content temp dir creation code is always triggered before first use.

[String changes made/needed]:
None.
Attachment #8856131 - Flags: approval-mozilla-beta?
Attachment #8856131 - Flags: approval-mozilla-aurora?
Do we want this on ESR52 too?
Flags: needinfo?(bobowencode)
https://hg.mozilla.org/mozilla-central/rev/59d5c20392e6
https://hg.mozilla.org/mozilla-central/rev/35b4896888d3
Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
We probably will need to do an RC2 build for this as it just missed today's 53.0 RC build. 
That is if people agree that it is a release blocking issue. I am still not sure. 

Randell, what do you think?
Flags: needinfo?(rjesup)
Hi Florin, could you help find someone to verify if this issue was fixed as expected on a latest Nightly build? Thanks!
Flags: qe-verify+
Flags: needinfo?(florin.mezei)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #21)
> Do we want this on ESR52 too?

I'm not totally sure, I think this only gets triggered with the GPU process, which I think is only enabled for Fx53.
It applies cleanly to Fx52.0.2 release, so I'm guessing it will apply to ESR52, if we wanted to take it just in case.
Flags: needinfo?(bobowencode)
I have reproduced the issue using a Nightly 55.0a1 (id: 20170402030202). 
I can confirm that the issue is VERIFIED FIXED in the latest Nightly build on Windows 7 x64, Windows 8.1 x64 and Windows 10 x64.
Flags: needinfo?(florin.mezei)
I'm more worried about crash information working than printing not working, though the question is "for how long is printing not working?"  If it's just for say the first run, or until something common happens, then I'm tempted to not take this in RC.
The one factor that makes me think of taking it is kiosk/hotel/etc use, which may create a new profile for each user who sits down.  (I've seen a number of hotels that use Firefox at their business stations)

If it's "new profiles won't have printing working, and it won't work for some time (or will never work", then it's clearly a blocker.
Comment on attachment 8856131 [details] [diff] [review]
Part 1: Ensure Content Temp Dir is created before use

Let's take this for an RC2 build later this week.
Attachment #8856131 - Flags: approval-mozilla-beta?
Attachment #8856131 - Flags: approval-mozilla-beta+
Attachment #8856131 - Flags: approval-mozilla-aurora?
Attachment #8856131 - Flags: approval-mozilla-aurora+
(In reply to Bob Owen (:bobowen) from comment #25)
> I'm not totally sure, I think this only gets triggered with the GPU process,
> which I think is only enabled for Fx53.

If this requires the GPU process, then wontfix for ESR52 makes sense. If we find later that it affects more than just the GPU process, we can always set this back to affected and request approval then. Thanks!
Also verified this fix in RC 53.0 build 5 as part of the RC build sign off. Marking as VERIFIED FIXED.
Status: RESOLVED → VERIFIED
Depends on: 1358964
(In reply to Bob Owen (:bobowen) from comment #20)
> [Needs manual test from QE? If yes, steps to reproduce]: 
> Yes, STR - start firefox with a new profile and attempt to print.
Verified fixed Fx 54b4, Win 10 x64.
Flags: qe-verify+
Flags: needinfo?(rjesup)
You need to log in before you can comment on or make changes to this bug.