Closed Bug 1305436 Opened 4 years ago Closed 4 years ago

Firefox 49 won't start after installation

Categories

(Core :: Networking, defect)

49 Branch
x86_64
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla52
Tracking Status
firefox49 blocking fixed
relnote-firefox --- 49+
firefox50 + fixed
firefox51 + fixed
firefox52 + fixed

People

(Reporter: assistance, Assigned: dragana, NeedInfo)

References

()

Details

(Keywords: regression, Whiteboard: [necko-active])

Attachments

(2 files, 2 obsolete files)

User Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:48.0) Gecko/20100101 Firefox/48.0
Build ID: 20160823121617

Steps to reproduce:

On Windows 7 64bit :
* Installed FF49 (x86 or x64)
or
* auto-upgraded from 48.0.2

After installation, launched Firefox


Actual results:

Firefox does not start.
The firefox.exe process is running (using between 5Mb and 50000Mb of memory), but nothing else happens.
The process does not use CPU (0% utilization).
No Firefox window appears.

A discussion is opened on Firefox forum : https://support.mozilla.org/en-US/questions/1139904


Expected results:

Firefox should have opened a window.
OS: Unspecified → Windows 7
Hardware: Unspecified → x86_64
Component: Untriaged → General
i am going to confirm this bug based on having seen a couple of similar user reports in various support channels. 
normal troubleshooting steps like firefox safemode, new profile, reinstallation and windows safemode don't seem to have an effect on the issue and affected users don't appear to have the same security software or other obvious third-party programs running that might affect firefox.
Status: UNCONFIRMED → NEW
Component: General → Untriaged
Ever confirmed: true
Keywords: regression
Sorry, I did a mistake in my bug description : Firefox uses between *5000K and 50000K* of memory
Flags: needinfo?(assistance)
hi jim, we have tried running in firefox safemode & windows safemode as well and those didn't make a difference unfortunately: https://support.mozilla.org/en-US/questions/1139904?page=2#answer-921053
This is potentially related to bug Bug 1304848 See Bug1304848 comment 7
> One user has reported it is AVG Antivirus causing the issue.
> https://support.mozilla.org/questions/1140005#answer-921255
> The same person also got an error 
> < Firefox cannot start correctly (0xc0000022) after version update
> https://support.mozilla.org/questions/1140218
> That error apparently  occurs if Firefox is reinstalled and AVG is renabled.
i'm not sure that this would be related to bug 1304848 as the 2 users i the sumo thread that posted their list of installed programs didn't have avg present there
I confirm that I don't have any error message. Nothing related to "mozglue.dll".
The process starts, but nothing happens.
Component: Untriaged → Application Update
Product: Firefox → Toolkit
Component: Application Update → Untriaged
Product: Toolkit → Firefox
we had a very helpful irc user (thanks, firefoxer!) who was affected and successfully ran regression testing on the issue and came up with a stacktrace as well:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=afb08f0036485e6840155eb3104327345670cd63&tochange=3ea44d60cb3fe23293fe3326c43636336e4ac8dc
This looks like a socket thread related hang. Maybe someone from the necko team should take a look.
Flags: needinfo?(jduell.mcbugs)
Thanks to the awesome Aryx, he was able to create 2 try builds for us with bug 1260764 & bug 1260725 removed! So if assistance-multi & others can install these builds and see if any start up correctly, it would be much appreciated.

Try build with Bug 1260764 backed out:
https://archive.mozilla.org/pub/firefox/try-builds/archaeopteryx@coole-files.de-516caa78c3a85cf55f14acdf661d0d3dff5b4f1c/try-win32/firefox-48.0a1.en-US.win32.installer.exe
Zip build version:
https://archive.mozilla.org/pub/firefox/try-builds/archaeopteryx@coole-files.de-516caa78c3a85cf55f14acdf661d0d3dff5b4f1c/try-win32/firefox-48.0a1.en-US.win32.zip

Try build with Bug 1260725 backed out:
https://archive.mozilla.org/pub/firefox/try-builds/archaeopteryx@coole-files.de-3d23b6dfd423f4dfef4dac906719720267d9fc6a/try-win32/firefox-52.0a1.en-US.win32.installer.exe
Zip build version:
https://archive.mozilla.org/pub/firefox/try-builds/archaeopteryx@coole-files.de-3d23b6dfd423f4dfef4dac906719720267d9fc6a/try-win32/firefox-52.0a1.en-US.win32.zip

Test each try build. Let us know if there was any difference. If not, Aryx has offered to create more try builds for us with other patches backed out from the reported regression range if these don't solve the startup problem. 

Also it is not necessary to test both the installer & zip builds for each backout. The zip builds are the same as the installer builds but are for people who want to install them to custom directories.
Flags: needinfo?(assistance)
Hi Noah,

Sorry, the 2 builds have the same issue : firefox.exe running, no Firefox window.
Hopefully Honza or Patrick can take a look.
Flags: needinfo?(jduell.mcbugs) → needinfo?(honzab.moz)
Flags: needinfo?(mcmanus)
I'm a little confused - comment 8 would make me suspicious of bug 1260764 (and the stack trace does too)

but that bug landed as part of firefox 48. And the STR here is very clear that it worked with 48.0.2 and did not work with 49.0

how was that cset part of the regression range?

outside of that, it a lock being obtained during startup does indeed sound consistent with the report.Is this reproducible? can we get a 

It sounds like this is reproducible (if the test builds were ruled out) for someone .. please attach a http log https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging

dragana, you've been deep in this stuff for a while - do you want to glance at the stack trace (that's the attachment - its not a nspr log)
Flags: needinfo?(mcmanus) → needinfo?(dd.mozilla)
this is pure guessing, but maybe it could be a an issue between one of the bugs in that regression range + the fact that msvc2015 is used to build firefox? this combination would have been in place for mozilla-central at the time the pushlog would indicate that this regressed, but not in aurora, beta & release 48 (bug 1270664).
SocketThread  holds lock and it hangs in PR_Accept for connecting socket pairs:

   8  Id: 1b28.1250 Suspend: 1 Teb: ffe7c000 Unfrozen "Socket Thread"
 # ChildEBP RetAddr
00 0526f048 72e26eff ntdll!NtWaitForSingleObject+0x15
01 0526f088 72e26d20 mswsock!SockWaitForSingleObject+0x1ba
02 0526f174 76c6673e mswsock!WSPSelect+0x3a6
03 0526f1f4 6b493be1 WS2_32!select+0x494
04 0526f45c 6b4495a9 nss3!socket_io_wait(int osfd = 0n86438164, int fd_type = 0n1, unsigned int timeout = 0xffffffff)+0x2c0 [c:\builds\moz2_slave\m-rel-w32-00000000000000000000\build\src\nsprpub\pr\src\md\windows\w95sock.c @ 514]
05 (Inline) -------- nss3!_MD_Accept+0x103c2d [c:\builds\moz2_slave\m-rel-w32-00000000000000000000\build\src\nsprpub\pr\src\md\windows\w95sock.c @ 137]
06 0526f47c 6b345929 nss3!SocketAccept+0x103c7c [c:\builds\moz2_slave\m-rel-w32-00000000000000000000\build\src\nsprpub\pr\src\io\prsocket.c @ 383]
07 0526f48c 0e6ffdcb nss3!PR_Accept(struct PRFileDesc * fd = 0x04d11840, union PRNetAddr * addr = 0x0526f4b8, unsigned int timeout = 0xffffffff)+0x12 [c:\builds\moz2_slave\m-rel-w32-00000000000000000000\build\src\nsprpub\pr\src\io\priometh.c @ 167]
08 0526f5d8 0e6ffb8a xul!mozilla::net::NewTCPSocketPair(struct PRFileDesc ** fd = 0x0526f5f0)+0x195 [c:\builds\moz2_slave\m-rel-w32-00000000000000000000\build\src\netwerk\base\pollableevent.cpp @ 100]
09 0526f5f8 0e3a5eb0 xul!mozilla::net::PollableEvent::PollableEvent(void)+0x41 [c:\builds\moz2_slave\m-rel-w32-00000000000000000000\build\src\netwerk\base\pollableevent.cpp @ 158]
0a 0526f7b4 0e3a53de xul!mozilla::net::nsSocketTransportService::Run(void)+0x67 [c:\builds\moz2_slave\m-rel-w32-00000000000000000000\build\src\netwerk\base\nssockettransportservice2.cpp @ 814]


The can try to get tcptrace to see if syn packet is sent. With RawCap we can get tcptrace of the local address. I am wandering why?
Flags: needinfo?(dd.mozilla)
(In reply to Patrick McManus [:mcmanus] from comment #13)

> It sounds like this is reproducible (if the test builds were ruled out) for
> someone .. please attach a http log
> https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
for affected users the issue is reproducible - however it seems like in this error state firefox is not able to produce that logging (the log file is created but not filled with content): https://support.mozilla.org/en-US/questions/1139904?page=3#answer-922572
(In reply to [:philipp] from comment #16)
> (In reply to Patrick McManus [:mcmanus] from comment #13)
> 
> > It sounds like this is reproducible (if the test builds were ruled out) for
> > someone .. please attach a http log
> > https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
> for affected users the issue is reproducible - however it seems like in this
> error state firefox is not able to produce that logging (the log file is
> created but not filled with content):
> https://support.mozilla.org/en-US/questions/1139904?page=3#answer-922572

For 49 we changed MOZ_LOG_MODULES into MOZ_LOG. Sorry!!! We should change the doc.

if they made a log can they add io:5 to as well (just add ",io:5")
This probably have to do with bug 698882 comment 73.
the http log only consisted of three lines then:
>2016-09-30 11:50:51.193000 UTC - [Main Thread]: D/nsSocketTransport STS dispatch [4c202e0]
>2016-09-30 11:50:51.193000 UTC - [Socket Thread]: D/nsSocketTransport STS thread init 1000 sockets
>2016-09-30 11:50:51.193000 UTC - [Socket Thread]: D/nsSocketTransport PollableEvent() using socket pair
https://support.mozilla.org/en-US/questions/1139904?page=3#answer-922631
Thanks. 

from bug 698882 comment 73: McAfee seems to be hardcoded to expect an NSPR layer on top of the read side of the localhost pair.

maybe using msvc2015 confuses McAfee?
(In reply to Dragana Damjanovic [:dragana] from comment #20)
> maybe using msvc2015 confuses McAfee?
Hi dragana,

I don't use McAfee software.
(In reply to assistance-multi from comment #21)
> (In reply to Dragana Damjanovic [:dragana] from comment #20)
> > maybe using msvc2015 confuses McAfee?
> Hi dragana,
> 
> I don't use McAfee software.

Do you use any other anti-virus software?
(In reply to Dragana Damjanovic [:dragana] from comment #22)
> Do you use any other anti-virus software?

Yes, Kaspersky Anti-Virus 16.0.1.445(e)

But disabling the anti-virus does not solve the problem.
Thanks Dragana!

Not the first time there are trouble with local socket pairs and AV software...  

Disabling an av software sometimes doesn't disable it.  Uninstallation may be the only way sometimes to prove an influence.

To get a full log please add "sync" module (no ":5" behind it) to MOZ_LOG_MODULES variable.  MOZ_LOG_MODULES works in Firefox v 48 - 52.
Flags: needinfo?(honzab.moz)
OK, I'll try to uninstall Kaspersky AV.
(In reply to Honza Bambas (:mayhemer) from comment #24)

> Disabling an av software sometimes doesn't disable it.  Uninstallation may
> be the only way sometimes to prove an influence.

I confirm that totally removing Kaspersky solves the problem : FF49.0.1 starts normally.
(In reply to assistance-multi from comment #26)
> (In reply to Honza Bambas (:mayhemer) from comment #24)
> 
> > Disabling an av software sometimes doesn't disable it.  Uninstallation may
> > be the only way sometimes to prove an influence.
> 
> I confirm that totally removing Kaspersky solves the problem : FF49.0.1
> starts normally.

Thanks you!

Dragana, I didn't understand from your comments if this is a crash or something more serious that causes the hang or if we could just add a timeout to PR_Accept (like 5 seconds) and in case of a timeout fallback to busy poll.
(In reply to Honza Bambas (:mayhemer) from comment #27)
> (In reply to assistance-multi from comment #26)
> > (In reply to Honza Bambas (:mayhemer) from comment #24)
> > 
> > > Disabling an av software sometimes doesn't disable it.  Uninstallation may
> > > be the only way sometimes to prove an influence.
> > 
> > I confirm that totally removing Kaspersky solves the problem : FF49.0.1
> > starts normally.
> 
> Thanks you!
> 
> Dragana, I didn't understand from your comments if this is a crash or
> something more serious that causes the hang or if we could just add a
> timeout to PR_Accept (like 5 seconds) and in case of a timeout fallback to
> busy poll.

I think syn packet is blocked and we wait in accept. So if we add timeout and do busy poll the problem is resolved. I could make a patch for that.
But I am wondering why 48 works and 49 doesn't? The code is the same. compiler changed?
(In reply to Dragana Damjanovic [:dragana] from comment #28)
> 
> 
> I think syn packet is blocked and we wait in accept. So if we add timeout
> and do busy poll the problem is resolved. I could make a patch for that.
> But I am wondering why 48 works and 49 doesn't? The code is the same.
> compiler changed?


I was thinking the same things.. my only concern here is that kaspersky is a pretty big product and busy poll is pretty sub optimal. Minimally we need telemetry on how often it happens - but I'd prefer to think we could get around this somehow.

there has to be some difference in firefox between 48 and 49 - I don't know what it is either.
(In reply to Patrick McManus [:mcmanus] from comment #29)
> (In reply to Dragana Damjanovic [:dragana] from comment #28)
> > 
> > 
> > I think syn packet is blocked and we wait in accept. So if we add timeout
> > and do busy poll the problem is resolved. I could make a patch for that.
> > But I am wondering why 48 works and 49 doesn't? The code is the same.
> > compiler changed?
> 
> 
> I was thinking the same things.. my only concern here is that kaspersky is a
> pretty big product and busy poll is pretty sub optimal. Minimally we need
> telemetry on how often it happens - but I'd prefer to think we could get
> around this somehow.
> 
> there has to be some difference in firefox between 48 and 49 - I don't know
> what it is either.

In my humble opinion, I don't think it's a compiler change, but rather some white listing in kaspersky?

@assistance-multi, are you on the latest version of the Kaspersk AV software?


And, this is in Release, we need a simple fix that will make the product work, even inefficiently.  I would go with something that fixes this simply and reliably.  Then we can think of something better for beta+.
Flags: needinfo?(assistance)
assistance-multi from comment #23:
> (In reply to Dragana Damjanovic [:dragana] from comment #22)
> > Do you use any other anti-virus software?
> 
> Yes, Kaspersky Anti-Virus 16.0.1.445(e)

Honza, I can confirm that he has the latest version of Kaspersky's 2016 version. They keep updating & maintaining it even though they have a 2017 offering.

See https://support.kaspersky.com/12118 & expand the Patches A - E for version 16.0.1.445 section. Patch E for Kaspersky Anti-Virus 2016 was released on August 16, 2016.
Flags: needinfo?(assistance)
(In reply to Honza Bambas (:mayhemer) from comment #30)
>
> 
> In my humble opinion, I don't think it's a compiler change, but rather some
> white listing in kaspersky?

why? (I have no idea what it is - I don't see any evidence for either of those that fits with the facts... this is just the event socket at SYN time..)

> 
> @assistance-multi, are you on the latest version of the Kaspersk AV software?

The other thing is that obv not all kaspersky users are effected. I'm running it fine for instance - its stock on moco installs.

> 
> 
> And, this is in Release, we need a simple fix that will make the product
> work, even inefficiently.  I would go with something that fixes this simply
> and reliably.  Then we can think of something better for beta+.

agreed.
I do not know Kaspersky, but probably there is a way to get a log. Maybe there is a way to log events like connection blocked and why.
I found lizzard needinfo'd some Kaspersky folk in bug 1271875 comment 7.

I'll do the same.
Flags: needinfo?(kaspersky-antivirus)
Flags: needinfo?(alexey.drozdov)
please note that this issue is not exclusive to kasperky, as we also have plenty of users with other security software reporting the same issue:
mcafee - https://support.mozilla.org/en-US/questions/1140828
mse+zonealarm - https://support.mozilla.org/en-US/questions/1139904?page=3#answer-922677
avg - user was on irc, this is his software list: https://pastebin.mozilla.org/8913243
(In reply to [:philipp] from comment #8)
> Created attachment 8795802 [details]
> FIREFOX-DEBUG_17A4_2016-09-28_19-47-29-864.LOG
> 
> we had a very helpful irc user (thanks, firefoxer!) who was affected and
> successfully ran regression testing on the issue and came up with a
> stacktrace as well:
> https://hg.mozilla.org/integration/mozilla-inbound/
> pushloghtml?fromchange=afb08f0036485e6840155eb3104327345670cd63&tochange=3ea4
> 4d60cb3fe23293fe3326c43636336e4ac8dc

Philipp, can you ask this person to try to find regression range again, but this time when browser starts also check if pages are loading. If bug 1260764 is removed it can be that firefox starts but I think that  socketThread hangs so it will not load anything. Or maybe he did it?

assistance-multi, can you try to find regression range for us?
Flags: needinfo?(madperson)
Flags: needinfo?(assistance)
(In reply to Dragana Damjanovic [:dragana] from comment #36)
> Philipp, can you ask this person to try to find regression range again, but
> this time when browser starts also check if pages are loading. 
unfortunately i have no way to reach out to that user again, as he was an anonymous user on an irc support channel (unless he follows this bug and reports back). we did run mozregression-testing just on the fact if a window opens at startup of the browser or not.

> If bug 1260764 is removed it can be that firefox starts but I 
> think that socketThread hangs so it will not load anything.
we did give "assistance-multi" test builds with the state just before bug 1260764 landed and he indeed reported that the window opened but without connectivity & firefox.exe still continued running after the firefox UI was closed: https://support.mozilla.org/en-US/questions/1139904?page=2#answer-922565
Flags: needinfo?(madperson)
[Tracking Requested - why for this release]: Browser startup issue.
Component: Untriaged → Networking
Product: Firefox → Core
i've tagged the user reports on sumo that i suspect to be this bug at https://support.mozilla.org/en-US/questions/firefox?tagged=bug1305436&show=all
Can they open a page?
in the cases where the UI opened successfully ('+'), there was no connectivity to the web, locally stored pages could be opened though.
(In reply to [:philipp] from comment #42)
> in the cases where the UI opened successfully ('+'), there was no
> connectivity to the web, locally stored pages could be opened though.

The problem is that socketThread hangs. bug 1260764 added a lock, so without that patch firefox would start but still no network request would be sent. Maybe also some other patches rearranged code and may cause firefox to start but still the main problem is that socketThread hangs.
That part of necko code did not change between 48 and 49. Under necko there is a nspr. I am not sure if something changed there, probably not much. Maybe having io:5 log for 48 and 49 would help. nspr do not have a lot of logging so I am not optimistic. Looking at nspr, there is really not much logging there.

it would be good if people from kaspersky would help us figuring out why our socket pair is blocked.
(In reply to Dragana Damjanovic [:dragana] from comment #44)
> Maybe having io:5 log 

Please don't forget that logging for the NSPR code must be set via NSPR_LOG_MODULES and NSPR_LOG_FILE (different path than MOZ_LOG_FILE!) vars.  Those still apply for NSPR and it won't change!
Bug 698882 was backed out from 48. I will make a couple of builds for users to try. Try is currently closed.
In new version we set some additional socket options.
ok, please disregard my comment #40 - something must have gone wrong during testing, since try builds with just bug 1260725 backed out exhibit the same issue (no ui), so this was unrelated and it looks like a networking issue after all.
No longer blocks: 1260725
https://treeherder.mozilla.org/#/jobs?repo=try&revision=51836a9c48ad

This a build where I removed setting RECVBUF
Flags: needinfo?(madperson)
Tracking 52+ for this issue which relates to Firefox not starting after install.
The build provided by Philipp in the support forum works fine for me : Firefox UI OK & Connectivity OK.

https://support.mozilla.org/fr/questions/1139904?page=4#answer-923592
yeah, thanks dragana!
Flags: needinfo?(madperson)
Given the severity and timing of this new issue, tracked for 50+
Attached patch bug_1305436.patch (obsolete) — Splinter Review
The main problem is recvbuff.

There is different ways to make this change but I decided to use PR_NewTCPSocketPair to be on more secure path and not force users to try different builds. Let's go with this patch for uplift to release (if accepted) and we can remove NEWTCPSocketPair() or try something else on nightly.

As reminder, we increased recvbuf because of some bug in some older versions of  windows with tcp-autotune(I forgot the real name) options set.
Assignee: nobody → dd.mozilla
Status: NEW → ASSIGNED
Attachment #8797354 - Flags: review?(mcmanus)
Comment on attachment 8797354 [details] [diff] [review]
bug_1305436.patch

Review of attachment 8797354 [details] [diff] [review]:
-----------------------------------------------------------------

dragana and I talked about this face to face. the concern is that this approach will certainly regress bug 1248358. We can do that if we have absolutely have no other recourse, but let's try a version that uses a smaller rcvbuf (but still non zero when << 14 as per that other bug)
Attachment #8797354 - Flags: review?(mcmanus)
Whiteboard: [necko-active]
Flags: needinfo?(assistance)
(In reply to Dragana Damjanovic [:dragana] from comment #57)
> can you please try if it starts and if it has connectivity. Thank you very
> much!

Hi Dragana.
I've only tested 64bit versions, but sorry to tell you that none has started correctly : process running, no UI.
Comment on attachment 8798368 [details] [diff] [review]
bug_1305436_v4.patch

Review of attachment 8798368 [details] [diff] [review]:
-----------------------------------------------------------------

this is pretty straightforward and low risk for uplifting.
Attachment #8798368 - Flags: review?(mcmanus) → review+
assistance-multi, may I ask you to try this build. It is good to have the last check that everything works.
Thank you!

https://archive.mozilla.org/pub/firefox/try-builds/dd.mozilla@gmail.com-a05195a7cbdb42ed131bcf01b69bc763b7358a31/try-win64/firefox-52.0a1.en-US.win64.zip
(In reply to Dragana Damjanovic [:dragana] from comment #61)
> https://archive.mozilla.org/pub/firefox/try-builds/dd.mozilla@gmail.com-
> a05195a7cbdb42ed131bcf01b69bc763b7358a31/try-win64/firefox-52.0a1.en-US.
> win64.zip

Hi Dragana, this build does not work, still the same issue.
Can you please make a http log, I really do not understand what is happening :(

Thank you very much!
https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging

and use this MOZ_LOG values (instead of MOZ_LOG_MODULES)
cd c:\
set MOZ_LOG=timestamp,sync,nsHttp:5,nsSocketTransport:5
set MOZ_LOG_FILE=%TEMP%\log.txt

this is just reminder, please use the firefox that you downloaded from comment 61 ("cd" to the directory you downloaded that firefox)
Sorry, sorry, I just figured out where the error is.
Really sorry. I do not need the log.
re 62 - I should have noticed in the review.. the timeout needs to be on accept not connect.. in theory it could be either way, but the stack trace that was provided earlier makes it clear it needs to be on accept. so we should still be able to get a workable patch using the same basic idea.
Attachment #8798368 - Attachment is obsolete: true
assistance-multi, can you try this one, please? Sorry that the last one did not work, it was my mistake.

Thank you very much for helping us!

https://archive.mozilla.org/pub/firefox/try-builds/dd.mozilla@gmail.com-44f63321298c52db35cfdd441cfe4be9a11c0a08/try-win64/firefox-52.0a1.en-US.win64.zip
Attachment #8798484 - Flags: review?(mcmanus)
Attachment #8798484 - Flags: review?(mcmanus) → review+
Keywords: checkin-needed
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/946e587ef22e
"Fix issue with Firefox 49 won't start after installation". r=mcmanus
Keywords: checkin-needed
Flags: needinfo?(assistance)
assistance-multi, can you try this one, please? Sorry that the last one did
not work, it was my mistake.

Thank you very much for helping us!

https://archive.mozilla.org/pub/firefox/try-builds/dd.mozilla@gmail.com-44f63321298c52db35cfdd441cfe4be9a11c0a08/try-win64/firefox-52.0a1.en-US.win64.zip
Flags: needinfo?(assistance)
I checked manually bug 1248358 with this change and there is no regression of that bug.
https://hg.mozilla.org/mozilla-central/rev/946e587ef22e
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
(In reply to Dragana Damjanovic [:dragana] from comment #69)
> assistance-multi, can you try this one, please? Sorry that the last one did
> not work, it was my mistake.
> Thank you very much for helping us!

Hi Dragana, this build works : Firefox UI + connectivity OK.
Thanks
(In reply to assistance-multi from comment #72)
> (In reply to Dragana Damjanovic [:dragana] from comment #69)
> > assistance-multi, can you try this one, please? Sorry that the last one did
> > not work, it was my mistake.
> > Thank you very much for helping us!
> 
> Hi Dragana, this build works : Firefox UI + connectivity OK.
> Thanks

Thanks a lot for you help it is really appreciated!
Comment on attachment 8798484 [details] [diff] [review]
bug_1305436_v4.patch

Approval Request Comment
[Feature/regressing bug #]:  bug 698882
[User impact if declined]: Firefox does not start, it starts but there is no ui at all. The bug is connected to some firewalls, but doe not affect all users of these firewalls just some of them. (The cause is that some firewalls block our internal socket pair.)
[Describe test coverage new/current, TreeHerder]: Tested by a user (comment 72). And there is no regression of bug 1248358 (I have tested it, comment 70)
[Risks and why]: low
[String/UUID change made/needed]: none
Attachment #8798484 - Flags: approval-mozilla-release?
Attachment #8798484 - Flags: approval-mozilla-beta?
Attachment #8798484 - Flags: approval-mozilla-aurora?
Can you tell me if a 49.x version of Firefox will be available (soon ?) with a fix for this bug ?
Thanks
(In reply to assistance-multi from comment #75)
> Can you tell me if a 49.x version of Firefox will be available (soon ?) with
> a fix for this bug ?
> Thanks

It is not my decision, I asked for an uplift, but decision depends on weighing risks and impact. The decision will be made soon and you will see it in this bug.
Comment on attachment 8798484 [details] [diff] [review]
bug_1305436_v4.patch

Fixes a recent (severe but not widespread) regression from Fx49, fix was verified on a try build, Aurora51+, Beta50+
Attachment #8798484 - Flags: approval-mozilla-beta?
Attachment #8798484 - Flags: approval-mozilla-beta+
Attachment #8798484 - Flags: approval-mozilla-aurora?
Attachment #8798484 - Flags: approval-mozilla-aurora+
Flags: needinfo?(dbolter)
Looks like this should be in the next beta release, 50.0b7, this Friday. 
We can test from there. 

Marking this as a 49 blocker, because this sounds bad enough that it should be a 49 dot release driver along with bug 1306472.   

Will affected users get this update? If we think they won't be able to, then we can put out the word that anyone affected will have to download and install a fresh copy of 49.0.2.
Andrei, can we replicate this issue? 
Dragana, does this take not just some specific antivirus software, but a particular network configuration? Can you or mcmanus help us with a test setup?
Flags: needinfo?(mcmanus)
Flags: needinfo?(dd.mozilla)
Flags: needinfo?(andrei.vaida)
Comment on attachment 8798484 [details] [diff] [review]
bug_1305436_v4.patch

Fix for an issue where some users with particular network configurations cannot see the Firefox UI on startup.   Taking this on m-r for a potential 49.0.2 dot release.
Attachment #8798484 - Flags: approval-mozilla-release? → approval-mozilla-release+
liz - the dev team was never able to reproduce. Its clearly related to firewall software, but we weren't able to find the exact circumstances despite trying some of the same software. We've relied on reporters verifying the fix on channels the fix is already included in.
Flags: needinfo?(mcmanus)
Then I think our testing should focus on making sure we didn't break normal updates and functioning. 

I don't have a good way to assess risk here other than you and Dragana's judgement.
The risk here is pretty low.. pretty much the new code just uses the current algorithm but instead of blocking on it will timeout and try some fallback strategies. So the vast majority that aren't seeing the problem will keep using the same code path.
I agree with Patrick.
Flags: needinfo?(dd.mozilla)
Flags: needinfo?(dbolter)
We were unable to reproduce this bug on an affected build, testing was conducted in various conditions and involving different popular antivirus software, with no luck.

assistance-multi, would it be possible for you to verify this fix on 49.0.2? We'd really appreciate it. Here's where you can download the build to test on:

http://archive.mozilla.org/pub/firefox/candidates/49.0.2-candidates/build1/
Flags: needinfo?(andrei.vaida)
> assistance-multi, would it be possible for you to verify this fix on 49.0.2?
> We'd really appreciate it. Here's where you can download the build to test
> on:
> http://archive.mozilla.org/pub/firefox/candidates/49.0.2-candidates/build1/

Hi Andrei, this build works fine.
To be more accurate, I run the following version :
49.0.2
Mozilla Firefox EME-free
Mozilla-EMEfree - 1.0
I found the installer here : http://archive.mozilla.org/pub/firefox/candidates/49.0.2-candidates/build1/win64-EME-free/fr/
Release Note Request (optional, but appreciated)
[Why is this notable]:
[Suggested wording]: Network issue prevents some users from seeing the Firefox UI on startup (Bug 1305436)
[Links (documentation, blog post, etc)]:
assistance-multi, thanks so much for the bug report and for working with us to test. Sending you a gift in appreciation!
Flags: needinfo?(assistance)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #91)
> assistance-multi, thanks so much for the bug report and for working with us
> to test. Sending you a gift in appreciation!

You're welcome Liz, as a longtime user (and fan) of Mozilla's software, I find it legitimate. It's really interesting to follow up your process for debugging/fixing a bug.

If I had time (and enough skills) it would be a pleasure to get involved in Mozilla's development, but it's not possible... maybe later ?
I'm not convinced that this issue is *only* caused by firewall issues. My 49.0.2 version exhibited the behaviour described about, as soon as it updated to that version, and nothing I did, including disabling the Kaspersky firewall, made any difference. Uninstalling and reinstalling did not work. Installing 64-bit did not work. Renaming profile folder between installs to provoke creation of new profile did not work.

Eventually I fixed it by uninstalling every Mozilla app, deleting every Mozilla folder including profiles, deleting every Mozilla key in the registry, rebooting and installing again. It seems to me that the issue is related to either incomplete installation / uninstallation into the program folders, or more likely the registry
You need to log in before you can comment on or make changes to this bug.