Reporter

Description

•

5 years ago

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36

Steps to reproduce:

Thunderbird periodically (randomly) stops fetching emails from my POP server. When I suspect it has "stalled", I manually try to down load any emails. When I hit the "Get Messages" button, I get a popup from Thunderbird with the text shown below.

I'm using Thunderbird 60.8.0 (32 bit) on Windows 10. It has been happening on earlier versions, too. I just decided to finally report it.

Actual results:

Get a popup with the following text:

The POP3 mail server (pop.verizon.net) does not support UIDL or XTND XLST, which are required to implement the Leave on Server'',Maximum Message Size'' or ``Fetch Headers Only'' options. To download your mail, turn off these options in the Server Settings for your mail server in the Account Settings window.

Expected results:

Should have tried to download any new emails. There is nothing wrong with my configuration -- either on the email server or in Thunderbird. All I have to do to clear the problem is exit Thunderbird & re-start it. Then it immediately fetches new emails. Just by restarting Thunderbird the problem goes away -- until a random time in the future (usually several days) when it happens again.

WaltS48 [:walts48]

Comment 1

•

5 years ago

In my Verizon accounts settings.

I have Leave on server set to until I delete them.
Save disk space and fetch headers are disabled.

I don't have the problem you are seeing in any version I've used.

Are you using an antivirus program to scan emails? Try disabling it.

This sounds like a Thunderbird Support problem.

BigbirdPhila

Reporter

Comment 2

•

5 years ago

Thanks for your input, Walt. I don't want to ever delete them on the server, even if I clean up (delete) locally on my machine. Verizon email (now handled by AOL, a subsidary of Verizon) gives us unlimited disk space. That way I'll be able to always find an email if I really have to -- even if I deleted it locally.

I have fetch headers disabled. I don't have a "save disk space" option, though -- at least I can't find it.

I don't have an antivirus program scanning the emails.

WaltS48 [:walts48]

Comment 3

•

5 years ago

(In reply to BigbirdPhila from comment #2)

Thanks for your input, Walt. I don't want to ever delete them on the server, even if I clean up (delete) locally on my machine. Verizon email (now handled by AOL, a subsidary of Verizon) gives us unlimited disk space. That way I'll be able to always find an email if I really have to -- even if I deleted it locally.

I have fetch headers disabled. I don't have a "save disk space" option, though -- at least I can't find it.

"To save disk space, do not download: Messages larger than" is in the Disk Space item in "View settings for this account".

I don't have an antivirus program scanning the emails.

Okay.

BigbirdPhila

Reporter

Comment 4

•

5 years ago

Ok. Found it. I also have it disabled.

Interesting that you & I both have Verizon email addresses. I'm seeing the problem, you're not.

So, I'd like to add to the report that I also have Thunderbird accessing (polling) my Gmail account -- just in case it's some subtle interaction with Gmail that's causing the problem across all accounts. I'm accessing my Gmail account using POP, too.

Benjamin Flanagin

Updated

•

5 years ago

Component: Untriaged → Networking: POP

Product: Thunderbird → MailNews Core

BigbirdPhila

Reporter

Comment 5

•

5 years ago

Another observation: the problem just happened again. I shut down Thunderbird & re-started it. This did NOT fix the problem. Then I noticed that I had the Address Book window open (across the restart of Thunderbird). I closed the Address Book, exited Thunderbird, and re-started it. Problem went away. So, the problem persists across a Thunderbird restart if the Address Book stays open.

BigbirdPhila

Reporter

Comment 6

•

5 years ago

More info: I have Thunderbird accessing 3 email accounts. Two Verizon (AOL), and one Gmail. All via POP. When it happens to one account, it does NOT happen to the other two. I just had it happen to one of the Verizon accounts and the other Verizon account (same POP server) polled just fine.

I don't think it ever happened to my Gmail account. Not 100% sure of that since that's a very low-volume account. I would notice that the others have "stalled" long before noticing that nothing's been coming in on Gmail. Don't pay much attention to the Gmail account.

Wayne Mery (:wsmwk)

Comment 7

•

5 years ago

So this doesn't happen if fetch headers only is enabled?

Flags: needinfo?(bbird_0003)

BigbirdPhila

Reporter

Comment 8

•

5 years ago

I've never tried "fetch headers only".

Flags: needinfo?(bbird_0003)

Barry S. Finkel

Comment 9

•

5 years ago

The problem is in Thunderbird. It sends a POP3 "UIDL" command, and the (in my case inbound.att.net) server responds with a temporary error - "-ERR [SYS/TEMP] Server error - Please try again later." I have not yet looked at the various POP3 RFCs to see what they say about [temporary] error responses to commands. Thunderbird responds by sending the POP3 command "XTND XLST Message-Id" (as if no error message had occurred), and the inbound server replies "-ERR Invalid Command.". This sets a flag in Thunderbird - the server does not accept UIDL or XTND', The only way to clear this is to shut down Thunderbird and restart it. See this URL:

https://support.mozilla.org/en-US/questions/1268468

which I opened (and which was recently closed/archived, for no apparent reason). That thread has the results of Thunderbird POP3 traces that I ran (and still have running).

It seems to me that when Thunderbird receives the temporary error message, it should stop mail downloading and wait for the next user-supplied interval to get new messages. Note that I have not looked in detail at the Thunderbird source. Note that I am currently using Thunderbird 68.5.0 on Windows 7 32-bit (with a migration to Windows `10 64-bit shortly).

BigbirdPhila

Reporter

Comment 10

•

5 years ago

Thank you, Barry. Excellent analysis, IMHO. I was going to request again that someone take a look at this because it's still happening to me. I have no idea that Thunderbird has "stalled" on a particular account until I notice my phone is reporting incoming emails and my PC isn't. That's when I re-start Thunderbird and emails come flowing in for that account.

Can someone with knowledge of Thunderbird internals please take a look at this? It looks like Barry has figured out a simple solution.

Barry S. Finkel

Comment 11

•

5 years ago

Analysis was not hard, once I got the debug environmental variable from someone on forums.mozilla.org. Then I had to wait for an occurrence and look at the diagnostic log. I still have the logs, if anyone needs them. All of the explanations I had read on the Thunderbird forum and on the at&t and other ISP fora did not make sense to me. It made no sense that an ISP POP3 server would be down (mysteriously) until I (and others) restarted Thunderbird.

Barry S. Finkel

Comment 12

•

5 years ago

Two further comments. I had an another occurrence this morning. The log shows the same "RECV: -ERR [SYS/TEMP] Server error - Please try again later." response from the server and a further response "RECV: -ERR Invalid Command." I have not looked at the log in detail, but in the previous logs, Thunderbird does connect to the POP3 server after the error, at the user-specified interval, but Thunderbird does not send the UIDL nor XTND commands, so no mail is downloaded. I have not looked at the code, but it seems that if Thunderbird is changed so that it does not remember the error code from the XTND command, everything should work properly.

I am not sure what should be done if an initial download attempt sends the commands, and the POP3 server really does not support those commands. I have no idea how many POP3 servers remain that do not support these commands.

David Smith

Comment 13

•

5 years ago

This error happened to me, I found Barry's thread on SUMO, and did some digging into the source.

The error has to do with the logic in: comm-central/comm/mailnews/local/src/nsPop3Protocol.cpp.

Functions of interest:

nsPop3Protocol::SendUidlList() — Will trigger a call to GetUidlList() if either POP3_HAS_UIDL or POP3_UIDL_UNDEFINED capability flags are set. If not, it will redirect to SendXtndXlstMsgid.

nsPop3Protocol::SendXtndXlstMsgid() — Will call HandleNoUidListAvailable if XTND/XLST are not supported or undefined (as is the legitimate case for the AT&T servers).

nsPop3Protocol::HandleNoUidListAvailable() — Will abort any attempts to get mail if "leaving mail on server"-type settings are set. This is where the "fix" of turning off those settings comes into play.

nsPop3Protocol::GetUidlList(nsIInputStream* inputStream, uint32_t length) — Explicitly clears the POP3_UIDL_UNDEFINED flag. If the previous command failed, it will also clear the POP3_HAS_UIDL flag. There are no means of redefining either of those flags, which means after they are cleared, it's impossible to get mail again until you restart the client, and the capability flags are reset to undefined.

nsPop3Protocol::WaitForResponse(nsIInputStream* inputStream, uint32_t length) — This function determines if the query succeeded or failed (setting m_pop3ConData->command_succeeded). It looks for a response of "RECV: %s". If the first character of the %s is a "+", then it's a success. As we see from Barry's logs, the response is "RECV: -ERR [SYS/TEMP] Server error - Please try again later.", which means the first character of the %s is a "-", indicating an error.

The function then checks for various failure codes after the "-ERR " portion of the response, including "[SYS", which is what we see. However it makes no attempt to check for anything after that, and dumps everything into the common failure bucket, rather than consider the possibility of a temporary failure.

There's actually a comment in the GetUidlList function that explicitly assumes that command_succeeded will remain constant, though that's not the case.

The correct fix for this will be to switch from a simple bool to a three-state value which can correctly mark a failure as temporary, rather than permanent, and not clear the UIDL flags (and probably other flags that are used in the class) after receiving such an error.

BigbirdPhila

Reporter

Comment 14

•

4 years ago

Can someone please take a look at this bug? Some users have taken a look at the code and proposed possible solutions. It doesn't appear to be a difficult bug to fix. My email provider is AOL, with a "verizon.net" domain. Thunderbird is stalling 4 or 5 times a day for me now without reporting a problem. The only reason I know email delivery has stalled is because my phone is reporting incoming emails. When I manually tell Thunderbird to "Get All New Messages", I get the "... does not support UIDL or XTND XLST ..." popup. Exiting Thunderbird & re-starting it clears the error message/stall (temporarily). Please fix ASAP!!!

WaltS48 [:walts48]

Comment 15

•

4 years ago

I have been noticing this recently with my Verizon email account.

I ignore it and the next time Verizon checks for email I get the messages if there are any to retrieve. It's not a heavily used account.

Barry S. Finkel

Comment 16

•

4 years ago

To Walt548 - If you do nothing with Thunderbird, and Verizon continues to load mail later, then you do NOT have the same problem. With the problem that has been reported here, Thunderbird will connect to the server at each successive (user-defined) interval, but will NOT download mail because once it receives the temporary error, it tries a command that fails, and it remembers the failure. The ONLY way to clear Thunderbird's brain is to Exit and restart Thunderbird.

There are two environmental variables to get diagnostic information:

 MOZ_LOG = POP3:5,time
 MOZ_LOG_FILE=C:\Users\...\Desktop\pop3.log

I put the log file on my desktop; it can go anywhere. One has to set the variables and then restart Thunderbird to get the POP3 log entries. I do not know if one has to reboot to get the environmental variables loaded.

Barry S. Finkel

Comment 17

•

4 years ago

I have not looked at the code, but I think that the fix is simpler than what David Smith posted above.

First, parse the error message. If the error message is temporary, then all the code needs to do is to exit the mail downloads for this interval.

At the next user-configured interval, the code will again check for mail. With this solution, the code will not send the incorrect UIDL command to the mail server, and thus will not remember that the server does not accept the UIDL command.

Note that in the past few days, I have seen a temporary error on the outbound.att.net server once or twice each day, requiring my proce4ssing of the pop3.log diagnostic file and restarting Thunderbird.

WaltS48 [:walts48]

Comment 19

•

4 years ago

Still seeing this using Version 68.11.0?.

I'm also wondering why only Verizon and not other POP3 accounts.

Flags: needinfo?(bbird_0003)

WaltS48 [:walts48]

Updated

•

4 years ago

Flags: needinfo?(dsmith)

Flags: needinfo?(bsfinkel)

BigbirdPhila

Reporter

Comment 20

•

4 years ago

Yes, still seeing it in 68.11.0. I keep hoping it was fixed as I install each new update but, alas, not yet.

It has gotten more frequent recently. About 4 to 5 times a day now.

"Verizon" email has taken a noticeable step backwards in reliability ever since AOL took it over. I see (recoverable) errors now that I never saw when Verizon engineers had control. Like not being able to log into the smtp server to send an email for hours at a time -- which then clears on its own. But I (and most others) don't want to change email addresses because it's such a PIA. So, I'm stuck with re-starting Thunderbird 4 or 5 times a day and living with the other problems (that aren't Thunderbird's fault).

I'm disappointed that it has taken them so long to fix which looks like a pretty simple bug. I made a significant contribution to Thunderbird when I elected to start using it a couple years ago. I don't understand how they decide what bugs to fix.

Flags: needinfo?(bbird_0003)

Barry S. Finkel

Comment 21

•

4 years ago

Maybe if I get a chance, I will look at the code. I am not sure in what language the affected code is written. And I do not know the structure of the code, so I do not know if I can determine how to (as I wrote above) "exit the mail downloads for this interval".

Flags: needinfo?(bsfinkel)

David Smith

Comment 22

•

4 years ago

WaltS48: It's not just Verizon. I've encountered the error on AT&T/Yahoo. Apparently Verizon's servers are flakier, though.

Barry: The code is C++. I found where they had a webhost showing all the source code. searchfox.org works well enough, if you follow it down mailnews/local/src directories. The file is nsPop3Protocol.cpp.

I'm not as interested in trying to figure out Mozilla's contribution process at the moment. The codebase is too large, and I believe there are a lot of third party things that need to be dealt with, as well.

Flags: needinfo?(dsmith)

David Smith

Comment 23

•

4 years ago

Attached patch patch (obsolete) — Details — Splinter Review

The attached patch is based on the current hg repo, as of noon Pacific Time today.

It has not yet been tested, as I haven't gotten everything set up for building on my machine. The patch just covers the logic necessary for the problem happening in this bug. I don't know who's responsible for the module, or where to even begin looking at testing.

I also haven't checked whether any other parts of this code file could benefit from checking for temporary failures.

Bug 1577548 - Don't allow temporary failures to corrupt UIDL capability setting.

Barry S. Finkel

Comment 24

•

4 years ago

I have not looked in detail to see what this patch changes. I have looked at the base module, but there are parts that I do not understand. And I do not know C++. I see that this patch adds new boolean flags. It seems to me that maybe this patch is too complicated. See my post of two months ago, where I give my thoughts on what needs to be changed. Basically, we have to parse the message, which this patch does. If the message points to a temporary error, then all we need to do is to terminate this check for new mail messages. If we have not set any flags, then at the next user-supplied interval, Thunderbird will connect to the mail server (as usual) and download new messages. I think that the current code sees the error message, treats it as a permanent error, and then sends a UIDL command. This UIDL command fails, and Thunderbird remembers that fact. This causes the next attempts at mail retrieval to fail because Thunderbird has remembered that the UIDL command (which should not have been sent) has failed in the past. If we terminate the mail download before sending the UIDL command, then we do not have to change any UIDL code.

A related problem. This morning (09/23) between 4 AM and 8 AM CDT (0900 to 1300 UTC), Thunderbird experienced this UIDL error. When I went to look at the pop3.log file to extract the records, I found that the last record in the flog file had timestamp "2020-09-16 21:28:59.661000 UTC". That was seven days ago, and I do not know why the logging stopped. This is minor, as I am confident that the Thunderbird UIDL error this morning is the same as in this bug report.

David Smith

Comment 25

•

4 years ago

There were a few errors in your post, but it did make me take another look at the code logic, and I made some tweaks.

I see that this patch adds new boolean flags.

It adds a single boolean flag, to indicate whether the failure was temporary or not.

See my post of two months ago, where I give my thoughts on what needs to be changed. Basically, we have to parse the message, which this patch does. If the message points to a temporary error, then all we need to do is to terminate this check for new mail messages.

This isn't possible. The function that parses the error message does not know whether you're trying to download mail, as it also handles server responses for authentication and quitting, as well as certain other queries. As such, it cannot terminate the check for new mail; it's just not part of that code's scope.

Without another boolean flag, the response-handling code has no way to signal to any caller in the state machine anything other than succeed/fail. If the caller is trying to do mail downloads and sees the failure response to the UIDL query, it then has no way of knowing whether this means UIDL isn't supported (in which case it shouldn't try to send more of those queries, but can still continue with the fallback mechanisms) or if it's a temporary error and should abort the attempt to download additional mail entirely.

You're saying, "If the error is temporary, do X", but the function that knows whether it's temporary can't do X, and the function that can do X has no way of knowing if the error is temporary without an additional flag being set.

Further, this runs through a state machine. The response-handling function is not directly called by any other functions. Rather, each such function sets a value saying what part of the state machine it expects to run next. What happens after that is not explicitly known by the code that receives the server response.

Because the state machine flow is rather complex, and I absolutely do not want to touch anything that might impact authentication, I wanted to do the absolute minimally intrusive change.

I think that the current code sees the error message, treats it as a permanent error, and then sends a UIDL command. This UIDL command fails, and Thunderbird remembers that fact.

Slightly different order. The existing code sends a UIDL command as part of the process of acquiring mail (if UIDL is either known to be supported, or is not yet defined), then sees an error (either temporary or permanent), and disables UIDL in such a way that it can never be re-enabled.

This causes the next attempts at mail retrieval to fail because Thunderbird has remembered that the UIDL command (which should not have been sent) has failed in the past. If we terminate the mail download before sending the UIDL command, then we do not have to change any UIDL code.

The UIDL command should always be sent unless it's known that UIDL is not supported. It's the UIDL command's own failure that (currently) causes it to disable further use. You can't terminate before sending the UIDL command because the error comes from the UIDL command.

Remember that it should be perfectly functional to query a server that legitimately does not support UIDL, and the program should still work, falling back on XTND XLST. As such, terminating the download process when UIDL fails is not good behavior.

We also have to consider the potential of the UIDL failure happening on the very first attempt to download email, when the UIDL capability is still unknown. You still need to be able to distinguish between temp and permanent failures before you choose to set the capability to false, or you'll end up with the same problem noted in this bug, just slightly less common.

Note: The alternative is to not track the UIDL capability of the server at all. If the UIDL query fails, just immediately bail out. This generates extra traffic for mail servers that do not support UIDL because they have to fail the request every single time, but I don't think there are very many of them (but that's entirely speculative).

It would not be a difficult change, but it does change the handling of the fallback calls a bit, and I'm not sure about the desire to retain knowledge of the UIDL support capability. In this case, it would either require removing variables, or leaving in variables that do not perform their stated function. It's riskier in terms of potential side-effects.

David Smith

Comment 26

•

4 years ago

Attached patch patch 2 (obsolete) — Details — Splinter Review

Revision

Barry S. Finkel

Comment 27

•

4 years ago

I did not compare the two patches to see what has been changed. I do not understand the new UIDL code after

if (m_pop3ConData->command_temp_failed) return Error("pop3UidlFailure");

That code seems to do some UIDL testing/setting. I do not know what exactly the "return Error(...)" does. If all it does is terminate the current check for new messages, then I do not see why the new UIDL code after it is doing. I need to study the patch and code in more detail. In the current code, Thunderbird knows not to send a UIDL command because the mail server does not accept it. But when the error message is returned, and Thunderbird treats the error as permanent, then Thunderbird sends a UIDL command, which fails. I am not sure why Thunderbird sends the UIDL command.

David Smith

Comment 28

•

4 years ago

The GetUidlList function is only called if the server either definitely has UIDL capabilities, or we don't know whether it has UIDL capabilities yet. If we know that it does not support UIDL, this function will not be called.

The error return tells the state machine to exit processing. The error is returned only if there was a temporary error. With a temporary error, you can't determine whether the server supports UIDL or not, so you can't set the capability flag, and if UIDL is supported, then obviously you can't use UIDL if it's failing. So it just exits, as you suggested.

After that we know that either the read succeeded, or failed in a more permanent manner. If the capability flag is undefined, then it's safe to define it. We only ever want to define the capability when we know we got a valid success or failure, and once we know whether the server has UIDL capability or not, we don't want to change it. As such, even if UIDL fails later on, it won't undefine the result we already determined.

However, once we've passed this point, we could still be in a fail state (mainly if we started as undefined, and determined that UIDL was not supported). In the case of a failure, we need to redirect the state machine to use the XTND_XLST fallback instead.

Note: the final if check needs to be vs whether the results succeeded, not whether UIDL is supported. This fixes some edge cases the previous revision would have allowed. Updated to revision 3.

David Smith

Comment 29

•

4 years ago

Attached patch patch 3 (obsolete) — Details — Splinter Review

Barry S. Finkel

Comment 30

•

4 years ago

I don't know the code other than this one routine, which I do not understand in its entirety. It still seems to me that UIDL processing does not need to be changed, as it appears to work in my situation. Thunderbird knows no to send the UIDL command, and does not send it - except in this one case of a [temporary error], which is not parsed correctly, and then causes Thunderbird to send a UIDL command in error. Essentially, 1) initialize the new flag to false, 2) set the flag to true if the error is temporary, and 3) return Error if the flag is true.

m_pop3ConData->command_temp_failed = false;

 if (m_commandResponse.Find("[SYS/TEMP", true) >= 0)

   m_pop3ConData->command_temp_failed = true;

if (m_pop3ConData->command_temp_failed) return Error("pop3UidlFailure");

But I will defer to those who know the code better than I.

gene smith

Assignee

Updated

•

4 years ago

Comment 31

•

4 years ago

https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=70d0431e8250f3702ca3acc6673066492c15f6f0&selectedTaskRun=Jz4XMZHvQtmNp3h8m1EevQ.0

This is a link to a tb "try" build that incorporates the patch provided by David Smith. It looks reasonable to me but I don't work on tb's pop protocol much since most reported problems that I deal with are imap.

I don't have a yahoo based pop account set up so I haven't been able to duplicate the problem. It works with my ISP's pop account that supports UIDL so I don't think it breaks anything.

I did the "try" build for linux64 and win64 so if you need something else like osx or win32 let me know and I'll add another build.

The try build patches the current "daily" build of tb so it is "latest and greatest" but usually reasonable stable. On the paged linked above, click the green "B" next to windows or linux and down below then click on "Artifacts". In the long list you will find the installation file for windows64 called "target_installer.exe". For linux you can just find and unpack the target.tar.bz2 and run "thunderbird" from there.

Sorry for the delay in addressing this. I just recently became aware of the issue. Let me know how any testing goes.
Thanks!

Barry S. Finkel

Comment 32

•

4 years ago

There is a basic problem in testing this. I have no idea when this occurs until I see the message and no mail is downloaded. If I were to install this test build, and the problem did not recur, I would have no way of knowing if there happened to be a temporary error that the patch successfully bypassed.

There is a problem with the debug smtp.l;og file - see https://support.mozilla.org/en-US/questions/1319643#answer-1381375 .
That file begins with non-printable characters, so I cannot use the Unix (GOW) utility "grep" to search that log file to see if there is a line for a temporary error. Therefore, I have no idea if the fix worked. If the smtp.log file did not have the initial 500 Mb first line, then I could use grep periodically to look for temporary errors.

I just remembered the "find" command. I tried it find /I "-err" pop3.log.moz_log and it was able to process my existing log file, even though the first line has 550 Mb characters.

One question about the patch - did you use David's entire patch, or the three pieces that I believe are the only necessary changes?

gene smith

Assignee

Comment 33

•

4 years ago

•

Edited

I just used the whole thing with a couple minor tweaks. It mostly seemed right. Maybe could be "optimized", not sure.

I could add a MOZ_LOG that prints to the log file when the error is detected. Would that help?

I recorded a POP3:5 log and don't see any non-ascii or binary stuff. Is it on the first line? I'm running on linux. Maybe this is a windows only problem?
P/S: I recorded a log on windows but only have an imap account. Didn't see any bad chars or lines with 500Mb or whatever. Can you explain or may attach a log to this bug with that problem?

gene smith

Assignee

Comment 34

•

4 years ago

I recorded a log on windows but only have an imap account (IMAP:5). Didn't see any bad chars or lines with 500Mb or whatever. Can you explain or maybe attach a (partial) log to this bug with that problem? I couldn't tell exactly what the problem is from the support site link.

Also, you should still see the response code -ERR[SYS/TEMP] in the log when it occur, so I don't think it needs an new MOZ_LOG() call to detect the issue in the log.

Barry S. Finkel

Comment 35

•

4 years ago

Gene, as you wrote, I do not need anything else in the pop3.log file, as I can use "find" to locate any "-ERR" lines.

I am not sure that I can attach a log file to this record. If you look at my posting on the forum, you will see that the first line of the log file has around 550 M characters:

C:\Users\BSF\Desktop>head -1 pop3.log.moz_log | wc
1 15 88336449

C:\Users\BSF\Desktop>head -2 pop3.log.moz_log | wc
2 26 88336553

C:\Users\BSF\Desktop>dir pop3.log.moz_log
Volume in drive C is Win10-Boot
Volume Serial Number is F8E6-91B2

Directory of C:\Users\BSF\Desktop

01/06/2021 09:49 AM 130,774,133 pop3.log.moz_log
1 File(s) 130,774,133 bytes
0 Dir(s) 108,932,567,040 bytes free

C:\Users\BSF\Desktop>

Note that the first line in my current log file has (obviously) one line, 15 words (separated by white space), and over 88 million characters. The second line has (26-15) = 11 words and 104 characters. Here is information on the first line in the file:

C:\Users\BSF\Desktop>head -1 pop3.log.moz_log > pop3.line1

C:\Users\BSF\Desktop>wc pop3.line1
1 15 88336449 pop3.line1

C:\Users\BSF\Desktop>od -c pop3.line1 | more
0000000 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
*
520763660 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 1 - 0 6
520763700 0 3 : 2 8 : 5 6 . 6 9 7 0 0 0
520763720 U T C - [ ( n u l l ) 2 4
520763740 2 9 2 : M a i n T h r e a d
520763760 ] : D / P O P 3 [ t h i s =
520764000 4 1 2 2 C 3 2 0 ] C l e a r i
520764020 n g s e r v e r b u s y i
520764040 n n s P o p 3 P r o t o c o l
520764060 : : O n S t o p R e q u e s t \r
520764100 \n
520764101

C:\Users\BSF\Desktop>

The offsets in the output of the od (Octal Dump) command are octal. Note that there are a HUGE number of null characters in my current log file before the first real log line. As I wrote in the other forum post, I did a search of the source, and there were too many lines containing "smtp" for me to determine where in the code these null characters are being written.

gene smith

Assignee

Comment 36

•

4 years ago

Come to think of it, I may have seen a moz log with a weird first line. Lots of repeating zeros maybe. But I don't remember what I was doing when I saw it. I don't think I saw a reason. Maybe Wayne has seen this; he knows all the bugs. :)

Barry, can you tell me exactly how you are setting the env vars and running tb to produce a log. If you always see the weird chars on the first line, maybe I can duplicate it by doing exactly what you do.

Anyhow, have you tried to run the "try" build?

Flags: needinfo?(vseerror)

Barry S. Finkel

Comment 37

•

4 years ago

I have two environmental variables:

 MOZ_LOG            POP3:5,timestamp
 MOZ_LOG_FILE   c:\Users\...\Desktop\pop3.log

At one time I also had in MOZ_LOG:

 POP3:5,SMTP:5,timestamp

to log SMTP traffic. I needed that to debug a content-filter problem on the outbound.att.net mail server and also to debug a problem with my att.net password. I have these two global environmental variables set, and Thunderbird sees and uses them when it starts.

I have not yet had time to install the "try" build. As for the NULL characters - my logs show that on 03/10/2020 (Windows 7) I did not have to remove NULL characters, but on 03/17/2020 I had to do so. My update log shows 03/16/2020 09:41 Mozilla Thunderbird 68.6.0 (replaces 68.5.0) . So, it appears that the NULLS were introduced in 68.6.0. In Win 7 I was running 16-bit TB as I am in Win 10.

Barry S. Finkel

Comment 38

•

4 years ago

Gene, I looked at the link you provided above, and I do not see an executable that I can download. But I am not really familiar with the Bugzilla repository.

gene smith

Assignee

Comment 39

•

4 years ago

(In reply to Barry S. Finkel from comment #38)

Gene, I looked at the link you provided above, and I do not see an executable that I can download. But I am not really familiar with the Bugzilla repository.

Here's the direct link to the win64 installer:
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Jz4XMZHvQtmNp3h8m1EevQ/runs/0/artifacts/public/build/install/sea/target.installer.exe

You should be able to get to this yourself with the instructions I wrote above:

On the paged linked above, click the green "B" next to windows or linux and down below then click on "Artifacts". In the long list you will find the installation file for windows64 called "target_installer.exe". For linux you can just find and unpack the target.tar.bz2 and run "thunderbird" from there.

Here's the link again to the full "try" build:
https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=70d0431e8250f3702ca3acc6673066492c15f6f0&selectedTaskRun=Jz4XMZHvQtmNp3h8m1EevQ.0

Anyhow, no problem. Maybe you didn't see this or it's not clear. Let me know how it goes.

Barry S. Finkel

Comment 40

•

4 years ago

Gene, about the null characters in the pop3.log file -- I restarted Thunderbird today at 11:30 today. I ran some commands that showed that at 16:36 the log file was empty; this surprised me. Then at 21:40 I checked, and the log file was 40.8 Mb in size with a large of NULL characters at the beginning.

======
D:\bsftools> head -1 c:\users\BSF\desktop\pop3.log.moz_log | od
0000000

D:\bsftools>dir c:\users\BSF\desktop\pop3.log.moz_log
Volume in drive C is Win10-Boot
Volume Serial Number is F8E6-91B2

Directory of c:\users\BSF\desktop

01/11/2021 04:36 PM 0 pop3.log.moz_log
1 File(s) 0 bytes
0 Dir(s) 107,214,700,544 bytes free

D:\bsftools>
...
D:\bsftools>dir c:\users\BSF\desktop\pop3.log.moz_log
Volume in drive C is Win10-Boot
Volume Serial Number is F8E6-91B2

Directory of c:\users\BSF\desktop

01/11/2021 09:40 PM 40,867,751 pop3.log.moz_log
1 File(s) 40,867,751 bytes
0 Dir(s) 106,812,301,312 bytes free

D:\bsftools> head -1 c:\users\BSF\desktop\pop3.log.moz_log | od -c
0000000 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
*
231026340 \0 t e : 1 5 \r \n
231026351

D:\bsftools>

Is there a document that tells what patches were included in the upgrade from 68.5.0 to 68.6.0?

gene smith

Assignee

Comment 41

•

4 years ago

Is there a document that tells what patches were included in the upgrade from 68.5.0 to 68.6.0?

AFAIK, there is only this:
https://www.thunderbird.net/en-US/thunderbird/68.6.0/releasenotes/
but it doesn't detail the exact bugs or patches that go into 68.6.0. In the past, whoever published the release notes would link to a bugzilla search that show all the changes that impact the subject release. But that seems to have been dropped. You might be able to do it yourself if you can figure out the bugzilla search and if you really need the details. If you need help with this you could request a NI (Need Info) from Wayne using his email address from comment 7 in the "Request information from" box below. (Wayne's the expert at searching bugzilla!)

I haven't had a chance to try recording a pop log again in windows. I don't think it matters but I'm still not 100% sure how you are setting your env:

MOZ_LOG POP3:5,timestamp
MOZ_LOG_FILE c:\Users...\Desktop\pop3.log

Are you doing this on a command line or are you doing it from the the "System Properties" -> "Environment Variables" gui?

One other thing you might try to eliminate the leading nulls is to add a parameter "sync" to the MOZ_LOG:

MOZ_LOG POP3:5,timestamp,sync

This is just a shot in the dark...

Wayne Mery (:wsmwk)

Comment 42

•

4 years ago

(In reply to gene smith from comment #36)

Come to think of it, I may have seen a moz log with a weird first line. Lots of repeating zeros maybe. But I don't remember what I was doing when I saw it. I don't think I saw a reason. Maybe Wayne has seen this; he knows all the bugs. :)

not this time :(

(In reply to gene smith from comment #41)

Is there a document that tells what patches were included in the upgrade from 68.5.0 to 68.6.0?

AFAIK, there is only this:
https://www.thunderbird.net/en-US/thunderbird/68.6.0/releasenotes/
but it doesn't detail the exact bugs or patches that go into 68.6.0. In the past, whoever published the release notes would link to a bugzilla search that show all the changes that impact the subject release. But that seems to have been dropped. You might be able to do it yourself if you can figure out the bugzilla search and if you really need the details

Easier to use hg - https://hg.mozilla.org/releases/comm-esr68/pushloghtml?fromchange=THUNDERBIRD_68_5_0_RELEASE&tochange=THUNDERBIRD_68_6_0_RELEASE&full=1

Flags: needinfo?(vseerror)

Barry S. Finkel

Comment 43

•

4 years ago

"System Properties" -> "Environment Variables" gui
I will add "sync".

The above was not submitted last night due to a "collision"., but I do not see any updates that were entered after I typed my reply; so maybe whoever entered an update while I was updating backed-off that update, as I did a few minutes ago.

I just upgraded from 78.5.0 to 78.6.1 (I had not yet installed the "try" build). I checked the pop3.log.moz_log file, and it had zero length. Then I clicked "Get Messages", and two messages were downloaded. The resulting log file had 1,239,292 bytes; the first line had lots of initial NULL characters:

C:\Users\BSF\Desktop>head -1 POP3.LOG.MOZ_LOG | od -c
0000000 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
*
2262200 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 s
2262220 = 2 8 D 1 3 C C 0 ] R E C V :
2262240 t : 0 p x ; f o n t - s i
2262260 z e : 0 p x ; o v e r f l o
2262300 w : h i d d e n ; m s o - h
2262320 i d e : a l l " / > < / b o d
2262340 y > < / h t m l > \r \r \n
2262354

C:\Users\BSF\Desktop>

Were the null characters written (but the log file not closed) before I clicked "Get Messages"? I can't see why writing the first line of the log file would cause NULLS to be pre-pended to the line begin written. I do not know if the number of NULL characters is related to the amount of time between the restart and the downloading of the first message.

gene smith

Assignee

Comment 44

•

4 years ago

•

Edited

I just set up a pop3 account 64 bit win10 and recorded a log as you describe in env gui and ran tb 78.6.1 (64-bit) from cmd. I still don't see a problem with the log.

I didn't run a long time but on closing tb the log was still OK. I received one message while running and it logged to the file.

Do you see the leading zero and stuff if you just run tb a short time?

Maybe you can just zip the file and send it directly to me via my email if less than 20M. Lot of repeating zeros in the file should be compressed out of the zip making it much smaller I would think. Not sure I can tell anything by looking at the file but I'll try.
If you can see the problem while running tb a short time, that may also make the file smaller.
P/S: You can also leave off timestamp temporarily to make it smaller.
PP/S: Any chance the "unix tools" you are running somehow "corrupting" the file. I.e., if you don't run them do you see the zeros?

Barry S. Finkel

Comment 45

•

4 years ago

Here are some commands I just ran:

=====
C:\Users\BSF\Desktop>dir pop3*
Volume in drive C is Win10-Boot
Volume Serial Number is F8E6-91B2

Directory of C:\Users\BSF\Desktop

01/06/2021 10:05 AM 88,336,449 pop3.line1
01/06/2021 10:08 AM 88,336,449 pop3.line1a
07/09/2020 08:42 AM 0 pop3.log .moz_log
01/12/2021 10:12 AM 0 pop3.log.child-1.moz_log
01/12/2021 10:12 AM 0 pop3.log.child-10.moz_log
01/12/2021 11:32 AM 0 pop3.log.child-11.moz_log
01/12/2021 11:32 AM 0 pop3.log.child-12.moz_log
01/12/2021 01:08 PM 0 pop3.log.child-13.moz_log
01/12/2021 01:08 PM 0 pop3.log.child-14.moz_log
12/30/2020 09:02 PM 0 pop3.log.child-15.moz_log
12/27/2020 03:06 PM 0 pop3.log.child-16.moz_log
12/27/2020 03:06 PM 0 pop3.log.child-17.moz_log
01/12/2021 10:12 AM 0 pop3.log.child-2.moz_log
01/12/2021 10:12 AM 0 pop3.log.child-3.moz_log
01/12/2021 10:12 AM 0 pop3.log.child-4.moz_log
01/12/2021 10:12 AM 0 pop3.log.child-5.moz_log
01/12/2021 10:12 AM 0 pop3.log.child-6.moz_log
01/12/2021 10:12 AM 0 pop3.log.child-7.moz_log
01/12/2021 10:12 AM 0 pop3.log.child-8.moz_log
01/12/2021 10:12 AM 0 pop3.log.child-9.moz_log
01/14/2021 01:49 PM 0 pop3.log.moz_log
01/04/2021 03:36 PM <DIR> pop3logs
21 File(s) 176,672,898 bytes
1 Dir(s) 106,492,293,120 bytes free

C:\Users\BSF\Desktop>finderr

C:\Users\BSF\Desktop>rem batch scriupt to find "-ERR" in the Thunderbird log file

C:\Users\BSF\Desktop>rem 08Jan21 0941AM Barry Finkel

C:\Users\BSF\Desktop>find "-ERR" c:\users\BarryFinkel\desktop\pop3.log.moz_log

---------- C:\USERS\BSF\DESKTOP\POP3.LOG.MOZ_LOG

C:\Users\BSF\Desktop>wc pop3.log.moz_log
4465 58569 72905937 pop3.log.moz_log

C:\Users\BSF\Desktop>head -1 pop3.log.moz_log | wc
1 4 72414085

C:\Users\BSF\Desktop>

Note that the "dir" command shows that the log file has ZERO length. My "finderr" command took about 30 seconds to complete, so there is something in the log file. The Unix "wc" (word count) command shows 4,465 lines 58,569 "words" and 72,905,937 characters. The first line has 72,414,085 characters. It appears that the log file is not "officially" closed, so it appears to have no length.

I did the following: cat pop.log.moz_log > a.b [I use a.b as a temporary filename.]
Zip the a.b file into a.b.zip .

The resulting ZIP file is 608 Kb. I have just sent an e-mail. I hope your ISP allows zip files.

I am using the Unix utilities to read the file or extract certain lines from the file. I cannot believe that these utilities are corrupting the file.

gene smith

Assignee

Comment 46

•

4 years ago

•

Edited

Received the zip file ok. Thanks. On unzip, the first line is all 0's and is 99% of the file it seems. No idea where that is coming from. I can delete the first line and the file is a normal size and contains expected content.

Also, I've seen the files with "Child" in the name and have no idea what that is about. Kind of looks like backups or something. They all seem to be zero length too, per you "dir" output. I've always just ignored them.

Don't mean to "point fingers" at the unix tools but don't know what else, other than defective tb, could cause it. Maybe you've done this but for grins, try this:

Shutdown tb and all associated windows.
Run tb and record the log. Don't run any other utilities.
Click "get messages" to ensure some activity occurs.
Shutdown tb
Open log file with notepad.
Is the file OK or does it have the bad 0s?

I know the logfile may not be completely written until you completely shutdown tb. At least on linux, ls still shows non-zero length for the log file even with tb running. Not sure how windows dir works in this respect. (The "sync" logging option I mentioned is supposed to force a write to the log file more often. However, I've never seen it actually make a difference.)

Barry S. Finkel

Comment 47

•

4 years ago

The first line, I believe, is NOT all NULL characters. The end of that line contains the first real logged record.
C:\Users\BSF\Desktop>head -1 a.b | od -c
0000000 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0
*
424171540 [ t h i s = 1 A 9 2 8 6 6 0 ]
424171560 E n t e r i n g s t a t e :
424171600 2 3 \r \n
424171605

C:\Users\BSF\Desktop>

I will try your suggestions. The Unix utilities I am using are only reading and formatting the file; the utilities are not writing the file.

gene smith

Assignee

Comment 48

•

4 years ago

The first line, I believe, is NOT all NULL characters. The end of that line contains the first real logged record

Yes, I noticed that too. I guess it is 99.99% zeros. :)
I also noticed that the leading data for the first record (timestamp, etc) is not there at the end of the first line.

Barry S. Finkel

Comment 49

•

4 years ago

Thunderbird hung this evening; I have no idea why. I cancelled it via the Task Manager. I restarted it, and it downloaded some mail (as it had been in a "Not Responding" state for about 20 minutes). I immediately did

 copy pop3.log.moz_log aa.bb

and the aa.bb file does NOT have the initial NULL characters. The first line starts with a UTC timestamp. This is different from the a.b file I captured this afternoon. I have no idea why.

Barry S. Finkel

Comment 50

•

4 years ago

I just did some research on my existing log file. Process Explorer says that Thunderbird was started at 09:28 PM yesterday (01/14). This is the time I cancelled Thunderbird and restarted it. Last night the log file had no initial NULL characters. Now (01/15 4:13PM) the log file has many initial NULL characters. This is very confusing, as I am not sure how one pre-pends characters to an existing file. The "dir" command shows that the file has 71 Mb of characters, while last night, just after the restart, dir showed a zero file length. I do not know when the dir command showed a non-zero length, This is very confusing.

gene smith

Assignee

Comment 51

•

4 years ago

Is there any way to "watch" a file while windows has it open. For exmaple in linux you can do "tail -f logfile" and it will display the most recent last few lines of the file as they change. Maybe the unix tools have a "tail" tool. However not sure this will show anything. But if dir always shows zero length until you shut down tb and close the file, I doubt if a "tail" function would show anything either.

Another thing that occurred to me is maybe this is somehow an anti-virus artifact affecting the file. (Sorry, more finger pointing.)

Regarding the "prepending" of zeros, I guess maybe an external tool of some sort could replace the first line with a bunch of zero followed by the original 1st line. But I don't think the code in mozilla/tb would do that, but since you see it maybe it (or something) is...

Finally, I don't see in you comments if you ever did the simple short-term test I suggested in comment 46. If tb is writing the leading zero on tb shutdown, they should appear in the file after only running tb a few minutes.

Barry S. Finkel

Comment 52

•

4 years ago

The GOW (Gnu on Windows) Unix utilities are the same ones that are in Gnu Unix. The "tail -f" command will work, but it only displays lines added at the END of a file. So it will not help in finding characters added to the beginning of a file.

I wrote a simple batch script to look at the pop3.log file every minute.

C:\Users\BSF\Desktop>cat poplog.bat
rem Batch script to look for NULL characters in Thunderbird pop3.log file
rem 16Jan21 0619PM Barry Finkel bsfinkel@att.net

call cddesk
:loop
ECHO:| time
dir pop3.log.moz_log
head -1 pop3.log.moz_log | od -c
sleep 60s
goto :loop

C:\Users\BSF\Desktop>

I started the script at 18:29. At 18:56 I did an Exit in Thunderbird, and at 18:58 it was still running, so I did another Exit. At 19:00 I restarted Thunderbird, and at 19:10 (after my user-specified 10 minute wait interval), two new mail messages were downloaded. As I write this part at 19:48, the first line still is intact without NULL characters. I will wait to see when the NULL characters appear, and then wait to see if more NULLS are added later.

At 19:54 the log file has zero length; I am not sure why. I am doing nothing with Thunderbird except reading and deleting mail.

At 20:00 more mail arrived, and the log file has MANY initial NULL characters (octal 14065667).

At 20:04 the log file has reverted to zero-length.

At 20:30 the file has many NULL characters and some text. There were no messsages downloaded at 20:30. Here are the first of the 4,248 records in the log (with the timestamp info removed):

 D/POP3 [this=372B3E20] LoadUrl()
 D/POP3 [this=372B3E20] Initialize()
 I/POP3 [this=372B3E20] Connecting to server pop.att.yahoo.com:995
 D/POP3 [this=372B3E20] Setting server busy in nsPop3Protocol::LoadUrl()
 I/POP3 [this=372B3E20] Entering NET_ProcessPop3 25
 I/POP3 [this=372B3E20] Entering state: 1
 I/POP3 [this=372B3E20] Entering state: 2
 I/POP3 [this=372B3E20] Entering state: 4
 I/POP3 [this=372B3E20] RECV: +OK Hello from jpop-0.1
 I/POP3 [this=372B3E20] Entering state: 28
 D/POP3 [this=372B3E20] SendCapa()
 I/POP3 [this=372B3E20] SEND: CAPA
 I/POP3 [this=372B3E20] Entering NET_ProcessPop3 126
 I/POP3 [this=372B3E20] Entering state: 3
 I/POP3 [this=372B3E20] RECV: +OK CAPA list follows
 I/POP3 [this=372B3E20] Entering state: 29
 I/POP3 [this=372B3E20] RECV: IMPLEMENTATION jpop-0.1
 D/POP3 [this=372B3E20] Capability entry processed
 I/POP3 [this=372B3E20] Entering state: 29
 I/POP3 [this=372B3E20] RECV: EXPIRE-NEVER
 D/POP3 [this=372B3E20] Capability entry processed
 I/POP3 [this=372B3E20] Entering state: 29

At 20:42 the log file again reverted to zero length. No mail was downloaded at 20:40.

At 20:51 the log file now has initial NULLs and 4,196 records.

It is obvious that Thunderbird is not pre-pending NULLS to an existing file (and I have no idea how it could). Thunderbird is clearing an existing file (one that contains no NULLs) and writing a new file (that contains initial NULLs). I need to figure out what else I need to add to the script to get more information. I will send the script output file via e-mail.

gene smith

Assignee

Comment 53

•

4 years ago

•

Edited

Yes, I can see it here too on windows:

PS C:\Users\gds> head -1 tb.log.moz_log | od -c
0000000  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0142340  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0   2   0   2
0142360   1   -   0   1   -   1   7       0   5   :   3   8   :   0   5
0142400   .   1   1   7   0   0   0       U   T   C       -       [   (
0142420   n   u   l   l   )       1   5   6   1   6   :       M   a   i
0142440   n       T   h   r   e   a   d   ]   :       D   /   P   O   P
0142460   3       [   t   h   i   s   =   0   0   0   0   0   2   0   C
0142500   2   7   7   0   4   0   0   0   ]       L   o   a   d   U   r
0142520   l   (   )  \r  \n
0142525
PS C:\Users\gds>

I installed the GOW utilities and had the log file open in vim (from the GOW utils) and it said the file changed. I then saw in vim and with ls -l that the the file was zero length. I then sent a message to the pop account and again vim said the file changed. This time the zeros were at the top according to the above dump.
I was setting up your script to run when I saw this but I haven't yet run it to see the behavior over time. But I think this pretty much verifies your observations and this seems like another bug. (I've mostly forgotten what the actual subject of this bug is :)).

gene smith

Assignee

Comment 54

•

4 years ago

From comment 32:

There is a problem with the debug smtp.l;og file - see https://support.mozilla.org/en-US/questions/1319643#answer-1381375 .
That file begins with non-printable characters, so I cannot use the Unix (GOW) utility "grep" to search that log file to see if there is a line for a temporary error. Therefore, I have no idea if the fix worked. If the smtp.log file did not have the initial 500 Mb first line, then I could use grep periodically to look for temporary errors.

It looks like there is a grep option -a that allow it to ignore the "binary" data and just look for text, e.g.,

grep -a LIST tb.log.moz_log

This worked for me with the GOW version on a log file that grep claimed, without the -a, was binary. (You may already know that you can ignore case too with grep option -i so searching with option -ia might be appropriate.)

P/S: the GOW tools work pretty well with no hassle. I've used cygwin in the past for similar purposes, but it's been a while. Thanks for pointing this out!

Barry S. Finkel

Comment 55

•

4 years ago

Thanks for the "-a" parameter tip; I do not know/remember all of the grep parameters, and I never had to use grep on a binary file..
I revised my script:

C:\Users\BSF\Desktop>cat poplog1.bat
rem Batch script to look for NULL characters in Thunderbird pop3.log file
rem 16Jan21 0619PM Barry Finkel bsfinkel@att.net
rem 16Jan21 0902PM Added wc -l and tail -1

call cddesk
:loop
ECHO:| time
dir pop3.log.moz_log
head -1 pop3.log.moz_log | od -c
wc -l pop3.log.moz_log
tail -1 pop3.log.moz_log
sleep 60s
goto :loop

C:\Users\BSF\Desktop>

and I let it run from 21:06 until the present (09:29); I tried to cancel it, but it will not cancel.
I did "copy poplog1.out.txt poplog1.out1.txt", so I have a file that is not being updated. In that file:

21:00 log file length = 6,607,069 with pre-pend NULLs
21:11 log file length went to 0
21:20 log file length = 8,048,615 with pre-pend NULLs
09:21 The log file overnight grew; it never went back to 0 length. And, of course, the number of NULLs did not change.

I will send the log file via e-mail. The script is still running, so I will be able to tell if the log file length goes back to zero.
Note that in Chicago I am UTC-0600. So, the basic questions are:

Why is the log file sometimes written without leading NULL characters in the first line, and sometimes with lots of NULLs?
Why does the log file sometimes clear to zero length?

I do not know at the present time how I can modify my script to get answers to these two questions. I do not know if capturing the log before it resets will tell us anything; it will tell us what Thunderbird did, but it won't tell us what caused the log to reset. And a new zero-length log will tell us nothing about what happened.

FYI, the log problem is a secondary problem/bug; this but report concerns Thunderbird not parsing a temporary POP3 error on the server, and thus not downloading any mail until Thunderbird is restarted. I have not installed the test build yet, because
a) I upgraded to 78.6.1 last week, and b) I have been busy with this log problem to see if the test build fixes my other problem. The other problem occurs randomly, so will have to be continually looking at the log to see if I see a temporary error logged.

Barry S. Finkel

Comment 56

•

4 years ago

Gene, my script is still running, as my attempt to cancel it this morning did not succeed, and I see no need to cancel it. It currently is 1.4 Mb with 36,062 lines.

C:\Users\BSF\Desktop>grep " 0 p" poplog1.out.txt
01/16/2021 09:11 PM 0 pop3.log.moz_log
01/16/2021 09:11 PM 0 pop3.log.moz_log
01/16/2021 09:11 PM 0 pop3.log.moz_log
01/16/2021 09:11 PM 0 pop3.log.moz_log
01/16/2021 09:11 PM 0 pop3.log.moz_log
01/16/2021 09:11 PM 0 pop3.log.moz_log
01/16/2021 09:11 PM 0 pop3.log.moz_log
01/16/2021 09:11 PM 0 pop3.log.moz_log
01/16/2021 09:11 PM 0 pop3.log.moz_log
01/17/2021 12:26 PM 0 pop3.log.moz_log
01/17/2021 12:26 PM 0 pop3.log.moz_log
01/17/2021 12:26 PM 0 pop3.log.moz_log
01/17/2021 12:26 PM 0 pop3.log.moz_log
01/17/2021 02:14 PM 0 pop3.log.moz_log
01/17/2021 02:14 PM 0 pop3.log.moz_log
01/17/2021 02:14 PM 0 pop3.log.moz_log
01/17/2021 02:17 PM 0 pop3.log.moz_log
01/17/2021 02:17 PM 0 pop3.log.moz_log
01/17/2021 02:20 PM 0 pop3.log.moz_log

C:\Users\BSF\Desktop>

Note that the log was reset 21:11 yesterday, 12:26 today, and again at 14:14 today.
I would not mind if you created a new bug report for these log problems - NULLs and log reset. Maybe two bug reports?

gene smith

Assignee

Comment 57

•

4 years ago

Thanks for the "-a" parameter tip; I do not know/remember all of the grep parameters, and I never had to use grep on a binary file..

And thank you too since I was not familiar with "head" and "od" commands..

I ran the test again last night but just monitored the log periodically with your head/od command. I added IMAP:5 to the POP3:5 and it seemed for a while it was OK. Then I saw a the file go to just leading zeros and then it went to leading zero and OK stuff following as we've seen before. When I shutdown tb the file length with dir showed 0 and when I opened it with vim (from you GOW) it looked empty. Then I shutdown windows and quit for the night.
Unfortunately, I just started tb again and forgot that the log file would be overwritten since the env vars are still set. I was meaning to look at it again today but it's gone now. Oh, well.

Anyhow, we know something strange happens in windows. I've never seen this problem on linux so I need to check it there too.

I just received your comment 56 above that shows samples where the length goes to zero. I'm not sure if I asked this, but when you cleanly close down tb, is the log file still bad? Maybe when the buffers are "flushed out" it cleans up the file and gets rid of the leading zeros? That's what I meant to check from my file last night but I messed up... :(

gene smith

Assignee

Updated

•

4 years ago

Blocks: 1687377

Barry S. Finkel

Comment 58

•

4 years ago

Getting back to the original problem - I have not installed the test build for a number of reasons:

I installed newer releases of Firefox since the test version was created. I assume that I could install the back-level test version replacing a newer version, without any problems.
The UIDL problem has not recurred recently; the last occurrence was Dec 27. Without a recurrence, I would not know if the code is working. All I would be able to test is that the patch does not disrupt the downloading of mail from the server. To know if the code is working, I would have to look at the Mozilla log file periodically and see if there were any temporary errors logged. This is not possible, as anytime I open a URL in an e-mail message, the log file resets, and I loose all information that was in the log before it was reset.

Barry S. Finkel

Comment 59

•

4 years ago

A quick update to my previous update of two months ago - I have not worked on this because I was spending time on the MOZ_LOG bug 1687377. The same two items in my previous posting still apply. I am not sure that if I were to install the test version, which is a number of months old, that I would have problems with regression from my current 78.9.1 to a previous version on which the test version was built. And, since I have not had a recurrence with att.net since December 27, running with the test version (for about a day) will only tell me if the revised code breaks POP3 access to the server. It would not tell me if the revised code actually fixes the problem.

Should I install the text version for a day? Thanks.

gene smith

Assignee

Comment 60

•

4 years ago

Hi Barry, Was wondering when this original issue would be mentioned again. I wonder if maybe it has been fixed on the AOL/Verizon/att/Yahoo POP3 servers since haven't heard anything more about it and you say you haven't seen it since Dec.

Anyhow, if you still have it, probably no harm in trying it again. But if you don't have it downloaded locally, it's probably "expired" from the try server link and I'll have to rebuild it. Let me know.

I guess I've given up on anyone really caring about bug 1687377. I'll just suggest that the tb logging page mentions the issue and while logging don't click a link to open firefox.

Note to anyone reading this: The UIDL bug for this is mostly covered up to comment 33. Between comment 34 and 58 we mostly address the logging issue (TB log file gets reset to zero length when an http link is clicked to open firefox from an email) which has been moved into bug 1687377.

Barry S. Finkel

Comment 61

•

4 years ago

Gene, I did not download your test version, so it probably has expired. And I was unsure how to download it. If you re-build it, I will test it to ensure that it does not affect POP3 mail downloads. As for the logging - why can't FF and TB use their own log file? "Translate" MOZ_LOG_FILE internally to TB_MOZ_LOG_FILE or FF_MOZ_LOG_FILE. I have no idea how many TB or FF users are currently logging.

gene smith

Assignee

Comment 62

•

4 years ago

•

Edited

I just re-did the "try" build with the patch applied to yesterday's "daily" version here:
https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=66ca29ab1116f586339708b8c0916e519ddf2cc4

The "try" build incorporates the patch provided by David Smith. It looks reasonable to me but I don't work on tb's pop protocol much since most reported problems that I deal with are imap.

The try build patches the current "daily" build of tb so it is "latest and greatest" (as of yesterday) but usually reasonably stable. On the paged linked above, click the green "B" next to "Windows 2012 x64 opt" and down below then click on "Artifacts". In the long list you will find the installation file for windows64 called "target_installer.exe".

In case this is confusing, here's a direct link to "target_installer.exe":
https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/IpUU4WuPTnCPsDKSRd52fg/runs/0/artifacts/public/build/install/sea/target.installer.exe

As for the logging - why can't FF and TB use their own log file? "Translate" MOZ_LOG_FILE internally to TB_MOZ_LOG_FILE or FF_MOZ_LOG_FILE. I have no idea how many TB or FF users are currently logging.

I'm not sure I understand what you mean by this. The TB specific code has no knowledge of or access to these environment variables. It is all in the mozilla specific code that TB just links to and uses as a library. So any changes to "translate" the vars would require changes in mozilla code which there seems to be very little interest in.

Barry S. Finkel

Comment 63

•

4 years ago

I had a problem with the target_installer.exe . It started TB with a new profile, and I did not not want to "play" with my mail profile. So I renamed the profiles.ini file to profiles.ini.daily and restored my profiles.ini from a backup. I experimented with the profiles.ini file when I moved my TB mail files from Windows 7 to Windows 10, and I did not have access to my old profiles file to see what was there. When I have time, I will review the profiles.ini files and be ready to modify the daily-build file to use my default profile. Maybe later today. I am very cautious when working with my mail files. I did install 78.10.0 when TB came back up.

Barry S. Finkel

Comment 64

•

4 years ago

This morning I started your test daily Thunderbird. I will let it run a while and report any problems. But I doubt I will experience the mail server problem that started this bug report. One comment - I started with the "-P" parameter so that I could select the correct profile. The pop-up window gave me two choices - default and default-release. I had to go to the Appdata\Roaming\Thunderbird\Profiles directory to see the actual filenames: v73kxnlg.default-release and qf00ka5x.default. I know my TB profile via the 5x name. Maybe "-P" should give the full filenames?

gene smith

Assignee

Comment 65

•

4 years ago

Sorry for the long delay in responding to your comments. I'm not real sure how the -p parameter works or the difference between "default" and "default-release". I do know that you can rename a profile but the directory name where the profile lives doesn't change, which can be confusing. Also, good that you are using a different profile with the "try" build since you can run into an error when you run the release build again that forces you to make a new profile and the only way to avoid it is to run tb with env var MOZ_ALLOW_DOWNGRADE or with parameter --allow-downgrade (or maybe sometimes both, not sure).

Barry S. Finkel

Comment 66

•

4 years ago

Gene, I am using the SAME profile. There is no way I could read mail without the same folder setup. I have not noticed any problems, and, since I plan to install the April patches this weekend, I will have to reboot. At that time I will switch back to the "regular" TB. I assume that I will have to add the "downgrade" parameter to my tb.bat script.

C:\Users\BarryFinkel\AppData\Roaming\Thunderbird\Profiles>dir
Volume in drive C is Win10-Boot
Volume Serial Number is F8E6-91B2

Directory of C:\Users\BSF\AppData\Roaming\Thunderbird\Profiles

04/21/2021 10:12 AM <DIR> .
04/21/2021 10:12 AM <DIR> ..
04/25/2021 10:38 AM <DIR> ii4kvzw7.default-default
04/06/2020 09:21 PM <DIR> qaqkee8x.default-old
04/30/2021 09:18 AM <DIR> qf00ka5x.default
04/29/2021 10:43 PM <DIR> v73kxnlg.default-release
0 File(s) 0 bytes
6 Dir(s) 95,472,463,872 bytes free

C:\Users\BSF\AppData\Roaming\Thunderbird\Profiles>

gene smith

Assignee

Comment 67

•

4 years ago

Ok, guess you didn't say you were using a new profile; for some reason I just assumed it since you mentioned two different profile names.
Anyhow, if you see a message that you must create a new profile to run TB you will have to either create a new profile or use the env var or the parameter to use the existing profile. Just a heads-up; I'm not sure exactly when this kicks in.

Barry S. Finkel

Comment 68

•

4 years ago

I have seen no problems running the test build. I have not experienced the mail server error that prompted this bug report, so I cannot determine if the fix works. From looking at your code, it appears that the new code will resolve the problem. But since your code does more than I think is necessary, I do not know if your extra code will cause any problems. Right now, I see no reason why your test code cannot be included in a production release. I do not know if I have access to anyone else who has experienced this problem; if I can locate anyone, we should have them test the code, assuming that that person has experienced the server error recently.

gene smith

Assignee

Comment 69

•

4 years ago

Maybe David Smith who actually produced the patch would be interested in testing the try build? He's still on the CC list for this bug.

Barry S. Finkel

Comment 70

•

4 years ago

I was interested in finding persons who had experienced the problem with a server other than inbound.att. net; maybe persons had experienced recent problems with, say, verizon.net . I wrote to one person, who posted very early in this bug report; I have not gotten a reply. All of the other items I found in a Google search were much older (3+ years).

Barry S. Finkel

Comment 71

•

4 years ago

I have a problem with the test build. I am trying to send a short piece of mail with three attachments. TB will not send the mail; a pop-up window flashes on the screen, and I cannot read what it says. I did get into the state where TB told me that it does not have access to one attachment. I deleted it and re-attached it. Then TB complained about the second attachment. After deletion and re-attachment, TB complained about the third one. I have no idea what is happening.

Also, I changed tbdaily.bat to add SMTP logging, but I do not get any SMTP logging. Why?

D:\bsftools>type tbdaily.bat
rem batch script to set environmental variables and then start Thunderbird
rem 20Feb21 0728PM Barry Finkel
rem 21Feb21 0904AM Cloned from tb-append.bat; removed ",append"
rem 21Apr21 1011AM Cloned from tb.bat
rem 30Apr21 0916AM Added -P
rem 14May21 0234PM Added ",SMTP:5"

rem set MOZ_LOG=POP3:5,timestamp,sync,append
rem set MOZ_LOG=POP3:5,timestamp,sync
set MOZ_LOG=POP3:5,SMTP:5,timestamp,sync
set MOZ_LOG_FILE=C:\Users\BarryFinkel\Desktop\pop3.log
rem START C:"Program Files (x86)""Mozilla Thunderbird"\thunderbird.exe
START C:"Program Files""Thunderbird Daily"\thunderbird.exe -P

D:\bsftools>

gene smith

Assignee

Comment 72

•

4 years ago

If the notification message is too quick to read you can usually see it by one of these ways:

Look under Tools menu items and select "Activity Manager'. It should list it.
Check window's log of notification messages. Maybe it has one, not sure (Linux/KDE does).

Your MOZ_LOG variable looks right. You only see SMTP log entries when a messages is sent by you. So if you send a simple messages does it show some SMTP log items?

If you can't determine what the problem is, can you create a sample emails and attach it that has the attachment problem?

Barry S. Finkel

Comment 73

•

4 years ago

Gene, I did not see the info in the Activity Manager. But yesterday's records do not have a timestamp, and my activity might have been already removed from that log. And I found nothing in the Event Viewer. Friday afternoon, I took the mail that had three attachment (after I had sent it successfully via the web interface to my att.net mail), and I re-addressed it to myself. The mail was successfully delivered. I just send the same e-mail to my alternate ACM mailbox, and the mail was successfully sent without any error messages.

I just looked at the pop3.log.moz_log file, and the only lines with SMTP are in lines with "I/POP3", so the test mail I just sent is NOT being logged.

Barry S. Finkel

Comment 74

•

4 years ago

I tried an experiment this morning. I modified tbdaily.bat to interchange SMTP and POP3. I still get POP3 logged but not SMTP.

Is your patch going to a production release soon? I have tested it all that I can, and I have had no POP3 problems. But, I have had no server temporary failure to test if the code really recovers from the temporary failure.

gene smith

Assignee

Comment 75

•

4 years ago

Barry, again, you are right in what you see. There is no logging with default daily build as you are using with the "try" build. After doing a lot of testing trying to get SMTP logging to work I poked around in bugzilla and found that apparently the SMTP protocol is being ported from c++ to javascript here: Bug 1677683. I have no idea why this is a good idea but it does seem to be what is happening.

Possibly the problem you saw with sending is due to this change. And it seems SMTP logging is not present in the javascript port.

But at least the original c++ is still present but it is disabled by default. When I set the config editor preference mailnews.smtp.jsmodule to false and restart TB I see SMTP logging occur. You probably should set this so your sending code will be more stable. If you can duplicate the sending problem you had with the pref true and then verify that it goes away with mailnews.smtp.jsmodule false it would be helpful to file a new bug report regarding that.

I'll check some more about the lack of SMTP logging and probably file a new bug regarding that unless there are already plans to add it to the JS port.

Is your patch going to a production release soon?

No, I haven't submitted it as formal patch. I really need to look at it closer and see again what it is doing. (If it's only working around a bug in yahoo and friends POP server it might be hard sell to have it incorporated in the code base.)

Barry S. Finkel

Comment 76

•

4 years ago

I will test the "mailnews.smtp.jsmodule" setting later this week. I believe that the patch NEEDS to be submitted as a formal patch. It fixes a long-standing bug that I have experienced with the inbound.att.net server. Others have reported it with a POP3 server at Verizon.net. I do not remember the other ISPs for which this has been reported. Your patch does not cause any problems, as far as I can tell. But, as I have written before, I think it does more than it needs to do. It does not have to do any6thing w2ith any UIDL flag. Basically, all it has to do is this - parse the error message, and if the error is temporary, stop the current message download attempt. That is all. At the next user-supplied interval, TB will again connect to the server and try to download mail.

gene smith

Assignee

Comment 77

•

4 years ago

Magnus, Re: comment 75. It appears that the SMTP protocol has been ported to javascript. Is there a way to do SMTP logging like we could do before? There is a pref called mailnews.smtp.loglevel which defaults to Warn but setting it to Debug or Verbose or 5 doesn't cause an SMTP log to be recorded with env. vars MOZ_LOG=SMTP:5 and MOZ_LOG_FILE=~/tblog set in the usual way. Is this a missing feature or am I doing something wrong?
Note: if I set mailnews.smtp.jsmodule false SMTP logging occurs as expected since this enables to old c++ protocol code.
Also, in SmtpClient.jsm it appears there are calls to logging using, e.g., this.logger.debug("String to log") so there appears to be an attempt to support SMTP logging.

gene smith

Assignee

Comment 78

•

4 years ago

(In reply to Barry S. Finkel from comment #76)

I will test the "mailnews.smtp.jsmodule" setting later this week. I believe that the patch NEEDS to be submitted as a formal patch. It fixes a long-standing bug that I have experienced with the inbound.att.net server. Others have reported it with a POP3 server at Verizon.net. I do not remember the other ISPs for which this has been reported. Your patch does not cause any problems, as far as I can tell. But, as I have written before, I think it does more than it needs to do. It does not have to do any6thing w2ith any UIDL flag. Basically, all it has to do is this - parse the error message, and if the error is temporary, stop the current message download attempt. That is all. At the next user-supplied interval, TB will again connect to the server and try to download mail.

Ok, thanks for the info on this. I may be wrong, but I think that att, verizon and possibly aol all use the yahoo servers. So if yahoo fixed the bug it may no longer be a problem. But, I see your point that fixing this would tolerate this bug if it ever comes back (assuming it is really fixed and is common to yahoo servers) or happens with another server.
This may also depend somewhat on if there are plans to possibly convert the POP3 code over to javascript. I guess Magnus would know the answer to this.

gene smith

Assignee

Comment 79

•

4 years ago

Attached patch pop3-sys-temp-fix.diff (obsolete) — Details — Splinter Review

Barry, Here's a new "try" build that, I think, does what you suggest. It just catches the [sys/temp] error and puts up a message that a "temporary" server error has occurred and allows a normal retry (either click on get messages or wait for timed "biff" check) to occur.

Since I don't have a POP3 server that has this problem, to test this I have included in the build a "fake" response that will occur on the first check for new mail. So you will always see the message due to the "fake" response one time. But now with the fix, it won't cause new messages to not be retrieved like before.

The try build is here:
https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=429b1d9d5b6b93afe70f690688a84a7319c7f583

The target.installer.exe is here: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/RpCDzcWvRni9yJkI9-a71w/runs/0/artifacts/public/build/install/sea/target.installer.exe

I've also attached the changes. I left in some things that helped me to understand the resulting POP3 log. I added a suffix number x to the "RECVx:" occurrences so I can tell where in the code each occurs. Also, there is code added to do the one-shot "fake" [sys/temp] response and some changes to make the compiler happy and avoid crashing regarding freeing memory used by the "fake" line. This shouldn't affect the running of the code and I'll clean it up once we agree that it is OK.

I went back to your old forum post and verified that my "fake" response resulted in the log entries you collected probably over a year ago: https://support.mozilla.org/en-US/questions/1268468?page=3#answer-1294494.

Assignee: nobody → gds

Attachment #9177490 - Attachment is obsolete: true

Attachment #9177533 - Attachment is obsolete: true

Status: UNCONFIRMED → ASSIGNED

Ever confirmed: true

gene smith

Assignee

Comment 80

•

4 years ago

Magnus, please see comment 77. I intended to set NI to to you when I wrote it but apparently I didn't.

Flags: needinfo?(mkmelin+mozilla)

Magnus Melin [:mkmelin]

Comment 81

•

4 years ago

The js logging doesn't use the NSPR logging mechanism. The logs go to the error console (and stdout). E.g. setting the mailnews.smtp.loglevel pref to "All" will have all the SMTP activities log to those locations.

Flags: needinfo?(mkmelin+mozilla)

gene smith

Assignee

Comment 82

•

4 years ago

•

Edited

Ok, thanks Magnus. At some point I guess https://wiki.mozilla.org/MailNews:Logging needs to mention this about SMTP.

Barry, FWIW here's a way to combine your POP3 log and the new SMTP log into one file I call tblog.moz_log in one command line:

MOZ_LOG=POP3:5,timestamp,sync  MOZ_LOG_FILE=  <path to thunderbird exe>/thunderbird  
--allow-downgrade -p 2>&1 | grep -E '\/POP3 |mailnews.smtp' > ~/tblog.moz_log

I suppose you would have to port this to windows commands if you are interested.

You can also filter on "mailnews.smtp" in the error console and the right click and "Export visible messages to" a file or clipboard to save just the SMTP log items.

It also appears that there is no timestamp on the new SMTP log entries. Probably not a big deal with SMTP but maybe with the other protocols it is if they are ported to JS.

Edit: I like to see everything in the terminal mostly to see any printf outputs I may have added. But the command above filters everything to the tblog.moz_log file and nothing appears in the console. To still see the full terminal output from TB I added a "tee" to the pipeline, i.e.,

MOZ_LOG=POP3:5,timestamp,sync  MOZ_LOG_FILE=  <path to thunderbird exe>/thunderbird  
--allow-downgrade -p 2>&1 |tee /dev/tty | grep -E '\/POP3 |mailnews.smtp' > ~/tblog.moz_log

as per this article: https://unix.stackexchange.com/questions/71492/output-to-stdout-and-at-the-same-time-grep-into-a-file

Barry S. Finkel

Comment 83

•

4 years ago

Gene, I was able to change mailnews.smtp.jsmodule from True to False, and I get SMTP logging in the same log as POP3.

Also, before I install your new test build, I want to see your source changes. You wrote that you had attached them, but I do not see them.
Thanks.

gene smith

Assignee

Comment 84

•

4 years ago

•

Edited

It's under that "Attachments" section near the top as "pop3-sys-temp-fix.diff". Maybe you need to click the twisty widget > to see it?
Here's a direct link to the changes:
attachment 9222332 [details] [diff] [review]

FYI, you can also see the changes on the try server link
https://hg.mozilla.org/try-comm-central/rev/429b1d9d5b6b93afe70f690688a84a7319c7f583

Edit:

Gene, I was able to change mailnews.smtp.jsmodule from True to False, and I get SMTP logging in the same log as POP3.

Yes, that would be reverting to using the deprecated C++ smtp code so logging would work in the expected way.

gene smith

Assignee

Comment 85

•

4 years ago

Attached image sys-temp-message.png (obsolete) — Details

Barry, Just a heads-up that you will see the attached message on the first attempt to fetch POP3 email with the new "try" build. Here's the reason quoted from my comment above:

Since I don't have a POP3 server that has this problem, to test this I have included in the build a "fake" response that will occur on the first check for new mail. So you will always see the message due to the "fake" response one time. But now with the fix, it won't cause new messages to not be retrieved like before.

Barry S. Finkel

Comment 86

•

4 years ago

Gene, I forgot that the attachments are at the top of the page. I installed your test version a few minutes ago. I did get the test message; no problems so far. I have looked at the changes, and I assume that the "XXX" lines are temporary changes. You check for :RESP-CODES" and "RESP_CODES"; is this because either string can appear? Those changes do not seem related to the bug at hand. The new code for the bug at hand appears to be OK, but I do not know cpp, nor do I know the logic flow of the modules with changes. (I did write a few programs in C in my career, but I dislike that language, especially character-handling,)

gene smith

Assignee

Comment 87

•

4 years ago

•

Edited

Yes, the XXX items are just markers to remind me they are temporary and need a fix or removal.
The RESP-CODES vs. RESP_CODES is due to the fact that my ISPs POP3 server sends RESP_CODES but the RFC and the current TB code only looks for RESP-CODES capability string to know if response codes should even occur. So I changed it so that the fix works with my ISP's POP3 server and my "fake" response code SYS/TEMP or a "real" one is detected and acted upon when it occurs. I probably should have put an XXX on that too since I'm not sure that allowing my ISP to violate the RFC is right or not.

Here's the RFC I'm referring to: https://datatracker.ietf.org/doc/html/rfc2449#section-6.4

Barry S. Finkel

Comment 88

•

4 years ago

If your ISP violates the RFC, then it is a good idea to accommodate what that ISP does. The extra "" tests do not add any additional processing time. I had looked at the RFC earlier this morning; it was referenced in the code. I saw "-" but not "". I am not familiar with the POP3 RFCs. I am familiar with the SMTP and mail RFCs (5331 and 5322), as I was involved in a group that edited 531 and 532 to produce 2821 and 2822. I sent my first "e-mail" in 1974 via FTP and my first real e-mail in 1983.

gene smith

Assignee

Comment 89

•

4 years ago

If your ISP violates the RFC, then it is a good idea to accommodate what that ISP does.

I suppose this conforms to "Postel's Law": https://en.wikipedia.org/wiki/Robustness_principle

Yes, the RFC for POP3 extensions https://datatracker.ietf.org/doc/html/rfc2449#section-6.4 only shows RESP-CODES as valid and not RESP_CODE. However these "capability tags" are case insensitive so it might be argued that - and _ are the same since they are both on the same key, at least on US keyboard. (Probably not a good argument since AFAICT case only applies to letters.)

I was involved in a group that edited 531 and 532 to produce 2821 and 2822.

Yes, there you are at top of middle column!: https://datatracker.ietf.org/doc/html/rfc2822#section-8

I sent my first "e-mail" in 1974 via FTP

I had no idea how email worked and learned from my son (circa 2000) that you can actually send a simple email via telnet using HELO, RECP TO, etc strings. This was before much security was in place.

Barry S. Finkel

Comment 90

•

4 years ago

Success!!

C:\Users\BSF\Desktop>grep SYS pop3.log.moz_log | grep ERR
2021-05-26 01:52:40.602000 UTC - [Parent 7140: Main Thread]: I/POP3 [this=1d2c4c13400] RECV2: -ERR [SYS/TEMP] simulated temp error
2021-05-26 07:33:33.297000 UTC - [Parent 7140: Main Thread]: I/POP3 [this=1d2adc15800] RECV2: -ERR [SYS/TEMP] Server error - Please try again later.

C:\Users\BSF\Desktop>

From my log, I had an occurrence at 07:33 UTC today (02:33 Central), and TB recovered. I will send a piece of the log via e-mail.

gene smith

Assignee

Comment 91

•

4 years ago

That's good news! I looked at the log fragment you sent. Was TB sending a "STAT" command when the SYS/TEMP response occurred?

Barry S. Finkel

Comment 92

•

4 years ago

Gene, I do not know the POP3 protocol, and it seems to me to make no difference what command TB sent to the server. At 07:33:33 TB sent a STAT command, and the server responded with a temporary error. There was not much logged after that at 07:33:33. The TB waited until the next user-set interval (10 minutes in my case), and at 07:42:30 TB tried again to get mail, and succeeded.

These are the surrounding ten-minute intervals:

 2021-05-26 07:22:28.566000 UTC - [Parent 7140: Main Thread]: D/POP3 [this=1d2adc13c00] LoadUrl()
 2021-05-26 07:32:30.208000 UTC - [Parent 7140: Main Thread]: D/POP3 [this=1d2adc15800] LoadUrl()
 2021-05-26 07:42:30.250000 UTC - [Parent 7140: Main Thread]: D/POP3 [this=1d2adc13c00] LoadUrl()

gene smith

Assignee

Comment 93

•

3 years ago

Attached patch pop3-sys-temp-fix.patch (obsolete) — Details — Splinter Review

Barry has been battling this bug for well over a year so I want to go ahead and get the formal review process started. So here's my patch.
This is functionally the same as pop3-sys-temp-fix.diff except some temporary changes added for testing and debug have been removed.

Attachment #9224202 - Flags: review?(benc)

Ben Campbell

Comment 94

•

3 years ago

Comment on attachment 9224202 [details] [diff] [review] pop3-sys-temp-fix.patch Review of attachment 9224202 [details] [diff] [review]: ----------------------------------------------------------------- This looks pretty good to me. I'm going to hold off on a r+, pending a decision on the `ClearFlag(POP3_STOPLOGIN);` bit below. ::: mailnews/local/src/nsPop3Protocol.cpp @@ +1192,5 @@ > SetFlag(POP3_STOPLOGIN); > > + if (m_commandResponse.Find("[SYS/TEMP", true) >= 0) { > + m_pop3ConData->command_temp_fail = true; > + ClearFlag(POP3_STOPLOGIN); I might be missing something here, but I don't think it should clear the POP3_STOPLOGIN flag. A SYS/TEMP response is still an error, and I don't think there's any recovery path (short of just failing, showing the user an error dialog, and waiting for them to retry). @@ +1427,5 @@ > SetCapFlag(POP3_HAS_XSENDER); > m_pop3Server->SetPop3CapabilityFlags(m_pop3ConData->capability_flags); > + } else if (!PL_strcasecmp(line, "RESP-CODES") || > + !PL_strcasecmp(line, "RESP_CODES")) { > + // see RFC 2449, chapter 6.4 -- some servers use underscore, not hyphen. Good catch! I think the comment could be reworded slightly - It seems to imply the underscore thing is mentioned in the RFC, but I don't think that is the case. ``` // See RFC 2449, chapter 6.4. Also, some servers use underscore, not hyphen. ``` @@ +1432,5 @@ > SetCapFlag(POP3_HAS_RESP_CODES); > m_pop3Server->SetPop3CapabilityFlags(m_pop3ConData->capability_flags); > + } else if (!PL_strcasecmp(line, "AUTH-RESP-CODE") || > + !PL_strcasecmp(line, "AUTH_RESP_CODE")) { > + // see RFC 3206, chapter 6 -- some servers use underscore, not hyphen. same again @@ +2652,5 @@ > } > > int32_t nsPop3Protocol::GetUidlList(nsIInputStream* inputStream, > uint32_t length) { > + if (m_pop3ConData->command_temp_fail) { (this is just a thought - I'm not asking you to change this!): It feels to me like this block should be inside the: `if(!m_pop3ConData->command_succeeded) {...` block below. rationale: `command_temp_fail` is a dependent of `command_succeeded`. That is, it's only valid if `command_succeeded` is false. downside: it'd require some messing about with the ClearCapFlag(POP3_UIDL_UNDEFINED) too. @@ +2658,5 @@ > + nsCOMPtr<nsIMsgIncomingServer> server = do_QueryInterface(m_pop3Server); > + nsresult rv = server->GetRealHostName(hostName); > + NS_ENSURE_SUCCESS(rv, -1); > + NS_ConvertASCIItoUTF16 hostNameUnicode(hostName); > + return Error("pop3TempServerError", hostNameUnicode.get()); In theory, it should probably fall back to trying XTND XLST (ie set the state to POP3_SEND_XTND_XLST_MSGID), just like it does for a permanent error. Thus we'd avoid bothering the user with a error dialog. But I think that'd be a horrible can of worms - if the XTND XLST also fails then you want to remember that the UIDL error was only a temporary one and show an appropriate message, and that all just seems fraught with madness and complication. So bailing out now seems like the right choice to me!

Ben Campbell

Comment 95

•

3 years ago

Regarding the corrupted-log-file issue mentioned earlier (comment 32), it sounds like the same kind of thing as Bug 1701105.
That one was down to a log file being accessed from multiple processes, which caused the existing data in the file to be overwritten with zeros.
It'll be a different issue - it's a totally different log system, and I've never encountered the problem with MOZ_LOG logging (I'm on Linux). I don't think the MOZ_LOG stuff is used from multiple processes either - it certainly shouldn't be for pop3 logging. But maybe it's the same thing but due to multiple threads instead? And maybe only an issue on windows?
In any case - if anyone is still suffering from it, I'd suggest opening up a new bug to tackle it.

gene smith

Assignee

Comment 96

•

3 years ago

(In reply to Ben Campbell from comment #95)

Regarding the corrupted-log-file issue mentioned earlier (comment 32), it sounds like the same kind of thing as Bug 1701105.
That one was down to a log file being accessed from multiple processes, which caused the existing data in the file to be overwritten with zeros.
It'll be a different issue - it's a totally different log system, and I've never encountered the problem with MOZ_LOG logging (I'm on Linux). I don't think the MOZ_LOG stuff is used from multiple processes either - it certainly shouldn't be for pop3 logging. But maybe it's the same thing but due to multiple threads instead? And maybe only an issue on windows?
In any case - if anyone is still suffering from it, I'd suggest opening up a new bug to tackle it.

The corrupted log problem is branched over to bug 1687377 which is shown above as "blocking" this bug (I guess because I created it by cloning; it's not blocking). I found exactly what the problem is in the mozilla code and submitted a patch in that bug that Magnus didn't like. Also, got only one mozilla person to even comment on the bug in the mozilla code. So I just found a workaround for the problem by making a batch and bash script for windows and linux respectively that allows you to click an http link in TB and have it open in FF without FF clobbering (truncating) the file pointed to by env. var MOZ_LOG_FILE. This is all described in detail in bug 1687377. Sounds similar to the problem you had with that other logging facility.

gene smith

Assignee

Comment 97

•

3 years ago

Ben, my main concern with the patch is that the error message that pops up might be blocking (or modal, but not sure if that's the right term). If this occurs with unattended TB, will it block other background tasks from running? It turns the main window gray when it is present but I'm not sure if that causes other TB processes to stop running or just stops UI stuff.

In theory, it should probably fall back to trying XTND XLST (ie set the state to POP3_SEND_XTND_XLST_MSGID), just like it does for a permanent error. Thus we'd avoid bothering the user with a error dialog.
But I think that'd be a horrible can of worms - if the XTND XLST also fails then you want to remember that the UIDL error was only a temporary one and show an appropriate message, and that all just seems fraught with madness and complication.
So bailing out now seems like the right choice to me!

The original patch submitted by user David Smith attachment 9177562 [details] [diff] [review] did a bit more after the error message is put up. Not sure if this is what you are referring to. However, I never really understood what it was doing and user Barry didn't think the other stuff was needed either. I need to look closer again to make sure.

I'll check your other issues with the patch tomorrow. Thanks!

Barry S. Finkel

Comment 98

•

3 years ago

As I have written above, all that needs to be done when the temporary error is seen is this - terminate the current "Get Messages" attempt. That is all that needs to be done. At the next user-supplied interval (in my case 10 minutes), TB will again start a "Get Messages" process, and that should succeed. The current Gene patch that I am running, I believe, essentially does that. And I had a failure last week where TB was successful in recovery.

gene smith

Assignee

Comment 99

•

3 years ago

I might be missing something here, but I don't think it should clear the POP3_STOPLOGIN flag. A SYS/TEMP response is still an error, and I don't think there's any recovery path (short of just failing, showing the user an error dialog, and waiting for them to retry).

I'm not sure if the error we are dealing with can occur before login anyhow. If it could, it would prevent login due to the SYS/TEMP error if the flag is not cleared on SYS/TEMP error (unless I clear it after the user acks the pop-up message). If SYS/TEMP occurs the flag will be set so that's why I clear it. So it only remains set if the error is LOGIN-DELAY, IN-USE or SYS, not SYS/TEMP.

It feels to me like this block should be inside the:
if(!m_pop3ConData->command_succeeded) {...

I think it's OK as is since command_temp_fail is very specialized, unlike command_succeeded.

In theory, it should probably fall back to trying XTND XLST (ie set the state to POP3_SEND_XTND_XLST_MSGID), just like it does for a permanent error. Thus we'd avoid bothering the user with a error dialog.

Instead of a modal pop-up, I'd rather have just a notification style message and also record it to activity msg. However, I couldn't find a similar thing to base this on in the POP3 code so I just called the standard Error() used in all other POP3 error cases. I'd rather not block the whole POP3 UI with this ugly pop-up dialog (again, this is my biggest concern with the current patch, maybe user Barry thinks it OK as is?).

I'll look closer again at these tomorrow...

gene smith

Assignee

Comment 100

•

3 years ago

Attached patch pop3-sys-temp-fix-v2.patch (obsolete) — Details — Splinter Review

Ben, you were right. I don't need to reset the POP3_STOPLOGIN flag since if the SYS/TEMP error occurs before or during login, the current pop3 instance is destroyed due to authentication failure and a new one is constructed with the flag reset so login is tried again when check for new mail occurs.

I also tried to clarify the "dash" vs. "hypen" change in the comments. (My ISP's POP3 servers returns "RESP_CODES" incorrectly instead of "RESP-CODES" so to test this I had to allow "RESP_CODES".)

I left everything else the way it was for now.

Attachment #9222332 - Attachment is obsolete: true

Attachment #9222668 - Attachment is obsolete: true

Attachment #9224202 - Attachment is obsolete: true

Attachment #9224202 - Flags: review?(benc)

Attachment #9225058 - Flags: review?(benc)

Ben Campbell

Comment 101

•

3 years ago

Comment on attachment 9225058 [details] [diff] [review] pop3-sys-temp-fix-v2.patch Review of attachment 9225058 [details] [diff] [review]: ----------------------------------------------------------------- This looks good to me, and I think this is a big enough improvement to land as it is. There's still the issue of the potentially-blocking error message, and the question of if that's a bad thing or not. But I think that probably comes down to how the end-user experience is. Barry - does the version you tested you pop up an error message? And if so, does it block things or otherwise get in your way?

Attachment #9225058 - Flags: review?(benc) → review+

gene smith

Assignee

Comment 102

•

3 years ago

Barry, if having the message pop up about temporary error is alright with you, I probably should also make a new "try" build (without the "fake" error) using pop3-sys-temp-fix-v2.patch. That will also check that the changes don't affect unit tests.

Flags: needinfo?(bsfinkel)

Magnus Melin [:mkmelin]

Comment 103

•

3 years ago

Comment on attachment 9225058 [details] [diff] [review] pop3-sys-temp-fix-v2.patch Review of attachment 9225058 [details] [diff] [review]: ----------------------------------------------------------------- ::: mail/locales/en-US/chrome/messenger/localMsgs.properties @@ +88,5 @@ > > +#LOCALIZATION NOTE (pop3TempServerError): "Get Messages" is the label on the button that > +# causes new messages to be fetched. %S is where the POP3 server name will appear. > +pop3TempServerError=Mail server %S responded with a temporary error while retrieving \ > +messages. Click the `Get Messages' button to try again. Maybe "Please try again later." rather than saying to click a button. ::: mailnews/local/src/nsPop3Protocol.cpp @@ +2659,5 @@ > + nsCOMPtr<nsIMsgIncomingServer> server = do_QueryInterface(m_pop3Server); > + nsresult rv = server->GetRealHostName(hostName); > + NS_ENSURE_SUCCESS(rv, -1); > + NS_ConvertASCIItoUTF16 hostNameUnicode(hostName); > + return Error("pop3TempServerError", hostNameUnicode.get()); For just showing the notification, there is AlertUserEventUsingName (but IMAP only atm, not sure how easy it is to move elsewhere or generalize it)

gene smith

Assignee

Comment 104

•

3 years ago

Attached image notification-sys-temp.png (obsolete) — Details

Maybe "Please try again later." rather than saying to click a button.

Here's my new proposed wording for the message (now as a notification instead of a modal pop-up). I didn't just say "Please try again" since this message can occur in response to timed biff events or if the user manually clicks "Get Messages" so the retry will typically be automatic via the biff timer.
"Please try again" would only be relevant for manual retrieval and not during biff. So I think it should just say

Mail server %S responded with a temporary error while retrieving new messages
and will be tried again at the next check for new messages."

where "next check for new messages" will be automatic or manual. Anyhow, if this isn't OK, let me know the exact string wanted here and I'll use it.
The revised patch is in the next comment.

Attachment #9177562 - Attachment is obsolete: true

Attachment #9225058 - Attachment is obsolete: true

Attachment #9225416 - Flags: feedback?(mkmelin+mozilla)

Attachment #9225416 - Flags: feedback?(benc)

gene smith

Assignee

Comment 105

•

3 years ago

Attached patch pop3-sys-temp-fix-v3.patch (obsolete) — Details — Splinter Review

Magnus wrote:

For just showing the notification, there is AlertUserEventUsingName (but IMAP only atm, not sure how easy it is to move elsewhere or generalize it)

I looked at this before and it seemed harder then than now. Turned out pretty simple. I just made a change in the existing Error() function to handle my new error string as a notification instead of, like all others, as a dialog. So now the error shows up as an OS specific tray notification and is recorded to the TB Activity Mgr. It no longer shows up as a modal dialog that requires user interaction.

Rather than checking the error string in Error() I could alternatively add a new parameter to Error() that selects whether the error string appears as a dialog or a notification.

Attachment #9225419 - Flags: review?(benc)

Attachment #9225419 - Flags: feedback?(mkmelin+mozilla)

Ben Campbell

Comment 106

•

3 years ago

Comment on attachment 9225419 [details] [diff] [review] pop3-sys-temp-fix-v3.patch Review of attachment 9225419 [details] [diff] [review]: ----------------------------------------------------------------- Ooh cool - I didn't know about the notification thingy! Just goes to show you should never ask me about anything GUI-related ;-) Looks good to me. I'd normally lean toward passing in a "non-modal" flag or something to Error(), rather than hardcoding behaviour keyed by the error name... but there's already another error with special handling, and I think it makes sense to keep all the icky special cases in one place. If we decide there are other errors which should also use this mechanism, then that's the time to add the extra param to Error(). (in any case, one day I'd like to see all this dialog-popping stuff moved out of the protocol entirely, with error callbacks passing out enough information that upper layers can deal with errors in a context-appropriate way (eg some errors, it'd make sense to offer a retry option, or maybe in account setup you want to deal with errors silently as you probe the connection... etc etc))

Attachment #9225419 - Flags: review?(benc) → review+

Ben Campbell

Comment 107

•

3 years ago

•

Edited

Comment on attachment 9225416 [details]
notification-sys-temp.png

It's a bit of an unwieldly message... but that's largely Error() is splicing together your message with a generic one.
Hmm. Here's what you've got now:

Mail server mobile.charter.net responded with a temporary error while retrieving new messages and will be tried again at the next check for new messages.
Mail server mobile.charter.net responded: simulated temp error

How about reordering your new message a little so it doesn't repeat "Mail server mobile.charter.net responded" twice? eg:

Temporary error from mobile.charter.net while retrieving new messages. The operation will be retried at the next check for new messages.
Mail server mobile.charter.net responded: simulated temp error

Attachment #9225416 - Flags: feedback?(benc)

gene smith

Assignee

Comment 108

•

3 years ago

Attached patch pop3-sys-temp-fix-v4.patch (obsolete) — Details — Splinter Review

Ben, I went ahead and added a new parameter to Error(). It's a bool called "showAsDialog" and defaults to true.
I also changed the message text as you suggested.
This ended up touching all 3 file of the patch so I'm requesting another, hopefully last, review.

Barry, I also made another "try" build based on this patch. It won't have the simulated error on first message fetch like the old "try" build. If you could run this and see how it does I would appreciate it. It's here:
https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=e4d528b3528993cbcde8b629185f216e5d654a63&selectedTaskRun=a2_13S5XSgC4jGqFErnFXw.0
I'm also canceling the "Need Info" flag regarding your opinion since the blocking/modal dialog is now not present and is replaced with just a notification pop-up from the tray.

Attachment #9225416 - Attachment is obsolete: true

Attachment #9225419 - Attachment is obsolete: true

Attachment #9225416 - Flags: feedback?(mkmelin+mozilla)

Attachment #9225419 - Flags: feedback?(mkmelin+mozilla)

Flags: needinfo?(bsfinkel)

Attachment #9225435 - Flags: review?(benc)

Barry S. Finkel

Comment 109

•

3 years ago

(In reply to gene smith from comment #102)

Barry, if having the message pop up about temporary error is alright with you, I probably should also make a new "try" build (without the "fake" error) using pop3-sys-temp-fix-v2.patch. That will also check that the changes don't affect unit tests.

Gene, when the temp error occurred, I was not at my computer. I did not see a pop-up message. It was later, when I did a grep on the log file that I saw that a temporary error had occurred. I am not sure that a message is really needed. When TB does a "Get Messages" and there are no messages to download, there is no message. My wait/retry interval is set to 10 minutes. If I set the interval were set to 1 minute, I have no idea if the temporary error would recur. I have no idea what might cause the temporary error on the server (cluster), so I do not know if it would last a minute or more. Also, someone pointed out that the server is probably a collection of servers, so I do not know that my re-try would hit the same server that had the temporary error. Treat the temp error the same as "no messages downloaded" - no message to the end user. The end-user does not need to know that there was a temporary error, if TB recovers from the error.

C:\Users\BSF\Desktop>grep SYS pop3.log.210526.moz_log | grep ERR
2021-05-26 01:52:40.602000 UTC - [Parent 7140: Main Thread]: I/POP3 [this=1d2c4c13400] RECV2: -ERR [SYS/TEMP] simulated temp error
2021-05-26 07:33:33.297000 UTC - [Parent 7140: Main Thread]: I/POP3 [this=1d2adc15800] RECV2: -ERR [SYS/TEMP] Server error - Please try again later.

C:\Users\BSF\Desktop>

Barry S. Finkel

Comment 110

•

3 years ago

(In reply to gene smith from comment #108)

Created attachment 9225435 [details] [diff] [review]
pop3-sys-temp-fix-v4.patch

Barry, I also made another "try" build based on this patch. It won't have the simulated error on first message fetch like the old "try" build. If you could run this and see how it does I would appreciate it. It's here:
https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=e4d528b3528993cbcde8b629185f216e5d654a63&selectedTaskRun=a2_13S5XSgC4jGqFErnFXw.0
I'm also canceling the "Need Info" flag regarding your opinion since the blocking/modal dialog is now not present and is replaced with just a notification pop-up from the tray.

Gene, I might have time tomorrow night to install your test build. I want to install the 21H1 Feature Set tomorrow, and I learned from my mainframe system programmer days that it is not a good idea to make two changes at the same time. If a problem occurs, then one does not know which change caused the problem (maybe both). On the mainframe when this happened, we were able to put diagnostics into the code (we had the source for both sub-systems) and then revise the code to get around the problem. I spent many days looking at successful-branch traces (something that the mainframe has that Windows does not have - I do not know about Unix) to try to determine exactly what was happening. And I never was able to determine what change(s) caused the problem.

gene smith

Assignee

Comment 111

•

3 years ago

Barry, I didn't realize you didn't see the error message dialog when the SYS/TEMP error occurred. I'll look at the log you sent me and see if I can tell why.

gene smith

Assignee

Comment 112

•

3 years ago

Looking at the log, it appears that the SYS/TEMP response occurred in response to a STAT command. The code I added really doesn't affect that but only affects responses to UIDL that are SYS/TEMP. However, a failed STAT, even if SYS/TEMP, looks like it should still produce the "pop3StatFail" error

The STAT command did not succeed. Error getting message number and sizes.

Did you see an error saying that?

I just now simulated the SYS/TEMP error on the response to STAT and I do see the above error occur in a dialog. This also causes message to not be fetched until tried again on the next attempt (manually or via timer). Maybe all occurrences of SYS/TEMP should be just a notification? (I guess I was incorrectly thinking that SYS/TEMP only occurs in response to UIDL.)

Barry S. Finkel

Comment 113

•

3 years ago

In my case, a SYS/TEMP will not occur in response to a UIDL, because TB knows that my server does not accept the UIDL command. The problem with the standard TB code is that after a SYS/TEMP (not properly parsed), it tries a UIDL command. That UIDL command fails, and TB remembers that fact and thus does not retry the "Get Messages" after the user-set interval. I did not see any pop-up window when the failure occurred.

Barry S. Finkel

Comment 114

•

3 years ago

One additional comment - It does not matter what POP3 command was sent to the server; any command that gets a SYS/TEMP error response should result in the current "Get Messages" attempt to be terminated (I do not care if there is a message produced).

Barry S. Finkel

Comment 115

•

3 years ago

I had to do a reboot this morning because a full Defender scan caused windows to lose text. Before the reboot I shut down TB, and looked at the log. There were many SYS/TEMP occurrences, so I renamed the log to save it.

C:\Users\BSF\Desktop>grep SYS pop3.log.210607.moz_log | grep ERR
2021-06-06 21:30:00.283000 UTC - [Parent 5860: Main Thread]: I/POP3 [this=1e739077c00] RECV2: -ERR [SYS/TEMP] simulated temp error
2021-06-06 21:41:06.812000 UTC - [Parent 5860: Main Thread]: I/POP3 [this=1e722f0e000] RECV2: -ERR [SYS/TEMP] Server error - Please try again later.
2021-06-07 11:31:17.179000 UTC - [Parent 5860: Main Thread]: I/POP3 [this=1e722f0e000] RECV2: -ERR [SYS/TEMP] Server error - Please try again later.
2021-06-07 11:41:14.952000 UTC - [Parent 5860: Main Thread]: I/POP3 [this=1e722f0e000] RECV2: -ERR [SYS/TEMP] Server error - Please try again later.
2021-06-07 11:42:15.946000 UTC - [Parent 5860: Main Thread]: I/POP3 [this=1e722f0e000] RECV2: -ERR [SYS/TEMP] Server error - Please try again later.
2021-06-07 11:43:21.061000 UTC - [Parent 5860: Main Thread]: I/POP3 [this=1e722f0e000] RECV2: -ERR [SYS/TEMP] Server error - Please try again later.
2021-06-07 11:51:14.735000 UTC - [Parent 5860: Main Thread]: I/POP3 [this=1e722f0e000] RECV2: -ERR [SYS/TEMP] Server error - Please try again later.

C:\Users\BSF\Desktop>

Is there anything in this 51 Mb log that you need to see? I saw no pop[-up windows, but I have no idea if any pop-up would dissappear on its own. Your test window has me click "Yes", IIRC.

gene smith

Assignee

Comment 116

•

3 years ago

In my case, a SYS/TEMP will not occur in response to a UIDL, because TB knows that my server does not accept the UIDL command.

Your server does support UIDL. It comes back in response to the CAPA command and in your the log you sent me via email I see it occur and it works correctly. The message that occurred with un-patched code indicating it is not supported is misleading/wrong.

Is there anything in this 51 Mb log that you need to see?

For the SYS/TEMP errors, which command in the log was being sent when the errors occurred. You can tell by finding the line with SEND: **** preceding the SYS/TEMP error, for example from the log you sent via email

2021-05-26 07:32:31.160000 UTC - [Parent 7140: Main Thread]: I/POP3 [this=1d2adc15800] SEND: STAT
2021-05-26 07:33:33.297000 UTC - [Parent 7140: Main Thread]: I/POP3 [this=1d2adc15800] Entering NET_ProcessPop3 56
2021-05-26 07:33:33.297000 UTC - [Parent 7140: Main Thread]: I/POP3 [this=1d2adc15800] Entering state: 3
2021-05-26 07:33:33.297000 UTC - [Parent 7140: Main Thread]: I/POP3 [this=1d2adc15800] RECV2: -ERR [SYS/TEMP] Server error - Please try again later.

indicates the error occurred on the STAT command.

I saw no pop[-up windows, but I have no idea if any pop-up would dissappear on its own. Your test window has me click "Yes", IIRC.

The dialog window should not disappear until you click OK or YES (not sure which is shown). At least that's how it works for me.

Barry S. Finkel

Comment 117

•

3 years ago

Gene, two questions:

Do I have to click on the message to get TB to continue to download mail? If so, then the message is bad because a SYS/TEMP error might occur at a time when I am not at the computer.
Do you want me to install your test build? Or are you going to revise it based on newer information?

Thanks.

Ben Campbell

Comment 118

•

3 years ago

Comment on attachment 9225435 [details] [diff] [review] pop3-sys-temp-fix-v4.patch Review of attachment 9225435 [details] [diff] [review]: ----------------------------------------------------------------- Looks good to me! r+, assuming you and Barry are both happy about the end-user experience. (sorry about the delay reviewing this - I was away Monday, then distracted down message-copy rabbit holes yesterday!)

Attachment #9225435 - Flags: review?(benc) → review+

gene smith

Assignee

Comment 119

•

3 years ago

(In reply to Barry S. Finkel from comment #117)

Gene, two questions:

Do I have to click on the message to get TB to continue to download mail? If so, then the message is bad because a SYS/TEMP error might occur at a time when I am not at the computer.

The message I added with this patch is now just a notification that appears out of the system tray so it won't cause anything to be delayed since there is nothing to "click". However, even without the patch it is possible to have a SYS/TEMP error on commands other than UIDL (like STAT as indicated in your log from several weeks ago) where the message will cause a stoppage and require you to click OK. However, you say you never saw this which doesn't seem to make sense.

That said, if the dialog requiring a click of OK occurs in the middle of the night with TB running, yes, emails won't be "downloaded" until you click OK. However, unless I'm missing something, why would that matter? You will still download and then see any new emails after you click OK on the next timed fetch or if you click "Get Messages", so nothing is lost.

Do you want me to install your test build? Or are you going to revise it based on newer information?

I'm not sure if I will revise it at this time. I was hoping you would answer the question I asked in comment 116:

For the SYS/TEMP errors, which command in the log was being sent when the errors occurred? You can tell by finding the line with SEND: **** preceding the SYS/TEMP error, for example from the log you sent via email...

If you don't understand what I'm asking here, maybe you could just cut out part of the 51 Mb log file that contains where the SYS/TEMP errors occurs (some before to some after) and attach it above or email it to me.

So I guess there is no reason to install the new "try" build until this is resolved.

Barry S. Finkel

Comment 120

•

3 years ago

Gene, I sent you via yesterday (6/7) at 12:12 CDT an e-mall that contains all of the "SEND:" lines and the "SYS/TEMP" lines. Did you see that, and was that what you wanted?

Barry S. Finkel

Comment 121

•

3 years ago

Gene, another comment. If TB recovers from a temp error, then I do not need to know that fact (other to conclude the first time that the revised code is working). I see no need for a pop-up window where the user has to acknowledge.

gene smith

Assignee

Comment 122

•

3 years ago

•

Edited

(In reply to Barry S. Finkel from comment #121)

Gene, another comment. If TB recovers from a temp error, then I do not need to know that fact (other to conclude the first time that the revised code is working). I see no need for a pop-up window where the user has to acknowledge.

Well, if you click on "Get Messages" and you receive no new messages due to SYS/TEMP and some are actually available you might want to know something went temporarily wrong and you might want to try again.

Anyhow, I'm pretty sure I now know why you don't see any error dialogs when the random SYS/TEMPs occur. The Error() function is still actually getting called but because polling for new messages occurs in "biff" mode (biff is a weird term for checking for new mail via a timer) TB assumes there is no "window" to display the dialog onto. So inside Error(), if the "window" pointer comes back null, there is no attempt to put up a dialog. (I worked on another bug recently involving this but somehow I didn't notice this inside Error() until just now. Duh.)

So what this means is that I don't need the special notification and will just let it be a normal dialog with the OK button. This dialog for SYS/TEMP will only appear if SYS/TEMP occurs when you do "Get Messages" by clicking.

I'll need to update the patch one more time.

gene smith

Assignee

Comment 123

•

3 years ago

Attached patch pop3-sys-temp-fix-v5.patch — Details — Splinter Review

Ben, I'm sure you're getting tired of looking at this. Now I know why Barry doesn't see any dialogs for SYS/TEMP errors since it occurs during biff and there is no window. So backed out my previous addition of showing a notification and always have Error() show a dialog as it always has. So the new error message will only occur and require an OK when "Get Message" clicked which I'm pretty sure is OK with Barry.

Attachment #9225435 - Attachment is obsolete: true

Attachment #9226226 - Flags: review?(benc)

Ben Campbell

Comment 124

•

3 years ago

Comment on attachment 9226226 [details] [diff] [review] pop3-sys-temp-fix-v5.patch Review of attachment 9226226 [details] [diff] [review]: ----------------------------------------------------------------- LGTM. And no problem on all the revisions - I'm perfectly happy to review any number of patches! (This bug in particular has been great - I ended up learning a whole lot more about POP3 than I did before)

Attachment #9226226 - Flags: review?(benc) → review+

gene smith

Assignee

Comment 125

•

3 years ago

Thanks Ben.
Barry, With any luck this will appear in tomorrow's nightly build.

Keywords: checkin-needed-tb

Magnus Melin [:mkmelin]

Updated

•

3 years ago

status-thunderbird_esr78: --- → wontfix

Target Milestone: --- → 91 Branch

Pulsebot

Comment 126

•

3 years ago

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/8084ddf6d387
Avoid disable of fetch new messages after POP3 SYS/TEMP error. r=benc

Status: ASSIGNED → RESOLVED

Closed: 3 years ago

Keywords: checkin-needed-tb

Resolution: --- → FIXED

Barry S. Finkel

Comment 127

•

3 years ago

Is the fix in the nightly build? And from can I download that build? Thanks. I looked at the latest patch, and it looks OK to me.

gene smith

Assignee

Comment 128

•

3 years ago

•

Edited

Hi Barry, it got put in on Fri. evening per comment 126 above. I also see it in the official "comm-central" log. So you can get today's daily here: https://archive.mozilla.org/pub/thunderbird/nightly/2021/06/2021-06-13-09-51-45-comm-central/
You probably need file

File 	thunderbird-91.0a1.en-US.win64.installer.exe 	54M 	13-Jun-2021 11:23

P/S: This links to the most recent daily so you don't have provide a date in the url:
https://ftp.mozilla.org/pub/thunderbird/nightly/latest-comm-central/
(Not sure what the files from 31 May are doing here. But the installer file with the correct date is still down below.)

Barry S. Finkel

Comment 129

•

3 years ago

I installed the daily build this morning.

Barry S. Finkel

Comment 130

•

3 years ago

Ar 12:43 today I logged a successful recovery from a temporary server error. So, the fix does work. I saved the log in case anyone needs info from it.

gene smith

Assignee

Updated

•

10 months ago

Updated

•

5 months ago

patch 4 years ago David Smith 2.92 KB, patch		Details \| Diff \| Splinter Review
patch 2 4 years ago David Smith 2.87 KB, patch		Details \| Diff \| Splinter Review
patch 3 4 years ago David Smith 2.83 KB, patch		Details \| Diff \| Splinter Review
pop3-sys-temp-fix.diff 4 years ago gene smith 12.16 KB, patch		Details \| Diff \| Splinter Review
sys-temp-message.png 4 years ago gene smith 53.86 KB, image/png		Details
pop3-sys-temp-fix.patch 3 years ago gene smith 6.46 KB, patch		Details \| Diff \| Splinter Review
pop3-sys-temp-fix-v2.patch 3 years ago gene smith 6.53 KB, patch	benc : review+	Details \| Diff \| Splinter Review
notification-sys-temp.png 3 years ago gene smith 57.26 KB, image/png		Details
pop3-sys-temp-fix-v3.patch 3 years ago gene smith 8.10 KB, patch	benc : review+	Details \| Diff \| Splinter Review
pop3-sys-temp-fix-v4.patch 3 years ago gene smith 9.63 KB, patch	benc : review+	Details \| Diff \| Splinter Review
pop3-sys-temp-fix-v5.patch 3 years ago gene smith 6.45 KB, patch	benc : review+	Details \| Diff \| Splinter Review

Intermittent UIDL or XTND XLST Error Messages

D:\bsftools>

C:\Users\BSF\Desktop>