Closed Bug 1591811 Opened 6 years ago Closed 6 years ago

Lost all contacts in Personal Address Book during mbox -> sqlite migration of Thunderbird 71

Categories

(Thunderbird :: Address Book, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 73.0

People

(Reporter: ronjouch, Assigned: darktrojan)

Details

Attachments

(1 file)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0

Steps to reproduce:

  1. Run the beta channel
  2. Be automatically updated to Thunderbird 71.0b1 (64-bit) under Linux, released on October 24, 2019.
  3. Open the Address Book, click "Personal Address Book"

Actual results:

All my contacts are gone! :-(

Expected results:

My contacts (a few hundreds) should have been here.

Not sure how to help you dig that; feel free to ask for more info.

Geoff was working on that. Let's see if he has suggestions on debugging this.

Flags: needinfo?(geoff)

You are on the beta channel for automatic updates? If yes, you should have gotten build2 of 71.0b1 which fixes bug 1590637. If no, your build ID at help > troubleshooting will be older than 20191024.

What happens if you restart Thunderbird?

Component: Migration → Address Book
Flags: needinfo?(ronan)

(In reply to Wayne Mery (:wsmwk) from comment #3)

You are on the beta channel for automatic updates?

Yes.

You should have gotten build2 of 71.0b1 which fixes bug 1590637.
If no, your build ID at help > troubleshooting will be older than 20191024.

I'm running 71.0b1 20191024110308.

What happens if you restart Thunderbird?

Restarting doesn't help: after a restart, my "Personal Address Book" is still empty.

Additional information:

  • In my profile folder, I see the abook.mab.bak mork backup created on Thursday 24 Oct. It's 321kB with 780 contacts.
  • Next to it I have an abook.sqlite file weighing 1.5MB, and opening with sqlite-browser it I see the properties table contains 7757 rows of seemingly sane card/name/value data.
  • So, it looks like migration happened okay, but data isn't loaded/populated
  • I'd like to help more by adding logging, but looking at https://wiki.mozilla.org/MailNews:Logging , I'm not sure which env. variable(s) would be adequate for my needs. Can anyone help?
Flags: needinfo?(ronan)

Also, I'd like to avoid sharing the data. Open to questions to help debugging, though.

You can try repeating the conversion:

Move .sqlite file out of the way and rename the .mab.bak back to .mab. Then look at .filename preferences in the config editor and rename .sqlite back to .mab, like in ldap_2.servers.pab.filename. Also change the corresponding .dirType back to 2, like in ldap_2.servers.pab.dirType.

Make a backup of abook.mab.bak before you start.

Geoff will know more.

(In reply to Jorg K (GMT+2) from comment #6)

You can try repeating the conversion [...]

Thanks Jorg. Before I do, I'd like help to get logging from the module doing the conversion.
I tried following https://wiki.mozilla.org/MailNews:Logging and for example started Tb with env MOZ_LOG_FILE=~/tb.log MOZ_LOG=MODULE_NAME:2,timestamp thunderbird (I tried {Mailbox, IMAP} for MODULE_NAME). A file is created, but it stays desperately empty. What am I doing wrong? What's the right MODULE_NAME taking care of the conversion?

I'd say there is no logging option for AB conversion, but we'd really have to wait for the developer who implemented this.

(In reply to Jorg K (GMT+2) from comment #8)

we'd really have to wait for the developer who implemented this.

Hi, any news from Geoff?

I followed comment 6 to repeat the conversion under released-yesterday 71.0b2-20191030211447, with the same results as 71.0b1-20191024110308. To sum up,

  • {abook,history}.mab.bak got backed up. They're respectively {321kB, 4kB} and contain {700, 5} contacts.
  • {abook,history}.sqlite created, weighing {1.5MB, 330kB}. SQLite Browser opens abook.sqlite successfully, and I see 4 tables: {cards, list_cards, lists, properties}. First 3 ones are empty, and properties contains 7757 rows with card/name/value columns.
  • So, it looks like migration was performed and worked.
  • When I open my address books,
    • "Collected addresses" is populated with the 5 entries present in history.mab.bak.
    • ... but "Personal Address Book" appears empty.
  • So, migration got done for both address books, but something in the Personal Address Book data causes it to fail to load/populate in the UI.
  • I'd like to help more by looking at logs, but looking at https://wiki.mozilla.org/MailNews:Logging , I'm not sure which env. variable(s) would be adequate for my needs.
  • I cannot share the .mab files.

Feel free to ask for more debug info.

Unfortunately Geoff has been on PTO this week. We've shipped TB 71 beta with this migration to a about 34.000 thousand people now (https://stats.thunderbird.net/#beta) and this is the only complaint I've seen. I'm Geoff will investigate when he returns next week.

(In reply to Ronan Jouchet from comment #9)

  • {abook,history}.sqlite created, weighing {1.5MB, 330kB}. SQLite Browser opens abook.sqlite successfully, and I see 4 tables: {cards, list_cards, lists, properties}. First 3 ones are empty, and properties contains 7757 rows with card/name/value columns.
  • So, it looks like migration was performed and worked.
  • When I open my address books,
    • "Collected addresses" is populated with the 5 entries present in history.mab.bak.
    • ... but "Personal Address Book" appears empty.
  • So, migration got done for both address books, but something in the Personal Address Book data causes it to fail to load/populate in the UI.

A-ha! Digging a bit more, I just noticed the problem: in (correctly-migrated) history.sqlite, both the cards and properties table got populated, with respectively 5 and 38 entries.

However, in badly-migrated abook.sqlite, only the properties table got populated (with 7757 entries)! The cards table is empty!

So,

  1. It's not a UI data population bug, it's actually a partially-broken migration, and migration will have to be triggered again for users having gone through the broken migration.
  2. Geoff (or anyone wanting to dig the code), can you take a close look at what could fail in the migration code creating cards entries?

And I have logs. My previous comment was assuming incorrect data population. But now, re-running the migration and immediately opening the Error Console, here's what I see:

Creating new abook.sqlite                        MailMigrator.jsm:700:15
No chrome package registered for chrome://remote/content/RemoteAgent.jsm
Contract ID '@mozilla.org/remote/agent' was registered as a command line handler for entry 'm-remote', but could not be created.
[xpconnect wrapped mozIStorageError]           AddrBookDirectory.jsm:376
    handleError resource:///modules/AddrBookDirectory.jsm:376
Renaming abook.mab to abook.mab.bak              MailMigrator.jsm:751:15
Creating new history.sqlite                      MailMigrator.jsm:700:15
Renaming history.mab to history.mab.bak          MailMigrator.jsm:751:15

I opened my omni.ja, reached for AddrBookDirectory.jsm, and added console.logs, hoping to narrow down on the culprit entry, but my logging doesn't show up in the Error Console. However, I see for example the logging for Creating new abook.sqlite in MailMigrator.jsm, and it's done with console.log too.

Is there setup to do to have my console.logs show up in the Error Console? Or is the JS in omni.ja cached somewhere and my code didn't run? Am I missing something else?

Your safest bet is to get a debug version from here
https://queue.taskcluster.net/v1/task/bDRnOrXxRYu4Yc6g3lXkLg/runs/0/artifacts/public/build/target.tar.bz2 (assuming Linux64)
and add simple dump() calls.

If you edit omni.ja, just rename it to omni.zip, then unpack it. TB will then use the "flat files".

I don't think that console.log() will show anywhere in a non-debug version.

Happy hunting!

Hi Ronan, thanks for digging further into this. There's some good information here.

(In reply to Ronan Jouchet from comment #12)

[xpconnect wrapped mozIStorageError]           AddrBookDirectory.jsm:376
    handleError resource:///modules/AddrBookDirectory.jsm:376

That is an unfortunately unhelpful message. At least it confirms what I was thinking about where the problem might be. I'd really need to see the result and message properties of the error object before I can say any more.

Flags: needinfo?(geoff)

(In reply to Jorg K (GMT+1) (PTO to 15th Dec 2019, sporadically reading bugmail) from comment #14)

Your safest bet is to get a debug version and add simple dump() calls.
https://queue.taskcluster.net/v1/task/bDRnOrXxRYu4Yc6g3lXkLg/runs/0/artifacts/public/build/target.tar.bz2 (assuming Linux64)

It's been a while and I wouldn't want to use this old build from September on my new data. Can you give me a permalink to the latest tb-linux64 build, or a link to the Tb build dashboard? I tried fiddling with the taskcluster.net URL but never get anywhere :-/ .

If you edit omni.ja, just rename it to omni.zip, then unpack it. TB will then use the "flat files".

Okay, I did that and Tb still works. However...

I don't think that console.log() will show anywhere in a non-debug version.

... I'm confused, because I see that they do show in my release version: I see that the lines that do log to the Error Console I mentioned above are console.log calls. For example, I grepped that the Renaming ... logging comes from MailMigrator.jsm +751,

console.log(`Renaming ${fileName}.mab to ${backupFile.leafName}`);

, and I see it in the Error Console (Ctrl+Shift+J) in Thunderbird, with clear confirmation that the source is the console.log above:

Renaming abook.mab to abook.mab.bak              MailMigrator.jsm:751:15

Plus, if I click on MailMigrator.jsm:751:15 I'm even brought to the line in the file in my $EDITOR.

However, if I:

  1. Modify this line (in the flat file, after extracting & deleting omni.ja), for example logging 'HELLO Renaming'...
  2. Then re-run the migration (with a little rm ~/.thunderbird/ronj/abook.sqlite ~/.thunderbird/ronj/history.sqlite && cp ~/.thunderbird/ronj/abook.mab.bak ~/.thunderbird/ronj/abook.mab && cp ~/.thunderbird/ronj/history.mab.bak ~/.thunderbird/ronj/history.mab && thunderbird )
  3. Peek at the logs in the Error Console, looking for my changed console.log

, then I do not see my new HELLO code being run, I see the side effect of the pre-modification console.log ! And to add to my befuddlement, when I click on the Error Console link to MailMigrator.jsm:751:15, I'm brought to the file and I do see my modification.

→ What's going on here? Is there a code cache somewhere needing invalidation? Or is this JS compiled somewhere?

Flags: needinfo?(geoff)

Running Thunderbird with the argument --purgecaches should fix the last problem. I don't imagine using any other version will be of any more use to you than the one you have. (Just be aware that if an automatic update comes along it will mess up what you're working on and you'll have to start again, so maybe turn off updates for now.)

Flags: needinfo?(geoff)

(In reply to Geoff Lankow (:darktrojan) from comment #17)

Running Thunderbird with the argument --purgecaches should fix the last problem.

Thanks, that did the trick.

So, good news! I was able to progress: after I console.log the error rather than just Cu.reportError it, I see that AddrBookDirectory.jsm :: _bulkAddCards fails with

{
  "message": "UNIQUE constraint failed: cards.uid",
  "result": 19,
  "CONSTRAINT": 19
}

So, I added logging of each the uid for card being migrated, and yes, attempts to use the same uid are made. From what I see, the cards parameter of _bulkAddCards simply contains duplicate contacts, leading to attempting to migrate the same contact twice 🙂. Here's last run:

  • I have in this contact in my Mork database (nit: I'm not even sure I'm extracting this properly, gosh Mork is ugly):
(330C=Catherine)(29EB=704)(3310=catherine.ANONYMIZED@hotmail.com)(35C4=608118)(330F=ANONYMIZED)(3312=ANONYMIZED)(330E=3b510f8e-f08c-4bc4-a142-72d85a04a7b7)
  • But at the time of migration, when iterating on cards, my added logging in _bulkAddCards reports it will try to migrate this contact twice:
**** NEW CARD, uid= 3b510f8e-f08c-4bc4-a142-72d85a04a7b7 AddrBookDirectory.jsm:410:15
Name: DisplayName , Value: Catherine AddrBookDirectory.jsm:447:19
// ... 40 other cards being iterated on ...
**** NEW CARD, uid= 3b510f8e-f08c-4bc4-a142-72d85a04a7b7 AddrBookDirectory.jsm:410:15
Name: DisplayName , Value: Catherine AddrBookDirectory.jsm:447:19
// ... more stuff being done, and then when the async processing hits duplicated Catherine,
error.message: UNIQUE constraint failed: cards.uid AddrBookDirectory.jsm:461:19
error.result: 19 AddrBookDirectory.jsm:462:19
error: {"message":"UNIQUE constraint failed: cards.uid","result":19,"ERROR":1,"INTERNAL":2,"PERM":3,"ABORT":4,"BUSY":5,"LOCKED":6,"NOMEM":7,"READONLY":8,"INTERRUPT":9,"IOERR":10,"CORRUPT":11,"FULL":13,"CANTOPEN":14,"EMPTY":16,"SCHEMA":17,"TOOBIG":18,"CONSTRAINT":19,"MISMATCH":20,"MISUSE":21,"NOLFS":22,"AUTH":23,"FORMAT":24,"RANGE":25,"NOTADB":26} AddrBookDirectory.jsm:463:19

At this point I see these possibilities:

  1. I haven't looked at _bulkAddCards(cards)'s caller. Geoff, could you comb through it and look for possibilities to build a cards argument containing duplicate entries?
  2. If nothing comes out of 1. , maybe simply de-duplicate cards by uid before calling _bulkAddCards(cards) ?
  3. If 1. and 2. above feel too hackish / fixing without understanding the root cause, feel free to ask targeted questions about the Mork db / contact (as I'm not willing to share it). I tried inspecting my Mork db and looked for inconsistencies, but I'm not familiar enough with Mork (and did I say GOSH MORK IS UGLY 😄) to find anything.
Flags: needinfo?(geoff)

and did I say GOSH MORK IS UGLY 😄

If you think that's ugly you should see the code that operates it! That's what we're trying to get away from…

Seriously though, I don't know how you've got two cards with the same unique identifier but that's certainly a problem. Is it the same card, ie. are there two copies of the same contact? It looks like it.

I think what I'll do here is assign a new UID if one appears twice in the data. That way no data is lost.

Assignee: nobody → geoff
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Flags: needinfo?(geoff)

(In reply to Geoff Lankow (:darktrojan) from comment #19)

I don't know how you've got two cards with the same unique identifier but that's certainly a problem. Is it the same card, ie. are there two copies of the same contact? It looks like it.

Yes, my logging confirms that the migration code goes two times over the same contact (same firstname, lastname, email, nickname, etc). But again, I confirm that, grepping inside the original Mork db (by UUID, or name, nickname, or email), there is only one such contact. That's why my previous post was suggesting that the Mork db seems sane, but that it could be incorrect iterated on by the migration code.

I think what I'll do here is assign a new UID if one appears twice in the data. That way no data is lost.

So, I'd end up with duplicate contacts after imports, that I'd have to clean up manually? Sure if you see no other way, seems like a small price to pay to cover my case and avoid data loss. From what my logging indicates, about 50 other contacts were processed at the time of crashing, so I don't know if this duplicate entry was the first of a series, or the only one. I guess I'll know when I try your new-UID code, and feel free to ask for more info on my Mork db if you want to dig more :) .

It appears to be there only once because Mork avoids storing duplicate strings. You'll have more than one reference to it: ^330E.

(In reply to Geoff Lankow (:darktrojan) from comment #21)

It appears to be there only once because Mork avoids storing duplicate strings. You'll have more than one reference to it: ^330E.

Correct. In abook.mab I have three matches for ^330E: the contact itself:

(330C=Catherine ANONYMIZED)(29EB=704)
  (3310=catherine.ANONYMIZED@hotmail.com)(35C4=608118)(330F=ANONYMIZED)(3312
    =ANONYMIZEDANONYMIZEDANONYMIZEDANONYMIZEDANONYMIZED\
ANONYMIZED)(330E=3b510f8e-f08c-4bc4-a142-72d85a04a7b7)

, and two references:

  [2C0(^87^330C)(^8C=0)(^CC=704)(^8D=5)(^89^3310)(^B8^3585)(^8A^3310)
    (^B9^35C4)(^86=)(^85=)(^C3=)(^B7=)(^A1=)(^AE=)(^9A=)(^C4=)(^B2=)
    (^A7=)(^B3=)(^C6=)(^A4=)(^92=)(^A5=)(^A3=)(^9B=)(^CA^54C)(^9E=)
    (^B0=)(^CE=)(^93=)(^D0=)(^A0=)(^C7=)(^99=)(^83^1D9)(^A8=)(^C8=1)
    (^84^330F)(^9C=)(^A6=)(^B6=)(^AF=)(^CB=)(^90=)(^B1=)(^CD=)(^8F=)
    (^A2=)(^91=)(^88^3312)(^9F=)(^B4=)(^9D=)(^8B=)(^C9=)(^C5=)(^B5=)
    (^D5^330E)]
  [2EA(^87^330C)(^86=)(^85=)(^C3=)(^A1=)(^AE=)(^B9=297)(^C4=)(^CC=746)
    (^8C=0)(^D5^330E)(^88^3370)(^A7=)(^9A=)(^8A^3310)(^B3=)(^A6=)(^A4=)
    (^A5=)(^9E=)(^A3=)(^9B=)(^99=)(^B0=)(^8D=2)(^B2=)(^CE=)(^93=)(^D0=)
    (^CA^54C)(^B7=)(^A0=)(^C6=)(^C7=)(^83^1D9)(^A8=)(^C8=1)(^84^330F)
    (^92=)(^B6=)(^89^3310)(^AF=)(^CB=)(^90=)(^B1=)(^CD=)(^8F=)(^A2=)
    (^91=)(^B8^3371)(^9F=)(^B4=)(^9D=)(^8B=)(^C9=)(^C5=)(^9C=)(^B5=)]

This assigns a new UID if a duplicate is found in the cards being added. It also aborts the transaction and forwards a the SQLite error message to the calling function for logging, should a similar error happen again.

Attachment #9114458 - Flags: review?(mkmelin+mozilla)
Comment on attachment 9114458 [details] [diff] [review] 1591811-duplicate-uids-1.diff Review of attachment 9114458 [details] [diff] [review]: ----------------------------------------------------------------- Looks good, thx. r=mkmelin
Attachment #9114458 - Flags: review?(mkmelin+mozilla) → review+

Pushed by geoff@darktrojan.net:
https://hg.mozilla.org/comm-central/rev/b2b1ffe98ee5
Prevent two contacts with the same UID breaking the migration process; r=mkmelin DONTBUILD

Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 73.0

This is not fixed in Tb 72.0b2-20191212014636.
Is it supposed to be, or was the patch only applied on branch 73 but not 72?

This patch has not landed on 72.

Fix confirmed.

Deleted {abook, history}.sqlite, renamed {abook, history}.mab.bak to {abook, history}.mab wrongly-converted by 71 / 72, and launched 73.0b1. I end up with my expected contacts, plus five duplicates (about which I see nothing special in common) that I deleted.

Thanks for the fix!

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: