Closed Bug 713228 Opened 13 years ago Closed 13 years ago

SQLiteDatabaseCorruptException: database disk image is malformed: PRAGMA synchronous=1;

Categories

(Firefox for Android Graveyard :: General, defect, P2)

ARM
Android
defect

Tracking

(firefox11 fixed, firefox12+ fixed, fennec11+)

RESOLVED FIXED
Firefox 12
Tracking Status
firefox11 --- fixed
firefox12 + fixed
fennec 11+ ---

People

(Reporter: aaronmt, Assigned: gcp)

References

Details

(Keywords: dataloss)

Attachments

(1 file, 3 obsolete files)

I/ProfMigr( 1022): Opening path: /data/data/org.mozilla.fennec_aurora/files/mozilla/q1exgqly.default/places.sqlite
I/GeckoAboutHome( 1022): error reading json file
I/GeckoAboutHome( 1022): java.io.IOException: Bad file number
I/GeckoAboutHome( 1022): 	at java.util.zip.Inflater.setFileInputImpl(Native Method)
I/GeckoAboutHome( 1022): 	at java.util.zip.Inflater.setFileInput(Inflater.java:410)
I/GeckoAboutHome( 1022): 	at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:221)
I/GeckoAboutHome( 1022): 	at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:178)
I/GeckoAboutHome( 1022): 	at java.util.zip.ZipFile$ZipInflaterInputStream.read(ZipFile.java:459)
I/GeckoAboutHome( 1022): 	at org.mozilla.gecko.AboutHomeContent$7.run(AboutHomeContent.java:270)
I/GeckoAboutHome( 1022): 	at android.os.Handler.handleCallback(Handler.java:587)
I/GeckoAboutHome( 1022): 	at android.os.Handler.dispatchMessage(Handler.java:92)
I/GeckoAboutHome( 1022): 	at android.os.Looper.loop(Looper.java:130)
I/GeckoAboutHome( 1022): 	at org.mozilla.gecko.GeckoAppShell$LooperThread.run(GeckoAppShell.java:155)
I/Database( 1022): sqlite returned: error code = 11, msg = database corruption found by source line 40112
I/Database( 1022): sqlite returned: error code = 11, msg = database disk image is malformed
E/Database( 1022): Failure 11 (database disk image is malformed) on 0x1ca680 when preparing 'PRAGMA synchronous=1;'.
E/Database( 1022): Removing corrupt database: /data/data/org.mozilla.fennec_aurora/files/mozilla/q1exgqly.default/places.sqlite
E/Database( 1022): !@ - DB-onCorruption /data/data/org.mozilla.fennec_aurora/files/mozilla/q1exgqly.default/places.sqlite
E/Database( 1022): !@)bAdRfS
W/GLThreadManager( 1022): checkGLESVersion mGLESVersion = 131072 mMultipleGLESContextsAllowed = true
W/GLThread( 1022): onSurfaceCreated
E/Database( 1022): !@Deleting and re-creating corrupt database /data/data/org.mozilla.fennec_aurora/files/mozilla/q1exgqly.default/places.sqlite
E/Database( 1022): android.database.sqlite.SQLiteDatabaseCorruptException: database disk image is malformed: PRAGMA synchronous=1;
E/Database( 1022): 	at android.database.sqlite.SQLiteDatabase.native_execSQL(Native Method)
E/Database( 1022): 	at android.database.sqlite.SQLiteDatabase.execSQL(SQLiteDatabase.java:1904)
E/Database( 1022): 	at android.database.sqlite.SQLiteDatabase.<init>(SQLiteDatabase.java:2010)
E/Database( 1022): 	at android.database.sqlite.SQLiteDatabase.openDatabase(SQLiteDatabase.java:905)
E/Database( 1022): 	at org.mozilla.gecko.ProfileMigrator$PlacesTask.openPlaces(ProfileMigrator.java:312)
E/Database( 1022): 	at org.mozilla.gecko.ProfileMigrator$PlacesTask.migratePlaces(ProfileMigrator.java:336)
E/Database( 1022): 	at org.mozilla.gecko.ProfileMigrator$PlacesTask.run(ProfileMigrator.java:371)
E/Database( 1022): 	at java.lang.Thread.run(Thread.java:1019)
E/Database( 1022): !@)bAdRfS
I/Database( 1022): sqlite returned: error code = 14, msg = cannot open file at source line 25472
E/Database( 1022): sqlite3_open_v2("/data/data/org.mozilla.fennec_aurora/files/mozilla/q1exgqly.default/places.sqlite", &handle, 1, NULL) failed
I/ProfMigr( 1022): Error on places database:unable to open database file
W/GLThread( 1022): onSurfaceChanged(480, 699)
I/GLThread( 1022): sending render notification tid=9
D/CLIPBOARD( 1022): Hide Clipboard dialog at Starting input: finished by someone else... !

STR:

1. Install Aurora XUL (http://ftp.mozilla.org/pub/mozilla.org/mobile/nightly/2011-12-20-04-20-29-mozilla-aurora-android/fennec-10.0a2.multi.android-arm.apk)

2. Browse to CNN.com, bookmark CNN.com

3. Install Aurora Native (http://ftp.mozilla.org/pub/mozilla.org/mobile/nightly/latest-mozilla-aurora-android/fennec-11.0a2.multi.android-arm.apk)

ER: See CNN.com imported and bookmarked
AR: Dont see CNN.com imported and bookmarked

---
Tested via:

Samsung Galaxy SII (Android 2.3.4)
Samsung Nexus S (Android 4.0.3)

20111220042029
http://hg.mozilla.org/releases/mozilla-aurora/rev/0d8490e27457

20111223042034
http://hg.mozilla.org/releases/mozilla-aurora/rev/7c408ff60a8e
Keywords: dataloss
Nexus One (Android 2.3.4), slightly different, same actual result

I/GeckoApp( 1804): checking profile migration in: /data/data/org.mozilla.fennec_aurora/files/mozilla/n1kk77jy.default
I/ProfMigr( 1804): Opening path: /data/data/org.mozilla.fennec_aurora/files/mozilla/n1kk77jy.default/places.sqlite
I/Database( 1804): sqlite returned: error code = 11, msg = database corruption found by source line 40107
I/Database( 1804): sqlite returned: error code = 11, msg = database disk image is malformed
I/ProfMigr( 1804): Failed to get bookmarks: database disk image is malformed: , while compiling: SELECT places.url AS a_url, places.title AS a_title FROM (moz_places as places JOIN moz_bookmarks as bookmarks ON places.id = bookmarks.fk) WHERE places.hidden <> 1 ORDER BY bookmarks.dateAdded
I/Database( 1804): sqlite returned: error code = 11, msg = database corruption found by source line 40107
I/Database( 1804): sqlite returned: error code = 11, msg = database disk image is malformed
I/ProfMigr( 1804): Failed to get bookmarks: database disk image is malformed: , while compiling: SELECT places.url AS a_url, places.title AS a_title, history.visit_date AS a_date FROM (moz_historyvisits AS history JOIN moz_places AS places ON places.id = history.place_id) WHERE places.hidden <> 1 ORDER BY history.visit_date DESC
I/Database( 1804): sqlite returned: error code = 11, msg = database corruption found by source line 40107
I/Database( 1804): sqlite returned: error code = 11, msg = database disk image is malformed
I/ProfMigr( 1804): Failed to get favicons: database disk image is malformed: , while compiling: SELECT places.url AS a_url, favicon.data AS a_data, favicon.mime_type AS a_mime FROM (moz_places AS places JOIN moz_favicons AS favicon ON places.favicon_id = favicon.id)
Notes:

- places.sqlite was written by the SQLite that is used by XUL Fennec. I guess that's our own version which we update frequently.

- We're trying to read it in from Android, so using whatever SQLite Android bundles.

- Android seems to think the best way to deal with a database it thinks is corrupted is to delete it automatically (gee, thanks!). That could cause dataloss here, but perhaps the READONLY flag protects us from it.
Assignee: nobody → gpascutto
Priority: -- → P2
STR do not produce the issue on my Galaxy Tab 10.1 with Android 3.1. Slightly surprising as the Android SQLite should be inbetween the 2.3.4 and 4.0.3 that are reprotedly affected. Will try with the Nexus S next.
I'm able to reproduce this on the Samsung Galaxy SII (Android 2.3.4) combo. The same database that gives the error works correctly on the Galaxy Tab 10.1 and can be read locally.
tracking-fennec: --- → 11+
Patch to add a Java<->JNI<->SQLite Bridge. Simple API that works for the current users (parameter support requested by password manager). Updated ProfileMigrator to use this instead of the Android DB.

This makes the profile migration succeed on my SGS2.
Attachment #588998 - Flags: review?(blassey.bugs)
Comment on attachment 588998 [details] [diff] [review]
Patch 1. Add Bridge to use our own SQLite version. Update Profile Migrator.

Review of attachment 588998 [details] [diff] [review]:
-----------------------------------------------------------------

Just to cross your Ts you should get a storage peer to sign off on adding this code to their module (https://wiki.mozilla.org/Modules/All#storage).

::: mobile/android/base/ProfileMigrator.java
@@ +44,5 @@
>  
>  import android.database.Cursor;
>  import android.database.sqlite.SQLiteDatabase;
>  import android.database.sqlite.SQLiteException;
>  import android.database.sqlite.SQLiteStatement;

can these be removed now?

@@ +91,5 @@
>          + "places.id = bookmarks.fk) WHERE places.hidden <> 1 "
>          + "ORDER BY bookmarks.dateAdded";
> +    // Result column of relevant data
> +    final int bookmarkUrl   = 0;
> +    final int bookmarkTitle = 1;

this strikes me as slightly fragile. Please file a follow up to add an API to get column indexes from column names.

@@ +187,2 @@
>              try {
> +                ArrayList queryResult = db.query(historyQuery);

can we make that an ArrayList<Object[]> to avoid the cast below? Same comment for future uses as well obviously

::: mobile/android/base/sqlite/ByteArray.java
@@ +40,5 @@
> +/*
> + * Helper class which holds a byte[]-array, but
> + * which is derived from Object (which byte[] isn't).
> + */
> +public class ByteArray {

why not use the built in ByteBuffer?

::: mobile/android/base/sqlite/SQLiteBridge.java
@@ +77,5 @@
> +    // The result is returned as an ArrayList of Object[] arrays, with each
> +    // row being an entry in the ArrayList, and each column being one Object
> +    // in the Object[] array. The columns are of type null, ByteArray (BLOB),
> +    // or String (everything else).
> +    public ArrayList query(String aQuery, String[] aParams) throws SQLiteBridgeException {

ArrayList<Object[]>

::: mozglue/android/APKOpen.cpp
@@ -543,5 @@
>    MOZLOAD("mozalloc");
>    MOZLOAD("nspr4");
>    MOZLOAD("plc4");
>    MOZLOAD("plds4");
> -  MOZLOAD("mozsqlite3");

file a follow up to do this lib load (and your new library) through our custom lib loader

::: storage/android/SQLiteBridge.cpp
@@ +62,5 @@
> +    jenv->DeleteLocalRef(cls);
> +}
> +
> +jobject
> +JNI_Construct_ByteArray(JNIEnv* jenv, jbyteArray jBytes)

unneeded if you use ByteBuffer
Attachment #588998 - Flags: review?(blassey.bugs) → review+
Attachment #588998 - Flags: review?(bugmail)
I'll put here what I wrote on irc: i'd suggest not creating a separate lib for the sqlite bridge, but put that in mozglue, under mozglue/android, using the same tricks as for the jni bridge to libxul.so.
So, I initially only addressed Brad's comments, but I ran into a bug in elfhack, which crashed on libsqlitebridge. I bit the bullet and split the entire Gecko lib loading into 2 parts, moved everything from libsqlitebridge into mozglue using dlsym and made it use our custom linker. So basically, all the benefits, none of the disadvantages of the previous approach. (Ok, maybe the dylsym code in APKOpen.cpp isn't as pretty as a separate lib)

I also added getColumnIndex for the SQL query results.

There's a huge amount of changes, so requesting rereview.
Attachment #588998 - Attachment is obsolete: true
Attachment #588998 - Flags: review?(bugmail)
Attachment #589331 - Flags: review?(blassey.bugs)
Comment on attachment 589331 [details] [diff] [review]
Patch 1. Add Bridge to use our own SQLite version. Update Profile Migrator.

Review of attachment 589331 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/GeckoApp.java
@@ +1435,5 @@
>          if (sGREDir == null)
>              sGREDir = new File(this.getApplicationInfo().dataDir);
>  
> +        GeckoAppShell.loadSQLiteLibs(mAppContext.getApplication().getPackageResourcePath());
> +        checkMigrateProfile();

why aren't you doing this delayed on the background thread anymore?

::: mobile/android/base/ProfileMigrator.java
@@ +238,5 @@
>              BitmapDrawable image = (BitmapDrawable) Drawable.createFromStream(byteStream, "src");
>              if (image != null) {
>                  try {
>                      BrowserDB.updateFaviconForUrl(mCr, url, image);
> +                } catch (Exception e) {

what is being thrown here? I'm not a fan of blanket catching Exception in general.

::: mozglue/android/APKOpen.cpp
@@ +644,5 @@
> +  GETFUNC(sqlite3_column_type);
> +  GETFUNC(sqlite3_column_blob);
> +  GETFUNC(sqlite3_column_bytes);
> +  GETFUNC(sqlite3_column_text);
> +#undef GETFUNC

nit, move this function wrapping code to an init function in SQLiteBridge.cpp

::: mozglue/android/SQLiteBridge.cpp
@@ +79,5 @@
> +}
> +
> +// Calls public static ByteBuffer wrap (byte[] array)
> +static jobject
> +JNI_Construct_ByteBuffer(JNIEnv* jenv, jbyteArray jBytes)

rather than wrapping the bytearray, you should allocate a ByteBuffer, then get access to its data directly to operate on  http://docs.oracle.com/javase/1.4.2/docs/guide/jni/jni-14.html#NewDirectByteBuffer

@@ +85,5 @@
> +    jclass jcls;
> +    jmethodID jmid;
> +    jobject jObj;
> +
> +    jcls = jenv->FindClass("java/nio/ByteBuffer");

these sorts of operations (getting classes and method ids) should be done once in an init function

@@ +143,5 @@
> +    dbPath = jenv->GetStringUTFChars(jDb, NULL);
> +
> +    jobject jNull = jenv->NewGlobalRef(NULL);
> +    jclass objectClass = jenv->FindClass("java/lang/Object");
> +    jclass stringClass = jenv->FindClass("java/lang/String");

init function

@@ +243,5 @@
> +                int colLen = f_sqlite3_column_bytes(ppStmt, i);
> +
> +                // byte[] = jbyteArray, but that isn't a real Object
> +                // so we need to use the ByteBuffer wrapper
> +                jbyteArray jBytes = jenv->NewByteArray(colLen);

construct a ByteBuffer and get direct access to its buffer with GetDirectBufferAddress

@@ +247,5 @@
> +                jbyteArray jBytes = jenv->NewByteArray(colLen);
> +                if (jBytes == NULL) {
> +                    goto error_close;
> +                }
> +                jenv->SetByteArrayRegion(jBytes, 0, colLen, (jbyte*)blob);

this will just be a memcpy
Attachment #589331 - Flags: review?(blassey.bugs) → review-
Comment on attachment 589331 [details] [diff] [review]
Patch 1. Add Bridge to use our own SQLite version. Update Profile Migrator.

Review of attachment 589331 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozglue/android/APKOpen.cpp
@@ +616,5 @@
> +  MOZLOAD("mozalloc");
> +  MOZLOAD("nspr4");
> +  MOZLOAD("plc4");
> +  MOZLOAD("plds4");
> +  sqlite_handle = MOZLOAD("mozsqlite3");

You shouldn't need to MOZLOAD anything other than mozsqlite3.
Note this is going to conflict with bug 683127, which I'm hoping to land this week.
>why aren't you doing this delayed on the background thread anymore?

1) SQLiteLoadLibs needs to finish before GeckoLoadLibs starts, so those 2 need some kind synchronisation.
2) We already agreed that profile migration should be modal and blocking.

However, you're right that this ain't good - it will block the Java app loading even if profile migration doesn't need to run, and it will just show a blank until I get the splash screen for it landed. 

I moved it back to a thread, though it required some juggling because of (1).

>what is being thrown here? I'm not a fan of blanket catching Exception in general.

SQLiteConstraintExceptions (see bug 716729). I've narrowed it down now - it will go away eventually.

>rather than wrapping the bytearray, you should allocate a ByteBuffer, then get access to 
>its data directly to operate on  http://docs.oracle.com/javase/1.4.2/docs/guide/jni/jni-
>14.html#NewDirectByteBuffer
>construct a ByteBuffer and get direct access to its buffer with GetDirectBufferAddress

I implemented this as you suggested. However, this only works with direct ByteBuffers, which have no backing byte[]. Because the Java code does need that, it will end up copying everything again anyway. And the burden to do that has now shifted from the library code (once) to all users (every time). So this wasn't much of a win - had I known from the start I would have pushed back.

>Note this is going to conflict with bug 683127, which I'm hoping to land this week.

That sucks. This one needs to land too, it's a dataloss bug and it blocks bug 704682.
Attachment #589331 - Attachment is obsolete: true
Attachment #589481 - Flags: review?(blassey.bugs)
Blocks: 704682
(In reply to Gian-Carlo Pascutto (:gcp) from comment #11)
> Created attachment 589481 [details] [diff] [review]
> Patch 1. Add Bridge to use our own SQLite version. Update Profile Migrator.
> 
> >why aren't you doing this delayed on the background thread anymore?
> 
> 1) SQLiteLoadLibs needs to finish before GeckoLoadLibs starts, so those 2
> need some kind synchronisation.
> 2) We already agreed that profile migration should be modal and blocking.
Let's not make this change here.

Also, this shouldn't block gecko from starting. Please put the migration check back to the delayed runnable.

 
> 
> However, you're right that this ain't good - it will block the Java app
> loading even if profile migration doesn't need to run, and it will just show
> a blank until I get the splash screen for it landed. 
> 
> I moved it back to a thread, though it required some juggling because of (1).
we should start the gecko thread, then check if we need to do a migration. If we need to do a migration we should put up a modal dialog, not block gecko from starting.
>Also, this shouldn't block gecko from starting. Please put the migration check 
>back to the delayed runnable.

The checkMigrateProfile is calling ProfileMigrator.launchBackground(), which does GeckoAppShell.getHandler().post(new PlacesTask());

This means it shouldn't be blocking Gecko, as that is at that moment starting in another thread. Right?
Comment on attachment 589481 [details] [diff] [review]
Patch 1. Add Bridge to use our own SQLite version. Update Profile Migrator.

Review of attachment 589481 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/GeckoApp.java
@@ +1447,5 @@
>  
>          sGeckoThread = new GeckoThread(intent, mLastUri, mRestoreSession);
>          if (!ACTION_DEBUG.equals(intent.getAction()) &&
> +            checkAndSetLaunchState(LaunchState.Launching, LaunchState.Launched)) {
> +            new Thread(new LoadLibTask()).start();

this is going to impact start up time, please don't change this.

Instead, change loadSQLiteLibs() to ensureSQLiteLibsLoaded() and add a boolean variable to GeckoAppShell that sSQLiteLibsLoaded that says whether they need to be loaded or not. Protect the ensureSQLiteLibsLoaded function with a synchronised so two different threads don't try to load the libs.
Attachment #589481 - Flags: review?(blassey.bugs) → review-
Incorporated review comments. Added ByteBufferInputStream to avoid extra buffer copies.
Attachment #589481 - Attachment is obsolete: true
Attachment #589896 - Flags: review?(blassey.bugs)
Comment on attachment 589896 [details] [diff] [review]
Patch 1. Add Bridge to use our own SQLite version. Update Profile Migrator.

Review of attachment 589896 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/GeckoApp.java
@@ +1469,2 @@
>              sGeckoThread.start();
> +        }

remove the brackets

::: mobile/android/base/GeckoAppShell.java
@@ +420,5 @@
>          }
> +        return extractLibs;
> +    }
> +
> +    public synchronized static void ensureSQLiteLibsLoaded(String apkName) {

synchronizing on a static method uses the entire class as a lock.

Instead, make sSQLiteLibsLoaded a Boolean instead of a boolean and synchronize on that. 

To be clear, this should be: 
public static void ensureSQLiteLibsLoaded(String apkName) {
    if (sSQLiteLibsLoaded)
        return;
    synchronized(sSQLiteLibsLoaded) {
        if (sSQLiteLibsLoaded)
            return;
        loadSQLiteLibsNative(apkName, loadLibsSetup(apkName));
        sSQLiteLibsLoaded = true;
    }
}

::: mobile/android/base/GeckoThread.java
@@ +91,2 @@
>          GeckoAppShell.loadGeckoLibs(
>              app.getApplication().getPackageResourcePath());

not sure what the cost of app.getApplication().getPackageResourcePath() is, but just to be safe, only do it once and reuse the result

::: mobile/android/base/sqlite/ByteBufferInputStream.java
@@ +61,5 @@
> +        return mByteBuffer.get();
> +    }
> +
> +    @Override
> +    public int read(byte[] aBytes, int aOffset, int aLen) throws IOException {

I think both read methods need to be synchronized to be thread safe

::: mozglue/android/APKOpen.cpp
@@ +300,5 @@
> +SHELL_WRAPPER3(notifySmsDeleteFailed, jint, jint, jlong)
> +SHELL_WRAPPER2(notifyNoMessageInList, jint, jlong)
> +SHELL_WRAPPER8(notifyListCreated, jint, jint, jstring, jstring, jstring, jlong, jint, jlong)
> +SHELL_WRAPPER7(notifyGotNextMessage, jint, jstring, jstring, jstring, jlong, jint, jlong)
> +SHELL_WRAPPER3(notifyReadingMessageListFailed, jint, jint, jlong)

this patch is already big enough. Please split the cleaning up of these semi-colons into a separate patch. r=me on that clean up so it can land separately
Attachment #589896 - Flags: review?(blassey.bugs) → review+
Attachment #589896 - Flags: approval-mozilla-aurora?
FWIW, bug 683127 broke this, and this wasn't noticed until I got a crashing nightly in my hands, which means there are not tests for places migration.
:glandium, if this is broke again by another bug, should we reopen this bug?
No, bug 683127 was disabled.
Comment on attachment 589896 [details] [diff] [review]
Patch 1. Add Bridge to use our own SQLite version. Update Profile Migrator.

[Triage Comment]
Mobile only - approved for Aurora.
Attachment #589896 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
I now understand the root cause for this:

http://www.sqlite.org/wal.html

WAL support was added in 3.7.0. We use WAL support in our own SQLite database engine. 

It is *not* available in pre-Honeycomb Android:
http://stackoverflow.com/questions/2421189/version-of-sqlite-used-in-android

SQLite documentation says:

http://www.sqlite.org/wal.html
Backwards Compatibility

...To prevent older versions of SQLite from trying to recover a WAL-mode database (and making matters worse) the database file format version numbers (bytes 18 and 19 in the database header) are increased from 1 to 2 in WAL mode. Thus, if an older version of SQLite attempts to connect to an SQLite database that is operating in WAL mode, it will report an error along the lines of "file is encrypted or is not a database".

This explains this bug: Android versions prior to HoneyComb will not be able to read the XUL database because it was written with WAL and they don't support that. Android versions post-Honeycomb (i.e. my Galaxy Tab) work fine as they have more recent SQLite's on board.

Note that the claimed test result on the Samsung Nexus S (Android 4.0.3) is inconsistent. Might that have been a mistake?
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: