Closed
Bug 785032
Opened 13 years ago
Closed 13 years ago
SD card issues with new Tegras ("INFO: attempting to create file /mnt/sdcard/writetest ... Push File Failed to Validate!")
Categories
(Release Engineering :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: emorley, Assigned: cmtalbert)
References
Details
(Keywords: intermittent-failure, Whiteboard: [red])
Attachments
(2 files, 1 obsolete file)
4.21 KB,
patch
|
cmtalbert
:
review+
|
Details | Diff | Splinter Review |
9.25 KB,
patch
|
jmaher
:
review+
|
Details | Diff | Splinter Review |
{
========= Started Running verify.py failed (results: 2, elapsed: 34 secs) (at 2012-08-23 03:14:28.642449) =========
python /builds/sut_tools/verify.py
in dir /builds/tegra-362/test/build (timeout 1200 secs)
watching logfiles {}
argv: ['python', '/builds/sut_tools/verify.py']
environment:
HOME=/home/cltbld
PATH=/tools/buildbot-0.8.4-pre-moz2/bin:/usr/local/bin:/usr/local/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/cltbld/bin
PWD=/builds/tegra-362/test/build
SUT_IP=10.250.51.202
SUT_NAME=tegra-362
using PTY: False
DEBUG: updateSUT: Using tegra 'tegra-362' found in env variable
INFO: Using tegra 'tegra-362' found in env variable
INFO: attempting to ping tegra
reconnecting socket
INFO: updateSUT.py: We're running SUTAgentAndroid Version 1.11
INFO: Got expected SUTAgent version '1.11'
INFO: attempting to create file /mnt/sdcard/writetest
Push File Failed to Validate!
program finished with exit code 1
}
s: tegra-362
https://tbpl.mozilla.org/php/getParsedLog.php?id=14627585&tree=Firefox
s: tegra-343
https://tbpl.mozilla.org/php/getParsedLog.php?id=14627866&tree=Firefox
Reporter | ||
Comment 1•13 years ago
|
||
slave: tegra-325
https://tbpl.mozilla.org/php/getParsedLog.php?id=14622443&tree=Mozilla-Inbound
slave: tegra-330
https://tbpl.mozilla.org/php/getParsedLog.php?id=14623386&tree=Mozilla-Inbound
slave: tegra-340
https://tbpl.mozilla.org/php/getParsedLog.php?id=14626225&tree=Mozilla-Inbound
SD card related, but different error:
slave: tegra-331
https://tbpl.mozilla.org/php/getParsedLog.php?id=14624360&tree=Mozilla-Inbound
Reporter | ||
Comment 2•13 years ago
|
||
Comment 3•13 years ago
|
||
Comment 4•13 years ago
|
||
Comment 5•13 years ago
|
||
Comment 6•13 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=14780614&tree=Mozilla-Inbound
s: tegra-119
(Not so sure about this one)
Comment 7•13 years ago
|
||
Comment 8•13 years ago
|
||
Comment 9•13 years ago
|
||
Comment 10•13 years ago
|
||
Comment 11•13 years ago
|
||
Comment 12•13 years ago
|
||
Comment 13•13 years ago
|
||
Comment 14•13 years ago
|
||
Comment 15•13 years ago
|
||
Comment 16•13 years ago
|
||
Updated•13 years ago
|
Whiteboard: [orange]
Comment 17•13 years ago
|
||
Comment 18•13 years ago
|
||
Comment 19•13 years ago
|
||
Comment 20•13 years ago
|
||
Comment 21•13 years ago
|
||
Comment 22•13 years ago
|
||
Comment 23•13 years ago
|
||
Comment 24•13 years ago
|
||
Comment 25•13 years ago
|
||
We've seen this problem on non-new tegras, so don't believe it's related to the way the new tegras are installed. Removing block on new tegra install.
No longer blocks: 767456
Comment 26•13 years ago
|
||
Don't you mean "on one and only one non-new tegra"? And "a problem other than this one"?
This bug is
INFO: attempting to create file /mnt/sdcard/writetest
Push File Failed to Validate!
and every one of the ones on 119, the only non-new tegra mentioned here, has instead been
INFO: attempting to create file /mnt/sdcard/writetest
removing file: /mnt/sdcard/writetest
True
reconnecting socket
a thing which we've known about for some time, but have been calling bug 681861 rather than bothering to file the bug I really wanted to file, "Intermittent Android 'True'".
Blocks: 767456
Comment 27•13 years ago
|
||
Apparently I'm wrong, and old tegras do something else which isn't this, but is this, but they don't do it visibly, they do it invisibly, but their invisible not-it is this.
No longer blocks: 767456
Comment 28•13 years ago
|
||
Comment 29•13 years ago
|
||
Comment 30•13 years ago
|
||
Comment 31•13 years ago
|
||
Comment 32•13 years ago
|
||
Comment 33•13 years ago
|
||
Comment 34•13 years ago
|
||
Comment 35•13 years ago
|
||
Comment 36•13 years ago
|
||
Comment 37•13 years ago
|
||
Comment 38•13 years ago
|
||
Comment 39•13 years ago
|
||
Comment 40•13 years ago
|
||
Comment 41•13 years ago
|
||
Comment 42•13 years ago
|
||
Comment 43•13 years ago
|
||
Comment 44•13 years ago
|
||
Passing it this to ctalbert since he volunteered to look into this.
Assignee: nobody → ctalbert
Comment 45•13 years ago
|
||
Comment 46•13 years ago
|
||
Comment 47•13 years ago
|
||
Comment 48•13 years ago
|
||
Assignee | ||
Comment 49•13 years ago
|
||
So, given we don't know exactly what is happening here, the current hypothesis is that the agent is starting before the sdcard is mounted. That shouldn't happen. The agent waits on the android.intent.action.MEDIA_MOUNTED action to fire its startup receiver and start the agent.[1] However, I found that the agent would startup just fine with ONLY this code active when no sdcard was in the system (I disabled the code in the watcher).
So, this patch adds a second check of the external storage mount state in all the places we start the agent -- both in the watcher and in the agent itself. It also outputs debugging information so we can see what is happening when we test this, and I ask very much that we test this before deploying it.
If we get an r+ here I'd like to stage this an then we can do a final patch for deployment with less/better debug output.
Attachment #667280 -
Flags: review?(jmaher)
Comment 50•13 years ago
|
||
Comment on attachment 667280 [details] [diff] [review]
Patch to help debug and possibly fix this
Review of attachment 667280 [details] [diff] [review]:
-----------------------------------------------------------------
just some nits
::: build/mobile/sutagent/android/watcher/WatcherService.java
@@ +937,5 @@
> + if (Environment.MEDIA_MOUNTED.compareTo(state) != 0) {
> + Log.i("SUTAgentWatcher", "SDcard not mounted, waiting another turn");
> + return;
> + } else if (Environment.MEDIA_MOUNTED_READ_ONLY.equals(state)) {
> + Log.e("SUTAgentWatcher", "SDCard mounted read only not starting agent now, try again in 60s");
nit: caps here should be SDcard
@@ +945,3 @@
> if (bStartSUTAgent && !GetProcessInfo(sProgramName))
> {
> + Log.i("SUTAgentWatcher", "Starting SUTAGent from watcher code");
NIT: caps here should be SUTAgent
Attachment #667280 -
Flags: review?(jmaher) → review+
Assignee | ||
Comment 51•13 years ago
|
||
Ok, this patch and the dependent bug patch in 797652 will allow you to test this. Callek can you stage this on some tegras and we can see if this works for us?
Attachment #667280 -
Attachment is obsolete: true
Attachment #667759 -
Flags: review+
Comment 52•13 years ago
|
||
(In reply to Clint Talbert ( :ctalbert ) from comment #51)
> Created attachment 667759 [details] [diff] [review]
> Addresses nits
>
> Ok, this patch and the dependent bug patch in 797652 will allow you to test
> this. Callek can you stage this on some tegras and we can see if this works
> for us?
Sure, will just have you/joel give me the apk's and I'll get it up.
Assignee | ||
Comment 53•13 years ago
|
||
Assignee | ||
Comment 54•13 years ago
|
||
Note that this isn't the real fix we want. We have to deal with the fact that the agent needs to run on devices without an external sdcard and so we can't have it never come up if there is no sdcard. In that instance, we simply have to change the fallback behavior so that the agent never uses a fallback that is not world readable/writable. But this code in comment 53 should enable us to understand if this is the problem we're seeing on this bug and then we can fix it properly.
Assignee | ||
Comment 55•13 years ago
|
||
This is the final patch we should go with. I tried to fix the fallback method; however, I can't for the life of me get the agent to write anything into /data/local/tmp although the location appears to be (and is supposed to be) world writeable/readable. The fallback does work, but given that the getTestRoot code has never actually returned null (although that is what it was coded to do) it can cause the agent to crash, so I had to null guard pruneCommandLog against that. I think this should get us where we want to go, especially for the watcher, which is the more difficult one of the pair to update. We can continue working on the agent's fallback code in getTestRoot and see if we can actually get that to work properly.
Attachment #668302 -
Flags: review?(jmaher)
Comment 56•13 years ago
|
||
Comment on attachment 668302 [details] [diff] [review]
Final patch
Review of attachment 668302 [details] [diff] [review]:
-----------------------------------------------------------------
this looks pretty good in general. I think we will be good in doing this, but I could see us changing the behaviour in the future as we hit other scenarios or find out more data.
::: build/mobile/sutagent/android/DoCommand.java
@@ +1324,5 @@
> + tmpFile.createNewFile();
> + } catch (IOException e){
> + Log.i("SUTAgentAndroid", "Caught exception creating file in /data/local/tmp: " + e.getMessage());
> + }
> +
nit: extra whitespace here.
@@ +1334,5 @@
> + return(Environment.getExternalStorageDirectory().getAbsolutePath());
> + }
> + if (tmpFile.exists())
> + {
> + Log.i("CLINT", "tmpfile exists");
extra debugging in here?
Attachment #668302 -
Flags: review?(jmaher) → review+
Comment 57•13 years ago
|
||
Reporter | ||
Comment 58•13 years ago
|
||
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Updated•13 years ago
|
Keywords: intermittent-failure
Updated•13 years ago
|
Whiteboard: [orange][red] → [red]
Updated•12 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•