Closed
Bug 1050769
Opened 10 years ago
Closed 10 years ago
upload a new mobile_tp4.zip pageset to the 3 headed remote talos server (round 2)
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: jmaher, Assigned: sbruno)
References
Details
Attachments
(1 file)
230.88 KB,
text/plain
|
Details |
+++ This bug was initially created as a clone of Bug #1030166 +++
^ that bug would have relevant information for updating this.
Thanks to the work of :edmorley in bug 1050161, we have another round of cleaned up network access, here is the updated mobile_tp4.zip:
http://people.mozilla.org/~jmaher/taloszips/zips/mobile_tp4.zip
shasum mobile_tp4.zip
7373b491baf27dda89f47365142ba0d2ff6df1c6 mobile_tp4.zip
Comment 1•10 years ago
|
||
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Comment 2•10 years ago
|
||
It looks like this didn't work - in bug 1051993 I've pushed to try again, and I'm getting the same external connections as before. I've just re-downloaded the zip in comment 0 here and it does include my changes, so I think perhaps the wrong zip was uploaded to relengwebadm.private.scl3
Please could we try the upload again? :-)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 3•10 years ago
|
||
(In reply to Ed Morley [:edmorley] from comment #2)
> It looks like this didn't work - in bug 1051993 I've pushed to try again,
> and I'm getting the same external connections as before. I've just
> re-downloaded the zip in comment 0 here and it does include my changes, so I
> think perhaps the wrong zip was uploaded to relengwebadm.private.scl3
>
> Please could we try the upload again? :-)
Flags: needinfo?(bugspam.Callek)
Comment 4•10 years ago
|
||
Simone, can you take this on "today", if not n-i me back and I'll do it during my day.
Flags: needinfo?(bugspam.Callek) → needinfo?(sbruno)
Comment 5•10 years ago
|
||
Hey Ed,
Can you provide a single "changed file[name]" and a related shasum of said file to so we can verify against the server/extracted fileset that your change is in place as well.
Flags: needinfo?(emorley)
Comment 6•10 years ago
|
||
(In reply to Justin Wood (:Callek) from comment #5)
> Hey Ed,
>
> Can you provide a single "changed file[name]" and a related shasum of said
> file to so we can verify against the server/extracted fileset that your
> change is in place as well.
One of the changed files:
[/c/tpn]$ shasum amazon.com/www.amazon.com/index.html
9b132c56328821a1ab7d1c5d48d769328061f66a amazon.com/www.amazon.com/index.html
Flags: needinfo?(emorley)
Assignee | ||
Comment 7•10 years ago
|
||
I uploaded and extracted the zip file. The change seems to be in place:
# sha1sum amazon.com/www.amazon.com/index.html
9b132c56328821a1ab7d1c5d48d769328061f66a amazon.com/www.amazon.com/index.html
Flags: needinfo?(sbruno)
Comment 8•10 years ago
|
||
Thank you :-)
Assignee: nobody → sbruno
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Resolution: --- → FIXED
Comment 9•10 years ago
|
||
I'm still seeing the same failures in bug 1051993's new try run.
I've gone over the relevant files with a fine toothcomb and am pretty sure I've not missed any external connections - I just think the pandas are not running the same pageset as the one that was updated in comment 7.
I think we need to rule out:
1) That the zip isn't extracting to the wrong directory structure (ie one too deep or something like that - and we then have a double pageset, one inside the other).
2) That the pandas are in fact using this server/pageset for the remote-tp4m_nochrome run
3) That there's no caching going on
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 10•10 years ago
|
||
For #1, an |ls -al| from the top level of the talos repo, adding the output as a private file here would be good?
Assignee | ||
Comment 11•10 years ago
|
||
:edmorley:
After extracting, the mobile_tp4 folder is located here: /data/releng/src/talos-remote/www/talos-repo/talos/page_load_test/mobile_tp4 (as per instructions provided in https://bugzilla.mozilla.org/show_bug.cgi?id=1030166#c0). The old version of that folder was also located there, so basically I just replaced it with the new version.
Comment 12•10 years ago
|
||
Yeah I know it should have worked, and that yeah files were overwritten, but that could just mean the previous uploads were to the wrong place too though. As |ls -al| would at least help with #1 below.
Assignee | ||
Comment 13•10 years ago
|
||
Reporter | ||
Comment 14•10 years ago
|
||
so the top level folder shows this:
./talos/page_load_test:
total 64
drwxr-xr-x 13 root root 4096 Aug 19 01:59 .
drwxr-xr-x 14 root root 4096 Jun 25 13:44 ..
drwxr-xr-x 2 root root 4096 Jun 20 2013 a11y
drwxr-xr-x 4 root root 4096 May 2 09:57 canvasmark
drwxr-xr-x 3 root root 4096 Dec 1 2011 dhtml
drwxr-xr-x 5 root root 4096 May 2 09:57 dromaeo
drwxr-xr-x 2 root root 4096 Oct 4 2012 kraken
lrwxrwxrwx 1 root root 19 Apr 24 12:04 mobile_tp4 -> ../../../mobile_tp4
-rw-r--r-- 1 root root 4281 Jun 29 2012 quit.js
drwxr-xr-x 2 root root 4096 Jun 25 13:44 scroll
drwxr-xr-x 3 root root 4096 Dec 1 2011 svg
drwxr-xr-x 2 root root 4096 Dec 1 2011 svg_opacity
drwxr-xr-x 3 root root 4096 Jul 19 2013 svgx
drwxr-xr-x 4 root root 4096 Jun 25 13:44 tart
lrwxrwxrwx 1 root root 12 Apr 24 12:04 tp4 -> ../../../tp4
-rw-r--r-- 1 root root 1781 Dec 1 2011 tp4m.manifest
drwxr-xr-x 2 root root 4096 Oct 4 2012 v8_7
but we want to see the contents of the mobile_tp4 directory.
Comment 15•10 years ago
|
||
Might unzip not be handling the symlinks correctly?
If I've understood the paths in previous comments correctly, the actual pageset is at:
/data/releng/src/talos-remote/www/talos-repo/talos/page_load_test/../../../mobile_tp4
ie:
/data/releng/src/talos-remote/www/mobile_tp4
Assignee | ||
Comment 16•10 years ago
|
||
Oh! I just extracted the content of zip file in /data/releng/src/talos-remote/www/mobile_tp4
Comment 17•10 years ago
|
||
By "I just extracted" do you mean in previous comments you had extracted the zip there, or you've just re-extracted now to this location? :-)
Reporter | ||
Comment 18•10 years ago
|
||
we also want to make sure that the unzipping puts the data in mobile_tp4, not mobile_tp4/mobile_tp4.
Assignee | ||
Comment 19•10 years ago
|
||
"just extracted" means I have re-extracted it now to this location (sorry for the ambiguity).
Data is now under /data/releng/src/talos-remote/www/mobile_tp4 (not mobile_tp4/mobile_tp4)
I don't think this re-extraction will solve the problem, though, since check in Ccomment 7 was performed in /data/releng/src/talos-remote/www/talos-repo/talos/page_load_test/mobile_tp4.
Comment 20•10 years ago
|
||
Thank you - I've retriggered the jobs again, but as you say, if the shasum worked fine then that was unlikely the issue.
Failing that I guess this leaves:
(In reply to Ed Morley [:edmorley] from comment #9)
> 2) That the pandas are in fact using this server/pageset for the
> remote-tp4m_nochrome run
> 3) That there's no caching going on
Comment 21•10 years ago
|
||
https://tbpl.mozilla.org/php/getParsedLog.php?id=46259039&full=1&branch=try
06:52:07 INFO - 08-19 06:50:57.085 I/Gecko ( 2199): FATAL ERR_R: Non-local network connections are disabled and a connection attempt to g-ecx.images-amazon.com (54.239.132.83) was made.
I guess we could do something radical, like rename the pageset directory for a few mins and see if we get any failures? That would rule out us modifying the wrong location?
Comment 23•10 years ago
|
||
15:17 <simone|buildduty> edmorley|sheriffduty: I am ready to try the "rename the pageset directory for a few minutes" idea, if you want
15:21 <edmorley|sheriffduty> simone|buildduty: sure let's give it a go :-)
15:28 <simone|buildduty> edmorley|sheriffduty: page_load_test/mobile_tp4 renamed to m_tp4_
These jobs completed successfully after that:
https://tbpl.mozilla.org/php/getParsedLog.php?id=46262984&tree=Try
https://tbpl.mozilla.org/php/getParsedLog.php?id=46264271&tree=Try
https://tbpl.mozilla.org/php/getParsedLog.php?id=46264306&tree=Try
So the panda talos jobs aren't using the symlink at /data/releng/src/talos-remote/www/talos-repo/talos/page_load_test/mobile_tp4, when they refer to URLs such as:
http://talos-remote.pvt.build.mozilla.org/page_load_test/mobile_tp4/amazon.com/www.amazon.com/index.html
Comment 24•10 years ago
|
||
(In reply to Ed Morley [:edmorley] from comment #23)
> 15:17 <simone|buildduty> edmorley|sheriffduty: I am ready to try the "rename
> the pageset directory for a few minutes" idea, if you want
> 15:21 <edmorley|sheriffduty> simone|buildduty: sure let's give it a go :-)
> 15:28 <simone|buildduty> edmorley|sheriffduty: page_load_test/mobile_tp4
> renamed to m_tp4_
>
> These jobs completed successfully after that:
> https://tbpl.mozilla.org/php/getParsedLog.php?id=46262984&tree=Try
This one at least seems to have failed...
07:44:37 INFO - INFO : RSS: Main: 149086208
07:44:37 INFO - Cycle 1(1): loaded http://talos-remote.pvt.build.mozilla.org/page_load_test/mobile_tp4/news.google.com/news.google.com/index.html (next: http://talos-remote.pvt.build.mozilla.org/page_load_test/mobile_tp4/m.news.google.com/news.google.com/index.html)
07:44:37 INFO - RSS: Main: 167141376
07:44:37 INFO - Cycle 1(1): loaded http://talos-remote.pvt.build.mozilla.org/page_load_test/mobile_tp4/m.news.google.com/news.google.com/index.html (next: http://talos-remote.pvt.build.mozilla.org/page_load_test/mobile_tp4/amazon.com/www.amazon.com/index.html)
07:44:37 INFO - RSS: Main: 165490688
07:44:37 INFO - __startBeforeLaunchTimestamp1408459391330__endBeforeLaunchTimestamp
07:44:37 INFO - __startAfterTerminationTimestamp1408459417181__endAfterTerminationTimestamp
07:44:37 INFO - Failed tp4m:
07:44:37 INFO - Stopped Tue, 19 Aug 2014 07:44:37
07:44:37 ERROR - Traceback (most recent call last):
> So the panda talos jobs aren't using the symlink at
> /data/releng/src/talos-remote/www/talos-repo/talos/page_load_test/mobile_tp4,
> when they refer to URLs such as:
> http://talos-remote.pvt.build.mozilla.org/page_load_test/mobile_tp4/amazon.
> com/www.amazon.com/index.html
Which makes me think this is false
Comment 25•10 years ago
|
||
(In reply to Justin Wood (:Callek) from comment #24)
> (In reply to Ed Morley [:edmorley] from comment #23)
> > 15:17 <simone|buildduty> edmorley|sheriffduty: I am ready to try the "rename
> > the pageset directory for a few minutes" idea, if you want
> > 15:21 <edmorley|sheriffduty> simone|buildduty: sure let's give it a go :-)
> > 15:28 <simone|buildduty> edmorley|sheriffduty: page_load_test/mobile_tp4
> > renamed to m_tp4_
> >
> > These jobs completed successfully after that:
> > https://tbpl.mozilla.org/php/getParsedLog.php?id=46262984&tree=Try
>
> This one at least seems to have failed...
For a non-local connection (this is a try run of the bug 1051993 patch), see the full log:
07:44:37 INFO - 08-19 14:43:29.697 I/GeckoDump( 2192): Cycle 1(1): loaded http://talos-remote.pvt.build.mozilla.org/page_load_test/mobile_tp4/m.news.google.com/news.google.com/index.html (next: http://talos-remote.pvt.build.mozilla.org/page_load_test/mobile_tp4/amazon.com/www.amazon.com/index.html)
07:44:37 INFO - 08-19 14:43:29.697 I/GeckoDump( 2192):
07:44:37 INFO - 08-19 14:43:30.869 I/GeckoDump( 2192): RSS: Main: 165490688
07:44:37 INFO - 08-19 14:43:30.869 I/GeckoDump( 2192):
07:44:37 INFO - 08-19 14:43:30.986 D/GeckoSuggestedSites( 2192): Number of suggested sites: 4
07:44:37 INFO - 08-19 14:43:31.267 D/GeckoSuggestedSites( 2192): Number of suggested sites: 4
07:44:37 INFO - 08-19 14:43:31.283 D/GeckoSuggestedSites( 2192): Number of suggested sites: 4
07:44:37 INFO - 08-19 14:43:31.439 V/GeckoFavicons( 2192): Cancelling favicon load 36.
07:44:37 INFO - 08-19 14:43:31.712 I/SUTAgentAndroid( 1889): 10.26.128.20 : activity
07:44:37 INFO - 08-19 14:43:32.048 I/Gecko ( 2192): FATAL ERR_R: Non-local network connections are disabled and a connection attempt to g-ecx.images-amazon.com (54.230.119.189) was made.
So this is still true:
> > So the panda talos jobs aren't using the symlink at
> > /data/releng/src/talos-remote/www/talos-repo/talos/page_load_test/mobile_tp4,
> > when they refer to URLs such as:
> > http://talos-remote.pvt.build.mozilla.org/page_load_test/mobile_tp4/amazon.
> > com/www.amazon.com/index.html
Comment 26•10 years ago
|
||
If we were using this pageset, renaming it should result in 404s, not successfully loading the page & hitting the external network.
Comment 27•10 years ago
|
||
BAH!!!!
So, it looks like we effectively have broken docs/made a mistake. While we did update the location and the zip file, we *did not* deploy it to the webheads only the admin node.
(I had to run ./update at /data/releng/src/talos-remote)
I'm going to get a task on myself to update our docs *today*.
That said, this bug should now be fixed, see before/after:
[jwood@foopy72.p5.releng.scl3.mozilla.com ~]$ curl http://talos-remote.pvt.build.mozilla.org/page_load_test/mobile_tp4/amazon.com/www.amazon.com/index.html 2>/dev/null | shasum
95e7711e074e22c4df1bae8fd0fd23fdb6a2f3b3 -
[jwood@foopy72.p5.releng.scl3.mozilla.com ~]$ curl http://talos-remote.pvt.build.mozilla.org/page_load_test/mobile_tp4/amazon.com/www.amazon.com/index.html 2>/dev/null | shasum
9b132c56328821a1ab7d1c5d48d769328061f66a -
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Flags: needinfo?(bugspam.Callek)
Resolution: --- → FIXED
Comment 28•10 years ago
|
||
Thank you - has worked :-)
(green runs on https://tbpl.mozilla.org/?tree=Try&rev=863127ae067d)
Status: RESOLVED → VERIFIED
Comment 29•10 years ago
|
||
(In reply to Justin Wood (:Callek) from comment #27)
> BAH!!!!
>
> I'm going to get a task on myself to update our docs *today*.
Untested doc:
https://wiki.mozilla.org/index.php?title=ReleaseEngineering%2FBuildduty%2FOther_Duties&action=historysubmit&diff=1007067&oldid=1000544
Updated•7 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•