Closed Bug 1402966 Opened 7 years ago Closed 7 years ago

pingsender no longer works on Tier3 platforms

Categories

(Toolkit :: Telemetry, defect, P1)

Unspecified
FreeBSD
defect

Tracking

()

RESOLVED FIXED
mozilla58
Tracking Status
firefox-esr52 --- unaffected
firefox55 --- unaffected
firefox56 --- unaffected
firefox57 --- fixed
firefox58 --- fixed

People

(Reporter: jbeich, Assigned: gsvelto)

References

Details

(Keywords: regression)

Attachments

(2 files, 1 obsolete file)

$ strings obj-x86_64-unknown-freebsd12.0/dist/bin/pingsender | fgrep libcurl
$ truss obj-x86_64-unknown-freebsd12.0/dist/bin/pingsender http://localhost:36223/lookup_fail obj-x86_64-unknown-freebsd12.0/temp/xpc-profile-dWgBm2/saved-telemetry-pings/0557846b-9ac1-434c-9a2f-b9ef54e7df60 >after.log 2>&1 
$ diff before.log after.log
--- before
+++ after
[...]
-access("/lib/libcurl.so",F_OK)			 ERR#2 'No such file or directory'
-access("/usr/lib/libcurl.so",F_OK)		 ERR#2 'No such file or directory'
-access("/usr/lib/compat/libcurl.so",F_OK)	 ERR#2 'No such file or directory'
-access("/usr/local/lib/libcurl.so",F_OK)	 = 0 (0x0)
-openat(AT_FDCWD,"/usr/local/lib/libcurl.so",O_RDONLY|O_CLOEXEC|O_VERIFY,00) = 3 (0x3)
-fstat(3,{ mode=-rwxr-xr-x ,inode=136024,size=528140,blksize=131072 }) = 0 (0x0)
-mmap(0x0,4096,PROT_READ,MAP_PRIVATE|MAP_PREFAULT_READ,3,0x0) = 34365083648 (0x800519000)
-mmap(0x0,2572288,PROT_NONE,MAP_GUARD,-1,0x0)	 = 34391019520 (0x801dd5000)
-mmap(0x801dd5000,466944,PROT_READ|PROT_EXEC,MAP_PRIVATE|MAP_FIXED|MAP_NOCORE|MAP_PREFAULT_READ,3,0x0) = 34391019520 (0x801dd5000)
-mmap(0x802046000,12288,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_FIXED|MAP_PREFAULT_READ,3,0x71000) = 34393579520 (0x802046000)
-munmap(0x800519000,4096)			 = 0 (0x0)
-close(3)					 = 0 (0x0)
[...]
-poll(0x0,0,16)					 = 0 (0x0)
-socket(PF_INET,SOCK_STREAM,IPPROTO_TCP)		 = 4 (0x4)
-setsockopt(4,IPPROTO_TCP,TCP_NODELAY,0x7fffffffc8a0,4) = 0 (0x0)
-setsockopt(4,SOL_SOCKET,SO_NOSIGPIPE,0x7fffffffc8a0,4) = 0 (0x0)
-fcntl(4,F_GETFL,)				 = 2 (0x2)
-fcntl(4,F_SETFL,O_RDWR|O_NONBLOCK)		 = 0 (0x0)
-connect(4,{ AF_INET 127.0.0.1:36223 },16)	 ERR#36 'Operation now in progress'
-poll({ 4/POLLOUT },1,0)				 = 1 (0x1)
-getsockopt(4,SOL_SOCKET,SO_ERROR,0x7fffffffcab0,0x7fffffffca90) = 0 (0x0)
-getpeername(4,{ AF_INET 127.0.0.1:36223 },0x7fffffffc8dc) = 0 (0x0)
-getsockname(4,{ AF_INET 127.0.0.1:10643 },0x7fffffffc8dc) = 0 (0x0)
-poll(0x0,0,32)					 = 0 (0x0)
-mmap(0x0,28672,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) = 34365947904 (0x8005ec000)
-sendto(4,"POST /lookup_fail HTTP/1.1\r\nHo"...,404,MSG_NOSIGNAL,NULL,0) = 404 (0x194)
-poll({ 4/POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND },1,0) = 0 (0x0)
-poll({ 4/POLLIN },1,165)			 = 1 (0x1)
-poll({ 4/POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND },1,0) = 1 (0x1)
-recvfrom(4,"HTTP/1.1 404 Not Found\r\nconnec"...,16384,0,NULL,0x0) = 119 (0x77)
-close(4)					 = 0 (0x0)
[...]
This is a regression from mozilla-central changeset ebd3c3b2d644. Can you block bug 1401339?
Flags: needinfo?(gsvelto)
$ pkg info -x curl
curl-7.55.1

$ pkg info -l curl | fgrep lib/
        /usr/local/lib/libcurl.a
        /usr/local/lib/libcurl.so
        /usr/local/lib/libcurl.so.4
        /usr/local/lib/libcurl.so.4.4.0
Can someone shed light on why libcurlPath is required? I can't imagine GNU or BSD systems not having libcurl on the default search path.
$ ldconfig -r | fgrep search
        search directories: /lib:/usr/lib:/usr/lib/compat:/usr/local/lib:/usr/local/llvm40/lib

$ ldconfig -r | fgrep curl
        142:-lcurl.4 => /usr/local/lib/libcurl.so.4

$ LD_PRELOAD=libcurl.so truss /usr/bin/true 2>&1 | fgrep curl
access("/lib/libcurl.so",F_OK)                   ERR#2 'No such file or directory'
access("/usr/lib/libcurl.so",F_OK)               ERR#2 'No such file or directory'
access("/usr/lib/compat/libcurl.so",F_OK)        ERR#2 'No such file or directory'
access("/usr/local/lib/libcurl.so",F_OK)         = 0 (0x0)
openat(AT_FDCWD,"/usr/local/lib/libcurl.so",O_RDONLY|O_CLOEXEC|O_VERIFY,00) = 3 (0x3)
(In reply to Jan Beich from comment #3)
> Can someone shed light on why libcurlPath is required? I can't imagine GNU
> or BSD systems not having libcurl on the default search path.

We don't want to use the default search path because it is not always sane, but I'm glad to hardcode whatever other paths might be needed to find libcurl into the pingsender. Note that this will be a temporary fix so we don't need it to be perfect, that's because pingsender will be turned into Rust soon enough and then it won't need libcurl anymore.
Flags: needinfo?(gsvelto)
Attached patch [PATCH] WIP (obsolete) — Splinter Review
Here's a quick fix, would this be enough? I've got a NetBSD box here to test it but not a FreeBSD one.
Assignee: nobody → gsvelto
Status: NEW → ASSIGNED
Attachment #8912037 - Flags: feedback?(jbeich)
Comment on attachment 8912037 [details] [diff] [review]
[PATCH] WIP

Not enough, libcurlNames is still empty.
Attachment #8912037 - Flags: feedback?(jbeich) → feedback-
Comment on attachment 8912037 [details] [diff] [review]
[PATCH] WIP

Review of attachment 8912037 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/pingsender/pingsender_unix_common.cpp
@@ +86,5 @@
>      "/usr/lib32",
>      "/usr/lib64",
>      "/usr/lib/i386-linux-gnu", // Debian 32-bit x86
>      "/usr/lib/x86_64-linux-gnu", // Debian 64-bit x86
> +#elif defined(XP_UNIX) && !defined(XP_MACOSX) // Various BSDs

XP_UNIX check is redundant. The file is already limited to Unix by moz.build.
Good point, I have to adjust both preprocessor directives.
Attached patch [PATCH v2] WIPSplinter Review
Hopefully this should be better.
Attachment #8912037 - Attachment is obsolete: true
Attachment #8912042 - Flags: feedback?(jbeich)
Comment on attachment 8912042 [details] [diff] [review]
[PATCH v2] WIP

Thanks. Works fine on FreeBSD for the common case where LOCALBASE == /usr/local.

$ truss obj-x86_64-unknown-freebsd12.0/dist/bin/pingsender http://localhost:15008/lookup_fail obj-x86_64-unknown-freebsd12.0/temp/xpc-profile-5pKdOX/saved-telemetry-pings/8f6e978e-72d0-4df0-9fcd-9877d36d8f71 2>&1 | fgrep curl
openat(AT_FDCWD,"/usr/lib/libcurl.so",O_RDONLY|O_CLOEXEC|O_VERIFY,00) ERR#2 'No such file or directory'
openat(AT_FDCWD,"/usr/lib/libcurl.so.4",O_RDONLY|O_CLOEXEC|O_VERIFY,00) ERR#2 'No such file or directory'
openat(AT_FDCWD,"/usr/lib/libcurl-gnutls.so",O_RDONLY|O_CLOEXEC|O_VERIFY,00) ERR#2 'No such file or directory'
openat(AT_FDCWD,"/usr/lib/libcurl-gnutls.so.4",O_RDONLY|O_CLOEXEC|O_VERIFY,00) ERR#2 'No such file or directory'
openat(AT_FDCWD,"/usr/lib/libcurl.so.3",O_RDONLY|O_CLOEXEC|O_VERIFY,00) ERR#2 'No such file or directory'
openat(AT_FDCWD,"/usr/lib/libcurl-gnutls.so.3",O_RDONLY|O_CLOEXEC|O_VERIFY,00) ERR#2 'No such file or directory'
openat(AT_FDCWD,"/usr/local/lib/libcurl.so",O_RDONLY|O_CLOEXEC|O_VERIFY,00) = 3 (0x3)
Attachment #8912042 - Flags: feedback?(jbeich) → feedback+
Wait, why this madness ? if i get it right, curl is now a hidden runtime dependency ?

And why pingsender is using dlopen() directly, instead of relying on nspr's wrapper (which is fixed for us... see the patch that was never accepted in https://bugzilla.mozilla.org/show_bug.cgi?id=650772).

On OpenBSD, right now in -current we have /usr/local/lib/libcurl.so.25.11 and you have zero chance of knowning in advance what will/could be the minor/major. So yeah, conceptually dlopen(/usr/local/lib/libcurl.so) *should* work, but that's still ugly.

Oh well.
(In reply to Landry Breuil (:gaston) from comment #12)
> Wait, why this madness ?

Because there are other platforms that don't have sane defaults.

> if i get it right, curl is now a hidden runtime
> dependency ?

It isn't a dependency hence the dlopen(). The pingsender is a "best effort" tool that is used to send telemetry pings when the browser crashes or when it's already shutting down. If it fails to send a ping then it will be sent the next time the browser starts since it has already been persisted to disk. In other words, it doesn't need to be perfect, or even particularly good for that matter.

> And why pingsender is using dlopen() directly, instead of relying on nspr's
> wrapper (which is fixed for us... 

pingsender is a minimalistic tool and I don't want it to grow any larger especially because it will be entirely replaced within a year with a full Rust implementation that won't need external dependencies.

> see the patch that was never accepted in
> https://bugzilla.mozilla.org/show_bug.cgi?id=650772).
> 
> On OpenBSD, right now in -current we have /usr/local/lib/libcurl.so.25.11
> and you have zero chance of knowning in advance what will/could be the
> minor/major. So yeah, conceptually dlopen(/usr/local/lib/libcurl.so)
> *should* work, but that's still ugly.

I know it's ugly, I never intended it to be pretty, it just needs to work most of the time and if it doesn't it doesn't really matter. It's a stopgap solution that helps us receive telemetry faster:

https://www.a2p.it/wordpress/tech-stuff/mozilla/firefox-data-faster-shutdown-pingsender/
Comment on attachment 8912140 [details]
Bug 1402966 - Search for libcurl in more paths to support various *BSDs;

https://reviewboard.mozilla.org/r/183526/#review188678
Attachment #8912140 - Flags: review?(alessio.placitelli) → review+
Pushed by gsvelto@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/4741b93cfdf2
Search for libcurl in more paths to support various *BSDs; r=Dexter
Priority: -- → P1
https://hg.mozilla.org/mozilla-central/rev/4741b93cfdf2
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Comment on attachment 8912140 [details]
Bug 1402966 - Search for libcurl in more paths to support various *BSDs;

Approval Request Comment
[Feature/Bug causing the regression]: Bug 1401339
[User impact if declined]: No user-visible impact though pingsender won't work on the various *BSD platforms, this will make telemetry slower
[Is this code covered by automated tests?]: Yes
[Has the fix been verified in Nightly?]: Yes
[Needs manual test from QE? If yes, steps to reproduce]: No
[List of other uplifts needed for the feature/fix]: We need to uplift this because we uplifted bug 1401339 already
[Is the change risky?]: No
[Why is the change risky/not risky?]: It only adds the necessary library paths so that pingsender will work correctly on *BSD platforms
[String changes made/needed]: None
Attachment #8912140 - Flags: approval-mozilla-beta?
Comment on attachment 8912140 [details]
Bug 1402966 - Search for libcurl in more paths to support various *BSDs;

Help our bsd friends, taking it.
Should be in 57b4
Attachment #8912140 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
(In reply to Gabriele Svelto [:gsvelto] from comment #18)
> [Is this code covered by automated tests?]: Yes
> [Has the fix been verified in Nightly?]: Yes
> [Needs manual test from QE? If yes, steps to reproduce]: No

Setting qe-verify- based on Gabriele's assessment on manual testing needs and the fact that this fix has automated coverage.
Flags: qe-verify-
You need to log in before you can comment on or make changes to this bug.