Closed Bug 1040280 Opened 10 years ago Closed 8 years ago

Create and use getaddr2 to get TTL on Firefox OS.

Categories

(Core :: Networking: DNS, defect)

x86
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: johnsullivan.pem, Unassigned, Mentored)

References

Details

Attachments

(2 files, 2 obsolete files)

      No description provided.
Blocks: 151929
Attachment #8458195 - Flags: review?(sworkman)
Attachment #8458196 - Flags: review?(sworkman)
Assignee: nobody → josullivan
Comment on attachment 8458195 [details] [diff] [review]
Patch to gecko to use getaddrinfo2 if available.

Review of attachment 8458195 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/dns/GetAddrInfo.cpp
@@ +22,5 @@
>  #define LOG(args)
>  #endif
>  
> +#if GETADDRINFO_PLATFORM_ID == 2 // FirefoxOS
> +#include "netdb.h"

Looks like this is already included in GetAddrInfo.h.

@@ +58,5 @@
> +
> +  addrinfo2* result;
> +  int rv = getaddrinfo2(aHost, NULL, &hints, &result);
> +  if (rv != 0) {
> +    return NS_ERROR_FAILURE;

Log this failure return code, and NS_WARN_IF.

@@ +76,5 @@
> +      addr.inet.ip = ((sockaddr_in *) cur->ai_addr)->sin_addr.s_addr;
> +
> +      // We don't have anything meaningful to use as the port so we just use 0.
> +      // This is consistent with the PR platform.
> +      addr.inet.port = 0;

Is there enough similarity here among the three functions to have a NetAddr constructor or Create function for inet and inet6?

@@ +87,5 @@
> +      addr.inet6.family = AF_INET6;
> +      memcpy(&addr.inet6.ip,
> +             &((sockaddr_in6 *) cur->ai_addr)->sin6_addr.s6_addr,
> +             sizeof(addr.inet6.ip.u8));
> +      addr.inet.port = 0;

inet6 (same as Windows code ;) ).

@@ +283,5 @@
>  }
> +#else
> +// Slight future proofing in case anyone forgets to conditionally include their
> +// code.
> +#error No implementation for platform.

I think we can skip the #define for the fallback/default, i.e. PR_GetAddrInfoByName - let's just always use it if DnsQuery or getaddrinfo2 are not going to be used.

::: netwerk/dns/GetAddrInfo.h
@@ +7,5 @@
>  #ifndef netwerk_dns_GetAddrInfo_h
>  #define netwerk_dns_GetAddrInfo_h
>  
> +#include "nsError.h"
> + 

Whitespace.

@@ +9,5 @@
>  
> +#include "nsError.h"
> + 
> +// Necessary to include here to check if getaddrinfo2 is available
> +#include "netdb.h"

For now, we know that getaddrinfo2 is FxOS only, so we can restrict this inclusion to #ifdef ANDROID, and then do the #ifdef GETADDRINFO2_AVAILABLE.

@@ +12,5 @@
> +// Necessary to include here to check if getaddrinfo2 is available
> +#include "netdb.h"
> +
> +#if defined GETADDRINFO2_AVAILABLE
> +#define GETADDRINFO_PLATFORM_ID 2 // FirefoxOS

Rather than defining platform IDs here, let's call these USE_GETADDRINFO2, USE_DNS_QUERY and USE_PR_GETADDRINFOBYNAME. Possibly we can rename the functions as well - GetAddrInfo_getaddrinfo2, GetAddrInfo_DnsQuery.
Comment on attachment 8458196 [details] [diff] [review]
Patch to bionic's libc to add getaddrinfo2

Review of attachment 8458196 [details] [diff] [review]:
-----------------------------------------------------------------

Changes look good. We just need to figure out how to package them...

::: libc/include/netdb.h
@@ +132,5 @@
> +	struct	addrinfo2 *ai_next;	/* next structure in linked list */
> +	/* the above fields should match adrrinfo's fields */
> +
> +	int ai2_ttl; /* always positive */
> +};

Can you #define the first part of this that matches addrinfo to ensure that they're both the same?

s/int ai2_ttl/unsigned int ttl/

::: libc/netbsd/net/getaddrinfo.c
@@ +318,5 @@
>  
>  void
> +freeaddrinfo2(struct addrinfo2 *ai)
> +{
> +	freeaddrinfo((struct addrinfo *) ai);

Tabs to spaces. Check the whole file.

Add a comment to say that we're relying on C free here (or whatever is appropriate).

@@ +596,5 @@
> +	}
> +
> +	// The downcast is safe here because res is only used as an output param
> +	return getaddrinfo2(hostname, servname, hints2, (struct addrinfo2 **) res);
> +}

Tabs to spaces...

@@ +1348,5 @@
>  	char *canonname;
>  	const HEADER *hp;
>  	const u_char *cp;
>  	int n;
> +	int ttl;

Yay!

@@ +1501,4 @@
>  			/* don't overwrite pai */
>  			ai = *pai;
>  			ai.ai_family = (type == T_A) ? AF_INET : AF_INET6;
> +			ai.ai2_ttl = ttl;

Yay!

@@ +1868,4 @@
>  
>  /*ARGSUSED*/
>  static void
> +_rfc6724_sort(struct addrinfo2 *list_sentinel)

Maybe we can copy this into the Windows code for A/AAAA merging?
Attachment #8458196 - Flags: review?(sworkman) → review+
https://github.com/mozilla-b2g/platform_bionic - I think we might be able to use this repo. Check it out and see if it's right.
It's about 6.5k commits behind upstream, but we could update it. Once we decide on a repo we'll have to update what the submodule in B2G is pointing at though.
Yeah, I took a deeper look, and I'm not convinced it's the right one. Let's keep looking.
Hi Michael! Can you give us some advice on how we could/should patch bionic? John is hacking on bionic's getaddrinfo implementation, adding a getaddrinfo2 that will provide DNS TTL for Necko's DNS cache. It looks like we fetch from codeaurora.org: Can we fork this? Should we commit upstream? Basically we're not sure what way to proceed here. Any advice?
Flags: needinfo?(mwu)
You don't really need to hack bionic if you're going to be working on a custom getaddrinfo. At that point, you might as well fork everything and build our own getaddrinfo on B2G and Android.

We actually already have our own resolver fork in other-licenses/android. This is used on Android and used to be used on B2G.

I think Android actually wants to provide a dns cache, but I don't know if they've actually added one in any version. In the cases where we know there's a dns cache, can we get away with a ttl of 30 and just rely on the OS? In newer versions of bionic, dns calls actually get passed through to a daemon that does the actual work, which could do caching in theory.
Flags: needinfo?(mwu)
Michael, thanks for the response. Forking seems like the smart approach.

For FxOS, forking and having our own getaddrinfo2 seems like a good idea.

For Android, it could be beneficial to share the DNS cache daemon with other apps, but it would be nice to be consistent across platforms with Necko's DNS cache. We're working elsewhere to refresh the cache at network change events (e.g. switching to and from VPN), and we might lose a bit of control with the DNS cache if we were to rely on the OS cache too much. I'm not sure how much work it would be, but version detection might muddy things too. But it would be nice to share cache hostname resolutions with other Android apps ... 

John, can you do a quick review of Android's OS cache implementation with respect to the size of the cache, as well as how (or if) it deals with ttl and expiration, please? I'm definitely leaning towards forking and getting ttl in Necko's cache, but it would be good to those details about the OS cache as well.
> John, can you do a quick review of Android's OS cache implementation with respect to
> the size of the cache, as well as how (or if) it deals with ttl and expiration, please?
> I'm definitely leaning towards forking and getting ttl in Necko's cache, but it would
> be good to those details about the OS cache as well.

Definitely, that sounds like a good idea. I'll take a look.
I took a look, here's what I found:

Overview of Android's Cache Implementation
------------------------------------------

Bionic takes care of allocating and maintaining the cache. When you call `getaddrinfo`, bionic will send a command to the netd daemon who will then either serve you the response from its cache or make the DNS query on your behalf [1].

TTL and Expiration
------------------

Bionic will pull out the *smallest* TTL from the answer [2], and use that to properly set the expiration time of the cache entry [3]. If the answer was negative (no entries found), it will properly set the expiration according to RFC-2308 [4][5]. If a parsing error occurs, the answer just isn't cached [2].

Size of the Cache
-----------------

The number of entries the cache will store is 640 [6]. This is not configurable [7]. If the cache is full, all expired entries will be removed, if it's still full the oldest record will be blindly removed to make room for the new one [8].

Extra Considerations
--------------------

It seems unlikely that we'd want to do this, but it's interesting to note that we can easily trick Bionic into thinking that we're the proxy (therefore keeping the cache in-process and not shipping off to the proxy) [9].

> [1] Relevant code: https://android.googlesource.com/platform/system/netd/+/master/DnsProxyListener.cpp#125
> [2] Relevant code: https://github.com/android/platform_bionic/blob/ab4fc82315567a1400bb25af3f835f1b5d80a0fe/libc/dns/resolv/res_cache.c#L1050
> [3] Relevant code: https://github.com/android/platform_bionic/blob/ab4fc82315567a1400bb25af3f835f1b5d80a0fe/libc/dns/resolv/res_cache.c#L1727
> [4] Relevant code: https://github.com/android/platform_bionic/blob/ab4fc82315567a1400bb25af3f835f1b5d80a0fe/libc/dns/resolv/res_cache.c#L1002
> [5] SO post about negative caching time: http://serverfault.com/a/426810
> [6] Relevant code: https://github.com/android/platform_bionic/blob/ab4fc82315567a1400bb25af3f835f1b5d80a0fe/libc/dns/resolv/res_cache.c#L138
> [7] Relevant code: https://github.com/android/platform_bionic/blob/ab4fc82315567a1400bb25af3f835f1b5d80a0fe/libc/dns/resolv/res_cache.c#L1370
> [8] Relevant code: https://github.com/android/platform_bionic/blob/ab4fc82315567a1400bb25af3f835f1b5d80a0fe/libc/dns/resolv/res_cache.c#L1708
> [9] Relevant code (set this to "local"): https://github.com/android/platform_bionic/blob/ab4fc82315567a1400bb25af3f835f1b5d80a0fe/libc/dns/net/getaddrinfo.c#L602
(In reply to josullivan from comment #12)
> I took a look, here's what I found:

Awesome research, John. Thanks loads. I did some reading in order to determine when these things were added and what versions they're in, in order to determine how many devices have them.

> Overview of Android's Cache Implementation
> ------------------------------------------
> 
> Bionic takes care of allocating and maintaining the cache. When you call
> `getaddrinfo`, bionic will send a command to the netd daemon who will then
> either serve you the response from its cache or make the DNS query on your
> behalf [1].

The system-wide cached was added on Jan 25th 2012, for v4.1.1. According to [10], that means approximately 25% of Android devices out there do not share the cache among all apps.

> TTL and Expiration
> ------------------
> 
> Bionic will pull out the *smallest* TTL from the answer [2], and use that to
> properly set the expiration time of the cache entry [3]. If the answer was
> negative (no entries found), it will properly set the expiration according
> to RFC-2308 [4][5]. If a parsing error occurs, the answer just isn't cached
> [2].

TTL support was added Feb 2nd 2011, for v4.0.1. According to [10], that means approx. 13.5% are using the estimated TTL.

> Size of the Cache
> -----------------
> 
> The number of entries the cache will store is 640 [6]. This is not
> configurable [7]. If the cache is full, all expired entries will be removed,
> if it's still full the oldest record will be blindly removed to make room
> for the new one [8].

This was increased at the same time as the system-wide cache was added. 640 refers to IPv4 entries, as far as I can tell, with a comment suggesting that IPv6 requires 2x as much space (I think that should be 4x, but in any case, it seems like a space limitation rather than number of elements). So, again, approx. 25% of the current market do not have this shared cache.

> Extra Considerations
> --------------------
> 
> It seems unlikely that we'd want to do this, but it's interesting to note
> that we can easily trick Bionic into thinking that we're the proxy
> (therefore keeping the cache in-process and not shipping off to the proxy)
> [9].

Yeah - I don't think we want to do this. And we'd have to set a system-wide env var. Whatever - we don't want to be responsible for all DNS caching, so let's not go there :)


Ok, so that is nice data; here are my thoughts about depending more on the Android OS cache (e.g. reducing Necko's DNS cache expiration to something very small so that we go to the OS cache far more often than not):

- 25% of the market would have a very small (64 IPv4 entries) cache.
- 25% would not benefit from the shared cache.
--> That 25% would be hitting the network far more often if we were to rely on the OS cache.

- 13% would not have ttl.

- BUT, 75% would have ttl in the OS cache and a decent-sized cache (although still smaller than Necko's) that's shared among all apps (or at least all those built with API level 16 and above).


I'm kind of averse to having code that adapts to different versions here. It can be harder to maintain, and without doing a major rewrite that TRULY depends on the newer shared cache, we'd have a hacky solution in Necko. I'd also like features like NetworkLinkService, which clears the cache on network changes, to be a bit more immune from behavioral variances bubbling up (although it has a load of platform dependencies itself, but a big chunk of common code nonetheless).

I'm also still hopeful that we can get one packaged version that lives in mozilla-central and is used in both Android and FxOS (and *maybe* MacOSX and Linux?). That would be very nice for maintenance.

Still, we can open a new bug as a reminder to revisit this for Android and FxOS in the future. Maybe by that stage getaddrinfo2 will be upstreamed? ;)

So, my preference is the forked version for now; revisit shared implementation later.

[10] Android Fragmentation (results taken from 7 days ending July 7th 2014): https://developer.android.com/about/dashboards/index.html
My last day at Mozilla is tomorrow and I was never able to get back to this. Bug 820391 and bug 1067679 took priority.

I was working on copying the pertinent parts of bionic's code into our codebase to ensure getaddrinfo2 was available to all Android-based platforms (including Firefox OS). This proved to be quite tricky though, and I ended up having to bring in loads and loads of files, as well as doing some visibility hacks here and there to access private parts of libc.

When someone picks this up again, they should find the first patch I will upload in a moment to be useful. This patch simply brings over the files in Bionic that Steve Workman pulled in before. *At least* these files will need to be included. These file also include the changes I made to add `getaddrinfo2()` (which are tested and seemingly functional). This patch will not build however.

The second patch should be much less useful. It has the horrible hackery I did to make it build, and it'll probably be easier for you to begin afresh. This patch will build, but may not run well.
Attachment #8458195 - Attachment is obsolete: true
Attachment #8458196 - Attachment is obsolete: true
Attachment #8458195 - Flags: review?(sworkman)
Assignee: josullivan → nobody
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: