Closed Bug 807678 Opened 12 years ago Closed 11 years ago

don't proliferate PRNetAddr type beyond receiving lookups from NSPR

Categories

(Core :: Networking: DNS, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla20

People

(Reporter: jaas, Assigned: jaas)

References

Details

Attachments

(1 file, 14 obsolete files)

Right now we use NSPR to do DNS queries and we propagate the resulting PRNetAddr type around our code base. In the future we are likely to not use NSPR for DNS queries. In anticipation of that we should stop using PRNetAddr and start using a custom address structure that can work easily with any DNS resolution library.
Attached patch fix v0.8 (obsolete) — Splinter Review
Posting here to save my work and possibly get some early feedback.

This compiles and mostly works on OS X, and probably Linux, including Android. I still need to get it working on Windows, there is an IPv6-related TODO documented, and there are some unit test failures I need to look into.
Attachment #683313 - Flags: feedback?(sworkman)
Attached patch fix v0.81 (obsolete) — Splinter Review
A number of compile and bug fixes. Still needs at least one more bug fixed, and it needs to be ported to Windows.
Attachment #683313 - Attachment is obsolete: true
Attachment #683313 - Flags: feedback?(sworkman)
Attachment #686534 - Flags: feedback?(sworkman)
Attached patch fix v0.82 (obsolete) — Splinter Review
More fixes.
Attachment #686534 - Attachment is obsolete: true
Attachment #686534 - Flags: feedback?(sworkman)
Attachment #686580 - Flags: feedback?(sworkman)
Attached patch fix v0.85 (obsolete) — Splinter Review
Fixes a major bug in record iteration that was causing test failures. This should pass try on Mac and Linux, and if so then all I need to do is port to Windows.
Attachment #686580 - Attachment is obsolete: true
Attachment #686580 - Flags: feedback?(sworkman)
Attachment #686646 - Flags: feedback?(sworkman)
Attached patch fix v0.86 (obsolete) — Splinter Review
Another bug fix.
Attachment #686646 - Attachment is obsolete: true
Attachment #686646 - Flags: feedback?(sworkman)
Attachment #687133 - Flags: feedback?(sworkman)
Tests are all green on OS X and Linux with this latest revision.

https://tbpl.mozilla.org/?tree=Try&rev=dc919f2f7c74
Comment on attachment 687133 [details] [diff] [review]
fix v0.86

Review of attachment 687133 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. I have a few minor comments inlined, mainly cleanup stuff. Can you look at the indentation as well? It seems tabs are being expanded to 4 spaces in a lot of places, and not two. Again, though, that's just cleanup.

::: netwerk/base/public/nsINetAddr.idl
@@ +22,5 @@
>      readonly attribute unsigned short family;
>  
>      /**
>       * @return Either the IP address (FAMILY_INET, FAMILY_INET6) or the path
> +     * (FAMILY_LOCAL) in string form.

Minor query: Any reason you removed "IP addresses are in the format produced by PR_NetAddrToString" and did just change it to "NetAddrToString"?

::: netwerk/base/src/nsNetAddr.cpp
@@ +48,2 @@
>      aAddress.SetCapacity(16);
> +    NetAddrToString(&mAddr, aAddress.BeginWriting(), 16);

Eventually, it would probably be good to use a macro or const for these max lengths. (Or maybe even an enum for the different address types, and let NetAddrToString decide how much to write?) But this isn't urgent, and it might be better for revision history to focus on refactoring in this patch instead of making it prettier.

::: netwerk/dns/DNS.h
@@ +36,5 @@
> +    ((a)->pr_s6_addr32[2] == 0) &&    \
> +    ((a)->pr_s6_addr32[3] == 0))
> +
> +namespace mozilla {
> +    namespace net {

This namespace block shouldn't be indented. Same for the code in namespace net itself.

@@ +94,5 @@
> +        // This class wraps a NetAddr union to provide C++ linked list
> +        // capabilities and other methods.
> +        class NetAddrElement : public LinkedListElement<NetAddrElement> {
> +        public:
> +            NetAddrElement(const PRNetAddr *prNetAddr);

Can you add to the class comment to explicitly say that NetAddrElement takes and converts a PRNetAddr to a NetAddr?

::: netwerk/dns/nsHostResolver.cpp
@@ +971,5 @@
>          uint32_t millis = static_cast<uint32_t>(elapsed.ToMilliseconds());
>  
>          // convert error code to nsresult.
>          nsresult status;
> +        AddrInfo *ai = nullptr;

Any reason why AddrInfo *ai can't be declared in the same scope as PRAddrInfo *prai?

::: netwerk/dns/nsHostResolver.h
@@ +69,5 @@
>       */
>      Mutex        addr_info_lock;
>      int          addr_info_gencnt; /* generation count of |addr_info| */
> +    mozilla::net::AddrInfo *addr_info;
> +    mozilla::net::NetAddr  *addr;

Can you typedef these at the top of the class, so they can just be referred to as AddrInfo and NetAddr?

@@ +85,2 @@
>      void   ResetBlacklist();
> +    void   ReportUnusable(mozilla::net::NetAddr *addr);

Nitpick: Since the prototype for Blacklisted is being changed anyway, can you change the indentation to match the subsequent prototypes?

::: netwerk/protocol/http/HttpBaseChannel.h
@@ +185,5 @@
>      nsHttpResponseHead * GetResponseHead() const { return mResponseHead; }
>      nsHttpRequestHead * GetRequestHead() { return &mRequestHead; }
>  
> +    const mozilla::net::NetAddr& GetSelfAddr() { return mSelfAddr; }
> +    const mozilla::net::NetAddr& GetPeerAddr() { return mPeerAddr; }

This is in mozilla::net already, right? So you should be ok writing NetAddr on its own.
Attachment #687133 - Flags: feedback?(sworkman) → feedback+
(In reply to Steve Workman [:sworkman] from comment #7)

> Looks good. I have a few minor comments inlined, mainly cleanup stuff. Can
> you look at the indentation as well? It seems tabs are being expanded to 4
> spaces in a lot of places, and not two. Again, though, that's just cleanup.

I found one place where my indentation did not match existing file style, and another place where someone else had inserted a bunch of tab characters. Aside from that there is just my new code, which uses four-space indentation. I actually prefer two-space (four is a waste, imho), but I don't know what new code in Necko should be using. Right now the module seems to be a mix.
Attached patch fix v0.87 (obsolete) — Splinter Review
Addresses Steve's comments. I used 2-space indentation for new code. Still working on the Windows port.
Attachment #687133 - Attachment is obsolete: true
Attached patch fix v0.88 (obsolete) — Splinter Review
More fixes. Not done with the Windows port yet.
Attachment #687710 - Attachment is obsolete: true
Attached patch fix v1.0 (obsolete) — Splinter Review
Added support for Windows plus various other fixes.
Attachment #688834 - Attachment is obsolete: true
Attachment #690856 - Flags: review?(sworkman)
Blocks: 820391
Attached patch fix v1.1 (obsolete) — Splinter Review
Fix for Windows test failures, some other minor improvements.

https://tbpl.mozilla.org/?tree=Try&rev=b1d12d89559b
Attachment #690856 - Attachment is obsolete: true
Attachment #690856 - Flags: review?(sworkman)
Attachment #691438 - Flags: review?(sworkman)
Comment on attachment 691438 [details] [diff] [review]
fix v1.1

Review of attachment 691438 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. r=me.

::: netwerk/base/src/nsNetAddr.cpp
@@ +34,3 @@
>      *aFamily = nsINetAddr::FAMILY_LOCAL;
>      break;
> +#endif

I assume the need for '#if' is to do with PR_AF_LOCAL having the conditional implicit in its definition?

::: netwerk/dns/DNS.cpp
@@ +116,5 @@
> +  }
> +#endif
> +  return false;
> +}
> +  

Trailing whitespace (cpearce has been strict with me on those - blame him for me nitpicking ;) ).

::: netwerk/dns/DNS.h
@@ +48,5 @@
> +
> +// Required buffer size for text form of an IP address.
> +// Includes space for null termination.
> +const int gIPv4CStrBufSize = 16;
> +const int gIPv6CStrBufSize = 46;

'k' prefix for constants (https://developer.mozilla.org/en-US/docs/Developer_Guide/Coding_Style?redirectlocale=en-US&redirectslug=Mozilla_Coding_Style_Guide#Operators)

::: netwerk/dns/nsHostResolver.cpp
@@ +833,5 @@
>              old_addr_info = rec->addr_info;
>              rec->addr_info = result;
>              rec->addr_info_gencnt++;
>          }
> +        delete old_addr_info;

You could use an nsAutoPtr here, within the braces, declared before MutexAutoLock. That way old_addr_info would be deleted automatically when it goes out of scope. But it's such a quick use anyway that it's not important. Up to you.

@@ +1069,3 @@
>  
>      return PL_DHASH_NEXT;
>  }

Nice reworking of the function.

::: netwerk/dns/nsHostResolver.h
@@ +70,5 @@
>       */
>      Mutex        addr_info_lock;
>      int          addr_info_gencnt; /* generation count of |addr_info| */
> +    mozilla::net::AddrInfo *addr_info;
> +    mozilla::net::NetAddr  *addr;

'typedef mozilla::net::AddrInfo AddrInfo;' at the top of the class definition; same for NetAddr. Then you only need to refer to AddrInfo, NetAddr in the rest of the class definition.
Attachment #691438 - Flags: review?(sworkman) → review+
(In reply to Steve Workman [:sworkman] from comment #14)
> Comment on attachment 691438 [details] [diff] [review]

> ::: netwerk/base/src/nsNetAddr.cpp
> @@ +34,3 @@
> >      *aFamily = nsINetAddr::FAMILY_LOCAL;
> >      break;
> > +#endif
> 
> I assume the need for '#if' is to do with PR_AF_LOCAL having the conditional
> implicit in its definition?

Not sure what you mean. The AF_LOCAL type only exists on Unix and OS/2, so its cases are always ifdef'd to those.

> You could use an nsAutoPtr here, within the braces, declared before
> MutexAutoLock. That way old_addr_info would be deleted automatically when it
> goes out of scope. But it's such a quick use anyway that it's not important.
> Up to you.

The case is so simple it doesn't seem worth it here, but a good idea in general.

> 'typedef mozilla::net::AddrInfo AddrInfo;' at the top of the class
> definition; same for NetAddr. Then you only need to refer to AddrInfo,
> NetAddr in the rest of the class definition.

I was doing that but for some reason it causes a symbol conflict on Windows and won't compile. I couldn't figure out why so I just when back to explicit namespacing. If you don't mind I'd like to leave it this way for now.
Attached patch fix v1.2 (obsolete) — Splinter Review
Fixes in response to Steve's comments. Thanks for the review Steve!
Attachment #691438 - Attachment is obsolete: true
Comment on attachment 691747 [details] [diff] [review]
fix v1.2

Patch needs Necko peer review, Patrick - can you either review this or delegate to Steve?
Attachment #691747 - Flags: review?(mcmanus)
Attachment #691747 - Flags: review?(mcmanus) → review?(sworkman)
Comment on attachment 691747 [details] [diff] [review]
fix v1.2

Steve already did the review, so I think we're good here. Thanks.
Attachment #691747 - Flags: review?(sworkman)
Depends on: 821685
Attached patch fix v1.3 (obsolete) — Splinter Review
Fixes a threading issue in debug-only code, and a bug in Windows IP->string code. The former was probably responsible for the tbox hangs.

https://tbpl.mozilla.org/?tree=Try&rev=08a7851f7c2a
Attachment #691747 - Attachment is obsolete: true
Comment on attachment 694016 [details] [diff] [review]
fix v1.3

Review of attachment 694016 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. Just confirm you need to lock the whole nsHostRecord in nsDNSService2.cpp, please.

::: netwerk/dns/DNS.cpp
@@ +35,5 @@
> +    s.sin6_family = AF_INET6;
> +    memcpy(&s.sin6_addr, src, sizeof(struct in_addr6));
> +    int result = getnameinfo((struct sockaddr *)&s, sizeof(struct sockaddr_in6),
> +                             dst, size, nullptr, 0, NI_NUMERICHOST);
> +    if (result == 0) {

This is the Windows IP->String bug you mentioned, right? Assuming handling the return code fixes it - looks good.

::: netwerk/dns/nsDNSService2.cpp
@@ +98,2 @@
>  
> +    MutexAutoLock lock(mHostRecord->addr_info_lock);

This looks fine - locking addr_info before the 'if' conditional is needed. But I'm wondering about the fact that we're essentially locking the whole nsHostRecord until the function completes. Do you still see the threading issue if the Unlocks stay where they are (and you call Lock directly). In other words, is the problem accessing addr_info outside a lock, or the rest of the nsHostRecord?
Attached patch fix v1.4 (obsolete) — Splinter Review
I didn't scope the lock for the debug code properly in the last patch. This fixes it and all tests pass.
Attachment #694016 - Attachment is obsolete: true
(In reply to Steve Workman [:sworkman] from comment #22)

> This is the Windows IP->String bug you mentioned, right? Assuming handling
> the return code fixes it - looks good.

Yeah, that's the bug.

> This looks fine - locking addr_info before the 'if' conditional is needed.
> But I'm wondering about the fact that we're essentially locking the whole
> nsHostRecord until the function completes. Do you still see the threading
> issue if the Unlocks stay where they are (and you call Lock directly). In
> other words, is the problem accessing addr_info outside a lock, or the rest
> of the nsHostRecord?

I had code that was a bit more specific about the locking but it was much uglier given the way the method is structured. I figured just locking the whole method was probably fine, nothing in there takes much time. I'll see if I can find something not-ugly that locks less though.
Attached patch fix v1.5 (obsolete) — Splinter Review
Locks less of getNextAddr.
Attachment #694376 - Attachment is obsolete: true
Comment on attachment 694417 [details] [diff] [review]
fix v1.5

Review of attachment 694417 [details] [diff] [review]:
-----------------------------------------------------------------

r=me

::: netwerk/dns/nsDNSService2.cpp
@@ +134,5 @@
>              return NS_ERROR_NOT_AVAILABLE;
>          }
>      }
>      else {
>          mHostRecord->addr_info_lock.Unlock();

Looks good - thanks for doing that.
Attachment #694417 - Flags: review+
Backed out again.

http://hg.mozilla.org/integration/mozilla-inbound/rev/76c18ac8fb97

The strange WinXP debug xpcshell test hang is back:

https://tbpl.mozilla.org/php/getParsedLog.php?id=18140897&tree=Mozilla-Inbound&full=1

And I got some email about memory:

Talos Regression :( Trace Malloc Allocs increase 0.736% on WINNT 5.2 Mozilla-Inbound
Previous: avg 446261.533 stddev 1278.822 of 30 runs up to revision 73ce12dc950b
New     : avg 449547.000 stddev 363.118 of 5 runs since revision ec4e88c2c77a
Change  : +3285.467 (0.736% / z=2.569)

Talos Regression :( Trace Malloc Leaks increase 0.168% on CentOS (x86_64) release 5 (Final) Mozilla-Inbound
Previous: avg 793385.500 stddev 68.142 of 30 runs up to revision 73ce12dc950b
New     : avg 794717.800 stddev 25.675 of 5 runs since revision 487d0f89749b
Change  : +1332.300 (0.168% / z=19.552)
Attached patch fix v1.6 (obsolete) — Splinter Review
Auditing memory usage was productive. Fixed two instances of the wrong memory free call (free->delete, and free->moz_free), one memory leak I introduced, and one memory leak that I think was already there. I filed bug 823837 on the pre-existing memory leak.
Attachment #694417 - Attachment is obsolete: true
I used to be able to re-reproduce a hang in xpcshell tests in a Win 7 vm, but I fixed that hang and I can't reproduce the hang that still exists in Win debug xpcshell tests. I have no more leads to go on, I may need to find a way to get access to a debugger on a tbox machine.
Attached patch fix v1.7Splinter Review
I have what appears to be a fix for the Win debug purple. I tracked it down to the same debug code that used to have a locking problem. All that code does is print out a list of host entries being cleared. Apparently the purple was caused by having the debug code remove address entries from the host record instead of just iterating over them. This patch just iterates over the entries, like the old code did.
Attachment #694722 - Attachment is obsolete: true
Hopefully it sticks this time...

http://hg.mozilla.org/integration/mozilla-inbound/rev/7f5fad93ef78
Depends on: 824459
https://hg.mozilla.org/mozilla-central/rev/7f5fad93ef78
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla20
Blocks: 823837
Depends on: 825404
Depends on: 825492
Blocks: 826455
Depends on: 804605
Depends on: 857291
Depends on: 870652
You need to log in before you can comment on or make changes to this bug.