Last Comment Bug 308587 - PR_GetPhysicalMemorySize returns garbage on systems with more than 2GB RAM
: PR_GetPhysicalMemorySize returns garbage on systems with more than 2GB RAM
Status: RESOLVED FIXED
: fixed1.8
Product: NSPR
Classification: Components
Component: NSPR (show other bugs)
: other
: x86 Linux
: -- critical (vote)
: 4.6.1
Assigned To: Darin Fisher
: Wan-Teh Chang
:
Mentors:
Depends on:
Blocks: 295074
  Show dependency treegraph
 
Reported: 2005-09-14 18:40 PDT by Darin Fisher
Modified: 2006-03-12 18:55 PST (History)
4 users (show)
asa: blocking1.8b5+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
v1 patch (944 bytes, patch)
2005-09-14 18:43 PDT, Darin Fisher
no flags Details | Diff | Splinter Review
v2 patch (2.95 KB, patch)
2005-09-15 18:14 PDT, Darin Fisher
no flags Details | Diff | Splinter Review
v2.1 patch (3.20 KB, patch)
2005-09-15 18:56 PDT, Darin Fisher
wtc: review+
asa: approval1.8b5+
Details | Diff | Splinter Review

Description Darin Fisher 2005-09-14 18:40:37 PDT
PR_GetPhysicalMemorySize returns garbage on systems with more than 2GB RAM

I observed this with 32-bit x86-linux on a machine with 3GB RAM.

PR_GetPhysicalMemorySize returns 18446744072724996096.  The problem is with this
code:

    long pageSize = sysconf(_SC_PAGESIZE);
    long pageCount = sysconf(_SC_PHYS_PAGES);
    LL_I2L(bytes, pageSize * pageCount);

sizeof(long) is 4, so on my system "pageSize * pageCount" is a negative value. 
LL_I2L casts the negative value to a PRInt64, which ends up producing the bogus
value.

This bug ends up causing Firefox to think that it has a zero-sized memory cache
on my system, which obviously degrades performance.

I'll attach a candidate patch shortly.
Comment 1 Darin Fisher 2005-09-14 18:43:39 PDT
Created attachment 196105 [details] [diff] [review]
v1 patch

Here's a simple patch that solves the problem.	It's interesting that LL_UI2L
is actually identical to LL_I2L, so it is not enough to change the macro.  I
need the cast to force the correct result.  I decided to change the macro for
clarity only.
Comment 2 Darin Fisher 2005-09-14 18:44:52 PDT
I think it is crucial that this fix go into Firefox 1.5.
Comment 3 Wan-Teh Chang 2005-09-14 18:56:03 PDT
Comment on attachment 196105 [details] [diff] [review]
v1 patch

> #if defined(LINUX) || defined(SOLARIS)
> 
>     long pageSize = sysconf(_SC_PAGESIZE);
>     long pageCount = sysconf(_SC_PHYS_PAGES);
>-    LL_I2L(bytes, pageSize * pageCount);
>+    LL_UI2L(bytes, (unsigned long) pageSize * pageCount);
> 
> #elif defined(HPUX)

I think it's better to declare pageSize and pageCount
as PRUint64:

      PRUint64 pageSize, pageCount;
      LL_I2L(pageSize, sysconf(_SC_PAGESIZE));
      LL_I2L(pageCount, sysconf(_SC_PHYS_PAGES));
      LL_MUL(bytes, pageSize, pageCount);

This is for the case of a 32-bit OS running on a system
with more than 4GB RAM.

Could you test this?
Comment 4 Darin Fisher 2005-09-14 19:00:16 PDT
Yeah, I like that better.  I will try it now.
Comment 5 Darin Fisher 2005-09-14 19:05:28 PDT
Wan-Teh: Yeah, that patch works fine.  However, on platforms that do not have a
native 64-bit integer type, we would end up evaluating sysconf twice per LL_I2L
invocation.  So, temporary |long| variables are warranted.
Comment 6 Wan-Teh Chang 2005-09-14 19:13:04 PDT
Sigh.  You can just use regular arithmetic operators to
write that code.

    PRUint64 pageSize = sysconf(_SC_PAGESIZE);
    PRUint64 pageCount = sysconf(_SC_PHYS_PAGES);
    bytes = pageSize * pageCount;

It's 2005 now.  No need to use the LL_ macros.
Comment 7 Julien Pierre 2005-09-14 19:31:32 PDT
I agree with Wan-Teh. We should not support compilers that don't offer a long
long type anymore. I believe there are already several places in NSS which break
those compilers, and no one has been complaining.
Comment 8 timeless 2005-09-15 00:10:37 PDT
arg. please don't. some of us plan to deal with those edge cases.
Comment 9 Wan-Teh Chang 2005-09-15 09:58:32 PDT
Comment on attachment 196105 [details] [diff] [review]
v1 patch

We should also review the implementations for the other
platforms to see if they have the same overflow problem.

The code for HPUX also multiplies two 32-bit numbers
(in 32-bit HP-UX), so it needs to be fixed the same way.
(The type of info.physical_memory and info.page_size
is _T_LONG_T, which is either int32_t or int64_t depending
on whether the HP-UX OS is 32-bit or 64-bit.)  The code for
other platforms looks fine.

In the code for AIX, we have
    LL_MUL(bytes, kbytes, 1024);
It is wrong to use a constant in the LL_MUL macro.

timeless: please name a compiler you need to support that
doesn't have a 64-bit integer type (which doesn't need
to be called "long long").  All the compilers NSPR support
today have a 64-bit integer type.  Our portability guidelines
need to be reviewed from time to time, and I think it is
safe to say goodbye to the LL_ macros now.
Comment 10 Darin Fisher 2005-09-15 10:15:09 PDT
Wan-Teh: Ok, I'll take a crack at putting together a better patch shortly.
Comment 11 Darin Fisher 2005-09-15 18:14:55 PDT
Created attachment 196269 [details] [diff] [review]
v2 patch

Let me know if you want a more minimal patch.  I just decided to remove the LL_
macros from all #ifdef'd sections.
Comment 12 Wan-Teh Chang 2005-09-15 18:32:52 PDT
Comment on attachment 196269 [details] [diff] [review]
v2 patch

You missed this LL_ macro:

>     PRUint64 bytes = LL_ZERO;

In the following two changes:

>-    LL_I2L(bytes, pageSize * pageCount);
>+    bytes = (PRUint64) pageSize * pageCount;

and

>-        LL_I2L(bytes, info.physical_memory * info.page_size);
>+        bytes = (PRUint64) info.physical_memory * info.page_size;

I assume the (PRUint64) applies to the first operand rather
than the product (i.e., the typecast binds more tightly than
the * operator), and you rely on the automatic promotion of
the second operand to PRUint64.  Correct?
Comment 13 Darin Fisher 2005-09-15 18:49:48 PDT
> You missed this LL_ macro:
> 
> >     PRUint64 bytes = LL_ZERO;

Yeah, that was intentional.  I only changed code inside the #ifdef's, but I'll
go ahead and change that as well.


> I assume the (PRUint64) applies to the first operand rather
> than the product (i.e., the typecast binds more tightly than
> the * operator), and you rely on the automatic promotion of
> the second operand to PRUint64.  Correct?

Yes, exactly.  If you prefer, I will make the casting more explicit.
Comment 14 Darin Fisher 2005-09-15 18:56:43 PDT
Created attachment 196273 [details] [diff] [review]
v2.1 patch

BTW, I tested this patch on Linux-x86 (3 GB RAM), OSX-ppc (1 GB RAM), and WinXP
(1 GB RAM).
Comment 15 Darin Fisher 2005-09-16 18:41:43 PDT
Comment on attachment 196273 [details] [diff] [review]
v2.1 patch

I think we should fix this for Firefox 1.5
Comment 16 Darin Fisher 2005-09-16 18:47:05 PDT
fixed on NSPR trunk and NSPRPUB_PRE_4_2_CLIENT_BRANCH.

I assume that we will want to also land this on the following branches:
MOZILLA_1_8_BRANCH and NSPR_4_6_BRANCH.

I will await explicit approval before doing so.
Comment 17 Wan-Teh Chang 2005-09-16 23:32:42 PDT
Yes, MOZILLA_1_8_BRANCH and NSPR_4_6_BRANCH should
be kept in sync for as long as possible.
Comment 18 Darin Fisher 2005-09-19 11:04:55 PDT
fixed1.8, fixed-nspr4.6

Note You need to log in before you can comment on or make changes to this bug.