The default bug view has changed. See this FAQ.

Disable nice on systems that have multiple CPUs.

RESOLVED FIXED in mozilla7

Status

()

Core
Widget: Android
RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: dougt, Assigned: gcp)

Tracking

Trunk
mozilla7
All
Linux
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: QA?)

Attachments

(2 attachments, 6 obsolete attachments)

(Reporter)

Description

6 years ago
On Mobile, we can and do tweak the nice value of the child process:

http://mxr.mozilla.org/mozilla-central/source/dom/ipc/ContentParent.cpp#217

However, this doesn't make much sense and can be a perf hit on devices that have multiple cores.

Lets change this so that on devices with multiple cores we stop doing this nicing.  We may also want to experiment with cpu scheduling.

Note:

I do not think that

`cat /proc/cpuinfo | grep processor | wc -l`

will work on all devices.  I have a tablet that has multiple cores which does not show more than one processor line.  Maybe that is a bug on the device, not sure.

Maybe we can try something like calling sched_getaffinity and seeing what is set.
(Reporter)

Updated

6 years ago
Assignee: nobody → doug.turner
`adb shell cat /proc/cpuinfo | grep  -i processor | wc -l` gives correct values on my atrix and nexus one, can you test on your device?
(Reporter)

Updated

6 years ago
Assignee: doug.turner → gpascutto
(Assignee)

Comment 2

6 years ago
On my Galaxy S2 /proc/cpuinfo seems to be correct as well, i.e. has 2 processor lines.

Using mfinkel's test runner, with nice:
Fennec 7.0a1,GT-I9100,20110615011310,3486,AVERAGE_TP
Fennec 7.0a1,GT-I9100,20110615011310,3119,AVERAGE_TP
Fennec 7.0a1,GT-I9100,20110615011310,4274,AVERAGE_TP

without nice:
Fennec 7.0a1,GT-I9100,20110615003204,3948,AVERAGE_TP
Fennec 7.0a1,GT-I9100,20110615003204,3505,AVERAGE_TP
Fennec 7.0a1,GT-I9100,20110615020648,3823,AVERAGE_TP

Variance looks a bit big to say a lot, but it doesn't look significantly different. I'll check also with nice=20 as asked by dougt and code up a patch.
(Assignee)

Comment 3

6 years ago
Created attachment 539597 [details] [diff] [review]
Patch v1
Attachment #539597 - Flags: review?(doug.turner)
(Reporter)

Comment 4

6 years ago
Comment on attachment 539597 [details] [diff] [review]
Patch v1

I think the best way to expose the CPU count is through system info.  This would allow anyone (including chrome script) to figure out how many CPUs there are.  Could you look at doing that:

http://mxr.mozilla.org/mozilla-central/source/xpcom/base/nsSystemInfo.cpp
Attachment #539597 - Flags: review?(doug.turner) → review-
(Assignee)

Comment 5

6 years ago
Looks like it's already there:
http://mxr.mozilla.org/mozilla-central/source/xpcom/base/nsSystemInfo.cpp#92

but it's checking the amount online instead of present, so it will go wrong with powersaving:
http://mxr.mozilla.org/mozilla-central/source/nsprpub/pr/src/misc/prsystem.c#259

I'll change the behavior of the latter and rework the patch to use it instead.
(Assignee)

Comment 6

6 years ago
Darn, Android's libc gets it wrong:

android/bionic/libc/unistd/sysconf.c:

static int
__get_nproc_conf(void)
...
    if (line_parser_init(parser, "/proc/cpuinfo") < 0)
...

Will have to put the correct code into prsystem.c directly.
(Assignee)

Comment 7

6 years ago
Created attachment 539697 [details] [diff] [review]
Patch v2 (1/2)
Attachment #539597 - Attachment is obsolete: true
Attachment #539697 - Flags: review?(doug.turner)
(Assignee)

Comment 8

6 years ago
Created attachment 539698 [details] [diff] [review]
Patch v2 (2/2)
Attachment #539698 - Flags: review?(doug.turner)
(Reporter)

Comment 9

6 years ago
Comment on attachment 539697 [details] [diff] [review]
Patch v2 (1/2)

Review of attachment 539697 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/ipc/ContentParent.cpp
@@ +223,5 @@
> +        /* make the GUI thread have higher priority on single-cpu devices */
> +        nsresult rv;
> +        nsCOMPtr<nsIPropertyBag2> infoService =
> +            do_GetService(NS_SYSTEMINFO_CONTRACTID, &rv);
> +        if (NS_SUCCEEDED(rv)) {

nsCOMPtr<nsIPropertyBag2> infoService = do_GetService(NS_SYSTEMINFO_CONTRACTID)
if (infoService) {
  ...


There is no real need to deal with the nsresult

@@ +226,5 @@
> +            do_GetService(NS_SYSTEMINFO_CONTRACTID, &rv);
> +        if (NS_SUCCEEDED(rv)) {
> +            PRInt32 cpus;
> +            NS_NAMED_LITERAL_STRING(cpucount, "cpucount");
> +            rv = infoService->GetPropertyAsInt32(cpucount, &cpus);

infoService->GetPropertyAsInt32(cpucount, &cpus);


infoService->GetPropertyAsInt32(NS_LITERAL_STRING("cpucount"), &cpus);

no need for the named literal.

@@ +228,5 @@
> +            PRInt32 cpus;
> +            NS_NAMED_LITERAL_STRING(cpucount, "cpucount");
> +            rv = infoService->GetPropertyAsInt32(cpucount, &cpus);
> +            if (NS_SUCCEEDED(rv)) {
> +                if (nice != 0 && cpus == 1) {

What happens if GetPropertyAsInt32 fails -- does that mean we don't ever setpriority?  If GetPropertyAsInt32 fails, maybe we just set cpus to 1, and just do the existing logic.
Attachment #539697 - Flags: review?(doug.turner) → review-
(Reporter)

Comment 10

6 years ago
Comment on attachment 539698 [details] [diff] [review]
Patch v2 (2/2)

Review of attachment 539698 [details] [diff] [review]:
-----------------------------------------------------------------

::: nsprpub/pr/src/misc/prsystem.c
@@ +259,5 @@
>  #elif defined(XP_UNIX)
> +   /* for the benefit of devices with advanced power-saving, that
> +      actually hotplug their cpus in heavy load, try to figure out
> +      the real number of CPUs */
> +   char buf[512];

Maybe a #define for 512, then use that in the call to fgets() below.
Attachment #539698 - Flags: review?(ted.mielczarek)
Attachment #539698 - Flags: review?(doug.turner)
Attachment #539698 - Flags: review+
(Reporter)

Comment 11

6 years ago
ted needs to review the nspr bits  -- he is a module peer of nspr.
(Assignee)

Comment 12

6 years ago
Created attachment 539852 [details] [diff] [review]
Patch v3 (1/2)
Attachment #539697 - Attachment is obsolete: true
Attachment #539698 - Attachment is obsolete: true
Attachment #539698 - Flags: review?(ted.mielczarek)
Attachment #539852 - Flags: review?(doug.turner)
(Assignee)

Comment 13

6 years ago
Created attachment 539853 [details] [diff] [review]
Patch v3 (2/2)

Some small fixes/cleanups in addition to the comments from dougt.
Attachment #539853 - Flags: review?(doug.turner)
(Reporter)

Comment 14

6 years ago
Comment on attachment 539852 [details] [diff] [review]
Patch v3 (1/2)

i think you want to drop the rv from the call to do_GetService().  otherwise, looks good.
Attachment #539852 - Flags: review?(doug.turner) → review+
(Reporter)

Comment 15

6 years ago
first part w/ the rv correction:

http://hg.mozilla.org/mozilla-central/rev/35a679df430b
(Reporter)

Comment 16

6 years ago
backed out due to bustage.
(Assignee)

Comment 17

6 years ago
Created attachment 539977 [details] [diff] [review]
Patch v4 (1/2)
Attachment #539852 - Attachment is obsolete: true
Attachment #539853 - Attachment is obsolete: true
Attachment #539853 - Flags: review?(doug.turner)
Attachment #539977 - Flags: review?(doug.turner)
(Assignee)

Comment 18

6 years ago
Created attachment 539978 [details] [diff] [review]
Patch v4 (2/2)
Attachment #539978 - Flags: review?(ted.mielczarek)
(Reporter)

Comment 19

6 years ago
Comment on attachment 539977 [details] [diff] [review]
Patch v4 (1/2)

less buildfail now.
Attachment #539977 - Flags: review?(doug.turner) → review+
(Reporter)

Comment 20

6 years ago
http://hg.mozilla.org/integration/mozilla-inbound/rev/0be95155a3ef
http://hg.mozilla.org/mozilla-central/rev/0be95155a3ef
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla7
(Assignee)

Comment 22

6 years ago
Reopening for part 2: it will not work correctly on some Android devices without more advanced CPU-count detection in NSPR.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment on attachment 539978 [details] [diff] [review]
Patch v4 (2/2)

Review of attachment 539978 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with a few changes. If you attach an updated patch I can land it in NSPR CVS for you.

::: nsprpub/pr/src/misc/prsystem.c
@@ +262,5 @@
> +      actually hotplug their cpus in heavy load, try to figure out
> +      the real number of CPUs */
> +   char buf[MAX_LINE];
> +   FILE *fin;
> +   const char *cpu_present = "/sys/devices/system/cpu/present";

This is Linux-only, so I think this ought to be inside an #if defined(LINUX).

@@ +268,5 @@
> +   fin = fopen(cpu_present, "r");
> +   if (fin != NULL) {
> +        if (fgets(buf, MAX_LINE, fin) != NULL) {
> +	    size_t strsize = strlen(buf);
> +	    if (strsize == 1 && isdigit(buf[0])) {

I can't find clear documentation about what values will be in this file. ( http://www.kernel.org/doc/Documentation/cputopology.txt only seems to give examples) Does this always hold true regardless of the actual value of buf[0]?

@@ +272,5 @@
> +	    if (strsize == 1 && isdigit(buf[0])) {
> +	        /* single core */
> +		numCpus = 1;
> +	    } else if (strsize >= 3 && strsize <= 5) {
> +	        /* should be of the form 0-999 */

Should you sanity-check the first digit here as well?
Attachment #539978 - Flags: review?(ted.mielczarek) → review+
(Assignee)

Comment 24

6 years ago
>I can't find clear documentation about what values will be in this file. ( 
>http://www.kernel.org/doc/Documentation/cputopology.txt only seems to give 
>examples) Does this always hold true regardless of the actual value of buf[0]?

The docs say the format is that of cpulist_parse
http://lxr.free-electrons.com/source/include/linux/cpumask.h#L573
which really calls
http://lxr.free-electrons.com/source/lib/bitmap.c#L590

I wish the cputopology.txt doc stated it more explicitly, but I believe the present CPUs are actually those that are numbered, which would limit it to be 0-indexed without gaps.

Maybe I can add a sanity check that the first number is actually 0, the second one a dash, and pull out of this routine if it isn't (because that means the above assumption is wrong).
(Assignee)

Comment 25

6 years ago
Created attachment 542200 [details] [diff] [review]
Patch 2/2 v5 Detect cpucount with hotplugging/powersaving
Attachment #539978 - Attachment is obsolete: true
Attachment #542200 - Flags: review?(ted.mielczarek)
Comment on attachment 542200 [details] [diff] [review]
Patch 2/2 v5 Detect cpucount with hotplugging/powersaving

Review of attachment 542200 [details] [diff] [review]:
-----------------------------------------------------------------

::: nsprpub/pr/src/misc/prsystem.c
@@ +297,2 @@
>  #elif defined(XP_UNIX)
> +    numCpus = sysconf( _SC_NPROCESSORS_CONF );

Did you mean to change the sysconf parameter here for the unix-but-not-linux case?
Attachment #542200 - Flags: review?(ted.mielczarek) → review+
(Assignee)

Comment 27

6 years ago
> Did you mean to change the sysconf parameter here for the unix-but-not-linux case?

Yes, "configured" instead of "currently online" sounds closer to what we really want, because the function is only called once at startup and the result cached.

These constants aren't standardized (man says "These values also exist, but may not be standard") but systems that know the one will know the other. There's no reason for us to do it one way on Linux and differently on other Unixy systems.

The risk here is if there are use cases for _SC_NPROCESSORS_CONF != _SC_NPROCESSORS_ONLN that are very differently from what we'd expect (Power saving or replacing faulty CPUs in a running system) and would cause us to behave wrongly.

Given that we have a relevant use case where we'd prefer the new behavior, and no known cases where we'd want the old one, this change seemed sensible to me.
Okay, that seems fine.
Checked in the NSPR patch to NSPR CVS:
Checking in pr/src/misc/prsystem.c;
/cvsroot/mozilla/nsprpub/pr/src/misc/prsystem.c,v  <--  prsystem.c
new revision: 3.37; previous revision: 3.36
done
(Reporter)

Comment 30

6 years ago
ted, when can we merge nspr into mc?  Otherwise this bug is ready to be marked fixed.
You can check with wtc, I think we can probably produce another beta tag and take a snapshot of that in m-c.

Comment 32

6 years ago
(In reply to comment #27)
>
> The risk here is if there are use cases for _SC_NPROCESSORS_CONF !=
> _SC_NPROCESSORS_ONLN that are very differently from what we'd expect (Power
> saving or replacing faulty CPUs in a running system) and would cause us to
> behave wrongly.
> 
> Given that we have a relevant use case where we'd prefer the new behavior,
> and no known cases where we'd want the old one, this change seemed sensible
> to me.

Gian-Carlo, thank you for your patch.

NSPR is not used just by Mozilla.  I assume you have considered
only the use cases of this function in Mozilla products.  So this
behavior change may break other NSPR-based applications.

I don't know what you meant by "the real number of CPUs".  Do
you mean the static count or the dynamic count?  Can you make it
unambiguous?  I guess you want the static count, to be consistent
with _SC_NPROCESSORS_CONF.

Re: the MAX_LINE macro: you can just pass sizeof(buf) to fgets().
(Assignee)

Comment 33

6 years ago
>NSPR is not used just by Mozilla.  I assume you have considered
>only the use cases of this function in Mozilla products.

No, we did consider other users. I presume a typical use of this function is to know the amount of threads to launch to get a task done as quickly as possible. (And note that the FF usage that lead to this bug is something entirely different) In that case, they need this fix too, because they'll get it wrong with current NSPR.

There's a risk some odd use case exists that goes well with the current behavior and wrong with the fixed one. Given that we know none of those but at least 2 opposite ones, I think fixing this is the most sensible course of action.

If anyone knows a use case that breaks with this new behavior, *now* would certainly be a good time to speak up.

>I don't know what you meant by "the real number of CPUs".  Do
>you mean the static count or the dynamic count?

The number of physical CPUs present that may be brought up into the OS at any time. That should fit with _SC_NPROCESSORS_CONF.

>Re: the MAX_LINE macro: you can just pass sizeof(buf) to fgets().

I'd prefer to get rid of the arbitrary buffer completely, but I'm not sure if there's any simple method that's portable enough for NSPR. I know getline(), but that's glibc only...
(Reporter)

Comment 34

6 years ago
wah-teh, what are the next steps here?
(Assignee)

Comment 35

6 years ago
This just landed: 
http://hg.mozilla.org/mozilla-central/rev/ec7ad3a1de60
(Assignee)

Updated

6 years ago
Status: REOPENED → RESOLVED
Last Resolved: 6 years ago6 years ago
Resolution: --- → FIXED

Updated

6 years ago
Whiteboard: QA?
You need to log in before you can comment on or make changes to this bug.