Open Bug 1948051 Opened 11 months ago Updated 7 months ago

mach build hangs at the end of build (again). ctrl+c then restart build completes the build without hang

Categories

(Firefox Build System :: General, defect)

x86_64
Linux
defect

Tracking

(Not tracked)

People

(Reporter: manuel, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

Attached file mach-build-ctrl-c.html

for me ./mach build again sometimes hang very similarlish to Bug 1499382. It completes when running ./mach build again.

Attached file mach-build.html
Summary: mach build hangs → mach build hangs at the end of build (again). ctrl+c then restart build completes the build without hang
Severity: -- → S3

Does it still hang if you apply this patch:

--- a/python/mozbuild/mozbuild/controller/building.py
+++ b/python/mozbuild/mozbuild/controller/building.py
@@ -197,7 +197,7 @@ class BuildMonitor(MozbuildObject):
             poll_interval=0.1,
             metadata={"CPUName": get_cpu_brand()},
         )
-        self._resources_started = False
+        self._resources_started = None
 
         self.tiers = TierStatus(self.resources, metrics)
 
@@ -240,8 +240,9 @@ class BuildMonitor(MozbuildObject):
 
     def start_resource_recording(self):
         # This should be merged into start() once bug 892342 lands.
-        self.resources.start()
-        self._resources_started = True
+        #self.resources.start()
+        #self._resources_started = True
+        return
 
     def on_line(self, line):
         """Consume a line of output from the build system.
@@ -332,8 +333,7 @@ class BuildMonitor(MozbuildObject):
     def stop_resource_recording(self):
         if self._resources_started:
             self.resources.stop()
-
-        self._resources_started = False
+            self._resources_started = False
 
     def finish(self):
         """Record the end of the build."""
@@ -346,6 +346,8 @@ class BuildMonitor(MozbuildObject):
         self.warnings_database.save_to_file(self._warnings_path)
 
     def record_usage(self):
+        if not self.have_resource_usage:
+            return
         build_resources_profile_path = None
         try:
             # When running on automation, we store the resource usage data in
@@ -464,7 +466,7 @@ class BuildMonitor(MozbuildObject):
     @property
     def have_resource_usage(self):
         """Whether resource usage is available."""
-        return self.resources.start_time is not None
+        return self._resources_started is not None
 
     def get_resource_usage(self):
         """Produce a data structure containing the low-level resource usage information.
Flags: needinfo?(manuel)

I have trouble finding a way to reproduce reliable. Haven't tried with the patch yet. Therefore, I can't tell for now, but will update this bug when I have more information. Sorry!

Clearing needinfo for now.

Flags: needinfo?(manuel)
Duplicate of this bug: 1957417

For me, this happens when I didn't do a build for "some time".

Here are 2 profiles of when this happens:

To make it finish and capture the profiles, I killed the remaining sccache process.

For comparison, here are the same profiles when the build doesn't hang:

I believe the main difference is that the first one runs clang (and sccache). Note that I didn't change any file, but it may still change some time-related file maybe?

I'll try disabling sccache and see if this makes a difference.

After disabling sccache I don't seem to get the problem anymore.

I'll enable it again and look with samply and possibly strace again

What version of sccache are you using?

This is sccache 0.9.1, the one installed in .mozbuild automatically.

From strace it looks like it's waiting for a futex:

futex(0x7fd1fbaf5918, FUTEX_WAIT_PRIVATE, 1, NULL

I think I don't have the symbols for it at the moment, to look at it further.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: