Open Bug 853902 Opened 11 years ago Updated 2 years ago

glibc detected *** -loader: free(): corrupted unsorted chunks

Categories

(NSPR :: NSPR, defect)

x86_64
Linux
defect

Tracking

(Not tracked)

People

(Reporter: elio.maldonado.batiz, Unassigned)

Details

Attachments

(1 file)

As originally reprted by Jan Stodola for Red Hat Entetrprise Linux 6 on s390x:

Description of problem:
At the end of installation on s390x, when user clicks on Reboot button, following message is displayed:

detecting hardware...
waiting for hardware to initialize...
Running anaconda 13.21.9, the Red Hat Enterprise Linux 6 system installer - please wait.
15:51:29 DISPLAY variable not set. Starting text mode.                                  
*** glibc detected *** -loader: free(): corrupted unsorted chunks: 0x00000000800c6290 ***
======= Backtrace: =========                                                             
/lib64/libc.so.6(+0x8694e)[0x200006e494e]                                                
/lib64/libnspr4.so(+0x2e196)[0x20000e3a196]                                              
/lib64/libnspr4.so(+0xe0d0)[0x20000e1a0d0] 

see attachment for memory map.
Logs come from text mode installation, but I saw this issue when running vnc mode too. The installation reboots without any other visible issue.

Version-Release number of selected component (if applicable):
RHEL6.0-20100203.3
anaconda-13.21.9-1.el6

How reproducible:
sometimes

Steps to Reproduce:
1. install RHEL6, at the end of installation press Reboot button

Actual results:
*** glibc detected *** -loader: free(): corrupted unsorted chunks: 0x00000000800c6290 ***
...

Expected results:
Installation reboots without errors
Relevant comments from the discussion thread of the origibal bug:

Jan Stodola 2010-02-12 04:11:40 EST

I found a way how to reproduce this issue always:

1. start installation using kernel.img + initrd.img
2. in terminal, run ssh -x install@... to login and start the installation
3. select language and provide path to download install.img
4. start vnc installation
5. connect to the machine using vncviewer
5. resize terminal with running ssh connection from step 2
6. *** glibc detected ***

2) Alalysis from Bob Relyea 2010-05-10 14:59:12 EDT

I've determined the underlying problem of the error messages.

They are coming from nspr when the shared library is being unloaded at exit time. At this point the loader has already sent the kill signal to init. When I single step through the loader code, I do not have any issues. I can only guess that there is a race as the system is being torn down that is affecting the state of the memory pool.

The thing to note from these error messages if you are experiencing a crash, it's probably a crash in anaconda. These messages only happen when loader is exiting on it's own.

(NB: Anacond is the Red Hat's installer)

I can prevent NSPR from freeing any memory on unload. This should be relatively safe since it's relatively rare that NSPR gets loaded and unloaded on the fly into an address space (pam is the main example). I can also prevent NSPR from freeing it's main thread and get the same effect.

A third possibility is to place a short sleep after init. This causes loader to task switch and allows init to bring down the rest of the system (which is why single stepping appears to avoid the issue.

Both of these solutions have the risk of pasting over some larger underlying problem, In that case someone needs to track what's going on with the underling malloc system during shutdown. (I've seen the malloc system corrupted as early as the start of the exit call in the loader).

bob
patch that was applied and the fix verified on RHEL-6.
Attachment #728292 - Attachment description: Don't call free if we are unloading the library... by rreleya → Don't call free if we are unloading the library... by bob rrelyea
Attachment #728292 - Attachment description: Don't call free if we are unloading the library... by bob rrelyea → Don't call free if we are unloading the library... by rrelyea

The bug assignee is inactive on Bugzilla, so the assignee is being reset.

Assignee: wtc → nobody
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: