oracle_linux7.2_crash

With the release of Linux 7.2. came a little change which affected every application meant to run on it. That is the change of the default behaviour of what happens when the user logout. Up to now it was that the system won't kill your Inter Process Communication (IPC), which is controlled by the following configuration file: “/etc/systemd/logind.conf”

Default settings:

logind Config File

[Login]
#NAutoVTs=6
#ReserveVT=6
#KillUserProcesses=no
#KillOnlyUsers=
#KillExcludeUsers=root
#InhibitDelayMaxSec=5
#HandlePowerKey=poweroff
#HandleSuspendKey=suspend
#HandleHibernateKey=hibernate
#HandleLidSwitch=suspend
#HandleLidSwitchDocked=ignore
#PowerKeyIgnoreInhibited=no
#SuspendKeyIgnoreInhibited=no
#HibernateKeyIgnoreInhibited=no
#LidSwitchIgnoreInhibited=yes
#IdleAction=ignore
#IdleActionSec=30min
#RuntimeDirectorySize=10%
#RemoveIPC=yes                         <-The setting to "yes" will kill IPC after the user logout.

Well, that leads to little problems for Oracle, let see what:

Oracle highly relies on semaphores, as every other application, semaphores are used to create locks (latches, enqueues, mutexes, etc…) So what happens when we remove the semaphores after the DBA logs out ?

Well this happens:

Oracle Crash

Errors in file /u01/app/oracle/diag/rdbms/orcltest/orcltest/trace/orcltest_ofsd_10127_10129.trc:
ORA-27157: OS post/wait facility removed
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1
2020-06-11T09:50:26.163989+02:00
PMAN termination due to ORA-27157 in action 'monitor DNFS IO SLAVES'
2020-06-11T09:50:26.164138+02:00
Errors in file /u01/app/oracle/diag/rdbms/orcltest/orcltest/trace/orcltest_pman_10137.trc:
ORA-27157: OS post/wait facility removed
ORA-27300: OS system dependent operation:semop failed with status: 43
ORA-27301: OS failure message: Identifier removed
ORA-27302: failure occurred at: sskgpwwait1
2020-06-11T09:50:26.205582+02:00
Errors in file /u01/app/oracle/diag/rdbms/orcltest/orcltest/trace/orcltest_smon_10149.trc:
ORA-27157: OS post/wait facility removed
ORA-27300: OS system dependent operation:semop failed with statu

In a nutshell, oracle crashes, because it cannot open its semaphores anymore. You can of course investigate the stack:

Investigate the Stack

ksedsts()+244<-kjzdssdmp()+321<-kjzduptcctx()+692<-kjzdicrshnfy()+992<-ksuitm()+5857<-ksbrdp()+4223<-opirip()+1488<-opidrv()+616<-sou2o()+145<-opimai_real()+270<-ssthrdmain()+412<-main()+236<-__libc_start_m

--> library started --execution
  --> main procedure call
    --> thread initiated
           --> opimiain..real..
              --> sou2o --- call
                --> opidrv ----driver call
                    --> opitip ....
                             --> ksdrdp --- protocol (RDP)
                                  -->ksuitm --session allocation initial for program to get allocate memory to start (itm)
                                           --> now kjzdicrshnfy --- it failed since memory allocation main and crash notify
                                              --> kjzduptcctx --- Notifying DIAG for crash event
                                                  --> kjzdssdmp --- Dump the log

From the stack, you can see that it failed because it allocate the memory to start.

So, what is the solution. Well the we have to modify the setting and reboot the services:

So firstly edit the settings and restart the services as follows:

Fix the configuration

--Edit of: /etc/systemd/logind.conf
RemoveIPC=no

--Reboot services:
systemctl restart systemd-logind.service

--or better restart the whole system:
systemctl reboot
  • oracle_linux7.2_crash.txt
  • Last modified: 2020/06/12 11:40
  • by andonovj