Overview
With the release of Linux 7.2. came a little change which affected every application meant to run on it. That is the change of the default behaviour of what happens when the user logout. Up to now it was that the system won't kill your Inter Process Communication (IPC), which is controlled by the following configuration file: “/etc/systemd/logind.conf”
Default settings:
logind Config File
[Login] #NAutoVTs=6 #ReserveVT=6 #KillUserProcesses=no #KillOnlyUsers= #KillExcludeUsers=root #InhibitDelayMaxSec=5 #HandlePowerKey=poweroff #HandleSuspendKey=suspend #HandleHibernateKey=hibernate #HandleLidSwitch=suspend #HandleLidSwitchDocked=ignore #PowerKeyIgnoreInhibited=no #SuspendKeyIgnoreInhibited=no #HibernateKeyIgnoreInhibited=no #LidSwitchIgnoreInhibited=yes #IdleAction=ignore #IdleActionSec=30min #RuntimeDirectorySize=10% #RemoveIPC=yes <-The setting to "yes" will kill IPC after the user logout.
Well, that leads to little problems for Oracle, let see what:
Consequerneces for Oracle
Oracle highly relies on semaphores, as every other application, semaphores are used to create locks (latches, enqueues, mutexes, etc…) So what happens when we remove the semaphores after the DBA logs out ?
Well this happens:
Oracle Crash
Errors in file /u01/app/oracle/diag/rdbms/orcltest/orcltest/trace/orcltest_ofsd_10127_10129.trc: ORA-27157: OS post/wait facility removed ORA-27300: OS system dependent operation:semop failed with status: 43 ORA-27301: OS failure message: Identifier removed ORA-27302: failure occurred at: sskgpwwait1 2020-06-11T09:50:26.163989+02:00 PMAN termination due to ORA-27157 in action 'monitor DNFS IO SLAVES' 2020-06-11T09:50:26.164138+02:00 Errors in file /u01/app/oracle/diag/rdbms/orcltest/orcltest/trace/orcltest_pman_10137.trc: ORA-27157: OS post/wait facility removed ORA-27300: OS system dependent operation:semop failed with status: 43 ORA-27301: OS failure message: Identifier removed ORA-27302: failure occurred at: sskgpwwait1 2020-06-11T09:50:26.205582+02:00 Errors in file /u01/app/oracle/diag/rdbms/orcltest/orcltest/trace/orcltest_smon_10149.trc: ORA-27157: OS post/wait facility removed ORA-27300: OS system dependent operation:semop failed with statu
In a nutshell, oracle crashes, because it cannot open its semaphores anymore. You can of course investigate the stack:
Investigate the Stack
ksedsts()+244<-kjzdssdmp()+321<-kjzduptcctx()+692<-kjzdicrshnfy()+992<-ksuitm()+5857<-ksbrdp()+4223<-opirip()+1488<-opidrv()+616<-sou2o()+145<-opimai_real()+270<-ssthrdmain()+412<-main()+236<-__libc_start_m --> library started --execution --> main procedure call --> thread initiated --> opimiain..real.. --> sou2o --- call --> opidrv ----driver call --> opitip .... --> ksdrdp --- protocol (RDP) -->ksuitm --session allocation initial for program to get allocate memory to start (itm) --> now kjzdicrshnfy --- it failed since memory allocation main and crash notify --> kjzduptcctx --- Notifying DIAG for crash event --> kjzdssdmp --- Dump the log
From the stack, you can see that it failed because it allocate the memory to start.
Solution
So, what is the solution. Well the we have to modify the setting and reboot the services:
So firstly edit the settings and restart the services as follows:
Fix the configuration
--Edit of: /etc/systemd/logind.conf RemoveIPC=no --Reboot services: systemctl restart systemd-logind.service --or better restart the whole system: systemctl reboot