Just today (a specific time which I cannot tell) we had a time drif on the VM which most probably caused a FS to become a Read-Only. We don't know if the time drift caused the FS to become Read-Only, but we had to investigate anyway. There are couple command which you can use.
General trouble shooting isn't focused at specific component of the operation system, but it gives general idea of the system if there is performance issue, corruption and so on.
You can always check the logs. Very important log is:
/var/log/messages
That log records information about structural changes and other errors
You can also inspect the journals as follows:
Check journals since date
root@k8s-master:~# journalctl --since 2020-05-01 -- Logs begin at Fri 2020-05-01 13:35:29 UTC, end at Fri 2020-05-01 14:02:20 UTC. -- May 01 13:35:29 k8s-master systemd-journald[395]: Runtime journal (/run/log/journal/) is 1.2M, max 9.9M, 8.6M free. May 01 13:35:29 k8s-master kernel: Initializing cgroup subsys cpuset May 01 13:35:29 k8s-master kernel: Initializing cgroup subsys cpu May 01 13:35:29 k8s-master kernel: Initializing cgroup subsys cpuacct May 01 13:35:29 k8s-master kernel: Linux version 4.4.0-177-generic (buildd@lgw01-amd64-035) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.12) ) #207-Ubuntu SMP Mon Mar 16 01:16:10 UTC 2020 (Ubuntu 4.4.0-177.2 May 01 13:35:29 k8s-master kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-4.4.0-177-generic root=LABEL=cloudimg-rootfs ro console=tty1 console=ttyS0 May 01 13:35:29 k8s-master kernel: KERNEL supported cpus: ************************************************************************
There are several commands, which can help you identify usage with the system:
Check Corruption
[root@*********** log]# e2fsck -nv /dev/mapper/oravg-oradata e2fsck 1.42.9 (28-Dec-2013) Warning! /dev/mapper/oravg-oradata is mounted. Warning: skipping journal recovery because doing a read-only filesystem check. /dev/mapper/oravg-oradata contains a file system with errors, check forced. Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information Free blocks count wrong (12969949, counted=13514969). Fix? no Free inodes count wrong (3931129, counted=3931397). Fix? no 1031 inodes used (0.03%, out of 3932160) 398 non-contiguous files (38.6%) 2 non-contiguous directories (0.2%) # of inodes with ind/dind/tind blocks: 0/0/0 Extent depth histogram: 748/7 2758691 blocks used (17.54%, out of 15728640) 0 bad blocks 3 large files 742 regular files 12 directories 0 character device files 0 block device files 0 fifos 0 links 0 symbolic links (0 fast symbolic links) 0 sockets ------------ 754 files
You can choose to trace a specific process and what commands to the kernel he is passing as follows:
Strace is a tool which can be installed and it is used to display what request the application is passing to the kernel to execute. Usually these commands are very low level: “Read pointer”, “Write data”, “Allocate latch” and etc.
First thing is to locate the session PID, in our example we want to monitor a NAGIOS client, therefore we have to minitor a process called: “xinet” which is a parent to the nagios:
[root@oraback libexec]# ps -leaf | grep -i xinet 0 S root 4622 31523 0 78 0 - 15298 pipe_w 10:33 pts/0 00:00:00 grep -i xinet 1 S root 6177 1 0 75 0 - 5416 - Jan19 ? 00:00:16 xinetd -stayalive -pidfile /var/run/xinetd.pid
Once we have located the parent process, we can start tracing:
Start tracing
[root@oraback libexec]# strace -o /tmp/x -fp 6177 Process 6177 attached - interrupt to quit Process 4644 attached (waiting for parent) Process 4644 resumed (parent 6177 ready) Process 4645 attached (waiting for parent) Process 4645 resumed (parent 4644 ready) Process 4644 suspended Process 4646 attached (waiting for parent) Process 4646 resumed (parent 4645 ready) Process 4647 attached (waiting for parent)
This command, will trace all info, into a file called: /tmp/x
In case of need, the trace can be extended to include more data using:
Enable Verbose mode
[root@oraback libexec]# strace -vo /tmp/x -fp 6177 Process 6177 attached - interrupt to quit Process 5089 attached (waiting for parent) Process 5089 resumed (parent 6177 ready) Process 5091 attached (waiting for parent) Process 5091 resumed (parent 5089 ready) Process 5089 suspended Process 5093 attached (waiting for parent) Process 5093 resumed (parent 5091 ready) Process 5093 detached Process 5089 resumed Process 5091 detached
After that we can examine the content of a trace file:
Check the log
[root@oraback libexec]# less /tmp/x 2280 <... nanosleep resumed> NULL) = 0 2280 nanosleep({1, 0}, <unfinished ...> 2286 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out) 2286 futex(0x38891f8, FUTEX_WAKE_PRIVATE, 1) = 0 2286 futex(0x3889224, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 161, {1528735731, 389058000}, ffffffff <unfinished ...> 2287 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out) 2279 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out) 2287 futex(0x38d5658, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> 2279 futex(0x31f49f8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> 2287 <... futex resumed> ) = 0 2279 <... futex resumed> ) = 0 2287 futex(0x38d5684, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 319, {1528735726, 392983000}, ffffffff <unfinished ...> 2279 futex(0x31f4a24, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 319, {1528735726, 392985000}, ffffffff <unfinished ...> 2278 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out) 2277 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out) 2275 <... futex resumed> ) = -1 ETIMEDOUT (Connection timed out) 2278 futex(0x31f4968, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> 2277 futex(0x3884a78, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> 2275 futex(0x31f4b18, FUTEX_WAKE_PRIVATE, 1 <unfinished ...> 2278 <... futex resumed> ) = 0 2277 <... futex resumed> ) = 0 2275 <... futex resumed> ) = 0 2278 futex(0x31f4994, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1589, {1528735722, 422523000}, ffffffff <unfinished ...> 2277 futex(0x3884aa4, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1589, {1528735722, 422489000}, ffffffff <unfinished ...>