Hey everyone!! This one is going to be a very brief blog. A lot of times, system developers encounter OOPS from the linux kernel while tweaking in the kernel drivers.
A kernel OOPS is a non-fatal but serious error that help developers debug the potential problems.
It is like the kernel is the patient with an illness trying to talk to a doctor and telling what is wrong with it so that the developers can identify the issue and fix it.
It generally occurs when the kernel detects an invalid operation such as an illegal memory access, NULL pointer dereferences, invalid instruction execution. An OOPS doesn't necessarily mean that the system will stop working right there and then. However, it does impact the reliability of the system until a point at which the system could potentially halt and stop working.
OOPS generates a detailed report that contains info like the stack trace, CPU ID, stack pointer, program counter etc.
While tweaking with a PCIe Endpoint driver on iMX8MM, I happened to break the driver's integrity [by mistake of course :p]. Please have a look at the below snapshot: -

With this OOPS, the system is trying to tell you very clearly that the cause of this error is due to an instruction at pci_epf_test_cmd_handler+0x4c which also happens to be the program counter for obvious reasons.
Now to check which line of code in the kernel caused this OOPS, you have to do the following: -
1. Go to the directory where your kernel build is present and using the toolchain's 'nm' tool, list the symbol 'pci_epf_test_cmd_handler' address
nxgxxx@lsv03xxxx:~/linux-imx$ aarch64-poky-linux-nm -n vmlinux | grep pci_epf_test_cmd_handler
ffff8000086280b0 t pci_epf_test_cmd_handler
2. We also have the offset of the line that caused the OOPS which is '0x4c' as per the log. So adding this offset to the address of the symbol obtained in the above step we get,
0xffff8000086280b0 + 0x4c = 0xffff8000086280fc
3. Using 'addr2line' utility of the toolchain and 'vmlinux' elf obtained in the build we get the line number in the kernel code that caused this issue:-
nxgxxx@lsv03xxxx:~/linux-imx$ aarch64-poky-linux-addr2line -e vmlinux -f -i 0xffff8000086280fc
pci_epf_test_cmd_handler
/home/nxgxxx/linux-imx/drivers/pci/endpoint/functions/pci-epf-test.c:843

That's it. It is very easy to debug the kernel with the kind of tools we have nowadays. There are alternatives which developers use to debug the kernel like inspecting the kernel dump via gdb.
One of these days, we will cover what pc, lr and other fields in an OOPS denote, just for the fun of it. Happy learning!!