Hi,
We are getting below kernel crash while using "insmod" for kernel module built for t1040 processor using 64 bit toolchain.
root@t1040rdb:/media/ram# insmod linux-kernel-bde.ko
linux_kernel_bde: module license 'Proprietary' taints kernel.
Disabling lock debugging due to kernel taint
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0x80000000001a0758
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=24 CoreNet Generic
Modules linked in: linux_kernel_bde(PO+)
CPU: 0 PID: 2700 Comm: insmod Tainted: P O 3.12.19-rt30-QorIQ-SDK-V1.6+gc29fe1a #3
task: c000000006937400 ti: c00000000657c000 task.ti: c00000000657c000
NIP: 80000000001a0758 LR: 800000000019b274 CTR: c00000000036ddb4
REGS: c00000000657f7a0 TRAP: 0300 Tainted: P O (3.12.19-rt30-QorIQ-SDK-V1.6+gc29fe1a)
MSR: 0000000080029000 <CE,EE,ME> CR: 44000444 XER: 20000000
SOFTE: 1
DEAR: 0000000000000000, ESR: 0000000000000000
GPR00: 800000000019b268 c00000000657fa20 80000000001a8ad0 000000000000002a
GPR04: 0000000044000444 000000000000000d 0000000000000008 0000000000000008
GPR08: 0000000000000000 0000000000000001 00000001a66b1cbc 0000000000000000
GPR12: 0000000024000442 c00000000fff4000 80000000001a7fc8 0000000000000154
GPR16: 0000000000000018 c000000000b7c518 0000000000000000 0000000000000124
GPR20: c000000000afc210 c00000000657fdc0 0000000000000001 80000000001a0b50
GPR24: c0000000069df1c0 c0000000007d9648 0000000000000001 c000000000b3e980
GPR28: 800000000019d118 80000000001a2068 ffffffffffffffed 800000000019c318
NIP [80000000001a0758] gmodule_get+0x0/0xffffffffffffbad8 [linux_kernel_bde]
LR [800000000019b274] ____versions+0x169ac/0x17968 [linux_kernel_bde]
Call Trace:
[c00000000657fa20] [800000000019b268] ____versions+0x169a0/0x17968 [linux_kernel_bde] (unreliable)
[c00000000657fab0] [c00000000000184c] .do_one_initcall+0x14c/0x1a0
[c00000000657fba0] [c0000000000acfc8] .load_module+0x1ea4/0x2394
[c00000000657fd40] [c0000000000ad564] .SyS_init_module+0xac/0xec
[c00000000657fe30] [c000000000000598] syscall_exit+0x0/0x8c
Instruction dump:
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
---[ end trace 75186f9f417e1c86 ]---
Segmentation fault
root@t1040rdb:/media/ram#
However when the same is compiled for 32bit kernel with 32 bit toolchain the kernel module works perfectly. Is there any configuration missing? Can someone provide some clue to debug it further?
Regards,
Chandra Shekhar
Your module crashed in what appears to be the first instruction of the gmodule_get function (which is oddly reported as having a negative size). Use gdb or objdump to see what that instruction is, and try to figure out what's going wrong.
Hi Scott,
I took the objdump of module to figure out the failure but could not get where negative size is coming. Below is the code and objdump snippet,
/* code for module insertion =====>
int __init
init_module(void)
{
int rc;
printk("chanmish[%s,%d]!!!!\n", __FUNCTION__,__LINE__);
printk("chanmish[%s,%d] %p\n", __FUNCTION__,__LINE__,gmodule_get);
/* Get our definition */
_gmodule = gmodule_get();
if(!_gmodule) return -ENODEV;
/* objdump for above section =====>
0000000000000000 <.init_module>:
0: 7c 08 02 a6 mflr r0
4: fb e1 ff f8 std r31,-8(r1)
8: 3f e2 00 00 addis r31,r2,0
c: f8 01 00 10 std r0,16(r1)
10: fb 81 ff e0 std r28,-32(r1)
14: 3b ff 00 00 addi r31,r31,0
18: fb a1 ff e8 std r29,-24(r1)
1c: 3f 82 00 40 addis r28,r2,64
20: fb c1 ff f0 std r30,-16(r1)
24: 3b ff 00 30 addi r31,r31,48
28: f8 21 ff 71 stdu r1,-144(r1)
2c: 3b 9c 00 40 addi r28,r28,64
30: 7f e4 fb 78 mr r4,r31
34: 38 a0 01 8f li r5,399
38: 7f 83 e3 78 mr r3,r28
3c: 3f a2 00 00 addis r29,r2,0
40: 48 00 00 01 bl 40 <.init_module+0x40>
44: 60 00 00 00 nop
48: 3d 22 00 00 addis r9,r2,0
4c: e8 c9 00 00 ld r6,0(r9)
50: 3c 62 00 58 addis r3,r2,88
54: 7f e4 fb 78 mr r4,r31
58: 38 a0 01 90 li r5,400
5c: 38 63 00 58 addi r3,r3,88
60: 3b bd 00 00 addi r29,r29,0
64: 48 00 00 01 bl 64 <.init_module+0x64>
68: 60 00 00 00 nop
6c: 3b c0 ff ed li r30,-19
70: 48 00 00 01 bl 70 <.init_module+0x70>
74: 60 00 00 00 nop
78: 2f a3 00 00 cmpdi cr7,r3,0
7c: f8 7d 00 08 std r3,8(r29)
80: 41 fe 01 10 beq+ cr7,190 <.init_module+0x190>
84: 7f e4 fb 78 mr r4,r31
88: 38 a0 01 95 li r5,405
8c: 7f 83 e3 78 mr r3,r28
90: 48 00 00 01 bl 90 <.init_module+0x90>
/* code for gmodule_get =====>
gmodule_t *
gmodule_get(void)
{
printk(KERN_ERR "chanmish[%s,%d]\n",__FUNCTION__,__LINE__);
_gmodule.name = _modname;
return &_gmodule;
}
/* objdump for above section =====>
0000000000004880 <.gmodule_get>:
4880: 7c 08 02 a6 mflr r0
4884: 3c 82 00 00 addis r4,r2,0
4888: f8 01 00 10 std r0,16(r1)
488c: 38 84 00 00 addi r4,r4,0
4890: f8 21 ff 91 stdu r1,-112(r1)
4894: 3c 62 09 60 addis r3,r2,2400
4898: 38 84 00 20 addi r4,r4,32
489c: 38 a0 0e ad li r5,3757
48a0: 38 63 09 60 addi r3,r3,2400
48a4: 48 00 00 01 bl 48a4 <.gmodule_get+0x24>
48a8: 60 00 00 00 nop
48ac: 38 21 00 70 addi r1,r1,112
48b0: e8 01 00 10 ld r0,16(r1)
48b4: 3c 62 00 00 addis r3,r2,0
48b8: 38 63 00 00 addi r3,r3,0
48bc: 38 63 01 e8 addi r3,r3,488
48c0: 7c 08 03 a6 mtlr r0
48c4: 4e 80 00 20 blr
48c8: 00 00 00 00 .long 0x0
48cc: 00 00 00 01 .long 0x1
48d0: 80 00 00 00 lwz r0,0(0)
48d4: 60 00 00 00 nop
48d8: 60 00 00 00 nop
48dc: 60 00 00 00 nop
Please suggest what else I should try to resolve this issue.
Regards,
Chandra Shekhar
Is this the exact code that corresponds to the crash dump you posted? You don't get any output from the printk statements?
It's strange that the instruction dump was all XXXXXXXX.
How did you build the module? You used the same headers and config as the running kernel?
I tried building a simple out-of-tree module with SDK 1.6, and did not have this problem.
Hi Scott,
Below is the correct dump for the code/objdump snippet I shared. I am getting printk messages from "init_module" but not from "gmodule_get".
root@t1040rdb:/media/ram# insmod linux-kernel-bde.ko
linux_kernel_bde: module license 'Proprietary' taints kernel.
Disabling lock debugging due to kernel taint
chanmish[init_module,399]!!!!
chanmish[init_module,400] 80000000001a0758
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0x80000000001a0758
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=24 CoreNet Generic
Modules linked in: linux_kernel_bde(PO+)
CPU: 0 PID: 2700 Comm: insmod Tainted: P O 3.12.19-rt30-QorIQ-SDK-V1.6+gc29fe1a #3
task: c000000006937400 ti: c00000000657c000 task.ti: c00000000657c000
NIP: 80000000001a0758 LR: 800000000019b274 CTR: c00000000036ddb4
REGS: c00000000657f7a0 TRAP: 0300 Tainted: P O (3.12.19-rt30-QorIQ-SDK-V1.6+gc29fe1a)
MSR: 0000000080029000 <CE,EE,ME> CR: 44000444 XER: 20000000
SOFTE: 1
DEAR: 0000000000000000, ESR: 0000000000000000
GPR00: 800000000019b268 c00000000657fa20 80000000001a8ad0 000000000000002a
GPR04: 0000000044000444 000000000000000d 0000000000000008 0000000000000008
GPR08: 0000000000000000 0000000000000001 00000001a66b1cbc 0000000000000000
GPR12: 0000000024000442 c00000000fff4000 80000000001a7fc8 0000000000000154
GPR16: 0000000000000018 c000000000b7c518 0000000000000000 0000000000000124
GPR20: c000000000afc210 c00000000657fdc0 0000000000000001 80000000001a0b50
GPR24: c0000000069df1c0 c0000000007d9648 0000000000000001 c000000000b3e980
GPR28: 800000000019d118 80000000001a2068 ffffffffffffffed 800000000019c318
NIP [80000000001a0758] gmodule_get+0x0/0xffffffffffffbad8 [linux_kernel_bde]
LR [800000000019b274] ____versions+0x169ac/0x17968 [linux_kernel_bde]
Call Trace:
[c00000000657fa20] [800000000019b268] ____versions+0x169a0/0x17968 [linux_kernel_bde] (unreliable)
[c00000000657fab0] [c00000000000184c] .do_one_initcall+0x14c/0x1a0
[c00000000657fba0] [c0000000000acfc8] .load_module+0x1ea4/0x2394
[c00000000657fd40] [c0000000000ad564] .SyS_init_module+0xac/0xec
[c00000000657fe30] [c000000000000598] syscall_exit+0x0/0x8c
Instruction dump:
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
---[ end trace 75186f9f417e1c86 ]---
Segmentation fault
root@t1040rdb:/media/ram#
Also, the kernel module (Linux-kernel-bde) is built with same toolchain and header files which is being used or Linux kernel (uImage) build. However, Filesystem is built with bitbake.
Regards,
Chandra Shekhar
Hi Scot,
One more thing I noticed that generated objdump for .ko module shows "out of bounds" message at few places. Whether this could be the reason for crash? As I am not getting any clue to proceed,
00000000000003f1 <__UNIQUE_ID_vermagic0>:
3f1: 76 65 72 6d andis. r5,r19,29293
3f5: 61 67 69 63 ori r7,r11,26979
3f9: 3d 33 2e 31 addis r9,r19,11825
3fd: 32 2e 31 39 addic r17,r14,12601
401: 2d 72 74 33 cmpdi cr2,r18,29747
405: 30 2d 51 6f addic r1,r13,20847
409: 72 49 51 2d andi. r9,r18,20781
40d: 53 44 4b 2d rlwimi. r4,r26,9,12,22
411: 56 31 2e 36 rlwinm r17,r17,5,24,27
415: 2b 67 63 32 cmpldi cr6,r7,25394
419: 39 66 65 31 addi r11,r6,25905
41d: 61 20 53 4d ori r0,r9,21325
421: 50 20 6d 6f rlwimi. r0,r1,13,21,23
425: 64 5f 75 6e oris r31,r2,30062
429: 6c 6f 61 64 xoris r15,r3,24932
42d: 20 6d 6f 64 subfic r3,r13,28516
431: 76 65 72 73 andis. r5,r19,29299
435: 69 6f 6e 73 xori r15,r11,28275
439: Address 0x0000000000000439 is out of bounds.
Regards,
Chandra
That looks like you're disassembling something that isn't code.
Could you show precisely how you are building this module?
Hi Scott,
Sorry for very late reply as I was busy with some other priority issues. The kernel module in the question is build using standard out of kernel directory module build process. Below is the Makefile for the same.
MODULE := $(MOD_NAME).o
KMODULE := $(MOD_NAME).ko
PRE_COMPILED_OBJ := obj_$(MOD_NAME).o
obj-m := $(MODULE)
$(MOD_NAME)-y := $(MODULE_SYM) $(PRE_COMPILED_OBJ)
ifeq (,$(CROSS_COMPILE))
# CROSS compiler is powerpc64-fsl_networking-linux-
export CROSS_COMPILE
endif
SAVE_CFLAGS := ${CFLAGS}
include $(SDK)/make/Make.config
PWD := $(shell pwd)
ifneq ($(ARCH),)
# ARCH is powerpc
A := ARCH=$(ARCH)
export ARCH
endif
# Standard SDK include path for building source files that export
# kernel symbols.
override EXTRA_CFLAGS = -I${SDK}/include -I${SDK}/systems/linux/kernel/modules/include -I${SDK}/systems/bde/linux/include
# The precopiled object needs a dummy command file to avoid warnings
# from the Kbuild scripts (modpost stage).
# Kernels before 2.6.17 do not support external module symbols files,
# so we create a dummy to prevent build failures.
$(KMODULE):
rm -f *.o *.ko .*.cmd
rm -fr .tmp_versions
ln -s $(LIBDIR)/$(MODULE) $(PRE_COMPILED_OBJ)_shipped
echo "suppress warning" > .$(PRE_COMPILED_OBJ).cmd
$(MAKE) -C $(KERNDIR) CROSS_COMPILE=$(CROSS_COMPILE) M=$(PWD) modules
if [ ! -f Module.symvers ]; then echo "old kernel (pre-2.6.17)" > Module.symvers; fi
cp -f $(KMODULE) $(LIBDIR)
rm -f $(PRE_COMPILED_OBJ)_shipped
EXTRA_CFLAGS = $(CFLAGS)
CFLAGS := ${SAVE_CFLAGS}
That is not any "standard out of kernel directory module build process" that I've seen before, nor do I know what "$(SDK)/make/Make.config" is.
See https://www.kernel.org/doc/Documentation/kbuild/modules.txt for the standard procedure.
Hi Scott,
include $(SDK)/make/Make.config -> this file defines some macros used in code base. This is specific to SDK code.
Regards,
Chandra
Have you tried compiling it in-tree ? Also as compiled-in (instead of module) ?
Hi Max,
The module in this question is part of vendor provided SDK. Which is not possible to compile in tree. The same code base compiles and work perfectly with 32 bit kernel and tool chain. Only issue is when we compile with 64bit kernel and tool chain. I am doubting there is some issue with linking.
Hi Scott/Max,
I added the gcc flag "-mlongcall" while building and get rid of crash while insmod. Exactly what this flag does?
Regards,
Chandra