<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic gpu-viv bug when using PID namespaces in i.MX Processors</title>
    <link>https://community.nxp.com/t5/i-MX-Processors/gpu-viv-bug-when-using-PID-namespaces/m-p/1412900#M186958</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;This is a follow-up for &lt;A href="https://community.nxp.com/t5/i-MX-Processors/libGAL-segfaults-when-it-s-PID1/m-p/1388607" target="_blank" rel="noopener"&gt;https://community.nxp.com/t5/i-MX-Processors/libGAL-segfaults-when-it-s-PID1/m-p/1388607&lt;/A&gt; which has been stale for a month. It is complicated and was not run on BSP so I wanted to restart fresh with a new post.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hardware: imx8mp evk SCH-46370 REV A1 with 8MPLUSLPD4 CPU board (rev x1)&lt;/P&gt;&lt;P&gt;Software: LF_v5.10.72-2.2.0_images_IMX8MPEVK BSP with 5.10.72-lts-5.10.y+ga68e31b63f86 kernel as is&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Running the following commands lead to a segfault on the second command:&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;unshare -f -p ./benchmark_model --graph=mobilenet_v1_1.0_224_quant.tflite --use_nnapi=true &amp;gt; /dev/null &amp;amp;&lt;/P&gt;&lt;P&gt;unshare -f -p ./benchmark_model --graph=mobilenet_v1_1.0_224_quant.tflite --use_nnapi=true&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;with traces as follow:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;Core was generated by `./benchmark_model --graph=mobilenet_v1_1.0_224_quant.tflite --use_nnapi=true'.
Program terminated with signal SIGSEGV, Segmentation fault.
(gdb) bt
#0  0x0000ffff964177f0 in ?? () from /usr/lib/libGAL.so
#1  0x0000ffff96417a44 in ?? () from /usr/lib/libGAL.so
#2  0x0000ffff9644df9c in ?? () from /usr/lib/libGAL.so
#3  0x0000ffff964064bc in gcoVX_CreateHW () from /usr/lib/libGAL.so
#4  0x0000ffff964066b0 in gcoVX_Construct () from /usr/lib/libGAL.so
#5  0x0000ffff964068dc in gcoVX_SwitchContext () from /usr/lib/libGAL.so
#6  0x0000ffff975440d0 in ?? () from /usr/lib/libOpenVX.so.1
#7  0x0000ffff97798eb8 in vsi_nn_CreateContext () from /usr/lib/libovxlib.so.1.1.0
#8  0x0000ffff97bb6798 in nnrt::Execution::Execution(nnrt::Compilation*) () from /usr/lib/libnnrt.so.1
#9  0x0000ffff97cda76c in ANeuralNetworksExecution_create () from /usr/lib/libneuralnetworks.so
#10 0x0000ffff981dc684 in tflite::delegate::nnapi::NNAPIDelegateKernel::Invoke(TfLiteContext*, TfLiteNode*, int*) () from /usr/lib/libtensorflow-lite.so.2.6.0
#11 0x0000ffff981ce524 in tflite::Subgraph::Invoke() () from /usr/lib/libtensorflow-lite.so.2.6.0
#12 0x0000ffff98388590 in tflite::Interpreter::Invoke() () from /usr/lib/libtensorflow-lite.so.2.6.0
#13 0x0000aaaabb45fc6c in ?? ()
#14 0x0000aaaabb462c10 in ?? ()
#15 0x0000aaaabb45de68 in ?? ()
#16 0x0000ffff97d64994 in __libc_start_main (main=0xaaaabb45d780, argc=3, argv=0xfffff4999f88, init=&amp;lt;optimized out&amp;gt;, fini=&amp;lt;optimized out&amp;gt;, rtld_fini=&amp;lt;optimized out&amp;gt;, 
    stack_end=&amp;lt;optimized out&amp;gt;) at ../csu/libc-start.c:332
#17 0x0000aaaabb45dbb8 in ?? ()&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is because the gpu-viv driver allocates resources based on PID number, and if two processes with resources share the same PID the process won't handle it well.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I was able to work around this issue with the following patch, also accessible here: &lt;A href="https://github.com/atmark-techno/linux-5.10-at/commit/b4de9635b00ba52fafc35b953f20260eb78f593e" target="_blank" rel="noopener"&gt;https://github.com/atmark-techno/linux-5.10-at/commit/b4de9635b00ba52fafc35b953f20260eb78f593e&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;From b53c1a5bcc28db552dcc28fbb52289b5d043396c Mon Sep 17 00:00:00 2001
From: Dominique Martinet &amp;lt;dominique.martinet@atmark-techno.com&amp;gt;
Date: Tue, 8 Feb 2022 16:07:12 +0900
Subject: [PATCH] gpu-viv: make galcore functions use the global init pid
 namespace

using the container namespace leads to crashes when multiple processes
have the same PID

diff --git a/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_debug.h b/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_debug.h
index 852b2f552460..2de1d984cc99 100644
--- a/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_debug.h
+++ b/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_debug.h
@@ -97,7 +97,7 @@ typedef va_list gctARGUMENTS;
 
 #if LINUX_VERSION_CODE &amp;gt;= KERNEL_VERSION(2,6,24)
 #   define gcmkGETPROCESSID() \
-        task_tgid_vnr(current)
+        task_tgid_nr(current)
 #else
 #   define gcmkGETPROCESSID() \
         current-&amp;gt;tgid
@@ -105,7 +105,7 @@ typedef va_list gctARGUMENTS;
 
 #if LINUX_VERSION_CODE &amp;gt;= KERNEL_VERSION(2,6,24)
 #   define gcmkGETTHREADID() \
-        task_pid_vnr(current)
+        task_pid_nr(current)
 #else
 #   define gcmkGETTHREADID() \
         current-&amp;gt;pid
diff --git a/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_linux.h b/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_linux.h
index a436edb11d9a..57b0629569aa 100644
--- a/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_linux.h
+++ b/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_linux.h
@@ -330,7 +330,7 @@ _GetProcessID(
     )
 {
 #if LINUX_VERSION_CODE &amp;gt;= KERNEL_VERSION(2,6,24)
-    return task_tgid_vnr(current);
+    return task_tgid_nr(current);
 #else
     return current-&amp;gt;tgid;
 #endif
diff --git a/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_os.c b/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_os.c
index 5532efadd1e1..a0b274a35288 100644
--- a/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_os.c
+++ b/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_os.c
@@ -109,7 +109,7 @@ _GetThreadID(
     )
 {
 #if LINUX_VERSION_CODE &amp;gt;= KERNEL_VERSION(2,6,24)
-    return task_pid_vnr(current);
+    return task_pid_nr(current);
 #else
     return current-&amp;gt;pid;
 #endif&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'd appreciate acknowledgement about this issue, as well as a further analysis of the possible side-effects my patch would have as I have no way of checking what the closed source libGAL and other gpu-viv libraries do with that PID (e.g. there could be unwanted side-effects if they notice the PID doesn't match somewhere)&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;</description>
    <pubDate>Wed, 16 Feb 2022 23:12:42 GMT</pubDate>
    <dc:creator>martinetd</dc:creator>
    <dc:date>2022-02-16T23:12:42Z</dc:date>
    <item>
      <title>gpu-viv bug when using PID namespaces</title>
      <link>https://community.nxp.com/t5/i-MX-Processors/gpu-viv-bug-when-using-PID-namespaces/m-p/1412900#M186958</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;This is a follow-up for &lt;A href="https://community.nxp.com/t5/i-MX-Processors/libGAL-segfaults-when-it-s-PID1/m-p/1388607" target="_blank" rel="noopener"&gt;https://community.nxp.com/t5/i-MX-Processors/libGAL-segfaults-when-it-s-PID1/m-p/1388607&lt;/A&gt; which has been stale for a month. It is complicated and was not run on BSP so I wanted to restart fresh with a new post.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hardware: imx8mp evk SCH-46370 REV A1 with 8MPLUSLPD4 CPU board (rev x1)&lt;/P&gt;&lt;P&gt;Software: LF_v5.10.72-2.2.0_images_IMX8MPEVK BSP with 5.10.72-lts-5.10.y+ga68e31b63f86 kernel as is&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Running the following commands lead to a segfault on the second command:&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;unshare -f -p ./benchmark_model --graph=mobilenet_v1_1.0_224_quant.tflite --use_nnapi=true &amp;gt; /dev/null &amp;amp;&lt;/P&gt;&lt;P&gt;unshare -f -p ./benchmark_model --graph=mobilenet_v1_1.0_224_quant.tflite --use_nnapi=true&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;with traces as follow:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;Core was generated by `./benchmark_model --graph=mobilenet_v1_1.0_224_quant.tflite --use_nnapi=true'.
Program terminated with signal SIGSEGV, Segmentation fault.
(gdb) bt
#0  0x0000ffff964177f0 in ?? () from /usr/lib/libGAL.so
#1  0x0000ffff96417a44 in ?? () from /usr/lib/libGAL.so
#2  0x0000ffff9644df9c in ?? () from /usr/lib/libGAL.so
#3  0x0000ffff964064bc in gcoVX_CreateHW () from /usr/lib/libGAL.so
#4  0x0000ffff964066b0 in gcoVX_Construct () from /usr/lib/libGAL.so
#5  0x0000ffff964068dc in gcoVX_SwitchContext () from /usr/lib/libGAL.so
#6  0x0000ffff975440d0 in ?? () from /usr/lib/libOpenVX.so.1
#7  0x0000ffff97798eb8 in vsi_nn_CreateContext () from /usr/lib/libovxlib.so.1.1.0
#8  0x0000ffff97bb6798 in nnrt::Execution::Execution(nnrt::Compilation*) () from /usr/lib/libnnrt.so.1
#9  0x0000ffff97cda76c in ANeuralNetworksExecution_create () from /usr/lib/libneuralnetworks.so
#10 0x0000ffff981dc684 in tflite::delegate::nnapi::NNAPIDelegateKernel::Invoke(TfLiteContext*, TfLiteNode*, int*) () from /usr/lib/libtensorflow-lite.so.2.6.0
#11 0x0000ffff981ce524 in tflite::Subgraph::Invoke() () from /usr/lib/libtensorflow-lite.so.2.6.0
#12 0x0000ffff98388590 in tflite::Interpreter::Invoke() () from /usr/lib/libtensorflow-lite.so.2.6.0
#13 0x0000aaaabb45fc6c in ?? ()
#14 0x0000aaaabb462c10 in ?? ()
#15 0x0000aaaabb45de68 in ?? ()
#16 0x0000ffff97d64994 in __libc_start_main (main=0xaaaabb45d780, argc=3, argv=0xfffff4999f88, init=&amp;lt;optimized out&amp;gt;, fini=&amp;lt;optimized out&amp;gt;, rtld_fini=&amp;lt;optimized out&amp;gt;, 
    stack_end=&amp;lt;optimized out&amp;gt;) at ../csu/libc-start.c:332
#17 0x0000aaaabb45dbb8 in ?? ()&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is because the gpu-viv driver allocates resources based on PID number, and if two processes with resources share the same PID the process won't handle it well.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I was able to work around this issue with the following patch, also accessible here: &lt;A href="https://github.com/atmark-techno/linux-5.10-at/commit/b4de9635b00ba52fafc35b953f20260eb78f593e" target="_blank" rel="noopener"&gt;https://github.com/atmark-techno/linux-5.10-at/commit/b4de9635b00ba52fafc35b953f20260eb78f593e&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;From b53c1a5bcc28db552dcc28fbb52289b5d043396c Mon Sep 17 00:00:00 2001
From: Dominique Martinet &amp;lt;dominique.martinet@atmark-techno.com&amp;gt;
Date: Tue, 8 Feb 2022 16:07:12 +0900
Subject: [PATCH] gpu-viv: make galcore functions use the global init pid
 namespace

using the container namespace leads to crashes when multiple processes
have the same PID

diff --git a/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_debug.h b/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_debug.h
index 852b2f552460..2de1d984cc99 100644
--- a/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_debug.h
+++ b/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_debug.h
@@ -97,7 +97,7 @@ typedef va_list gctARGUMENTS;
 
 #if LINUX_VERSION_CODE &amp;gt;= KERNEL_VERSION(2,6,24)
 #   define gcmkGETPROCESSID() \
-        task_tgid_vnr(current)
+        task_tgid_nr(current)
 #else
 #   define gcmkGETPROCESSID() \
         current-&amp;gt;tgid
@@ -105,7 +105,7 @@ typedef va_list gctARGUMENTS;
 
 #if LINUX_VERSION_CODE &amp;gt;= KERNEL_VERSION(2,6,24)
 #   define gcmkGETTHREADID() \
-        task_pid_vnr(current)
+        task_pid_nr(current)
 #else
 #   define gcmkGETTHREADID() \
         current-&amp;gt;pid
diff --git a/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_linux.h b/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_linux.h
index a436edb11d9a..57b0629569aa 100644
--- a/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_linux.h
+++ b/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_linux.h
@@ -330,7 +330,7 @@ _GetProcessID(
     )
 {
 #if LINUX_VERSION_CODE &amp;gt;= KERNEL_VERSION(2,6,24)
-    return task_tgid_vnr(current);
+    return task_tgid_nr(current);
 #else
     return current-&amp;gt;tgid;
 #endif
diff --git a/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_os.c b/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_os.c
index 5532efadd1e1..a0b274a35288 100644
--- a/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_os.c
+++ b/drivers/mxc/gpu-viv/hal/os/linux/kernel/gc_hal_kernel_os.c
@@ -109,7 +109,7 @@ _GetThreadID(
     )
 {
 #if LINUX_VERSION_CODE &amp;gt;= KERNEL_VERSION(2,6,24)
-    return task_pid_vnr(current);
+    return task_pid_nr(current);
 #else
     return current-&amp;gt;pid;
 #endif&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'd appreciate acknowledgement about this issue, as well as a further analysis of the possible side-effects my patch would have as I have no way of checking what the closed source libGAL and other gpu-viv libraries do with that PID (e.g. there could be unwanted side-effects if they notice the PID doesn't match somewhere)&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;</description>
      <pubDate>Wed, 16 Feb 2022 23:12:42 GMT</pubDate>
      <guid>https://community.nxp.com/t5/i-MX-Processors/gpu-viv-bug-when-using-PID-namespaces/m-p/1412900#M186958</guid>
      <dc:creator>martinetd</dc:creator>
      <dc:date>2022-02-16T23:12:42Z</dc:date>
    </item>
    <item>
      <title>Re: gpu-viv bug when using PID namespaces</title>
      <link>https://community.nxp.com/t5/i-MX-Processors/gpu-viv-bug-when-using-PID-namespaces/m-p/1413765#M187023</link>
      <description>&lt;P&gt;Hello martinetd,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;We know about this issue on previous BSP, but is suppose to be fixed in 5.10.72_2.2.0, if you already present the fail please let is know.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Regards&lt;/P&gt;</description>
      <pubDate>Tue, 15 Feb 2022 14:42:45 GMT</pubDate>
      <guid>https://community.nxp.com/t5/i-MX-Processors/gpu-viv-bug-when-using-PID-namespaces/m-p/1413765#M187023</guid>
      <dc:creator>Bio_TICFSL</dc:creator>
      <dc:date>2022-02-15T14:42:45Z</dc:date>
    </item>
    <item>
      <title>Re: gpu-viv bug when using PID namespaces</title>
      <link>https://community.nxp.com/t5/i-MX-Processors/gpu-viv-bug-when-using-PID-namespaces/m-p/1413977#M187039</link>
      <description>&lt;P&gt;Hello &lt;a href="https://community.nxp.com/t5/user/viewprofilepage/user-id/34846"&gt;@Bio_TICFSL&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The trace I gave here was on 5.10.72_2.2.0 BSP.&lt;/P&gt;&lt;P&gt;In the previous post, I incorrectly said this was fixed in this version because benchmark_model default changed from npu to cpu, and this doesn't happen when using cpu, but using npu fails the same way.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you!&lt;/P&gt;</description>
      <pubDate>Wed, 16 Feb 2022 00:17:08 GMT</pubDate>
      <guid>https://community.nxp.com/t5/i-MX-Processors/gpu-viv-bug-when-using-PID-namespaces/m-p/1413977#M187039</guid>
      <dc:creator>martinetd</dc:creator>
      <dc:date>2022-02-16T00:17:08Z</dc:date>
    </item>
    <item>
      <title>Re: gpu-viv bug when using PID namespaces</title>
      <link>https://community.nxp.com/t5/i-MX-Processors/gpu-viv-bug-when-using-PID-namespaces/m-p/1414617#M187073</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;ok, this will need to go with developers, it will take time to answer is possible to appears fixed in next release of the BSP.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Regards&lt;/P&gt;</description>
      <pubDate>Wed, 16 Feb 2022 17:28:29 GMT</pubDate>
      <guid>https://community.nxp.com/t5/i-MX-Processors/gpu-viv-bug-when-using-PID-namespaces/m-p/1414617#M187073</guid>
      <dc:creator>Bio_TICFSL</dc:creator>
      <dc:date>2022-02-16T17:28:29Z</dc:date>
    </item>
    <item>
      <title>Re: gpu-viv bug when using PID namespaces</title>
      <link>https://community.nxp.com/t5/i-MX-Processors/gpu-viv-bug-when-using-PID-namespaces/m-p/1414837#M187085</link>
      <description>&lt;P&gt;Thank you! Please do not hesitate to reach out to me if you or developers have any question.&lt;/P&gt;</description>
      <pubDate>Wed, 16 Feb 2022 23:13:40 GMT</pubDate>
      <guid>https://community.nxp.com/t5/i-MX-Processors/gpu-viv-bug-when-using-PID-namespaces/m-p/1414837#M187085</guid>
      <dc:creator>martinetd</dc:creator>
      <dc:date>2022-02-16T23:13:40Z</dc:date>
    </item>
  </channel>
</rss>

