Unable to get bounding box and label information when using tflite model for detection on imx93 FRDM

キャンセル
次の結果を表示 
表示  限定  | 次の代わりに検索 
もしかして: 

Unable to get bounding box and label information when using tflite model for detection on imx93 FRDM

463件の閲覧回数
Aditya_Vashista
Contributor II

Hello,

I am trying to a python application for FRDM imx93 to detect all objects using the already provided ssdlite_mobilenet_v2_coco_quant_uint8_float32_no_postprocess_vela.tflite which is downloaded using the go-point demo for detection.  I want to get bounding box coordinates/information as well as labels from tensor converter and decoder (using gstreamer) in my application in-order to customize the overlays via python application and do special processing also for particular detected class.

After reading the NNstremaer documentation i enabled option7 in tensor decoder in hope of getting the needed data, but i am not getting that particular data. Instead, i am getting a byte stream of size 1,228,800 bytes which is a frame of 640X480X4. When i try to visualize this frame using OpenCv (cv2), it comes out to be a dark black frame with blue overlays of bounding box and labels in places where the object is in-front of the camera. 

I also tried adding option7=1 in the batch script used in detection demo, but still can't get the bounding box and labels info on terminal.

My pipeline string is as follows:

pipeline_str = (
"v4l2src name=cam_src device=/dev/video0 num-buffers=-1 ! "
"video/x-raw,width=640,height=480,framerate=30/1 ! "
"tee name=t "

"t. ! queue name=thread-nn max-size-buffers=2 leaky=2 ! "
"imxvideoconvert_pxp ! video/x-raw,width=300,height=300,format=BGR ! "
"videoconvert ! video/x-raw,format=RGB ! "
"tensor_converter ! "
"tensor_filter framework=tensorflow-lite model=/opt/gopoint-apps/downloads/ssdlite_mobilenet_v2_coco_quant_uint8_float32_no_postprocess_vela.tflite "
"custom=Delegate:External,ExtDelegateLib:libethosu_delegate.so ! "
"tensor_decoder mode=bounding_boxes "
"option1=mobilenet-ssd "
"option2=/opt/gopoint-apps/downloads/coco_labels_list.txt "
"option3=/opt/gopoint-apps/downloads/box_priors.txt:0.5:10.0:10.0:0.5:0.5:0.5 "
"option4=640:480 option5=300:300 option7=1 ! "
"appsink name=npusink emit-signals=true max-buffers=1 drop=true "

"t. ! queue name=thread-frame max-size-buffers=2 leaky=2 ! "
"videoconvert ! video/x-raw,format=BGR ! appsink name=framesink emit-signals=true max-buffers=1 drop=true"
)
pipeline = Gst.parse_launch(pipeline_str)

 

My sink processing functions are as follows:

def get_frame(frame_sink):
    sample = frame_sink.emit("pull-sample")
    if sample:
        buffer=sample.get_buffer()
        success,map_info =buffer.map(Gst.MapFlags.READ)
        if success:
           frame_data=map_info.data
           buffer.unmap(map_info)
           return frame_data
    else:
       print("No data got from FRAME SINK!")
    return
def get_detections(npu_sink):
    sample = npu_sink.emit("pull-sample")
    if sample:
        buffer=sample.get_buffer()
        success,map_info =buffer.map(Gst.MapFlags.READ)
        meta=Gst.Buffer.get_meta(buffer,Gst.Meta.api_type_get_tags())
        print(meta)
        if success:
           try:
               
               raw_data = map_info.data
               print("NPU DATA type: ",type(raw_data)," len: ",len(raw_data))
               #print(raw_data)
               if not isinstance(raw_data,bytes):
                   raw_data=bytes(raw_data)
               data_str=raw_data.decode("utf-8",errors="replace")
               detection_data=json.load(data_str)
               buffer.unmap(map_info)
               return detection_data.get("bounding_boxes",[]),detection_data.get("labels",[])
               """array=np.frombuffer(map_info.data,dtype=np.float32)
               buffer.unmap(map_info)
               num_detections = len(array)//6
               array=array.reshape((num_detections,6))
               filtered = array[array[:,5]>0]
               detections=[]
               print(filtered)
               for i in filtered:
                   x,y,w,h,class_id,conf = i
                   detections.append({"x":x,"y":y,"w":w,"h":h,"class_id":class_id,"conf":conf})
               return detections"""
           except Exception as e:
               print("error parsing NPU data: ",e)
               buffer.unmap(map_info)
    else:
       print("No data got from NPU SINK!")
    #return {}
    return [],[]

def main():
    global pipeline,npusink,framesink

    pipeline = start_pipeline()
    #for i in range(10):
    while True:
        state= pipeline.get_state(Gst.CLOCK_TIME_NONE)
        #print("Pipeline Status: ",state)
        frame=get_frame(framesink)
        bounding_boxes = get_detections(npusink)
        #print("boxes: ",len(bounding_boxes))
        #print(bounding_boxes)
        #print("labels: ", len(labels))

        
        if frame is not None:
            #print("frame: yes. Type: ",type(frame))
            #print("len: ",len(frame))
            frame_np=np.frombuffer(frame,dtype=np.uint8).reshape(480,640,3)
            #TODO: extract the bounding box and label information from npusink and pass the frame,boundingbox info and labels to recognize_faces() to get overlay and unique person detections
            
            cv2.imshow("Detection Output", frame_np)

        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

    cleanup()

 

Please help me in this query. Also, i am having an hypothesis  that the model provided in detection example is not compatible with option-7 of tensor decoder and hence i am not getting any small information, instead a complete processed overlayed black frame. If yes, can you suggest a other compatible model which can do the same thing. It would be great if you can also provide the model properties.

 

P.S: attaching the demo application also:

0 件の賞賛
返信
0 返答(返信)