Yes, although it was a bit complicated for my usage on an IMX8MP. I am using the QT5 GUI widget set with gstreamer. I used the gstreamer imxcompositor_g2d that uses the 2D graphics engine hardware in an IMX8MP to perform the overlay on the video stream of a slow frame rate overlay stream produced from an appsrc.
Very basically in Gstreamer (There is lots more in my usage):
"imxcompositor_g2d name=c sink_1::alpha=1.0 ! waylandsink name=\"videoSink\""
"videotestsrc pattern=smpte ! c.sink_1"
"appsrc name=\"appsrc\" ! videoconvert ! c.sink_1"
Then I used Qt5 to generate an overlay in a frame sized and pixel formated QImage using normal QT5 drawing so anything that Qt5 can draw (any fonts, graphics etc) will be overlayed.
On the GStreamer callback for the appsrc frame I then effectively did a memcpy from the QImage to the gstreamer buffer:
buffer = gst_buffer_new_wrapped_full((GstMemoryFlags)0, (gpointer)owOverlayImageDisplay->bits(), owOverlayImageDisplay->sizeInBytes(), 0, owOverlayImageDisplay->sizeInBytes(), NULL, NULL);
The appsrc and QImage was configured as needed for this.
The result was a QImage being overlayed at 4 frames per second and with relatively low CPU usage.