Hi Joe:
As for my experience, systems with bare metal code can have good performance and good memory footprint. If your application is very simple and doesn't need multi tasks, bare metal code is good.
As your applications grows in size or complexity, maybe RTOS is a better choice.
An RTOS can have have multiple tasks simultaneously and can switch between them based on events and priorities. It has many synchronization tools that are included by default in a RTOS. for example, mutex, semaphore, events, message queue.. For bare metal code, it is a big problem.
An RTOS has a scheduler that can decide which task to be executed based on the priority of the task. It can suspend a task in order to execute a higher priority task.
Also it is easier to develop with an RTOS. and code can be ported to other platform.
......
For PCIe and SATA interfaces working on the LS1012A processor, I would suggest you ask questions on OorIQ forum
https://community.nxp.com/community/qoriq
Regards
Daniel