Here's my calculation of PTS, DTS and frame duration:
unsigned int currentFrameTimestamp = this->getCurrentTimestamp() - this->startTime; // [ms]
pAVPacket->pts = currentFrameTimestamp * 90; // Presentation timestamp. 90000 represents 1 second.
pAVPacket->dts = currentFrameTimestamp * 90; // Decoding timestamp. Same as presentation timestamp.
pAVPacket->duration = currentFrameTimestamp - this->previousFrameTimestamp; // Duration of the frame.
this->previousFrameTimestamp = currentFrameTimestamp; // Used to calculate the duration of the next frame.
It plays correctly in any player showing the right duration even if some frames are dropped from time to time. getCurrentTimestamp is system time in milliseconds.
I also figured out how to create the MP4 container along with the H.264 encoded video stream on-the-fly. This means no post-processing step is necessary for creating the container. If anyone's interested let me know and I'll post some code for it.
Cheers!