Hi Patrick,
I have generated high-speed PWM signals two ways. I don't think either way is all that "elegant", and "cost-effective" will be determined by your application. They were cost-effective for me.
The first was to use the PWM circuit in a DSP. With the Freescale DSP56721 (which I would NOT recommend) running at 200mHz (100mHz for the timer), I was able to get 10-nanosecond resolution, which gives you one part in 200, if set for a 500kHz period. The period is also adjustable with 10-nanosecond resolution. It was cost effective because I needed the DSP for math anyway.
Less cost effective, but more elegant, is to build the PWM in an FPGA and control it from the MCU. Since most of my designs use an FPGA anyway, it was an easy choice for me. An advantage is that you can build the PWM circuit to your exact specifications, limited only by the clock you have available.
I would also like to hear of ways to do it directly from a 8-bit MCU.