Thanks to everyone for the responses. I submitted a service request to Freescale regarding this issue and they gave me a copy of the validation code for the watchdog module in Kinetis K devices. It included a test with these comments:
/*********STARTFUNC**************************************************
* Name: watchdog_enable_in_lpoclk
* Purpose: To verify the bug fix IPGear ticket 20832 .
* Algorithm:
* Inputs: NONE
* Outputs: NONE
* Return value: NONE
* Assumptions : NONE
* Child Functions used: NONE
* PASS Criteria: global_fail_count == 0, global_pass_count==0xC
***********ENDFUNC***************************************************/
//The root cause is that update information need 2-3 watchdog clock synchronized to watchdog counter domain.
//and lpo clock is 1KHZ, divide by 5 to generate watchdog counter clock
//it need 10 ms to synchronize this to watchdog clock domain
//usd ptd10 toggle to check this time gap.
I think this is the same 2-3 watchdog counter domain cycles suggested by zhaohuiliu. I tried adding delay of at least 2-3 counts of the LPO clock between refresh attempts and that solved my problem. In terms of my bus clock, this required that I add a delay of 200,000 bus clock cycles to my application's main loop, in order to make use of the LPO clock.
I've gone back through the "K20 Sub-Family Reference Manual" to try to find some mention of this 2-3 LPO cycle design requirement. The closest thing I've found is in Section 23.9 "Restrictions on watchdog operation":
"You must take care not only to refresh the watchdog within the watchdog timer's actual time-out period, but also provide enough allowance for the time it takes for the refresh sequence to be detected by the watchdog timer, on the watchdog clock."
If this is the section of the reference manual that describes the design requirement, then I think it needs to be made much more clear, especially when the LPO clock is selected, which as Fred Roeber pointed out seems like the natural choice for the watchdog.
I tried to use the LPO clock because I was having trouble with initialization of the MCG and I found that, if I got stuck in a loop waiting for an MCG status bit to be set, the watchdog didn't always cause a reset if the watchdog was being clocked by the same bus clock I was trying to initialize. Using the LPO clock to drive the watchdog seemed like an ideal solution. But I don't want to add a 2-3 ms delay to my application. My MCG problem was caused by a different K20 problem and I have a workaround for that now, so I've gone back to clocking the watchdog from the bus clock.