I have consult this with AE team, here is the feedback from Enwei:
1. In fact, the AN2255 is very good reference and has explained the exact MSCAN low-power mode requirements and recommended program flowchart we can provide for our customer;
2. Yes, as customer said, the MSCAN SLPAK bit check(depends on user implementation C code and compiler optimization, several CPU cycles may need)and also the STOP assemble instruction( WFI, 2 CPU cycles excluding time spent waiting for an interrupt or event ) will still consume several CPU cycles before the CPU core really enter into STOP mode:

3. for customer's question:"What is the maximum permitted delay between checking that SLPAK is set and entering STOP mode? (how many MSCAN or CPU cycles...)", the max delay here is equal to the minimum time between SLPAK set and be cleaned again(the next CAN frame arrive and wake up the MSCAN ), so it actual means next CAN frame arrival interval--the Inter-frame space defined in BOSCH CAN spec.

If current CAN node is not a error passive node, the minimum inter-frame space is 3 bit intermission + Bus idle (an arbitration length, 12 bit for standard frame with 11-bit ID):

so the the minimum time interval is 15 * CAN bus bit period, which depends on the CAN bus communication baud rate you are using.
for example, a 500kbit/s CAN bus, the bit period is 1/500KHz = 2us, so the the minimum time interval is 15 * CAN bus bit period = 30us.
Based on the above discussion, let's consider a typical configuration of KEA part with MSCAN, use FEE mode to generate a 40MHz CPU core clock with a 8MHz external crystal as reference, the CPU instruction cycle will be 1/40MHz = 25ns, so 30us is 30/0.025 = 1250 instruction cycles, CPU can run ~1000 assembly instructions(for most CM0+ Thumb-2 ISA are single-cycle)for many low-power preparation works.
PS: the debug breakpoint customer made is meaningless for KEA CPU core is running with MIPS,the breakpoint cannot simulate the fast case in real world.
Based on the above, the following two suggestions for customer reference,
1. make the MSCAN sleep request and ACK check at the last of MCU low-power mode preparation, and simplify the step 3 as much as possible to short the time(suggest to configure the MSCAN wake up feature when MSCAN initialization, for MSCAN wake up feature only works in MSCAN sleep/shutdown mode, and does not work in normal mode):

2. as said in AN2255 , at high-level application, it is suggested that the CAN network operational modes are handled by network management software(for most automotive ECU, CAN network management is a MUST). For example, a specific message can be broadcast telling every node to go into sleep mode at the same time.