Having only a month to complete the job, I would strongly advise against changing MCU family (if you can avoid it).
You say you need 4 SCIs, but the 9S08GW has 3 (according to the block diagram -- although somewhere I saw 4 mentioned -- don't know this MCU). Here's what I would do (and have done in similar situation when I needed more SCIs than the MCU supported):
Assuming enough CPU idle time is left in your application and the baud rate is not excessively high, you could possibly implement soft SCIs using Input Captures (RX) and some timer for TX (or no timer, if you can afford to block while outputting the bit stream for a byte).
Also, instead of Input Captures (if not enough are available), you could also use a general-purpose input interrupt (like KBI) combined with some general timer to grab incoming bits correctly.
Regarding TX, there's also the possibility of using a single TX 'multiplexed' for more than one channel (assuming common baud rate).
If this works, then maybe you don't even have to move away from the QE32 that you already know well.
Hope this helps.