Is there a 32bit microcontroller that can be programmed in Assembly using a double or quad accumulator to assemble 8 or 16 bits into 32bit groups that can then be stored in memory? I have a need for a micro that can function this way.
There's none that can't do that. A 30-year-old 8-bit micro (even a 4-bit micro) could easily emulate that requirement. Anything that's "Turing Complete" could.
Or do you have some sort of unmentioned speed requirement that needs to read the data at some multiple gigahertz? The data rate and the data size (how many kBytes or Megabytes) is far more important.
You don't need "multiple accumulators" (at least ones running in parallel) as the memory path is going to be the bottleneck for storing the data if there's more than will fit into any internal SRAM in the chip. Many CPUs (Freescale MCF5329 is the one I'm most familiar with) have around 32k of internal RAM runing at 240MHz. After that you spill to the external SDRAM that can only manage about 30MHz (clocked at 80MHz, but takes 10-15 clocks per 4 word cycle). And so on.
Retrieving data ...