There's none that can't do that. A 30-year-old 8-bit micro (even a 4-bit micro) could easily emulate that requirement. Anything that's "Turing Complete" could.
Or do you have some sort of unmentioned speed requirement that needs to read the data at some multiple gigahertz? The data rate and the data size (how many kBytes or Megabytes) is far more important.
You don't need "multiple accumulators" (at least ones running in parallel) as the memory path is going to be the bottleneck for storing the data if there's more than will fit into any internal SRAM in the chip. Many CPUs (Freescale MCF5329 is the one I'm most familiar with) have around 32k of internal RAM runing at 240MHz. After that you spill to the external SDRAM that can only manage about 30MHz (clocked at 80MHz, but takes 10-15 clocks per 4 word cycle). And so on.
Tom