DNPU cluster

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

DNPU cluster

547 Views
Doyle
Contributor II

NPU cluster dream - 

So, I have a dream. I want to put together 4 of the 16168 cards on a pcie switched riser and make an NPU cluster to run an LLM. I would LOVE it to be able to run on windows alongside my GPUs in a heterogeneous compute environment, or I could install on one of my Ubuntu machines. Just how nuts am I? I've been pushing on Gemini and Cursor to help me figure this out. Cursor seems to think this is plausible, BUT it will not be able to come up with a viable driver without some more info. Cursor wants me to ask for Windows host enablement: programmer’s guide, firmware package, reference driver or porting kit, and redistribution terms — or confirm it does not exist and what NDA path applies (he may be wishlisting here)

 

0 Kudos
Reply
3 Replies

532 Views
Doyle
Contributor II

post script - this card looks like the riser card I want  ::  https://www.google.com/search?q=4+Port+M.2+NVMe+PCIe+4.0+Switch+Expansion+Card+No+Bifurcation+Needed...

 

0 Kudos
Reply

464 Views
AldoG
NXP TechSupport
NXP TechSupport

Hello,

Seems that you are trying to create something similar to the ARA240:
https://www.nxp.com/products/ARA240

Also, you may be interested in a the i.MX8MP FRDM board to combo the ARA240:
https://www.nxp.com/design/design-center/development-boards-and-designs/FRDM-IMX8MPLUS

Best regards/Saludos,
Aldo.

0 Kudos
Reply

441 Views
Doyle
Contributor II

Yes ! Thanks you, the Ara240 is at the heart of the matter. The reference to the 16168 module was regarding the Gateworks GW16168, which as far as I can tell isn't tremendously different than the Ara240 16GB M.2 module (apologies if that was an obscure reference).  What I'm trying to build would more properly be called a pcie expansion card rather than an SBU. Switched PCIE riser cards were originally built to create M.2 m-key drive expansions, but I want to re-purpose that PCIE infrastructure to host 4 Ara240 m.2 modules, giving me a cluster to run LLMs locally. I'm not sure if the Tensor LLM structure would work as well as it does for my GPUs, but I'd love to give it a try. My highest hope is that I can achieve a heterogeneous compute environment on my pc. Anywho, the real kicker would seem to be that driver construction. My Cursor agent feels like it can do a lot, but there are real limits to what it can put together without some further software level info about the Ara240 chip.

Tags (2)
0 Kudos
Reply