Using i.MX6Q to Build a Palm-Sized Heterogeneous Mini-HPC

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Using i.MX6Q to Build a Palm-Sized Heterogeneous Mini-HPC

No ratings

Using i.MX6Q to Build a Palm-Sized Heterogeneous Mini-HPC

This work is the result of my daughter's idea, she finished it with my guidance.


Cradle-1 Palmsize mini-HPC

World's first full function heterogeneous mini-HPC, this is what it looks like:

摇篮1号 <wbr><wbr> <wbr><wbr>-- <wbr><wbr> <wbr><wbr> <wbr><wbr>世界首台CPU+GPU异构掌上超级计算机

1 Architecture

        Overall:  CPU+GPU heterogeneous, 4 nodes, connected by a 100M Ethernet switcher;

        Nodes FreeScale I.MX6 Quad core mini-pc, with 4 ARM Cortex-A9 cores and 1 Vivante GC2000 GPU

2  Software

        OS:   Ubuntu 11.10 linaro

        OpenCL driver: Vivante GC2000 OpenCL driver

        Compiler C/C++: gcc 4.6.1, Fortan90/95:  gfortran 4.6.1,

        MPI Parallel Computing: MPICH2 1.4-1

        NFS network file system: nfs-kernel-server 1.2.4

        SSH security:   openssh   1:5.8

3 Hardware

        The hardware of all nodes are the same, only the software configurations are slightly different. One of them was assigned as the master node, the others are slave nodes. They were TV sticks originally, with android 4.0 installed. The node's hardware specification is:

        CPU: 4 1.2G Cortex-A9 cores

        GPU: 1 Vivante GC2000 GPU

        RAM: 1G DDR

        ROM: 8G SD

        NIC:   usb2.0 100M Ethernet Adapter (this NIC is not the TV stick's component, we added it)

        WIFI: 150M

        Display Interface:  HDMI

        Network Switcher: 5 port 100M Ethernet Switcher

Network

        Each node has one USB2.0 NIC and one WIFI interface, the WIFI is used as the backup connection for NIC connection. Network configurations are:

        IP Address assignment:  (baby1 - baby4 are the four computing nodes)

        baby1: 100M NIC 192.168.10.1 WIFI 192.168.0.111

        baby2: 100M NIC 192.168.10.2 WIFI 192.168.0.112

        baby3: 100M NIC 192.168.10.3 WIFI 192.168.0.113

        baby4: 100M NIC 192.168.10.4 WIFI 192.168.0.114

Performance

        Cradle-1 has 16 1.2G ARM Cortex-A9 cores and 4 Vivante GC2000 GPU cores, the total computing power of these 20 computing devices is more than 100GFLOPS,   more powerful than an ordinary desktop. The whole machine is only a little bigger than a palm, and the total power consumption is less than 15 watts.

         The overall architecture of Cradle-1 is almost the same as Chinese Tianhe-1A or the Titan in the oak ridge lab. they used the same set of software, LINUX+OPENCL+OPENMPI. Cradle-1 supports C/C++, Fortran90/95. And almost all kinds of parallel computing algorithms can run on it, the only difference is the scale.

        We coded a MPI parallel computing program for large matrix multiplication with 4 processes, each process had 5 threads, four threads for the four CPU cores, and one thread for GPU computing.

6 Appearance

摇篮1号 <wbr><wbr> <wbr><wbr>-- <wbr><wbr> <wbr><wbr> <wbr><wbr>世界首台CPU+GPU异构掌上超级计算机
Front
摇篮1号 <wbr><wbr> <wbr><wbr>-- <wbr><wbr> <wbr><wbr> <wbr><wbr>世界首台CPU+GPU异构掌上超级计算机
Back
摇篮1号 <wbr><wbr> <wbr><wbr>-- <wbr><wbr> <wbr><wbr> <wbr><wbr>世界首台CPU+GPU异构掌上超级计算机
Top

摇篮1号 <wbr><wbr> <wbr><wbr>-- <wbr><wbr> <wbr><wbr> <wbr><wbr>世界首台CPU+GPU异构掌上超级计算机
Left
摇篮1号 <wbr><wbr> <wbr><wbr>-- <wbr><wbr> <wbr><wbr> <wbr><wbr>世界首台CPU+GPU异构掌上超级计算机
Right

摇篮1号 <wbr><wbr> <wbr><wbr>-- <wbr><wbr> <wbr><wbr> <wbr><wbr>世界首台CPU+GPU异构掌上超级计算机

One node, it has three interfaces, the right is HDMI interface, upper-left is the wireless adapter for keyboard and mouse, down-left is the power connection.

摇篮1号 <wbr><wbr> <wbr><wbr>-- <wbr><wbr> <wbr><wbr> <wbr><wbr>世界首台CPU+GPU异构掌上超级计算机

One node is running Ubuntu 11.10.

摇篮1号 <wbr><wbr> <wbr><wbr>-- <wbr><wbr> <wbr><wbr> <wbr><wbr>世界首台CPU+GPU异构掌上超级计算机

Coded a simple OpenCL program to display OpenCL driver information

摇篮1号 <wbr><wbr> <wbr><wbr>-- <wbr><wbr> <wbr><wbr> <wbr><wbr>世界首台CPU+GPU异构掌上超级计算机

On a notebook, using remote desktop access function to obtan the node baby1's desktop. This is the sign in desktop of baby1 node. Baby 1 has X11VNC server installed.

摇篮1号 <wbr><wbr> <wbr><wbr>-- <wbr><wbr> <wbr><wbr> <wbr><wbr>世界首台CPU+GPU异构掌上超级计算机

sign in baby1, open a terminal

摇篮1号 <wbr><wbr> <wbr><wbr>-- <wbr><wbr> <wbr><wbr> <wbr><wbr>世界首台CPU+GPU异构掌上超级计算机

Ran a MPI testing program, ensuring that all babies (baby1 - baby4) were working

    Any comments? please mail to audrey.tao@hotmail.com


Labels (3)
Comments

Nice.

Do you add some cooling fan or heatsink on it?

No fans. There is not much heat. it runs only a little warm. and, the shells of the babies are made of aluminium.

Congrats Nice Setup!

I wonder if doing the sample code in Fortran would speed up the processing (floating point operations per second). Are you using this setup to do some scientific/academic processing?

Version history
Last update:
‎03-30-2013 09:11 PM
Updated by: