In this article, we explore a novel approach to efficiently perform neural network computations using a reconfigurable sparse master graph architecture. The proposed method leverages the advantages of both FPGA and CPU, addressing the limitations of traditional all-to-all connectivity in large-scale neural networks. By combining multiple instances of a sparse 3R3X problem, we create a dense master graph that can be efficiently processed on an FPGA.
The key insight behind this approach is the ability to reconfigure the master graph architecture to match the all-to-all topology of the CPU, ensuring seamless communication between different parts of the network. This is achieved through an instance selector, which multiplexes neighbors and clocks for each p-bit using a single phase-shifted clock. The acceptance rates for swap attempts are shown to be roughly equal between the master graph and all-to-all topology, indicating successful equivalence establishment.
To demonstrate the effectiveness of this approach, we present results from 1000 independent runs across various problem sizes, showcasing a consistent success rate of around 37.5% for both the master graph and CPU. Furthermore, we observe that the mean pi value as a function of swap attempts follows a linear growth trend, indicating efficient convergence towards the maximum success probability.
In summary, this article introduces a novel reconfigurable sparse master graph approach for efficient neural network computations, which leverages the advantages of both FPGA and CPU. By establishing equivalence between the master graph and all-to-all topology, we enable seamless communication between different parts of the network, leading to improved computational efficiency. These findings have significant implications for large-scale neural network computations, where the reconfigurable sparse master graph approach can significantly reduce the computational requirements while maintaining accuracy.
Computer Science, Distributed, Parallel, and Cluster Computing