



### sEMG-based Gesture Recognition with Spiking Neural Networks on Low-power FPGA

Matteo A. Scrugli, Gianluca Leone, Paola Busia, Paolo Meloni, Università degli Studi di Cagliari, Italy

17-19 January 2024 Munich, Germany

### **SNNs on FPGA**

- SNNs
  - Event-based for energy efficiency





- FPGA as a target
  - DSPs
  - Flexible BRAMs
  - Additional RL
    - Use-case flexible encoding/deconding
    - Emulate event-based with additional clockbased circuitry
  - Affordable costs

2

### State-of-the-art

| Work                       | Dataset               | Encoding                    | Classes | Accuracy | Device    | Energy            | Power  | Mops   |
|----------------------------|-----------------------|-----------------------------|---------|----------|-----------|-------------------|--------|--------|
| Behrenbeck, J.<br>2019     | custom<br>4 subjects  | Delta                       | 4       | 84.8%    | SpiNNaker | N.R. <sup>1</sup> | 1-4 W  | N.R.   |
| Cheng, L.<br>2021          | custom<br>8 subjects  | Population                  | 8       | 94.4%    | N.R.      | N.R.              | N.R.   | 0.013  |
| Tanzarella, S.<br>2023     | custom<br>5 subjects  | HD-sEMG<br>Decomposition    | 10      | 95%      | Jetson    | 0.97 mJ           | 100 mW | 7.97*  |
| Xu, M.<br>2023             | custom<br>10 subjects | Event-drive<br>Differential | 6       | 98.78%   | N.R.      | N.R.              | N.R.   | 6.57   |
| Vitale <i>,</i> A.<br>2022 | NinaPro DB5           | Delta                       | 12      | 74%      | Loihi     | 246 mJ            | 41 mW  | 11.56* |
| Our                        | NinaPro DB5           | Delta                       | 12      | 85.6%    | FPGA      | 35.68 uJ          | 1.7 mW | 2.336  |

<sup>1</sup> Not Reported.

\* Estimated from the paper.



### SYNtzulu: architecture



Copyright © 2023

### **Assessment: low-end implementation**

#### Lattice iCE40UP5k FPGA

- 5280 logic cells (4-LUT + Carry + FF)
- Around 1 Mbit on-chip RAM
- PLL, 2 x SPI, 2 x I2C hard IPs
- Two internal oscillators (10 kHz and 48 MHz) for simple designs
- Eight DSP multiplier blocks



| LUT & FF   | DSP     | BRAM     | SPRAM    |
|------------|---------|----------|----------|
| 4506 (85%) | 2 (25%) | 21 (70%) | 4 (100%) |

![](_page_6_Figure_0.jpeg)

# SNN and training

- Feed-forward SNN 4 FC layers
- PyTorch package **SLAYER** (Spike LAYER Error Reassignment)

![](_page_7_Figure_3.jpeg)

| Input<br>channels | L1 | L2  | L3 | L4 | Max axonal<br>delay | Loss      | Learning rate | Patience  |
|-------------------|----|-----|----|----|---------------------|-----------|---------------|-----------|
| 96                | 64 | 128 | 64 | 13 | 62                  | SpikeRate | Up to 10e-5   | 40 epochs |

### Accuracy

- Accuracy equal to 85.6%.
- 45% of the errors are concentrated in the first row and first column of the confusion matrix.
- True Labels

• Network outputs are filtered. If an output is isolated, it is considered the last valid output.

![](_page_8_Figure_5.jpeg)

| $\mathbf{Rest}$                       | 7078 | 36      | 1       | 28      | 43      | 1        | 4        | 26      | 1       | 39      | 86      | 2       | 32      |
|---------------------------------------|------|---------|---------|---------|---------|----------|----------|---------|---------|---------|---------|---------|---------|
| Idx Flx                               | 62   | 268     | 30      | 15      | 21      | 5        | 17       | 11      | 0       | 0       | 2       | 2       | 2       |
| Idx Ext                               | 31   | 23      | 224     | 8       | 12      | 3        | 1        | 0       | 0       | 0       | 2       | 0       | 0       |
| Mid Flx                               | 30   | 31      | 3       | 285     | 14      | 7        | 8        | 13      | 3       | 3       | 0       | 0       | 0       |
| Mid Ext                               | 17   | 4       | 3       | 0       | 244     | 8        | 4        | 5       | 7       | 0       | 2       | 0       | 0       |
| Ring Flx                              | 43   | 8       | 2       | 12      | 12      | 258      | 2        | 14      | 2       | 3       | 0       | 2       | 2       |
| Ring Ext                              | 39   | 30      | 3       | 17      | 36      | 8        | 218      | 7       | 3       | 2       | 0       | 0       | 10      |
| Lit Flx                               | 48   | 5       | 5       | 6       | 13      | 18       | 26       | 237     | 7       | 10      | 6       | 3       | 11      |
| Lit Ext                               | 52   | 0       | 0       | 5       | 5       | 10       | 13       | 13      | 253     | 0       | 13      | 7       | 0       |
| Thm Add                               | 7    | 10      | 0       | 0       | 0       | 0        | 10       | 4       | 0       | 161     | 6       | 70      | 7       |
| Thm Abd                               | 71   | 2       | 4       | 0       | 0       | 0        | 2        | 3       | 3       | 7       | 190     | 17      | 19      |
| Thm Flx                               | 21   | 0       | 4       | 0       | 0       | 0        | 0        | 0       | 0       | 64      | 2       | 166     | 10      |
| Thm Ext                               | 32   | 0       | 10      | 2       | 2       | 0        | 0        | 7       | 2       | 9       | 3       | 6       | 252     |
| itput - 12<br>ion - 10<br>ension - 10 | Rest | Idx Flx | Idx Ext | Mid Flx | Mid Ext | Ring Flx | Ring Ext | Lit Flx | Lit Ext | Thm Add | Thm Abd | Thm Flx | Thm Ext |

**Predicted Labels** 

Repetitions 5 of each exercise were considered to comprise the test set.

| Vitale, A. 2022                    | Our work                     |  |  |  |  |  |  |
|------------------------------------|------------------------------|--|--|--|--|--|--|
| Commor                             | naspects                     |  |  |  |  |  |  |
| NinaPro DB5 data                   |                              |  |  |  |  |  |  |
| Delta encoc                        | ling method                  |  |  |  |  |  |  |
| 12 cl                              | asses                        |  |  |  |  |  |  |
| Divergent aspects                  |                              |  |  |  |  |  |  |
| Loihi platform                     | Lattice FPGA                 |  |  |  |  |  |  |
| 100 mW of power consumption        | 1.7 mW of power consumption  |  |  |  |  |  |  |
| 8 sEMG channels                    | 16 sEMG channels             |  |  |  |  |  |  |
| Repetition 3 and 5 in the test set | Repetition 5 in the test set |  |  |  |  |  |  |
| Accuracy up to 74%                 | Accuracy up to 85.6%         |  |  |  |  |  |  |
| 11.5 MOPS                          | 2.3 MOPS                     |  |  |  |  |  |  |

### **Power consumption**

![](_page_10_Figure_1.jpeg)

• Dynamic Frequency Scaling with two system frequencies: 10KHz and 22.5MHz

11

## Sparsity and inference time

| Sparsity           | 90.99% |  |  |
|--------------------|--------|--|--|
| Effective sparsity | 77.11% |  |  |
| Inference time     | 2.9 ms |  |  |
| Inference period   | 100 ms |  |  |

![](_page_11_Figure_2.jpeg)

12

- Real-time sEMG classification system.
- High classification accuracy at the state of the art.
- Optimized for Lattice iCE40-UltraPlus FPGA and model for operational efficiency.
- Low Power Consumption.