RNN, LSTM, GRU

This study highlights the resource requirements of Simple RNN, LSTM, and GRU models.

Only TFLM could be used for these models (1). Also, all the models were deployed on the NUCLEO-L4R5ZI board.

Edge Impulse requires an enterprise account, Renesas eAI Translator cannot convert RNNs (might be solved by further adjustments), and Ekkono does not support RNNs.

Simple 0 is excluded from the figures due to its negligible resource requirements compared to the other models and keeping the figures readable. Simple 2 is also excluded from the figures because its basic version failed to run. All models' information is available in the tables below.

Model	Variant	Parameters	MACs	Error	Exe (ms)	Flash (kB)	RAM (kB)
Simple 0	basic	5	9	0	0.107	110.4375	8.140625
Simple 0	int8 only	5	9	0.005785	0.14567	111.640625	7.1328125
Simple 1	basic	8288	827200	0	107.2074	218.65625	112.6328125
Simple 1	int8 only	8288	827200	0.004366	292.9049	590.8125	103.2578125
Simple 2	basic	32960	3292800	-	-	-	-
Simple 2	int8 only	32960	3292800	0.004072	292.9049	615.1875	106.3828125
Shakespeare 1	basic	12513	1056300	0	141.1776	251.578125	127.625
Shakespeare 1	int8 only	12513	1056300	0.020862	168.4068	620.84375	107.5859375
Shakespeare 2	basic	37249	3321900	0	377.1719	348.203125	175.625
Shakespeare 2	int8 only	37249	3321900	0.021797	323.4767	645.1953125	107.5859375
LSTM	basic	26912	2702400	0	362.2378	472.9609375	268.734375
LSTM	int8 only	26912	2702400	0.013989	565.3515	769.8203125	258.359375
GRU	basic	20896	2094400	0	276.7367	496.984375	298.734375
GRU	int8 only	20896	2094400	0.044773	374.4158	809.7890625	295.359375

Models

Error

Execution Time

Flash Size

RAM Usage

Summary

Model Correctness: Except the basic version of Simple 2 which runs out of memory, all models got relatively acceptable error rates. The most concerning case belongs to GRU which might require some attention.
Execution Time: It is surprising that except for Shakespeare 2, the int8 only versions of the models have higher execution times.
Flash Size: It is surprising that the int8 only versions of the models have larger flash sizes.
RAM Usage: int8 only requires less RAM. (1)
1. We expected the int8 only to have the advantage with a bigger margin.
Conclusion: In most cases, the basic version of the models is more efficient.

LSTM vs GRU

It's believed that LSTM is more powerful than GRU, but GRU is more parameter and computation efficient. Also, in our study, we see that LSTM has more parameters and MACs than GRU. However, it is interesting that using TFLM, GRU has a larger flash size and RAM usage than LSTM. Still, it's executed faster than LSTM.