MyoAmp | Software

TDNN

Time-delay neural networks are feedforward models that incorporate temporal context by concatenating time-shifted inputs, rather than explicitly modeling state over time. While computationally simple, this approach is generally less robust for complex temporal patterns and less flexible than more modern sequence models.

LSTM

LSTMs, a widely used form of recurrent neural network, were considered due to their strong ability to capture temporal dependencies and their extensive support in machine-learning frameworks such as PyTorch and TensorFlow. However, LSTMs rely on sequential processing, which makes them comparatively slower and less efficient on resource-constrained hardware. Although LSTMs are particularly well suited for forecasting tasks, the EMG signals in this project are relatively consistent over short time windows, reducing the need for long-term temporal memory.

TCN

Temporal Convolutional Networks combine the strengths of convolutional models with effective temporal modeling. TCNs use causal, dilated convolutions to capture temporal dependencies while allowing for parallel computation, making them significantly faster and more efficient than recurrent models.

Final Choice

We selected a TCN because prior work shows they often match or slightly outperform LSTMs on sequence-classification tasks such as physiological signal analysis while maintaining lower inference latency. Even when performance gains are marginal, the improved computational efficiency and suitability for embedded deployment make TCNs the best fit for this project.

TCN Architecture

model = TCN(
  num_inputs = 3, # 3 sensors
  num_channels = [32, 32, 64],
  kernel_size = 4, 
  dilations = None, 
  dilation_reset = None, 
  dropout = 0.1, 
  causal = True, 
  use_norm = 'weight_norm', 
  activation = 'relu', 
  kernel_initializer = 'xavier_uniform', 
  use_skip_connections = False, 
  input_shape = 'NCL', 
  embedding_shapes = None, 
  embedding_mode = 'add', 
  use_gate = False,  
  lookahead = 0, 
  output_projection = num_classes, 
  output_activation= None  
)

Explanation of the Architecture

num_inputs = 3

Specifies the number of input channels per time step. In this project, the model receives EMG data from three forearm sensors, each treated as a separate input channel.

num_channels = [32, 32, 64]

Defines the number of convolutional filters in each temporal block of the network. The increasing channel depth allows the model to learn progressively higher-level temporal features from the EMG signals.

kernel_size = 4

Sets the temporal width of the convolutional kernels, meaning each filter processes windows of four consecutive time steps to capture short-term temporal dependencies in the signal.

dilations = None

Uses the default exponentially increasing dilation factors across layers, enabling the network to model long-range temporal dependencies without increasing kernel size.

dilation_reset = None

Indicates that dilation factors are not periodically reset and instead follow the default dilation progression through the network depth.

dropout = 0.1

Applies a 10% dropout rate during training to reduce overfitting by randomly disabling neurons and encouraging more robust feature learning.

causal = True

Enforces causal convolutions so that predictions at a given time step depend only on past and present inputs, which is essential for real-time EMG gesture recognition.

use_norm = 'weight_norm'

Applies weight normalization to stabilize training by decoupling the magnitude and direction of weight vectors, improving convergence behavior.

activation = 'relu'

Uses the Rectified Linear Unit activation function, introducing nonlinearity while maintaining computational efficiency.

kernel_initializer = 'xavier_uniform'

Initializes convolutional weights using Xavier uniform initialization, helping maintain stable signal variance across layers at the start of training.

use_skip_connections = False

Disables residual skip connections between layers. While skip connections can improve gradient flow, they were not used in this configuration to keep the architecture simpler.

input_shape = 'NCL'

Specifies the input tensor format as (batch size, number of channels, sequence length), which is appropriate for time-series EMG data.

embedding_shapes = None

Indicates that no additional learned embeddings (e.g., for categorical metadata) are incorporated into the model.

embedding_mode = 'add'

Defines how embeddings would be combined with inputs if present; here, embeddings would be added element-wise, though none are used.

use_gate = False

Disables gated activations (as in gated TCNs). While gating can improve expressiveness, it increases computational cost and was not required for this task.

lookahead = 0

Specifies zero lookahead, meaning the model does not access future time steps. This setting is deprecated but reinforces strict causality.

output_projection = 5

Projects the final network output to five classes, corresponding to the five hand gesture categories used in the classification task.

output_activation = None

Applies no activation function at the output layer, producing raw logits that are later processed by a loss function such as softmax cross-entropy during training.

Confusion matrix for EMG gesture classification

Confusion matrix summarizing the performance of the EMG gesture classification model across all CSV files. Rows correspond to the true gesture labels and columns to the predicted labels. The model demonstrates near-perfect classification for active gestures, achieving 100% accuracy for full_fist, open_hand, and true_rest, and 90% accuracy for air_pinch, with a single misclassification as open_hand. The primary source of error arises from confusion between rest and true_rest, where most rest samples are predicted as true_rest, indicating highly similar EMG signatures between these two classes. This suggests that rest and true_rest are likely too similar in terms of forearm muscle activation—differing mainly in whether the arm is supported on a flat surface—and therefore may have been more appropriately grouped into a single class.

The confusion matrix was generated by evaluating the trained EMG gesture classification model on a held-out validation set created using an 80/20 split of the available training data. Specifically, 80% of the labeled EMG samples were used to train the model, while the remaining 20% were reserved for evaluation to assess generalization performance on unseen data. After training, the model’s predictions on this validation subset were compared against the corresponding ground-truth labels, and the resulting counts of correct and incorrect classifications were aggregated into the confusion matrix.

Next Steps

Model deployment on embedded hardware: The trained Temporal Convolutional Network will be converted into a TensorFlow Lite (TFLite) format to enable deployment on a Raspberry Pi Pico–class microcontroller. This step is necessary to reduce model size and computational complexity while preserving classification performance, allowing the model to run within the memory and processing constraints of embedded hardware.
Evaluation with newly collected sensor data: The deployed model will be tested using new EMG data collected from the sensors to evaluate its ability to generalize beyond the original training dataset. This testing phase will help identify performance degradation due to sensor noise, placement variation, or user-specific differences that were not present in the initial data.
Real-time performance validation: The system will be evaluated under real-time operating conditions to ensure that gesture classification remains accurate and responsive when processing streaming EMG signals. This includes verifying that inference latency meets real-time requirements and that predictions remain stable during continuous use.
Overfitting mitigation and data robustness: Additional analysis will be conducted to ensure the model is not overfit to the original dataset. This includes comparing training and validation performance, testing on unseen data, and potentially expanding the dataset or adjusting regularization strategies to improve robustness and ensure reliable performance in real-world scenarios.
User generalization and usability: Future work will also involve training and evaluating the model using EMG data collected from a larger and more diverse group of users to improve cross-user generalization. By incorporating inter-user variability in muscle physiology and sensor placement, the system can be made more robust and reduce reliance on extensive per-user calibration. The goal is to ensure a positive user experience in which new users can achieve reliable gesture recognition with minimal or no individualized training, making the system practical for real-world adoption.

Software Design

Overview