What is PyBCI?

Statement of need

PyBCI addresses the growing need for a real-time Brain-Computer Interface (BCI) software capable of handling diverse physiological sensor data streams. By leveraging robust machine learning libraries such as PyTorch, SKLearn, and TensorFlow, alongside the Lab Streaming Layer protocol, PyBCI facilitates the integration of real-time data analysis and model training. This opens up avenues for researchers and practitioners to not only receive and analyze physiological sensor data but also develop, test, and deploy machine learning models seamlessly, fostering innovation in the rapidly evolving field of BCIs.

General Overview

PyBCI is a python based brain computer interface software designed to receive a varying number, be it singular or multiple, Lab Streaming Layer enabled physiological sensor data streams. An understanding of time-series data analysis, the lab streaming layer protocol, and machine learning techniques are a must to integrate innovative ideas with this interface. An LSL marker stream is required to train the model, where a received marker epochs the data received on the accepted datastreams based on a configurable time window around certain markers - custom marker strings can optionally be split and overlapped to count for more then one marker, example:

A baseline marker may have one marker sent for a 60 second window, where as target actions may only be ~0.5s long, so to conform when testing the model and giving a standardised window length would be desirable to split the 60s window after the received baseline marker in to ~0.5s windows. By overlapping windows we try to account for potential missed signal patterns/aliasing, as a rule of thumb it would be advised when testing a model to have an overlap of larger than or equal to 50%, see Shannon nyquist criterion. See here for more information on epoch timing.

Once the data has been epoched it is sent for feature extraction, there is a general feature extraction class which can be configured for general time and/or frequency analysis based features, ideal for data stream types like “EEG” and “EMG”. Since data analysis, preprocessing and feature extraction techniques can vary greatly between devices, a custom feature extraction class can be created for each data stream maker type. See here for more information on feature extraction.

Finally a passable, customisable sklearn or tensorflow classifier can be given to the bci class, once a defined number of epochs have been obtained for each received epoch/marker type the classifier can begin to fit the model. It’s advised to use ReceivedMarkerCount() to get the number of received training epochs received, once the min num epochs received of each type is larger than or equal to minimumEpochsRequired (default 10 of each epoch) the model will begin to fit. Once fit classifier info can be queried with CurrentClassifierInfo(), when a desired accuracy is met or number of epochs TestMode() can be called. Once in test mode you can query what pybci estimates the current bci epoch is (typically a “baseline” marker is given in the training period for no state). Review the examples for sklearn and model implementations.

Finally a passable pytorch, sklearn or tensorflow classifier can be given to the bci class, once a defined number of epochs have been obtained for each received epoch/marker type the classifier can begin to fit the model. It’s advised to use ReceivedMarkerCount() to get the number of received training epochs received, once the min num epochs received of each type is larger than or equal to minimumEpochsRequired (default 10 of each epoch) the model will begin to fit. Once fit the classifier info can be queried with CurrentClassifierInfo(), this returns the model used and accuracy. If enough epochs are received or high enough accuracy is obtained TestMode() can be called. Once in test mode you can query what pybci estimates the current bci epoch is (typically baseline is used for no state). Review the examples for sklearn and model implementations.

All the examples found on the github not in a dedicated folder have a pseudo LSL data generator enabled by default, by setting createPseudoDevice=True so the examples can run without the need of LSL capable hardware.