Distributed Machine Learning on EEG data using Apache Spark

A protoype of distributed computing engine for processing of EEG data

The application has the basic machine learning process of

Loading the data (in this case from local file system not hadoop)
Feature extraction (here we just scale all metrics to [0,1] range)
Train the classifier (we train a logistic regression classifier)

Further improvments :

Load more data
Add better signal processing methods
Improve feature extraction process
Add many other classifiers such as mentioned here: https://spark.apache.org/docs/latest/ml-classification-regression.html
Present better the metrics of a model (accuracy, RoC …)
Management of classifiers ie that you can load them from files
Design easy application management with arguments such as input file location, parameters which signal processing methods to use or which classifier to use, where to save results, what metrics to track…

Running the application

Probably the easiest way to load this application is just to clone or check it out from Github. It will run even without Apache Spark or Hadoop installed. Using IntellijIdea it's really easy to set-it up, I don't know for Eclipse.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Distributed Machine Learning on EEG data using Apache Spark

A protoype of distributed computing engine for processing of EEG data

The application has the basic machine learning process of

Further improvments :

Running the application

Files

README.md

Latest commit

History

README.md

File metadata and controls

Distributed Machine Learning on EEG data using Apache Spark

A protoype of distributed computing engine for processing of EEG data

The application has the basic machine learning process of

Further improvments :

Running the application