Benchmarking

Finding the best seizure prediction algorithms

The top performing algorithms from the ‘Melbourne-University AES-MathWorks-NIH Seizure Prediction Challenge’ have provided an initial set of benchmarking algorithms. In the contest the test set was divided up into a public test set (30% of the test set), used to rank algorithms until the end of the contest, and a private test set (70% of the test set), used to determine the winners at the end of the contest.

People participating in Epilepsyecosystem.org are encouraged to train their algorithms on the contest data with knowledge of the training and public test set labels, then contact Dr Levin Kuhlmann (levin.kuhlmann@monash.edu) to submit their complete test set predictions (predictions should be scaled between 0 to 1 as an estimate of preictal probability) using a solution file and obtain Area Under the Curve (AUC) performance scores for their algorithm for the private test set (as per the terms and conditions). Algorithms are ranked on the evolving ecosystem leaderboard.  

The Incentive - The Ultimate Benchmark

The top algorithms in the ecosystem will have the opportunity of being annually evaluated on the full dataset of 15 patients from the NeuroVista trial. The contest data represents a subset of data from 3 patients from the trial. Evaluating algorithms on the full trial dataset will help us to find the best algorithms for the widest range of patients and motivate larger scale clinical trials of seizure prediction devices. The ecosystem organisers will invite the top teams to submit their code for independent evaluation on the full dataset.

To participate in the ultimate benchmark test, people are required to:

  • Prepare algorithms with Python 3 (preferably using Anaconda).

    • This will ensure a standardised evaluation of seizure prediction performance and computational efficiency of the algorithms.

  • Make algorithms efficient such that the time taken to classify a 10 minute data segment is at most 30 seconds on a single core.

    • This duration needs to include all feature calculation and classification steps of a pretrained algorithm.

  • Make algorithms that utilise at most 100 MB of RAM when classifying a 10 minute data segment.

  • Submit code so that your algorithm can easily be retrained and tested on new data (using the same filename structure given for the contest data) and different data segment file durations by the independent evaluator.

Evaluation of performance of algorithms on the full trial dataset will also include weighting of algorithms by (time-varying) factors, such as circadian and other multidien information, interictal spike rates and/or temperature of each patient’s local region of residence. This will help to find new ways to tailor patient-specific algorithms. In addition, performance will be evaluated on different seizure types within a patient to give insight into the role seizure types play in seizure prediction.