This work is a group industrial project, considered as final assignment for FourthBrain Machine Learning Engineer program.
FourthBrain Machine Learning Engineer program website
Name | GitHub profile name | GitHub profile link |
---|---|---|
Christina Morgenstern | morgen01 | Link |
Bashar Naji | basharnaji | Link |
Pawel Dymek | pdymek | Link |
Retinal Optical Coherence Tomography (OCT) is a non-invasive imaging test. OCT uses light waves to take cross-section pictures of your retina. With OCT, your ophthalmologist can see each of the retina's distinctive layers. This allows your ophthalmologist to map and measure their thickness.
It is estimated that 30 million OCT scans are performed each year, and the analysis and interpretation of these images takes up a significant amount of time.
In order to speed up this process we can utilize Machine Learning models to identify scans of patients that might have a disease that can allow ophthalmologists to focus on those patients first. One of the challenges of having accurate Machine Learning models is the scarcity of well annotated data. In this project, we tackle this problem by utilizing the sinGAN model to generate realistic synthetic data that can be used to increase our training data and improve our prediction results.
Challenges in OCT image analysis:
- Availability of data
- Labeling requires effort and expertise
- Class imbalance
- Speed of analysis is important
- Algorithms still produce erroneous results and require expert intervention
- Python
- TensorFlow
- Flask
- AWS
- GoogleColab
- Kaggle
- GitHub
Dataset is imbalanced among different classes:
The system approach is summarized by the following:
- The technician/doctor will upload the OCT scan into our system
- Any Personally Identifiable Information (PII) is removed from the image in order to protect the privacy of the patient and adhere to local & medical laws. A unique ID will be generated for the transaction
- The image is converted into the desired JPG format that the model requires
- The model will process the image and make a prediction on the patient’s status
- The result of the model and the image will be stored for future reference
- The result will be passed back to the ophthalmologist
- The ophthalmologist will confirm the diagnosis or correct it and that result is fed back to our model storage
- The model should get retrained any time its performance (accuracy) drops below 92% and every time we acquire 500 new annotated images:
- singGAN will be used to generate new synthetic data
- use Xception model to retrain on the new “training” data.
- Once we reach 250,000 real annotated images we should consider turning off the sinGAN portion as we probably have acquired a significant amount of labelled images.
The Xception model was used for training on different randomly picked fractions of data (100%, 75%, 50%, 10%). For subsequent tuning of the model we decided to do it on this trained with 10% of data, supposing that it’s good base accuracy on a relatively small portion of data, which is an important factor for practical usage of the model. Other tasks will be related to maximizing the improvement of the 10% trained model.
Training Data Fraction | Test Accuracy | Test Loss (cross entropy) |
---|---|---|
100% | 98.97% | 0.0463 |
75% | 99.48% | 0.0293 |
50% | 98.14% | 0.0587 |
10% | 92.15% | 0.2108 |
Different experimental approaches in setting trainable layers for all or selected ones.
Training data fraction | Trainable layers | Test accuracy | Test loss | Training time |
---|---|---|---|---|
10% | all | 92.15% | 0.2108 | 2min 32s ± 1.95 s per loop (mean ± std. dev. of 7 runs, 1 loop each) |
10% | Block 14 (last 6 layers) | 85.74% | 0.4135 | 2min 3s ± 809 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) |
10% | Block 13 and 14 (last 16 layers) | 98.14% | 0.0662 | 2min 1s ± 875 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) |
Training data fraction / trainable layers | Test accuracy | Test Loss (Cross entropy) |
---|---|---|
10% | 92.15% | 0.2108 |
10% + SinGAN | 98.86% | 0.0328 |
Multiple new images are generated based on sample images.
The application is deployed on following address:
http://octclf.eba-yntjrjbn.us-west-2.elasticbeanstalk.com/