Captcha Solver
Amazon Captcha Project
Overview
This project involved creating a machine learning model that could solve Amazon's six-character captcha. The project required a lot of research, image transformation, and machine learning skills. The project was challenging, but it taught me a lot about image processing and neural networks.
Development
Generating Captchas
Data
One of the biggest challenges was collecting enough data to train the machine learning model. I decided to create a captcha generator that could automatically label as it created the captchas. I had to learn about image transformations, noise and randomness, and character merging to create something that was pretty close to Amazon's six-character captchas.
Generated (200x70)
Real (200x70)
Developing a Neural Network
Training
Training the model was a long process that involved a lot of trial and error. I first implemented the base five-character model, and then gradually generated more and more images while scaling the model up to six characters. I used hyperparameter tuning in order to find the best model. The model had a lot of difficulties with warped characters, but eventually learned to recognize them.
Model Performance on Real Captchas
Results
Validating the model's accuracy was a critical step in the project. At first, I tried to train the model with a mixture of generated and real Amazon captchas, but the ratio was off, and the model was learning more from the generated ones. Eventually, I pre-trained the model only on the generated captchas and fine-tuned it on the real captchas, resulting in 93% accuracy on validation real Amazon captchas.
Conclusion
Lessons Learned and Skills Gained
Reflexion
Overall, this project was a challenging but rewarding learning experience. I gained skills in image processing, neural networks, and machine learning. The project taught me to be patient and persistent, and to never give up on a difficult problem. The Amazon captcha project will always be a highlight of my portfolio.