Captcha Solver

Amazon Captcha Project

Overview

Tensorflow

GCP

OpenCV

Python

This project involved creating a machine learning model that could solve Amazon's six-character captcha. The project required a lot of research, image transformation, and machine learning skills. The project was challenging, but it taught me a lot about image processing and neural networks.

Development

Generating Captchas

Data

One of the biggest challenges was collecting enough data to train the machine learning model. I decided to create a captcha generator that could automatically label as it created the captchas. I had to learn about image transformations, noise and randomness, and character merging to create something that was pretty close to Amazon's six-character captchas.

Generated (200x70)

Real (200x70)

Developing a Neural Network

Training

Training the model was a long process that involved a lot of trial and error. I first implemented the base five-character model, and then gradually generated more and more images while scaling the model up to six characters. I used hyperparameter tuning in order to find the best model. The model had a lot of difficulties with warped characters, but eventually learned to recognize them.

Model Performance on Real Captchas

Results

Validating the model's accuracy was a critical step in the project. At first, I tried to train the model with a mixture of generated and real Amazon captchas, but the ratio was off, and the model was learning more from the generated ones. Eventually, I pre-trained the model only on the generated captchas and fine-tuned it on the real captchas, resulting in 93% accuracy on validation real Amazon captchas.

Conclusion

Lessons Learned and Skills Gained

Reflexion

Overall, this project was a challenging but rewarding learning experience. I gained skills in image processing, neural networks, and machine learning. The project taught me to be patient and persistent, and to never give up on a difficult problem. The Amazon captcha project will always be a highlight of my portfolio.