I have been thinking of clearing some space in my phone for a long time. I have a lot of images in my whatspapp images folder- around 37,000 images. There are some pictures important to me in these 37k images which I donot want to delete. Majority of the images consist of memes in malayalam and english languages. I arrived at the conclusion that deleting these images would suffice to create significant amount of free space in my phone.
Now that I have a need, the next step was to generate some solutions for this problem. I tried finetuning a googlenet and tried to predict meme images. But due to the similarity between memes and faces, some of the images classified as memes contained the photos of my friends and family that I wanted to keep.
As the 1st method didnt satisfy my requirement, I turned to my favorite ML detection method- yolo v5. Combined with a powerful annotation software like roboflow, yolo can do wonders. i decided to give it a try. I downloaded datasets of artificially created datasets of text in images consisting of 2000 images and tuned a pretrained yolo v5 using it with an accuracy of 99.2% I am using accuracy here because detecting text itself in some of the images is a difficult task in itself.
Inorder to classify an image as a meme, just detecting text is not enough. The text in my tshirt can throw the model off. So I decided to use some domain expertise. I assumed that there would be atleast 5 percent of text in the image. I defined a metric called text coverage=text area/image area X 100.
Inorder to reduce errors and make manual inspection easier, I created 5,10,20,30,40,50 percentage thresholds and I iteratively applied yolo on images after moving them in the descending order of text coverage.
This method was able to solve my problem. I see a potential for a product.