Run CLIP on iPhone to search photos

#57 · 🔥 379 · 💬 119 · one year ago · mazzzystar.github.io · mazzystar · 📷
I built an app called Queryable, which integrates the CLIP model on iOS to search the Photos album OFFLINE. It is available on App Store today and I thought it might be helpful to others who are as frustrated with the search function of Photos as I was, so I wrote this article to introduce it. Source from OpenAI. To run on iOS devices in real time, I made a compromise between the performance and the model size, and finally chose the ViT-B-32 model, separated the Text Encoder and Image Encoder. Import clip # Load ViT-B-32 CLIP model model, preprocess = clip. Encode text("Rainly night") # cosine similarity sim = cosin similarity(image feature, text feature) Integrate CLIP into iOS. I exported the Text Encoder and Image Encoder to CoreML model using coremltools library. The reason I split Text Encoder and Image Encoder into two models is because, when actually using this Photos search app, your input text will always change, but the content of the Photos library is fixed. Compared to the search function of the iPhone Photos, how much does the CLIP-based album search capability improve? The answer is: overwhelmingly better. The CLIP model itself resizes the input image to a very small size, so if your image is stored in iCloud, it actually does not affect search accuracy except that you cannot view its original image in search result.
Run CLIP on iPhone to search photos



Send Feedback | WebAssembly Version (beta)