App Store
Shipped
iOS Lead -> Solo Engineer
Turning an AI menu translation prototype into a shipped iOS product
App Store-published iOS AI menu translation product built for real travel scenarios. I took it from a course project into a shipped, multi-region, cost-conscious system with OCR, streaming translation, and layout-preserving overlays.
Started in a course team, then independently pushed into a shipped product with backend convergence, multi-region deployment, and layered cost / abuse control.
MenuGenie is an App Store-published iOS menu translation product. I took an AI menu translation prototype that began in a course collaboration project and pushed it into a product meant for real travel scenarios, so users could understand foreign-language menus more quickly at the table instead of receiving a detached block of translated text.
The difficulty of foreign-language menus is not only translation. In a restaurant, users usually do not have time to slowly decode a complex menu. Real menus often mix vertical and horizontal text, multi-column layouts, prices placed tightly next to dish names, and inconsistent typography. If the translated result loses the original layout, users still need to remap names, prices, and descriptions by hand, so the real cognitive cost does not go down. In practice, most users simply want to know what they can eat and what they should order.
A second problem is that existing tools often stop at literal translation. When a menu contains regional dishes, unusual ingredients, or non-obvious dish names, users still need to leave the screen to search for dish context, ingredients, or allergen information, then come back and re-orient themselves on the same menu. Some tools also misplace overlays or force users to re-upload the image after switching away. MenuGenie was built to close that broken flow so users could read, decide, and look deeper into a dish within one product.
This project had two stages. In the earlier stage, it took shape as a product prototype in a course collaboration project. At that stage, I was the iOS frontend lead, responsible for turning camera capture, translation screens, overlay rendering, reading modes, and the core interaction flow into a usable iOS product prototype.
I later continued pushing the project independently and turned it from a presentable prototype into a shipped, deployable, maintainable product. That later stage included not only feature extension, but also backend convergence, performance diagnosis, Cloudflare edge traffic handling, cost protection, and additional stability and troubleshooting work on the iOS side.
Early on, I turned backend capabilities into a complete iOS usage flow so users could move from capture into the translation screen and understand the menu through overlay rendering and multiple reading modes. I then pushed that flow further into a more productized experience by adding multi-page menu handling, history replay, map exploration, and dish context, turning MenuGenie from a one-off translation screen into a product flow that supports revisiting, comparing, and extending menu understanding.
On the system side, the earliest version used a mixed engineering structure built from Java Spring Boot, a standalone Python OCR service, Cloud SQL, and AWS S3. That setup worked during the prototype phase, but later exposed fragmented stacks, scattered deployment paths, and higher monthly cost. I then consolidated the system around GCP, converging it into a deployment model centered on FastAPI, Cloud Run, Firestore, and GCS, while also adding cloud cleanup and cost-control mechanisms so the side project could stay economical to operate.
On backend efficiency, I also handled several issues that directly affected waiting time. I broke down the OCR timeline across frontend and backend logs and confirmed that the bottleneck was not Document AI itself, but synchronous side work such as GCS uploads, thumbnail generation, and Firestore writes. I then changed task ordering and moved non-critical work behind the first visible result, which brought OCR processing time down from 10.21s to 6.36s. To make later troubleshooting and interrupted flows more manageable, I also added OSLog structured logging, task cancellation / interruption cleanup, and CameraService modularization, strengthening the project’s performance, stability, and maintenance base.
The frontend is a native iOS app. The backend is built primarily on FastAPI, uses Google Document AI for OCR, and relies on Firestore and GCS for state and storage. To support multi-region usage, I deployed the services across Cloud Run regions in Australia and Singapore, and used a Cloudflare Worker for geographic routing so the app only has to talk to a single entry point.
MenuGenie has moved from a course-stage prototype into an App Store-published iOS product with the basic conditions needed for continued deployment, maintenance, and cost control.
The project has now moved beyond proof of concept into a sustainably operating product state: the core flow is live, latency and cost have both been materially reduced, and the maintenance boundary is clear.
App Store
Shipped
−38%
OCR Pipeline
~−96%
Infra Cost
Explore more work
View all projects