Following the "Ghibli, pixel-inspired" AI-generated images craze sparked by ChatGPT, Google's Gemini 2.5 Flash Image (codenamed Nano Banana) model recently launched a new "photo-to-figure" mode, further garnering attention in the AI image generation field. Users can now experience this innovative feature directly from the Gemini homepage, with a "Generate Image with Imagen" button featuring a banana emoji, eliminating the need to access Google AI Studio.
The model's core function is to transform ordinary photos into exquisite figurines. Users simply upload a photo and enter a specific prompt, and the AI generates a complete image, including the figurine, packaging, and display base. For example, a user might enter: "Please turn this photo into a figurine. The background should be a cardboard box with a translucent plastic window and a person from the photo printed on it. Place the figurine version of the photo on a round plastic base in front of the box. The PVC material should be clearly visible, and an indoor background is preferred." This highly customized generation method provides a new creative tool for anime enthusiasts and collectors.
On a technical level, Gemini 2.5 Flash Image, released on August 26th, is Google's most advanced image generation and editing model. Enterprise users can access this service through Vertex AI, priced at $30 per 1 million output tokens. Based on this standard, the cost of a single image (approximately 1,290 output tokens) is approximately $0.039 (approximately 0.28 RMB). This AI-generated solution offers significant cost and efficiency advantages over professional design software or traditional figurine production processes.