GPT Image 2 — Fuse 16 Images, Render Any Text, Edit Any Photo
OpenAI's latest AI image model. Upload up to 16 reference photos and fuse them into one coherent scene, render multilingual typography readably inside images, and edit any photo from a natural-language prompt — on Nano Banana.

Drag & drop images or click to browse
JPEG / PNG / WEBP, max 30MB each, up to 16 images
Loading...
What GPT Image 2 Can Do — With the Prompts to Try
Ten ways creators are already using GPT Image 2 in real workflows. Each one below is a specific capability of the model, the kind of output it produces, and a ready-to-copy prompt you can paste into the tool above.

Thinking Mode: Generate Multiple Coherent Images from One Prompt
Unlike earlier models that produce a single image per prompt, GPT Image 2 can reason through a brief, verify details against its world knowledge, and return multiple consistent images from one instruction. Useful for storyboards, campaign variants, and anything that needs "the same subject seen several ways."
Generate 4 consistent product shots of the same ceramic coffee mug: front view on white cyclorama, side view with morning light, top-down flat-lay with coffee beans scattered around, lifestyle shot on a wooden café table. Keep the mug design identical in all four.
Use case: storyboards · campaign variants · product photography sets · consistent character sheets

Multi-Reference Fusion — Up to 16 Input Images in One Prompt
Upload up to sixteen photos and refer to them in your prompt by number ("image 1", "image 2"…). GPT Image 2 reasons across all of them at high fidelity and fuses subject, style, background, lighting, and composition into a single coherent output. This is the clearest "what I wanted but couldn't get before" moment for most creators.
Combine the character from image 1, the outfit from image 2, the background from image 3, and the lighting mood from image 4 into one coherent photograph. Match the camera angle of image 1.
Use case: product placement · virtual try-on · composite scenes · brand asset adaptation

Near-Perfect Multilingual Text Rendering
Headlines on posters, labels on product packaging, signage, menus, recipe captions — GPT Image 2 renders text inside images far more readably than previous models, with significant gains on non-Latin scripts including Japanese, Korean, Chinese, Hindi, and Bengali. The long-standing "garbled AI text" problem is finally a solved problem for most short-to-medium strings.
Design a 3:4 vertical poster for a new Chinese bubble tea launch. Modern minimalist style with vibrant colors, appetizing visuals, bold Chinese and English typography reading '春季限定 · Spring Edition'.
Use case: posters · ad creatives · product packaging · menu boards · multilingual campaigns

Step-by-Step Recipe & Instructional Infographics
Dense structured layouts with labeled ingredients, process arrows, quantity callouts, and a plated hero shot — GPT Image 2 handles all of it in a single generation. Every label renders legibly, in any language.
Create a step-by-step recipe infographic for creamy garlic mushroom pasta, top-down view, minimal white background, ingredient photos labeled with exact quantities like '200g spaghetti' and '150g mushrooms', dotted lines and icons for each process step, final plated dish at the bottom, clean modern style.
Use case: recipe cards · how-to guides · Xiaohongshu / Pinterest infographics · educational visuals

Tutorial Screenshots with Accurate UI & Labels
Product documentation teams have been waiting for this one. GPT Image 2 can generate a realistic software UI — with correct toolbar labels, menu text, numbered step annotations, and arrow callouts — accurate enough to serve as a tutorial illustration without a real screen capture.
Generate a realistic screenshot tutorial showing step-by-step how to configure domain capture in Charles Proxy. Include detailed English labels on every UI element, clean professional layout, numbered steps 1 through 5 with arrows, and a brief caption under each step.
Use case: product docs · onboarding tutorials · help center articles · app store screenshots

360° Immersive & Historical Virtual Tours
An under-the-radar capability driving a wave of viral posts: GPT Image 2 generates equirectangular 360° panoramas that can be loaded directly into VR headsets or mobile viewers. The thinking step pulls in period architecture, signage, and cultural detail automatically.
360 equirectangular image of Istiklal Street, Istanbul in 1900, highly detailed historical architecture, accurate period signage and text on shops, realistic atmosphere, cinematic lighting.
Use case: virtual tours · museum exhibits · immersive storytelling · time-travel experiences

Organizational Charts & Business Diagrams (with Iterative Edits)
Complex hierarchy, connector lines, department names, small footnotes — the stuff that used to mean an afternoon in PowerPoint. GPT Image 2 generates it in one pass, then lets you iterate on specifics ("fix the footnote, add two subsidiaries under Engineering") while keeping the rest of the layout identical.
Create a professional organizational chart for a public tech company in clean corporate style, with accurate department names, clear hierarchy boxes, connecting lines, and small footnote text at the bottom.
Edit the previous organizational chart: fix the footnote text, add two new subsidiaries under Engineering, update the CEO name. Keep exact same style and layout.
Use case: org charts · flowcharts · business diagrams · consulting deliverables · pitch decks

Commercial-Grade Menu Boards & Magazine Spreads
Dense body copy + photography + brand identity in one layout. GPT Image 2 produces output that holds up at actual print resolution, not just as a thumbnail.
Create a full detailed restaurant menu board for a modern Italian café — elegant design with appetizing food photos, clear prices, Chinese and English dish names, readable small-text descriptions, high-resolution commercial quality, print-ready.
Use case: restaurant menus · product catalogs · magazine editorials · brand books · print collateral

Style Transfer Between Photos
Take the aesthetic of one image and apply it to the subject of another. GPT Image 2 preserves the subject's identity, composition, and pose from one reference while borrowing the color palette, medium, and mood from another.
Apply the art style from image 1 to the subject in image 2. Keep the composition, facial identity, and pose from image 2 exactly as shown.
Use case: art direction · illustration · concept exploration · brand-aesthetic consistency

Natural-Language Photo Editing — No Masks, No Layers
Upload any photo, describe the change in plain English, and GPT Image 2 locates the region, applies the edit, and preserves everything you didn't mention. Background swap, object removal, outfit change, outpainting, photo restoration — all from one prompt interface.
Replace the background with a rainy Tokyo street at night. Keep the subject, clothing, and facial features unchanged. Match the lighting of the new scene — cool blue rim light from the back, warm street-lamp glow on the face.
Use case: background replacement · object removal · outfit change · photo restoration · outpainting
Frequently Asked Questions About GPT Image 2
What is GPT Image 2?
GPT Image 2 is OpenAI's latest image generation and editing model. It turns a text prompt into a high-resolution image, edits existing photos with natural-language instructions, and composes up to 16 reference images into a single coherent output — all from the same model.
How is GPT Image 2 different from Nano Banana 2?
Both are strong image models with different sweet spots. GPT Image 2 is stronger on multi-reference reasoning, text rendered inside images (labels, posters, product copy), and complex cross-image composition. Nano Banana 2 has best-in-class character consistency across long image series and tends to be faster for straightforward single-image edits. Both are in your Nano Banana account — run the same prompt through each and keep the one that fits. Try Nano Banana 2 here.
What languages does GPT Image 2 render text in?
Latin script (English, European languages) plus significant gains on non-Latin scripts — Japanese, Korean, Chinese, Hindi, and Bengali are the languages OpenAI specifically calls out as dramatically improved. In practice the model also handles other scripts reasonably for short strings, though quality may vary for dense paragraphs in less-represented languages.
Does GPT Image 2 support transparent backgrounds?
No — GPT Image 2 doesn't currently output transparent PNG. If you need a transparent background, use the AI Remove Background free tool as a follow-up step, or pick a different model in your Nano Banana account that supports alpha channels.
What resolutions does GPT Image 2 output?
Up to 4K-class output — the model supports resolutions with a maximum long edge of around 3840px, covering 3840×2160 landscape and 2160×3840 portrait, plus 1:1, 3:2, 2:3, and flexible ratios in between. Outputs above 2K are considered experimental on OpenAI's side and may vary in consistency.
Is GPT Image 2 free to use on Nano Banana?
New users get free credits on sign-up, and you can earn more through daily check-ins and by inviting friends. After your free credits run out, each generation uses a small number of credits based on the output quality tier. See the pricing page for current rates.
Can I use images generated or edited with GPT Image 2 commercially?
Yes, once you've generated an image on a paid plan, you can use it for personal, commercial, or creative projects. All outputs include invisible SynthID watermarking (per OpenAI's responsibility-by-design policy) for provenance, but this does not affect visual quality or usage rights.
How do I write prompts for multi-image edits?
Label your references clearly. Something like "Place the product from image 1 into the scene in image 2. Apply the color style from image 3. Keep the camera angle from image 2." is far more reliable than "combine these photos." Tell GPT Image 2 which image supplies which role (subject, background, style, lighting, outfit), and it delivers.
Does GPT Image 2 really preserve faces and logos on edits?
It preserves them substantially better than earlier image-to-image models — OpenAI processes every input image at high fidelity specifically to retain faces, logos, and product details. It's not pixel-perfect on every edit, and very aggressive style transfers can soften identity. For the strongest face consistency across a series of edits, Nano Banana 2 is usually the stronger choice.




