Screenshot caption
Also known as: screenshot headline, screenshot overlay text
What is a screenshot caption?
A screenshot caption is the marketing text overlaid on an App Store screenshot. Captions sit above, below, or alongside the device mockup and communicate what the frame is meant to convey. A typical caption is a 4-8 word headline ("Track your runs in real time"), sometimes paired with a 1-2 line sub-caption for context.
Captions matter on two dimensions. For human readers, they tell the story: each frame's caption is a beat in the value-proposition narrative across the set. For App Store indexing, Apple's OCR scans caption text and adds the recognized words to the app's keyword footprint, so caption copy is a free ASO surface.
What makes a screenshot caption convert?
The strongest captions lead with the benefit ("Read 100 books a year"), avoid jargon, and use sentence-case rather than ALL CAPS. They're short enough to read at thumbnail size in the search results above-the-fold display. They avoid stacking too many ideas: one frame, one message.
The weakest captions are generic ("Welcome to MyApp"), passive ("MyApp helps you do things"), or so dense the reader has to stop and parse them. App Store search-result thumbnails give readers maybe one second before they swipe past.
How does caption text affect localization?
Caption text is the highest-leverage localization surface. A German caption can run 60-80 percent longer than the English source (per IBM/W3C text-expansion data), which breaks tight layouts. A Japanese caption shifts in tone register (keigo vs futsutai) depending on app category. Caption rewrites typically dominate the localization work compared to the rest of the screenshot adaptation.
Related terms
- Apple OCRApple OCR is Apple's optical character recognition system that scans the text inside App Store screenshots and adds the recognized words to the app's effective keyword index.
- First three framesThe first three frames are the screenshots Apple displays above the fold in App Store search results: positions 1, 2, and 3 from your screenshot set.
- 60/40 ruleThe 60/40 rule is the informal guideline derived from Apple's App Store Review Guidelines that screenshots should be at least 60 percent product UI and at most 40 percent marketing collateral (lifestyle photography, oversized text, decorative graphics).
- RTLRTL (right-to-left) is the layout direction used by Arabic, Hebrew, and several other scripts.
- KeigoKeigo is the Japanese system of honorific and humble speech registers used to signal social context.