Boosting Photo App Retention with Face Detection and Clustering
Using DSFD, InsightFace, and DBSCAN to surface the photos people actually care about — and how it moved 30-day retention by up to 150 basis points.
Using DSFD, InsightFace, and DBSCAN to surface the photos people actually care about — and how it moved 30-day retention by up to 150 basis points.
People don't care about all their photos. They care about photos of people they love. That was the bet behind our retention work for AT&T Photos and Verizon Photos: if we surface the photos that matter most — the ones with familiar faces — users will come back more often.
We needed a face detector that handles real-world photo library conditions: small faces in group shots, side profiles, partial occlusion, wildly varying lighting. We went with Dual Shot Face Detector (DSFD) for its multi-scale feature processing. A single family photo might contain faces ranging from 20 pixels to 200 pixels wide — DSFD handles this range without needing separate detection passes.
Once faces are detected, InsightFace (trained with ArcFace loss) generates 512-dimensional embeddings. Three properties mattered for our use case:
We fine-tuned on a diverse dataset to improve accuracy on underrepresented demographics. When your product serves millions of people, equitable performance isn't a nice-to-have.
With embeddings computed, DBSCAN was the natural clustering choice:
Tuning epsilon was the critical knob. Too low and you split one person into multiple clusters; too high and you merge different people. We built a validation set from manually-labeled photo libraries and found the sweet spot empirically.
The face clusters powered three user-facing features:
| App | 30-Day Retention Improvement |
|---|---|
| AT&T Photos | +30 bps |
| Verizon Photos | +150 bps |
The gap between AT&T and Verizon came down to baseline engagement — Verizon's user base had more room to move.
In a subscription business with millions of users, 150 basis points compounds dramatically. Each retained user represents years of subscription revenue. The infrastructure cost of the CV pipeline was a rounding error compared to the retained lifetime value.