Not generic stock photos. Not "anyone wearing a suit."
Photos that look exactly like you.
The Problem I Was Solving:
Most text-to-image models (Stable Diffusion, DALL-E, Midjourney) are great at creating "a person in a blazer" but terrible at creating you in a blazer.
You can try prompt engineering with descriptions like "brown hair, glasses, oval face" but the output is always someone who looks similar, never identical.
Consistency across multiple images is nearly impossible.
The Technical Approach:
Here's the architecture that made it work:
1. Model Training (Per-User Fine-Tuning)
- User uploads ~30 photos (diverse angles, expressions, lighting)
- We fine-tune a lightweight diffusion model specifically on that person's face
- Training takes ~10 minutes on consumer GPUs (optimized for speed vs. traditional DreamBooth approaches)
- Each model is isolated, encrypted, and stored per-user (no shared dataset pollution)
2. Facial Feature Lock
This was the hardest part.
Standard fine-tuning often "drifts"—the model starts hallucinating features that weren't in the training set (wrong eye color, different nose shape, etc.)
We implemented:
- Identity-preserving loss function that penalizes deviation from core facial geometry
- Expression decoupling so you can change mood/expression without changing facial structure
- Lighting-invariant encoding to maintain consistency across different photo concepts
3. Fast Inference Pipeline
- Text prompt → concept parsing → facial feature injection → diffusion head
- 5-second generation time (optimized inference pipeline)
- User can iterate on concepts without re-training
4. Privacy Architecture
- Models are never shared across users
- Exportable on request
- Auto-deleted after subscription cancellation
- Zero training data retention post-model creation
The Results:
Early testers (mostly LinkedIn creators) report:
- Photos are indistinguishable from real headshots
- Consistency across 50+ generated images
- Posting frequency up 3× because friction is removed
Technical Challenges We're Still Solving:
- Hands (classic generative AI problem—still working on this)
- Full-body shots (current focus is chest-up portraits, but expanding)
- Extreme lighting conditions (edge cases like backlighting or harsh shadows)
Open Question for This Community:
What's the ethical framework for identity-locked generative models?
On one hand:
- User controls their own likeness
- Private models prevent misuse by others
- It's just efficiency for legitimate use cases
On the other hand:
- Deepfake potential (even if we prevent it, architecture is out there)
- Erosion of "photographic truth"
- Accessibility could enable bad actors
We've implemented safeguards (watermarking, user verification, exportable audit trails), but I'm curious:
How should tools like this balance convenience with responsibility?
Happy to dive deeper into the technical architecture or discuss the ethical implications. Would love this community's take.