- Ai For Real Life
- Posts
- How I cloned myself in 3 steps using AI tools
How I cloned myself in 3 steps using AI tools
The 3‐Part AI Cloning Workflow (That Actually Works)

The 3‑Part AI Cloning Workflow (That Actually Works)
Subject line: How I cloned myself in 3 steps using AI tools
The other morning, I decided to test something ridiculous: could I make a fully consistent AI version of myself, same look, same vibe, same voice, and drop him into cinematic scenes? Turns out… yep. I now have a digital clone who looks and talks like me, and it only took three tools: Higgsfield Consistent Characters, ElevenLabs, and Google’s Veo 3. Here’s how I did it and how you can too.

1. Higgsfield – Train Your Face
Higgsfield’s “Soul” model is how you get a consistent character. The key is training it on your face. I uploaded ~25 photos of myself in different angles, expressions, and lighting so the model understood what makes me, me. Once trained, I used that model to generate cinematic stills with my face in totally different settings: walking through Valencia, chilling on a Costa Rican beach at golden hour, or standing like a Western outlaw. Same character in every shot.

The screenshot above is straight from Higgsfield’s training screen. It recommends uploading 20+ photos and shows exactly what to do (and what to avoid) for the best results: mix angles, lighting, and expressions, and avoid duplicates.

Pro tips for training
Use 20–30 photos – the more variety, the better.
Include different angles, lighting, outfits, and expressions.
Make sure your eyes are visible in most images.
Mix close‑ups with medium shots to give the model context.
Spend a little extra time on your dataset, and you won’t have to fight for consistency later. Once you have a trained model, you can reuse it forever just swap in new prompts and you’re good to go.

2. ElevenLabs – Clone Your Voice
Next up: the voice. I wanted my digital double to actually sound like me, so I headed to ElevenLabs. After uploading a few minutes of my own speech, their VoiceLab created a scarily accurate clone. From there, I can type or paste any script and get audio back in my own tone, cadence, and personality.

When I later bring the visuals and audio together, I swap out any default voices with my ElevenLabs clone. It matches the way I talk tone, rhythm and attitude so the final product feels authentically me. You can even tweak inflection and pacing to suit the mood (serious, playful, monotone, whatever you’re going for).
Quick tip: record your training audio in a quiet room and speak naturally. Don’t over‑enunciate; you want the model to pick up your true speech patterns.
3. Google Veo 3 – Bring It to Life
With your stills ready and your voice cloned, it’s time to animate. Google’s Veo 3 (available via the Flow platform) turns frames into short videos. I uploaded my Higgsfield stills and added prompts telling the model exactly what I wanted “me” to say. It generated 8‑second talking‑head clips for each scene, lip sync, facial expressions, and subtle head movement included.

Breaking the dialogue into short chunks helps keep the syncing tight; I ran four separate prompts instead of stuffing an entire paragraph into one video. The stills came to life talking, reacting and moving naturally like my digital doppelgänger.

Fun idea: prompt your clone to say something outrageous just to freak your friends out. It’s uncanny.

4. CapCut – Sync & Polish
Finally, editing. I pulled my clips into CapCut, replaced the default Veo audio tracks with the ElevenLabs voiceovers, and synced everything up. CapCut’s timeline tools make it easy to fine‑tune lip sync, trim dead space, and add transitions. Toss on some background music or color correction if you’re feeling fancy. When you’re done, you’ve got a full-blown AI version of yourself, face, voice, and personality in any scene you can imagine.
⚡ Quick AI-Image & Video News (Week of July 31, 2025)
Hot Drop | Why It Matters |
---|---|
Ideogram “Character”: brand-new one-image cloning model—upload a single selfie, get endless consistent variations. Available to everyone for free. | Zero-friction character consistency is now table-stakes. Expect every generator to copy this. |
Runway Aleph: announced July 25. Their next-gen in-context video model lets you edit existing footage—add, remove, restyle objects, even change camera angles—with just text. | Moves Runway from “make a clip” to “rewrite any clip.” A giant leap for post-production. |
MiniMax Hailuo 02: their new video model (text- & image-to-video) just launched. Delivers rich cinematic output with high temporal consistency. | The upgraded model from China’s MiniMax is a serious Veo 3 rival—faster, smoother, and now publicly available. |
JSON Prompting Craze (Veo 3): creators are structuring prompts like mini JSON blocks for multi-scene flow and better control. | It’s like building a storyboard—but in code. Great for scenes longer than 8 seconds. |
HiDream-I1 (open-source): 17B sparse Diffusion Transformer with instruction-based editing, now trending again after May release. | For the local power-users: fine-tune your own AI image editor without the black box. |
Final notes
This workflow takes a bit of setup, but the payoff is huge: one consistent character, one consistent voice, and limitless scenes. Train your Higgsfield model right, clone your voice once, and you can crank out content fast.
I’m curious: what would you do with your own AI clone? Reply and let me know. Want me to break this down on video or share my exact prompts and templates? Hit reply and I’ll put something together. Let’s build our army of digital twins together.
— Khalil