Grok Imagine - turns photos to video + voice

5 Replies, 1686 Views

With the latest Grok Imagine release, you upload a photo and prompt for what you want the person to say.

It will then make a 6-second video where the person speaks whatever line you've given them.   In fact, they perform it.   

As well as rocket-fuel for A.I. Psychosis, this obviously has exciting implications for anyone generating femdom material....
(This post was last modified: 30 Oct 2025, 10:03 by AI_addict_hypnofan.)
Incidentally maybe it's time for a Video AI sub-forum, to go alongside images text and voice?

Although it's still expensive, there are new models launching all the time.
Six months on, and Grok Imagine is much more advanced. You can now combine up to 7 photos plus a text prompt to turn your erotic thoughts/fantasies into 720p resolution 10 second videos.

Even with just one photo, so long as it's above around 1000 by 1000 pixels, Grok can now create a digital avatar.

I really just recommend you try it out, but this is clearly massively significant for those of us who don't have much of a "minds eye".

It's not going to be explicit, but it will be perfectly tailored to you. In fact, if you're uploading your own image, it likely will actually involve you as a character.
(This post was last modified: 20 Mar 2026, 14:59 by AI_addict_hypnofan.)
Any examples?
(20 Mar 2026, 15:32 )Like Ra Wrote: Any examples?


Grok Imagine is really, really good at spells.  The results look like they're from a movie.  Upload the image of yourself / your representation, plus the person you want to be casting the spell.   If you're using an A.I. generated image for the spell caster, it actually makes it more photorealistic to fit with yours.

I've been using LLMs to write the prompts for me.  For instance I will ask it to write 10 historically-evidenced spells where in the foreground a woman is casting a spell to make a man impotent, or locked in chastity, or bound to her, etc, while in the background or to the side he looks on (usually spellbound / befuddled / concerned, etc).
Grok 4.20 is pretty good at it, but not the best.  I like the Chinese models such as Z.ai's GLM-5, DeepSeek, Kimi K2.5, and others you can find on OpenRouter.  

I'm using this format for the prompts:

Title/Summary
Visual
Action
Audio 

It can follow detailed requests for sound effects and speech.  Music is slightly behind but still worth playing around with.  

A paid SuperGrok account is pretty much essential as apparently free generations are very limited now.  Also, people using it to remove clothes / generate conventional sex acts, have been complaining non-stop for months that it won't let them.  So it goes without saying, don't expect to be able to do that.

But for fetish, femdom, hypnosis, fantasy, role-play and suggestive themes, it's mind-blowing what the Grok's new multiple image to video model can do.  It literally feels like it allows you to think in erotic video, in a way that's never been possible before.
(This post was last modified: 21 Mar 2026, 18:30 by AI_addict_hypnofan.)
(20 Mar 2026, 15:32 )Like Ra Wrote: Any examples?
(21 Mar 2026, 18:24 )AI_addict_hypnofan Wrote:  The results look like they're from a movie.
I mean, any real examples?

This is what I got some months ago: https://www.likera.com/forum/mybb/Thread...Year-Cards