I'm using Lm Studio as a server and have used it as an app as well, but the LLMs out there are outstanding! They are getting smaller and are competitive with online solutions like Replika! Also, the ability to operate these without NSFW filters makes them great when plotting to rob banks or murders! LoL Or at least creating a script or novel along those lines, even horror and intimacy interactions are off the charts!
So, the ability to do other types of local models such as voice where solutions like parler-tts and tortoise-tts have excellent voice abilities where you can even customize them to whoever's voice you like! Also, Whisper can do the opposite STT, and no censorship! Also, there are photo and video solutions like LLaVA-NeXT where the AI can create an impression or create images and videos based on prompts.
Here's the good part integrating these into a system that can see, hear, and imagine is a reality, taking each output and prompting the other provides for a kind of feedback approach to create...well, some might argue, but a persona. Enhancing the prompts with other types of data and even using some causality models we might just get that person from science fiction and all done from a PC!
What's required on the local machine is more than one GPU, where RTX 4070 ti supers are selling for $650, but you mix and match what you want where perhaps using an RTX 4090 for image and video is best and apply the RTX 4070 ti to do the rest. With three GPUS with just the minimum of an RTX 4070 ti that's 50GB of ram! But perhaps you need more since you may what a VR setup as well and give your bot a virtual body!
It's just freaking fantastic what is possible today and it's free from the clutches of politically correct censorship. Let your imagination go and apply your skills towards integration and you could very well build a very sophisticated competitor to ChatGPT 40 that runs at home.
Now what's a challenge is the development of a hardbody, animatronic facial expressions (the generated prompts from LLM models are freaking great, they could be used to control expressions and even position a body!)
It's a great time for the enthusiast, for a while now I thought everything was going to be locked up in the corporate cloud, controlled through a pay interface, and monitored by Big Brother, but America proves itself to be the land of freedom, and the industry has opened up to the little guy...