html UI for chatbot

  • 16 Replies
  • 15483 Views
*

Merlin

  • Trusty Member
  • **
  • Bumblebee
  • *
  • 46
    • Skynet-AI
Re: html UI for chatbot
« Reply #15 on: December 08, 2011, 11:01:45 pm »
I would love to allow selectable TTS.

For number 3, there is a client side (Javascript) version of espeak;
https://github.com/kripken/speak.js

Demo:
http://syntensity.com/static/espeak.html

For tight integration, lip-synch should be driven by the audio. For loose integration, like what you might see in a cartoon, it could be driven by the returning text stream or be almost completely random as long as it runs while the audio is playing.

Playing the images is trivial. I am thinking about how to add more structure on top of that so that creation of the avatars can be standardized.



*

Bragi

  • Trusty Member
  • ********
  • Replicant
  • *
  • 564
    • Neural network design blog
Re: html UI for chatbot
« Reply #16 on: December 09, 2011, 08:51:16 am »
Ahh, great. I hadn't yet looked for a javascript version of espeak. Solves 1 more problem: server load.
From a first glance, this appears to be an automated port, so some testing will have to be done. By default, the espeak lib doesn't support lip-syncing (no viseme events), but there are phoneme events (in the form of callbacks), which can be converted into viseme events. Secondly,the espeak lib does this with callbacks, but that doesn't work in a web environment cause the audio is rendered first and then played, so either the callback data needs to be buffered (while it is being rendered), or the lib needs to be modified so that it includes an array with viseme events when audio playback starts.
Also, we'll need a reliable way to know when audio starts to play. I don't know how to do that in javascript, but I figured, if games can do it, it should be possible.

Quote
For tight integration, lip-synch should be driven by the audio. For loose integration, like what you might see in a cartoon, it could be driven by the returning text stream or be almost completely random as long as it runs while the audio is playing.

Playing the images is trivial. I am thinking about how to add more structure on top of that so that creation of the avatars can be standardized.
Well, I already have my system, which I will be porting to javascript so that both online and offline, the same avatars can be used. It's based on verbot's system (I added some things to make it more flexible).
It's structured, but still leaves a lot of freedom in the animations: there is no division like: eye-blink, nose move,....  but rather: idles, visemes, backgrounds,.. In combination with Z-order and transparency. eyes, hair, eye-brows,... those can be done by convention.

 


LLaMA2 Meta's chatbot released
by spydaz (AI News )
August 24, 2024, 02:58:36 pm
ollama and llama3
by spydaz (AI News )
August 24, 2024, 02:55:13 pm
AI controlled F-16, for real!
by frankinstien (AI News )
June 15, 2024, 05:40:28 am
Open AI GPT-4o - audio, vision, text combined reasoning
by MikeB (AI News )
May 14, 2024, 05:46:48 am
OpenAI Speech-to-Speech Reasoning Demo
by MikeB (AI News )
March 31, 2024, 01:00:53 pm
Say good-bye to GPUs...
by MikeB (AI News )
March 23, 2024, 09:23:52 am
Google Bard report
by ivan.moony (AI News )
February 14, 2024, 04:42:23 pm
Elon Musk's xAI Grok Chatbot
by MikeB (AI News )
December 11, 2023, 06:26:33 am

Users Online

438 Guests, 0 Users

Most Online Today: 448. Most Online Ever: 2369 (November 21, 2020, 04:08:13 pm)

Articles