Ai Dreams Forum

Member's Experiments & Projects => General Project Discussion => Topic started by: jcbdev on January 04, 2018, 08:51:59 pm

Title: QLearning/Experience replay with texas holdem (python)
Post by: jcbdev on January 04, 2018, 08:51:59 pm
Hello!

Just joined the forum.  Been working on a fun project over christmas and thought I'd drop by here to get some advice/feedback from some real pros  ;)

here is the project:
https://github.com/jcbdev/holdemq

The general concept is to build a bot (I'm thinking slack based at the mo) that constantly plays itself in the background.  Hopefully improving constantly.  Also I hope to allow real people to play games against it too over chat.  These real games will also get re-fed back into the background trainer too.

It's very early stages and I haven't really figured out what my network or hyperparameters should look like yet.  But the spine of the code is there.  Currently it will just start a table with 10 AI players and run that over and over again ad-infinitum

Would appreciate it if anyone has the time to have a look and offer any advice from an AI perspective what I could do. (or even just fun suggestions!)
Title: Re: QLearning/Experience replay with texas holdem (python)
Post by: ivan.moony on January 04, 2018, 09:00:27 pm
Hi and welcome :)

May I ask, what is the starting set of knowledge? Do AI players have some part of predefined knowledge, or they just guessing the game rules as they play on?
Title: Re: QLearning/Experience replay with texas holdem (python)
Post by: jcbdev on January 04, 2018, 09:32:29 pm
Hello!

No there is no starting set of knowledge.  The network takes in the current state of the board and outputs an action (check/fold/raise etc).

Over time it builds a "memory" of actions taken against a particular board state and the resultant rise or fall in the players stack/pot (as the score to measure performance).  QLearning uses this memory to replay previous experiences (in order) constantly to try and learn a prediction function that optimizes the chance of eventual success.  It's based on Google's Deepmind work that taught an AI to play atari 2600 game with no previous knowledge of the rules of the game.

At the start there is an epsilon parameter that controls the networks chances of taking a random action over the predicted network action.  At the beginning this is high (so it takes lots of random actions) but over time decays (the hope being that the network knows what its doing by this point!)

Title: Re: QLearning/Experience replay with texas holdem (python)
Post by: Zero on January 06, 2018, 11:13:27 am
Welcome, jcbdev.

That's an interesting project! Have you considered participating to some existing Ai poker competition?
Title: Re: QLearning/Experience replay with texas holdem (python)
Post by: ranch vermin on January 06, 2018, 11:20:51 am
Ive heard about qlearning,  does the q come from "quantum?" just as a catch phrase.   I also heard that q-learning came from this guy that made the dimitri hexapod here -> https://www.youtube.com/watch?v=CVHd2_NUgIs

are u using an existing library, or is it from scratch?   Im a from scratch developer myself, and I like really fast access networks(that possibly are a little lossy) because getting the amount of q-samples up could be the secret of tackling problems with larger sensor spaces.

I bet texas holdem poker would have a nice small fixed sensor space which would be good for a beginners implementation.
Title: Re: QLearning/Experience replay with texas holdem (python)
Post by: keghn on January 09, 2018, 04:26:27 pm

Q Learning Explained: 

https://www.youtube.com/watch?v=aCEvtRtNO-M&t=76s
Title: Re: QLearning/Experience replay with texas holdem (python)
Post by: ivan.moony on January 09, 2018, 08:30:55 pm
Maybe you should make a phone app out of it, call it "Win Machine" and go to Vegas to test it. And then to spend all the winnings on girls :D
Title: Re: QLearning/Experience replay with texas holdem (python)
Post by: jcbdev on February 07, 2018, 04:20:14 pm
Do you think they'd get suspicious if I kept checking my phone after every move?
Title: Re: QLearning/Experience replay with texas holdem (python)
Post by: ivan.moony on February 07, 2018, 05:14:23 pm
hehe, it would be fine if we had those tech glasses :)
Title: Re: QLearning/Experience replay with texas holdem (python)
Post by: AgentSmith on March 16, 2018, 03:32:03 am
Seems that your approach is similar to https://en.wikipedia.org/wiki/TD-Gammon where you try to learn a value function by letting bots play against each other. The difference is that in TD-Gammon state-value functions are learned. You are learning a Q-function instead. Q-functions are more complex and harder to learn than state-value functions. But since texas holdem is not completely observable as backgammon, its seems to be a reasonable approach, since Q-functions do not rely on the probabilistic model of the successor state as state-value functions do.