Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Gomoku AI Player, Lecture notes of Artificial Intelligence

This report will investigate different AI (Artificial Intelligence) approaches towards the game Gomoku. It also shows the process taken to ...

Typology: Lecture notes

2021/2022

Uploaded on 09/12/2022

anjushri
anjushri 🇺🇸

4.8

(14)

243 documents

1 / 63

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Cardiff University
Final year project
Gomoku AI Player
Daniel Ford - C1224795
supervised by
Yukun Lai
May 6, 2016
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f

Partial preview of the text

Download Gomoku AI Player and more Lecture notes Artificial Intelligence in PDF only on Docsity!

Cardiff University

Final year project

Gomoku AI Player

Daniel Ford - C

supervised by

Yukun Lai

May 6, 2016

1 Abstract

This report will investigate different AI (Artificial Intelligence) approaches towards the game Gomoku. It also shows the process taken to get to the end system. This begins at the planning and research stage then proceeds to the design stages of each part of the system such as the UI (User Interface) or AI algorithms, through to summaries of the implementations.

The AI algorithms explored in this report include a Heuristic Function, Min- iMax, Iterative Deepening and Monte Carlo Tree Search. The report will show a comparison of how each approach played against each other and make conclusions as to why the results are as they are.

The report will end with showing the work that could potentially be planned for the future and a reflection of the work carried out.

  • 1 Abstract
  • 2 Introduction
  • 3 Background
    • 3.1 The game of Gomoku
    • 3.2 Gomoku and Artificial Intelligence
      • 3.2.1 Approaches
  • 4 Creating the Game
    • 4.1 Project Management
    • 4.2 Research
      • 4.2.1 Gomocup
    • 4.3 Design
      • 4.3.1 The Board
      • 4.3.2 User Interface
      • 4.3.3 The Player
  • 5 Creating the AI Player
    • 5.1 MiniMax
    • 5.2 Heuristic Evaluation
      • 5.2.1 Implementing the Heuristic
    • 5.3 Alpha Beta Pruning
    • 5.4 Depth limited MiniMax implementation in Java
      • 5.4.1 Deciding a move
      • 5.4.2 Minimising and maximising
    • 5.5 Iterative Deepening
      • 5.5.1 Limiting the depth
      • 5.5.2 Incrementing the depth over time
      • 5.5.3 Iterative Deepening implementation
    • 5.6 Monte Carlo Tree Search
      • 5.6.1 Multi-Armed Bandit Problem
      • 5.6.2 UCB-1
      • 5.6.3 Selection
      • 5.6.4 Expansion
      • 5.6.5 Simulation
      • 5.6.6 Back Propagation
      • 5.6.7 Full MCTS Pseudocode
      • 5.6.8 MCTS Implementation
  • 6 Experiments and Discussions
    • 6.1 Experiments
    • 6.2 Results and Conclusions
      • 6.2.1 Set Up
      • 6.2.2 Heuristic VS other algorithms
      • 6.2.3 Depth Limited MiniMax VS Iterative Deepening
      • 6.2.4 Iterative Deepening VS Monte Carlo Tree Search
  • 7 Future Work
    • 7.1 Improving the Heuristic
    • 7.2 Improving MiniMax and Iterative Deepening
    • 7.3 More algorithms
      • 7.3.1 Threat Space Search
      • 7.3.2 Reinforcement Learning
    • 7.4 Game Improvements
    • 7.5 Testing
  • 8 Reflection
  • Appendix A Java Code Listings
    • A.1 MiniMax - minimising
    • A.2 MiniMax - maximising
    • A.3 MCTS - selection
    • A.4 MCTS - expansion
    • A.5 MCTS - random simulation
    • A.6 MCTS - back propagate

2 Introduction

The project that I have chosen to do for this assignment is creating an AI (Artificially Intelligent) player for the game Gomoku. The main aim in this project will be to implement multiple AI approaches and give a comparison of how they play through gathering statistics by playing them against each other with different configurations.

Another important aim of the project that I have been given is that a UI (User Interface) should be implemented so that a user can easily interact and play with the AI player, or another human, should they so choose. For this project, I intend to write all of the code for the UI layer, ’data layer’ and AI players from scratch.

Figure 1: Counters placed on a 19 x 19 Go Board

3.2 Gomoku and Artificial Intelligence

Gomoku is often used as a starting point for creating AI players for the game of Go due to being simpler to program and providing an easier way to test UCT (Upper Confidence Bound 1 applied to trees) in e.g. the Monte Carlo algorithm [5].

When discussing solved games, freestyle Gomoku comes under the category of solved rather than partially solved or unsolved. A solved game is a game whose outcome can be predicted from any position given that both of the players play perfectly [14].

On a board size of 15 x 15 it has been proven that, in perfect play, black should always win. I will be implementing freestyle Gomoku which follows along the theory that, in perfect play, first player should either win or the game should result in a draw. However, this case has only been proven in a board of 15 x 15 but it is thought that it is most likely the case in boards of larger sizes [3]. Whether or not an AI player wins when they are first player will entirely depend on the strength of the algorithm used and many other variants. These variants may be how far the tree is explored or if a heuristic evaluation function is used, how good that heuristic function is.

However, freestyle Gomoku is not the only version that has been declared

solved. Renju rules are also said to have been solved in 2001 [15].

3.2.1 Approaches

Due to Gomoku being a common starting point to create AI approaches for Go [5], a number of approaches to Gomoku already exist.

Most approaches are based on a simple MiniMax implementation or similar algorithms such as the NegaMax. These approaches can then be further im- proved by developing a heuristic and other performance enhancements.

Other commonly used approaches include the Monte Carlo Tree Search algorithm. However, for a game such as Gomoku or Go, due to the large search space involved, light random playouts do not always play well and as such the algorithm can be improved with a strong heuristic evaluation to help with e.g. move selection in the simulation phase [5].

The approach which has been used to solve freestyle Gomoku is threat space search [2]. This approach was developed to model the way that a human Gomoku player finds winning threat sequences. Using this algorithm, a strong player was created that even won Gold in the Computer Olympiads, winning all games using black and half of the games using white.

For this project, the AI approaches that I would like to implement first are the MiniMax algorithm, Iterative Deepening, a Heuristic evaluation function and Monte Carlo Tree Search. As well as these algorithms, I would like to implement a number of improvements on top of each implementation.

If I have time, I would like to also research and implement other possible implementations which I could then also use for comparison against the other approaches mentioned.

website [6], I found that Gomocup AI players are usually:

  • Given 30 seconds to decide a move
  • Not allowed to have a starting size larger then 2MB
  • Not allowed to exceed the size of 20MB at any point during the tour- nament

In the Gomocup, a standard interface is provided which your program, re- ferred to as the ’brain’, must interact with. The Gomocup allows the AI players to be written in a range of languages, including Java, so in the fu- ture it may be interesting to see how the AI players created in this project would compete. However, part of this project is creating the actual board interface so compatibility with other programs will not be the primary focus. Although, it may be simple enough to adapt the code for the AI players to interact with another interface in the future.

4.3 Design

In terms of game design, I would like the game to be designed in such a way that I can change the dimensions of the board, and everything will still work, changing dynamically. I intend to make all of the classes extremely object orientated and re-usable.

As a simple, initial design summary of the entire program, in the form of a UML class diagram, I think that the program could look something like this:

Figure 2: Initial UML class diagram representation of the project

4.3.1 The Board

Visualising the board’s interface as rows and columns, one can see that the board would be seen as a multi dimensional array where placing a counter in the top left corner would be seen to be placing a counter at position 0,

Going forward in the design, I would like to create two separate representa- tions of the board. A data representation and a user interface representation. These representations will be implemented separately, yet interact with each other. I believe that this is important as it follows the principle of Separation of Concerns [11].

By having this separation, it allows me to create a ’data layer’ for the board

4.3.2 User Interface

In terms of the user interface, it will need to be one that shows clear intentions and also be easy / self-explanatory for the user to use and navigate. In terms of code design, I think that the User Interface could be as simple as the following class diagram.

Figure 3: An initial UML design of the board

The Game class would be created in the main ’setup’ class, this setup class setting up both the data layer and this UI class.

When this class is called it will create the UI representation of the board, creating tiles using the Tile class and the counters using the Counter class. I have created these two elements as two separate classes as I believe it creates a clear distinction of layers and when the user interacts with the board, it will clearly indicate which class the users are interacting with.

In terms of the actual UI, I would like the user flow to look like the following simplified flow diagram:

Figure 4: A simple diagram of a user’s flow through the system

Figure 4 shows that I would like the first screen to be a set up screen. This would allow the user to configure what kind of players are playing the game, for example, is it two human players or a human vs an AI using the MiniMax algorithm? I think that the configuration could also go beyond this allowing the user to configure the algorithms. The options may also allow the user to change the board size and how many counters in a row there must be to achieve a win.

Figure 5: An initial concept design of the game setup

While the user is playing the game, they will be shown who the current player is. I believe that it might also be nice to implement features such as showing

4.3.3 The Player

The player class will be needed to be created in an abstract way so that it can be extended upon by both a Computer style class and a Human styled class. It makes sense to have a super extendible class as it will provide basic functionality which would other wise be repeated whilst also providing a structure for the sub classes to follow.

As an initial approach, as part of my design, I believe all that would need to be implemented for this abstract class would be a ’unique identifier’, a colour for that player, the game state to interact with and an abstract method for the sub classes to use to actually take a move.

Listing 2: A basic abstract Player design

public abstract class Player {

private int identity; private Paint color;

public Player(int identity, String color) { this.identity = identity; this.color = Paint.valueOf( color ); }

public abstract void takeMove( SetupGame state );

}

5 Creating the AI Player

Now I will discuss the AI players that I have implemented. I will do this by firstly discussing the algorithm’s approach and the specific parts of it which make the algorithm different to other approaches. I will then continue explaining the algorithm by providing pseudocode that I have created.

I think it is important to provide pseudocode and not just code snippets alone as I believe that this makes the report easier to read due to being more ’programming language agnostic’. By this I mean that I would like to provide code in a way that is easily understood by readers coming from different programming language backgrounds.

As well as this psuedocode, I will explain how I have translated it into the programming langauge I have used for this project i.e. Java, so that a reader could see how the pseudocode translates into ’real code’.

5.1 MiniMax

The MiniMax algorithm was designed to address two players playing against each other in a zero-sum based game. A zero-sum game being a game where each player’s gain of utility is balanced by the other player’s loss of utility. In a zero-sum game, if you were to subtract one player’s utility by the others, the result should sum to zero.

MiniMax works by building a tree of possible outcomes where the MiniMax player will attempt to maximise it’s own score, whilst the other player will try to minimise the MiniMax player’s score by maximising it’s own score. This can be formally visualised as:

Figure 8: [9]

board, starting at a size of 19! × 19! (where! means factorial ). This would lead to the algorithm taking a large amount of time to make even a simple decision.

To improve this algorithm, steps would need to be taken to make it so that the algorithm would not have to search the entire search space of the board. This could be done by implementing: a Heuristic Evaluation function, Iterative Deepening or pruning techniques such as Alpha Beta pruning.

5.2 Heuristic Evaluation

In games with large search spaces, it is useful to create a heuristic function which will evaluate which moves might be relevant to the Computer Player at the current state of the game, also optionally providing ’scores’ for those moves. It might be useful for a heuristic to also provide scores for a move as this could help in situations where, even with the heuristic function, the end of the game was not met. This will help the Computer Player to weigh up the value of moves at the point it did manage to reach in the search tree.

For the purpose of this project, the evaluation function I will make will be a simple one, however, I can think of many improvements which would be useful for taking the project forward and making the existing algorithm even stronger.

Figure 9 shows that black player has placed 2 counters, and the white ’com- puter’ player has only played one, so it is the computer’s turn. In the terms of the heuristic, I have designed it such that it will take the current most valuable move to itself i.e. it should block the black player.

Figure 9: Black has played 2 counters, Computer is white.

This heuristic will first gather moves by looking at the search space around all existing counters. A move is then evaluated and considered by the heuristic counting how many counters there are in a chain either side of the proposed move. The heuristic will then assign a negative or positive value depending on whether or not the move belongs to the current player.

For example, in the case of the Figure 9, the heuristic would assign a score of −3 to column 11 and row 7 ( or column 8 row 10 ) as, if the black opponent plays here, they will have a score which has a greater impact on the white opponent’s own score.

This heuristic is similar to the Threat utility function where moves are chosen by the computer player based on a greedy approach [2]. By this I mean, the move that will be picked will be the most valuable for it’s current state i.e. it will block the opponent if it is more valuable than furthering it’s own ’chain’ of counters.

The weakness of this approach is that the heuristic doesn’t look ahead in the game. By this I mean that a move might be valuable for it’s current state. However, it may be that upon going further into the game tree, another move may have been better for it. An example of this is that an opponent might play moves with space between them, in an attempt to eventually create a situation where they will have winning combination either way.