带有井字游戏的 Minimax 算法(但每个玩家只能有 3 个 tacs)

Posted

技术标签:

【中文标题】带有井字游戏的 Minimax 算法(但每个玩家只能有 3 个 tacs)【英文标题】:Minimax Algorithm with TicTacToe (But each player can only have 3 tacs on board) 【发布时间】:2022-01-13 22:13:53 【问题描述】:

我目前正在制作一个 TicTacToe 游戏,该游戏使用 Minimax 算法进行玩家对计算机。截至目前,该代码仅在 PvE 上具有困难模式。这是一个使用 Minimax 算法无法战胜的工作机器人。但是,我需要找到一种方法来合并一个特殊的规则......

每位玩家一次只能在棋盘上拥有 3 个“战术”。一旦他们达到 3,他们必须将 tac 从一个位置移动到另一个位置。我不完全确定如何实现这一点,有人能给我一些想法吗?

"""
TicTacToe - 3 symbols max
By Gage Gunn (???), Trevor Purta (Player vs. Computer), Toby Howie (Player vs. Player), Rachel Powers (Computer vs. Computer)
"""

# -------- Modules --------

import os
import time
import random

# -------- Global Variables --------

human = 'O'
bot = 'X'
amount_of_x = 0
amount_of_o = 0
ID = '' #Detecting which game mode (PvP, PvE, EvE)
ID2 = '' #Detecting which difficulty in PvE
ID3 = '' #Detecting which difficulty in EvE

# -------- Functions --------

def printBoard(board):
  print(board[1] + ' | ' + board[2] + ' | ' + board[3] + '        1 | 2 | 3')
  print(board[4] + ' | ' + board[5] + ' | ' + board[6] + '        4 | 5 | 6')
  print(board[7] + ' | ' + board[8] + ' | ' + board[9] + '        7 | 8 | 9')
  print('\n')


def spaceIsFree(position):
  if board[position] == ' ':
    return True
  else:
    return False


def insertLetter(letter, position):
  if spaceIsFree(position):
    board[position] = letter
    printBoard(board)
    if (checkDraw()):
      print("Draw!")
      exit()
    if checkForWin():
        if letter == 'X':
          print("Bot wins!")
          exit()
        else:
          print("Player wins!")
          exit()

    return


  else:
    print("Can't insert there!")
    position = int(input("Please enter new position:  "))
    insertLetter(letter, position)
    return


def checkForWin():
  if (board[1] == board[2] and board[1] == board[3] and board[1] != ' '):
    return True
  elif (board[4] == board[5] and board[4] == board[6] and board[4] != ' '):
    return True
  elif (board[7] == board[8] and board[7] == board[9] and board[7] != ' '):
    return True
  elif (board[1] == board[4] and board[1] == board[7] and board[1] != ' '):
    return True
  elif (board[2] == board[5] and board[2] == board[8] and board[2] != ' '):
    return True
  elif (board[3] == board[6] and board[3] == board[9] and board[3] != ' '):
    return True
  elif (board[1] == board[5] and board[1] == board[9] and board[1] != ' '):
    return True
  elif (board[7] == board[5] and board[7] == board[3] and board[7] != ' '):
    return True
  else:
    return False


def checkWhichMarkWon(mark):
  if board[1] == board[2] and board[1] == board[3] and board[1] == mark:
    return True
  elif (board[4] == board[5] and board[4] == board[6] and board[4] == mark):
    return True 
  elif (board[7] == board[8] and board[7] == board[9] and board[7] == mark):
    return True
  elif (board[1] == board[4] and board[1] == board[7] and board[1] == mark):
    return True
  elif (board[2] == board[5] and board[2] == board[8] and board[2] == mark):
    return True
  elif (board[3] == board[6] and board[3] == board[9] and board[3] == mark):
    return True
  elif (board[1] == board[5] and board[1] == board[9] and board[1] == mark):
    return True
  elif (board[7] == board[5] and board[7] == board[3] and board[7] == mark):
    return True
  else:
    return False


def checkDraw():
  for key in board.keys():
    if (board[key] == ' '):
      return False
  return True


def playerMove():
  position = int(input("Enter the position for 'O':  "))
  insertLetter(human, position)
  return


def compMove():
  bestScore = -800
  bestMove = 0
  for key in board.keys():
    if (board[key] == ' '):
      board[key] = bot
      score = minimax(board, 0, False)
      board[key] = ' '
      if (score > bestScore):
        bestScore = score
        bestMove = key

  insertLetter(bot, bestMove)
  return


def minimax(board, depth, isMaximizing):
  if (checkWhichMarkWon(bot)):
    return 1
  elif (checkWhichMarkWon(human)):
    return -1
  elif (checkDraw()):
    return 0

  if (isMaximizing):
    bestScore = -800
    for key in board.keys():
      if (board[key] == ' '):
        board[key] = bot
        score = minimax(board, depth + 1, False)
        board[key] = ' '
        if (score > bestScore):
          bestScore = score
    return bestScore

  else:
    bestScore = 800
    for key in board.keys():
      if (board[key] == ' '):
        board[key] = human
        score = minimax(board, depth + 1, True)
        board[key] = ' '
        if (score < bestScore):
          bestScore = score
    return bestScore


board = 1: ' ', 2: ' ', 3: ' ',
        4: ' ', 5: ' ', 6: ' ',
        7: ' ', 8: ' ', 9: ' '

def pick_game_type(): #Pick PvP, PvE, EvE
  global ID
  print("""
First, Pick which game mode you would like to play!
(Player vs. Player [1], Player vs. Computer [2], Computer vs. Computer [3])
  """)
  type_choice = input('> ')
  if type_choice == "1": #Set global ID for [1]
    ID = "1"
  elif type_choice == "2": #Set global ID for [2]
    ID = "2"
  elif type_choice == "3": #Set global ID for [3]
    ID = "3"

# -------- Script --------

os.system('cls') #Clear console
print("""
Welcome to Wacky Tacky Toe, Tac, Tic!
By Gage Gunn, Trevor Purta, Toby Howie, and Rachel Power with an s
""")

pick_game_type() #Choose gamemode

if ID == '1': #PvP Chosen
  pass

elif ID == '2': #PvE Chosen
  os.system('cls')

  print('''
Welcome to Player vs Computer! In this version of Tic, Tac, Toe, each player can have a maximum of 3 tacs on the board at a time!
Decide which difficuly you would like to play? (Easy [1] or Hard [2])''')
  ID2 = input('> ')

  valid_input = False
  while not valid_input: #Checks validity of ID2
    if ID2 in ['1', '2']:
      valid_input = True
    else:
      ID2 = input('> ')
  
  if ID2 == '1': #Easy was chosen
    pass

  elif ID2 == '2':
    print('You have chosen hard! You are X, the computer is O. The computer will go first')
    while not checkForWin():
      compMove()
      playerMove()

elif ID == '3': #PvE Chosen
  pass

【问题讨论】:

这是相当广泛的。你被困在哪里了,具体来说?无论是否存在关于在 3 之后移动棋子的规则,极小极大算法都将完全相同——这是由 TTT 移动生成器程序处理的。 Minimax 只是说,“给我这个位置的所有合法移动”并且一次应用它们,不管移动是什么。这里的问题是您的移动生成与极小极大例程的耦合过于紧密——我建议将其与 MM 可以调用的 get_moves() 之类的 API 解耦。 您还需要定义一个规则(或最大深度,或模式跟踪),以防止最小最大移动扩展进入玩家将他们的战术来回移动到重复模式的无限循环。跨度> 我被困在哪里包含那个非常具体的规则,再加上机器人仍在做出最佳选择。我对极小极大还很陌生,所以这是第一次尝试,你知道吗?我想我把我的条件混​​在一起了哈哈。所以当你说解耦时,你的意思是让移动过程更独立于极小极大值吗?我可以研究一个更好的想法/理解的例子是什么? (顺便谢谢你的意见,我真的很感激 :))@ggorlen 这是真的!我可以看到玩家和机器人都可以这样进入循环的场景!谢谢,我从没想过@AlainT。 这是在 Java 中,但请参阅 this 和 this 以获取返回合法移动以供 MM 算法迭代的函数示例。 【参考方案1】:

您需要将游戏玩法的组件分离为可重用的函数,这些函数可以组合起来模拟最小极大计算的移动。

因为您需要模拟,所以最好有一个自包含的数据结构来表示游戏(棋盘)的状态,包括轮到谁来玩(我使用 10 个字符的字符串,其中第一个字符是当前玩家,其他 9 个是棋盘的当前状态)。

您需要实现的第一件事是模拟移动(返回新游戏状态)的函数:

def playMove(game,toPos,fromPos):
    played =  [game[0] if i==toPos else "." if i==fromPos else p
               for i,p in enumerate(game)]
    return "".join(played)

您还需要一个函数来切换当前播放器。这需要分开,以便您可以在模拟对手的动作之前轻松评估当前玩家的结果:

def switchPlayer(game):
    return "XO"[game[0]=="X"]+game[1:]  # returns a new game state

为了递归地模拟移动,您需要一个函数来提供可能移动的列表。根据您的游戏规则,移动定义为目标位置 (toPos) 和可选的原点位置 (fromPos)。前 3 步将没有原点(让 fromPos = toPos 为那些)

def getMoves(game):
    player,*board = game
    moves = [(i,i) for i,p in enumerate(board,1) if p=="."]  # 1st 3 moves
    if board.count(player)==3:                               # moves 4+
        origins = [r for r,p in enumerate(board,1) if p==player ]
        moves   = [ (i,r) for i,_ in moves for r in origins] # add origins
    return moves

对于模拟和实际游戏,您需要一个函数来确定游戏是否获胜:

winPatterns = [(1,2,3),(4,5,6),(7,8,9),(1,4,7),(2,5,8),(3,6,9),(1,5,9),(3,5,7)]
def isWin(game):
    return any(all(game[i]==game[0] for i in p) for p in winPatterns)

评价一个动作的结果可能是一些复杂的公式,但为了简单起见,我们只说赢是 1,不赢是零。 minimax 函数将递归地模拟移动,根据“最近”的决定性移动对它们进行评级(1 表示获胜,-1 表示失败)。

maxDepth = 6       # maximum depth = computer's skill
memo = dict()      # memoization (for performance)
def minimax(game,depth=maxDepth):
    if (game,depth) in memo:      # known rating at this depth 
        return memo[game,depth]   
    if isWin(game): return 1      # no moves after a win (base condition)
    if not depth: return 0        # limit depth (avoids infinite loops)
    oGame  = switchPlayer(game)   # will simulate opponent's moves
    oRatings = []                 # opponent's minimax ratings
    for move in getMoves(oGame):   
        played = playMove(oGame,*move)           # simulate
        oRatings.append(minimax(played,depth-1)) # get rating    
    rating = -max(oRatings or [1])               # worst countermove
    for d in range(1,depth+1):
        memo[game,d]=rating    # memorize for this depth and lower
    return rating

由于游戏数据结构是可散列的(即字符串),它可以用于记忆,大大提高了极小极大函数的性能。

通过极小极大功能,您可以确定电脑玩家的最佳走法(极小极大评分最高的玩家):

def bestMove(game):
    return max(getMoves(game),key=lambda m:minimax(playMove(game,*m)))

当不止一个动作具有最高评分时,您可以在获取 max() 之前随机选择最佳动作

将所有内容放在一个游戏循环中:

def printBoard(game):
    for r in range(1,10,3):
          print("("+") (".join(game[r:r+3])+")",list(range(r,r+3)))

from random import sample
while True:
    player,computer = sample("XOXO",2) # random starter, random computer
    game = player + "."*9
    while True:
        printBoard(game)
        if isWin(game): break
        game = switchPlayer(game)
        player = game[0]
        if player == computer:
            toPos,fromPos = bestMove(game)
            print(f"Computer plays player at toPos",end=" ")
            print(f"from fromPos"*(fromPos!=toPos))
        else:
            strMoves = [(str(t),str(f)) for t,f in getMoves(game)]
            while True:
                fromPos = toPos = input(f"Position to play for player: ")
                if not toPos in (f for f,_ in strMoves):
                    print("invalid position"); continue
                if game[1:].count(player)<3: break
                fromPos = input(f"Position to remove player from: ")
                if (toPos,fromPos) not in strMoves:
                    print("invalid move")
                else: break
        game = playMove(game,int(toPos),int(fromPos))
    print( f"player's WIN !!!\n")
    if input("play again (y/n)") != "y": break
    print()

示例运行:

(.) (.) (.) [1, 2, 3]
(.) (.) (.) [4, 5, 6]
(.) (.) (.) [7, 8, 9]
Computer plays O at 5 
(.) (.) (.) [1, 2, 3]
(.) (O) (.) [4, 5, 6]
(.) (.) (.) [7, 8, 9]
Position to play for X: 1
(X) (.) (.) [1, 2, 3]
(.) (O) (.) [4, 5, 6]
(.) (.) (.) [7, 8, 9]
Computer plays O at 3 
(X) (.) (O) [1, 2, 3]
(.) (O) (.) [4, 5, 6]
(.) (.) (.) [7, 8, 9]
Position to play for X: 7
(X) (.) (O) [1, 2, 3]
(.) (O) (.) [4, 5, 6]
(X) (.) (.) [7, 8, 9]
Computer plays O at 4 
(X) (.) (O) [1, 2, 3]
(O) (O) (.) [4, 5, 6]
(X) (.) (.) [7, 8, 9]
Position to play for X: 6
(X) (.) (O) [1, 2, 3]
(O) (O) (X) [4, 5, 6]
(X) (.) (.) [7, 8, 9]
Computer plays O at 9 from 3
(X) (.) (.) [1, 2, 3]
(O) (O) (X) [4, 5, 6]
(X) (.) (O) [7, 8, 9]
Position to play for X: 3
Position to remove X from: 7
(X) (.) (X) [1, 2, 3]
(O) (O) (X) [4, 5, 6]
(.) (.) (O) [7, 8, 9]
Computer plays O at 2 from 4
(X) (O) (X) [1, 2, 3]
(.) (O) (X) [4, 5, 6]
(.) (.) (O) [7, 8, 9]
Position to play for X: 8
Position to remove X from: 3
(X) (O) (.) [1, 2, 3]
(.) (O) (X) [4, 5, 6]
(.) (X) (O) [7, 8, 9]
Computer plays O at 3 from 2
(X) (.) (O) [1, 2, 3]
(.) (O) (X) [4, 5, 6]
(.) (X) (O) [7, 8, 9]
Position to play for X: 7
Position to remove X from: 8
(X) (.) (O) [1, 2, 3]
(.) (O) (X) [4, 5, 6]
(X) (.) (O) [7, 8, 9]
Computer plays O at 4 from 9
(X) (.) (O) [1, 2, 3]
(O) (O) (X) [4, 5, 6]
(X) (.) (.) [7, 8, 9]
Position to play for X: 
Position to play for X: 9
Position to remove X from: 1
(.) (.) (O) [1, 2, 3]
(O) (O) (X) [4, 5, 6]
(X) (.) (X) [7, 8, 9]
Computer plays O at 8 from 4
(.) (.) (O) [1, 2, 3]
(.) (O) (X) [4, 5, 6]
(X) (O) (X) [7, 8, 9]
Position to play for X: 2
Position to remove X from: 6
(.) (X) (O) [1, 2, 3]
(.) (O) (.) [4, 5, 6]
(X) (O) (X) [7, 8, 9]
Computer plays O at 1 from 3
(O) (X) (.) [1, 2, 3]
(.) (O) (.) [4, 5, 6]
(X) (O) (X) [7, 8, 9]
Position to play for X: 3
Position to remove X from: 7
(O) (X) (X) [1, 2, 3]
(.) (O) (.) [4, 5, 6]
(.) (O) (X) [7, 8, 9]
Computer plays O at 6 from 8
(O) (X) (X) [1, 2, 3]
(.) (O) (O) [4, 5, 6]
(.) (.) (X) [7, 8, 9]
Position to play for X: 7
Position to remove X from: 2
(O) (.) (X) [1, 2, 3]
(.) (O) (O) [4, 5, 6]
(X) (.) (X) [7, 8, 9]
Computer plays O at 4 from 1
(.) (.) (X) [1, 2, 3]
(O) (O) (O) [4, 5, 6]
(X) (.) (X) [7, 8, 9]
O's WIN !!!

play again (y/n) ? n

【讨论】:

刚刚尝试过,它完美无缺!这是非常简洁的,老实说是一段漂亮的代码。我有很多东西要学:)

以上是关于带有井字游戏的 Minimax 算法(但每个玩家只能有 3 个 tacs)的主要内容,如果未能解决你的问题,请参考以下文章

程序员面试金典-面试题 16.04. 井字游戏

我正在尝试实现一个极小极大算法来创建一个井字游戏机器人,但我遇到了递归错误

我做了一个功能,在我的井字游戏中的每一个动作之后都会改变玩家,但似乎不起作用

Android,在一个简单的在线游戏中连接两个用户,比如井字游戏。

井字游戏最小最大算法 Python。计算机算法并不稳健

C语言实现井字棋小游戏