Exploration vs Exploitation of Chess Openings Learner with Metropolis-Hastings Algorithm

dc.contributor.advisorRoy, Kallol, juhendaja
dc.contributor.authorSaarmann, Kristiine
dc.contributor.otherTartu Ülikool. Loodus- ja täppisteaduste valdkondet
dc.contributor.otherTartu Ülikool. Arvutiteaduse instituutet
dc.date.accessioned2023-08-30T07:07:24Z
dc.date.available2023-08-30T07:07:24Z
dc.date.issued2022
dc.description.abstractOpening lines are an integral part of ensuring a win, or at least a draw in chess. However, there are no well known openings learners that are available for the public that respond dynamically, thus allowing players to learn multiple openings at the same time. The problem with implementing such an algorithm is the fact that most chess databases use a notation that only expresses the movement of each step, not the board state as a whole. It has been reported that the frequencies of the opening moves follow a power-law with an exponent increase linearly with game depth. This thesis presents an innovative approach that can compare the states in different opening lines and dynamically choose which one to respond with. To be able to implement such a response, it is necessary to determine how to pick which state to transition to. We have proposed a weighted Metropolis Hastings algorithm to decide which move should be played by the computer. The move is jointly decided as a weighted sum of the Metropolis Hastings output and parameters set by the learner. The algorithm allows for the modification of the parameters used, as well as the weight (importance) of them. Our method gives the flexibility of exploration (from Metropolis-Hastings) vs exploitation (from the player parameters) in the chess openings and adapts the game accordingly. We have converted the opening lines represented in the string form to a list of matrices. This matrix encodes the board state of the game, with each entry of the matrix encoding the presence or the absence of a piece, the type of the piece, and the color. This transformation allows us to find matching states in the opening lines, even at a different move order. We have reported the distribution of chess board states under different number of iterations, parameters and weights, depending on the preferred playing style. We believe this is the first kind of study of chess openings from weighted stochastic approach and we are not aware of similar study to the best of our knowledge. Our method has the applicability for all chess learners and players, both new and experienced. In future we want to extend our method to include more complex variations, such as playing in positions with colors reversed, with one piece missing/displaced.et
dc.identifier.urihttps://hdl.handle.net/10062/91784
dc.language.isoenget
dc.publisherTartu Ülikoolet
dc.rightsopenAccesset
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectAcceptance/Rejection Samplinget
dc.subjectMonte-Carlo Methodset
dc.subjectMarkov Chainet
dc.subjectMetropolis- Hastingset
dc.subjectChesset
dc.subjectChess Opening Moveset
dc.subject.othermagistritöödet
dc.subject.otherinformaatikaet
dc.subject.otherinfotehnoloogiaet
dc.subject.otherinformaticset
dc.subject.otherinfotechnologyet
dc.titleExploration vs Exploitation of Chess Openings Learner with Metropolis-Hastings Algorithmet
dc.typeThesiset

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Saarmann_MSc_computer_science_2022.pdf
Size:
1.49 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: