enakai00/reversi.ipynb

Created November 7, 2016 08:34

Star () You must be signed in to star a gist
Fork () You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/enakai00/3afcf8dbc10237c62b898f7852fbcb6e.js"></script>
Save enakai00/3afcf8dbc10237c62b898f7852fbcb6e to your computer and use it in GitHub Desktop.

Download ZIP

Reinforcement learning example for mini-max method Reversi.

Raw

reversi.ipynb

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

yaneurao commented Nov 16, 2016 •

edited

Loading

×　シュミレーション
○　シミュレーション

興味深いことに、単純なミニマックス法に対する場合より勝率が下がりました。

ランダムプレイヤーより、「自分のコマ数 - 相手のコマ数」の評価関数を持つプレイヤーのほうが強い前提で議論が進んでいますが、それは自明ではありません。

実際、リバーシにおいて、序盤は自分の駒は少ないほうが中終盤で石の置ける升が増えるため良いとされることが多く、序盤の打ち方については、「自分のコマ数 - 相手のコマ数」の評価関数よりは、ランダムプレイヤーの打ち方のほうがはるかにマシだからです。

なので、ランダムプレイヤーと「自分のコマ数 - 相手のコマ数」の評価関数を持つプレイヤーとの勝率を先に検証しないと、上記引用部分が本当に「興味深い」かどうかは、言えません。

Author

enakai00 commented Nov 20, 2016

なるほど。ありがとうございます。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment