| 
AIT Associated Repository of Academic Resources >
A.研究報告 >
 A1 愛知工業大学研究報告 >
 4.愛知工業大学研究報告 (2008-) >
 54号 >
 
        
            | このアイテムの引用には次の識別子を使用してください: http://hdl.handle.net/11133/3491 |  
 
| タイトル: | 深層学習とプレイアウトに基づく囲碁アルゴリズム |  | その他のタイトル: | シンソウ ガクシュウ ト プレイ アウト ニ モトズク イゴ アルゴリズム Go Algorithm Based on Deep Learning and Playout
 |  | 著者: | 伊藤, 雅 伊藤, 有人
 ITOH, Masaru
 ITO, Arito
 |  | 発行日: | 2019年3月31日日曜日 |  | 出版者: | 愛知工業大学 |  | 抄録: | This paper describes a go algorithm based on deep learning and playout. The algorithm runs on a small resource environment which consists of one CPU and one GPU. The best next move can be obtained by using a Value-Monte-Carlo tree search method. It is one of the best-first search methods. The proposed method omits the process of tree policy which has been proposed by AlphaGo. Instead of tree policy, the method adds the top 20 candidates with the highest probability in synchronization with SL policy network as leaves of the node when expanding a leaf node. The win/loss function according to the rollout policy advocated by AlphaGo is substituted by playout, which is commonly used in ordinary Monte-Carlo tree search. As a node evaluation value, not an ordinary UCB1 value but an action value advocated by AlphaGo is adopted. Numerical experiments confirmed the statistical significance of the proposed method and clarified both the best mixing parameter value and the node expansion threshold. |  | URI: | http://hdl.handle.net/11133/3491 |  | 出現コレクション: | 54号 
 |  
 
    
     
 このリポジトリに保管されているアイテムは、他に指定されている場合を除き、著作権により保護されています。   |