Obserwuj
Soichiro Nishimori
Soichiro Nishimori
Zweryfikowany adres z g.ecc.u-tokyo.ac.jp
Tytuł
Cytowane przez
Cytowane przez
Rok
Pgx: Hardware-accelerated parallel game simulators for reinforcement learning
S Koyamada, S Okano, S Nishimori, Y Murata, K Habara, H Kita, S Ishii
Advances in Neural Information Processing Systems 36, 45716-45743, 2023
292023
Mjx: A framework for Mahjong AI research
S Koyamada, K Habara, N Goto, S Okano, S Nishimori, S Ishii
2022 IEEE Conference on Games (CoG), 504-507, 2022
42022
A policy gradient primal-dual algorithm for constrained mdps with uniform pac guarantees
T Kitamura, T Kozuno, M Kato, Y Ichihara, S Nishimori, A Sannai, ...
arXiv preprint arXiv:2401.17780, 2024
32024
JAX-CORL: Clean Sigle-file Implementations of Offline RL Algorithms in JAX
S Nishimori
URL https://github. com/nissymori/JAX-CORL, 2024
2*2024
A Batch Sequential Halving Algorithm without Performance Degradation
S Koyamada, S Nishimori, S Ishii
arXiv preprint arXiv:2406.00424, 2024
2024
Leveraging Domain-Unlabeled Data in Offline Reinforcement Learning across Two Domains
S Nishimori, XQ Cai, J Ackermann, M Sugiyama
arXiv preprint arXiv:2404.07465, 2024
2024
End-to-End Policy Gradient Method for POMDPs and Explainable Agents
S Nishimori, S Koyamada, S Ishii
arXiv preprint arXiv:2304.09769, 2023
2023
Nie można teraz wykonać tej operacji. Spróbuj ponownie później.
Prace 1–7