Combinatorial Bandit and Reinforcement Learning for Online Recommendation
陈卫,4 月 13 日 9:00
In online recommendation, the platform or the agent recommends a list of items to the user and obtains the user’s click feedback to improve the future recommendation effectiveness. It was originally formulated as a cascading bandit problem. In this talk, we first show how online recommendation fits into the general combinatorial multi-armed bandit (CMAB) framework. Then we present several of our latest advances extending the classical formulation. First, we show how to extend cascading bandit to the case where we cannot assume the user follows the given recommendation order to browse through items and thus it provides much weaker feedback than that of the cascading bandit. Second, we show how to integrate cascading bandit into the reinforcement learning framework to accommodate stateful multi-stage recommendations. Finally, we mention our technical advances on variance-based techniques to improve the regret bound for cascading bandit and related CMAB problems.
Bio

Wei Chen is a Principal Researcher at Microsoft Research Asia, and the Chair of MSRA Theory Center. He is a guest professor at several universities including Tsinghua University, Shanghai Jiao Tong University, Hong Kong University of Science and Technology – Guangzhou, and Shenzhen University. He is a standing committee member of the Technical Committee on Theoretical Computer Science, Chinese Computer Federation, and a member of the CCF Technical Committee on Big Data. He is a Fellow of Institute of Electrical and Electronic Engineers (IEEE). He is recognized by Elsevier as the most cited Chinese researchers (2021-2023), and is ranked as the top 2% scientists worldwide by the Stanford ranking (2020-2023).
Wei Chen’s main research interests include online learning and optimization, social and information networks, network game theory and economics, distributed computing, and fault tolerance. He has done influential research on the algorithmic study of social influence propagation and maximization and combinatorial online learning, with 15000+ collective citations on these topics. He has one coauthored monograph in English in 2013 and one sole authored monograph in Chinese in 2020, both on information and influence propagation in social networks. He has served as editors, academic conference chairs and program committee members for many academic conferences and journals. He has won several best paper awards including 2021 ICDM 10-Year Highest-Impact Paper Award, and William C. Carter Award for best paper in DSN’2000. Wei Chen has bachelor’s and master’s degrees from Tsinghua University and a Ph.D. degree in computer science from Cornell University.
For more information, you are welcome to visit his home page at http://research.microsoft.com/en-us/people/weic/.