Pure Exploration in Infinitely-Armed Bandit Models with Fixed-Confidence

0 avis

Pure Exploration in Infinitely-Armed Bandit Models with Fixed-Confidence

Archive ouverte

Aziz, Maryam | Anderton, Jesse | Kaufmann, Emilie | Aslam, Javed

Edité par CCSD -

International audience. We consider the problem of near-optimal arm identification in the fixed confidence setting of the infinitely armed bandit problem when nothing is known about the arm reservoir distribution. We (1) introduce a PAC-like framework within which to derive and cast results; (2) derive a sample complexity lower bound for near-optimal arm identification; (3) propose an algorithm that identifies a nearly-optimal arm with high probability and derive an upper bound on its sample complexity which is within a log factor of our lower bound; and (4) discuss whether our log^2(1/delta) dependence is inescapable for ``two-phase'' (select arms first, identify the best later) algorithms in the infinite setting. This work permits the application of bandit models to a broader class of problems where fewer assumptions hold.

Langue: en

Consulter en ligne

Consulter le document

Suggestions

Du même auteur

On Multi-Armed Bandit Designs for Dose-Finding Trials

Archive ouverte | Aziz, Maryam | CCSD

International audience. We study the problem of finding the optimal dosage in early stage clinical trials through the multi-armed bandit lens. We advocate the use of the Thompson Sampling principle, a flexible algor...

Near Instance-Optimal PAC Reinforcement Learning for Deterministic MDPs

Archive ouverte | Tirinzoni, Andrea | CCSD

International audience. In probably approximately correct (PAC) reinforcement learning (RL), an agent is required to identify an ε-optimal policy with probability 1 − δ. While minimax optimal algorithms exist for th...

A Practical Algorithm for Multiplayer Bandits when Arm Means Vary Among Players

Archive ouverte | Boursier, Etienne | CCSD

International audience. We study a multiplayer stochastic multi-armed bandit problem in which players cannot communicate, and if two or more players pull the same arm,a collision occurs and the involved players rece...

Chargement des enrichissements...

Syracuse