IRIS - Institutional Research Information System

We study online learning for optimal allocation when the resource to be allocated is time. An agent receives task proposals sequentially according to a Poisson process and can either accept or reject a proposed task. If she accepts the proposal, she is busy for the duration of the task and obtains a reward that depends on the task duration. If she rejects it, she remains on hold until a new task proposal arrives. We study the regret incurred by the agent, first when she knows her reward function but does not know the distribution of the task duration, and then when she does not know her reward function, either. This natural setting bears similarities with contextual (one-armed) bandits, but with the crucial difference that the normalized reward associated to a context depends on the whole distribution of contexts.

Boursier, E.; Garrec, T.; Perchet, V.; Scarsini, Marco. (2021). Making the most of your day: online learning for optimal allocation of time. In Advances in Neural Information Processing Systems (pp. 11208- 11219). https://proceedings.neurips.cc/paper/2021/hash/5d2c2cee8ab0b9a36bd1ed7196bd6c4a-Abstract.html.

Making the most of your day: online learning for optimal allocation of time

Boursier E.;Garrec T.;Perchet V.;Scarsini M.

2021

Abstract

We study online learning for optimal allocation when the resource to be allocated is time. An agent receives task proposals sequentially according to a Poisson process and can either accept or reject a proposed task. If she accepts the proposal, she is busy for the duration of the task and obtains a reward that depends on the task duration. If she rejects it, she remains on hold until a new task proposal arrives. We study the regret incurred by the agent, first when she knows her reward function but does not know the distribution of the task duration, and then when she does not know her reward function, either. This natural setting bears similarities with contextual (one-armed) bandits, but with the crucial difference that the normalized reward associated to a context depends on the whole distribution of contexts.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno del convegno
	
				2021
			
	Parole chiave
	
				online learning, scheduling
			
	Citazione
	
				Boursier, E.; Garrec, T.; Perchet, V.; Scarsini, Marco. (2021). Making the most of your day: online learning for optimal allocation of time. In Advances in Neural Information Processing Systems (pp. 11208- 11219). https://proceedings.neurips.cc/paper/2021/hash/5d2c2cee8ab0b9a36bd1ed7196bd6c4a-Abstract.html.
			
	Appare nelle tipologie:
	
				04.1 - Contributo in Atti di convegno (Paper in Proceedings)

File in questo prodotto:

File	Dimensione	Formato
NeurIPS2021BGPS.pdf Open Access Tipologia: Versione dell'editore Licenza: Tutti i diritti riservati Dimensione 715.91 kB Formato Adobe PDF Visualizza/Apri	715.91 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11385/219158

Citazioni

0

0

ND

social impact