On-line estimation of an optimal treatment allocation strategy