This study investigates the application of reinforcement learning for the adaptive tuning of neighborhood probabilities in stochastic multi-neighborhood search. The aim is to provide a more flexible and robust tuning method for heterogeneous scenarios than traditional offline tuning. We propose a novel mix of learning components for multi-neighborhood Simulated Annealing, which considers both cost-and time-effectiveness of moves. To assess the performance of our approach we employ two real-world case studies in timetabling, namely examination timetabling and sports timetabling, for which multi-neighborhood Simulated Annealing has already obtained remarkable results using offline tuning techniques. Experimental data show that our approach obtains better results than the analogous algorithm that uses state-of-the-art offline tuning on benchmarking datasets while requiring less tuning effort.