Virtually every organism gathers information about its noisy environment and builds models from those data, mostly using neural networks. Here, we use stochastic thermodynamics to analyze the learning of a classification rule by a neural network. We show that the information acquired by the network is bounded by the thermodynamic cost of learning and introduce a learning efficiency η≤1. We discuss the conditions for optimal learning and analyze Hebbian learning in the thermodynamic limit.