Learning deeper convolutional neural networks has become a tendency in recent years. However, many empirical evidences suggest that performance improvement cannot be attained by simply stacking more layers. In this paper, we consider the issue from an information theoretical perspective, and propose a novel method Relay Backpropagation, which encourages the propagation of effective information through the network in training stage. By virtue of the method, we achieved the first place in ILSVRC 2015 Scene Classification Challenge. Extensive experiments on two challenging large scale datasets demonstrate the effectiveness of our method is not restricted to a specific dataset or network architecture.