Recent advances in visual tracking are based on siamese feature extractors and template matching. For this category of trackers, latest research focuses on better feature embeddings and similarity measures. In this work, we focus on building holistic object representations for tracking. The framework is designed to be used on top of previous trackers with no further need for training. We, therefore, present a new framework for obtaining additional object templates during the tracking process. Since the number of stored templates is limited, our method only keeps the most diverse ones. We achieve this by providing a new diversity measure in the space of siamese features recently introduced in the field. The obtained representation contains information beyond the ground truth object location provided to the system. It is then useful for tracking itself but also for further tasks which require a visual understanding of objects. Strong empirical results on tracking benchmarks indicate that our method can improve the performance and robustness of state-of-the-art trackers while barely reducing their speed.