As reported by the World Health Organization, Mpox (monkeypox) is an important disease present in 110 countries, mostly in South Asia and Africa. The number of Mpox cases has increased rapidly, and the medical world is worried about the emergence of a new pandemic. Detection of Mpox by traditional methods (using test kits) is a costly and slow process. For this reason, there is a need for methods that have high success rates and can diagnose Mpox disease from skin images with a deep-learning-based autonomous method. In this work, we propose a multi-class, fast, and reliable autonomous disease diagnosis model using transformer-based deep learning architectures and skin lesion images, including for Mpox disease. Our other aim is to investigate the effects of self-supervised learning, self-distillation, and shifted window techniques on classification success when multi-class skin lesion images are trained with transformer-based deep learning architectures. The Mpox Skin Lesion Dataset, Version 2.0, which was publicly released in 2024, was used in the training, validation, and testing processes of the study. The SwinTransformer architecture we proposed in our study achieved about 8% higher accuracy evaluation metric classification success compared to its closest competitor in the literature. ViT, MAE, DINO, and SwinTransformer architectures achieved 93.10%, 84.60%, 90.40%, and 93.71% accuracy classification success, respectively. The results obtained in the study showed that Mpox disease and other skin lesion images can be diagnosed with high success and can support doctors in decision-making. In addition, the study provides important results that can be used in other medical fields where the number of images is low in terms of transformer-based architecture and technique to use.