The CowMask model produces state-of-the-art results on an ImageNet dataset using 10% of the labeled data during training. At the same time, the top 5 has a model error rate of 8.76%, and the top 1 has 26.06%. Using CowMask allows you to train models with state-of-the-art quality and simpler architecture. Researchers have tested CowMask for semi-supervised training on SVHN, CIFAR-10, and CIFAR-100 datasets.
Models with the proposed approach gave results comparable to state-of-the-art. The project code is available in the open repository on GitHub.
Intern project by Geoff French, hosted by Avital Oliver and Tim Salimans This project explores the use of CowMask for…
Consistency in semi-supervised training
Consistency regularization is a semi-supervised training technique that trains a model using a small amount of labeled data. It works in such a way that the model is trained to resist changes in unlabeled data.
How CowMask works
The researchers adapted the Mean Teacher framework and used it as a base for the approach. They used two networks: a student network and a teacher network. Both models predict vectors with probabilities for each class. The student network is trained in a standard manner using gradient descent. After each update of the parameters, the weights of the teacher network are updated so that they are an exponential moving average of the weights of the student model. At the same time, there is a hyperparameter with the help of which the stability and speed of how the teacher network follows the student network are regulated.
CowMask generation algorithm pseudocode
CowMask was used for two types of consistency regularization: mask-based erasure and mask-based mixing.
Pseudocode with functionals for two types of regularization: erasure and mixing.
Details of the approach evaluation are available in the original article.
Interested in Deep Learning?
If you found this article helpful, click the💚 or 👏 button below or share the article on Facebook so your friends can benefit from it too.