2020-10-04 12:34:30 -07:00
2020-10-03 15:47:26 -07:00
2020-10-03 15:47:26 -07:00
2020-10-03 15:49:27 -07:00

Vision Transformer - Pytorch (wip)

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch. There's really not much to code here, but may as well lay out all the code so we expedite the attention revolution and get everyone on the same page.

Citations

@inproceedings{
    anonymous2021an,
    title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},
    author={Anonymous},
    booktitle={Submitted to International Conference on Learning Representations},
    year={2021},
    url={https://openreview.net/forum?id=YicbFdNTTy},
    note={under review}
}
Description
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch 在PyTorch中实现视觉Transformer,这是一种仅通过单个Transformer编码器就在视觉分类任务中达到最先进水平的简单方法。
Readme MIT 11 MiB
Languages
Python 100%