Phil Wang
|
85314cf0b6
|
patch for scaling factor, thanks to @urkax
|
2021-01-21 09:39:42 -08:00 |
|
Phil Wang
|
e8ca6038c9
|
allow for DistillableVit to still run predictions
|
2021-01-11 10:49:14 -08:00 |
|
Phil Wang
|
2263b7396f
|
allow distillable efficient vit to restore efficient vit as well
|
2020-12-25 19:31:25 -08:00 |
|
Phil Wang
|
74074e2b6c
|
offer easy way to turn DistillableViT to ViT at the end of training
|
2020-12-25 11:16:52 -08:00 |
|
Phil Wang
|
0c68688d61
|
bump for release
|
2020-12-25 09:30:48 -08:00 |
|
Phil Wang
|
db98ed7a8e
|
allow for overriding alpha as well on forward in distillation wrapper
|
2020-12-24 11:18:36 -08:00 |
|
Phil Wang
|
dc4b3327ce
|
no grad for teacher in distillation
|
2020-12-24 11:11:58 -08:00 |
|
Phil Wang
|
aa9ed249a3
|
add knowledge distillation with distillation tokens, in light of new finding from facebook ai
|
2020-12-24 10:39:15 -08:00 |
|
Phil Wang
|
59787a6b7e
|
allow for mean pool with efficient version too
|
2020-12-23 18:15:40 -08:00 |
|
Phil Wang
|
24339644ca
|
offer a way to use mean pooling of last layer
|
2020-12-23 17:23:58 -08:00 |
|
Phil Wang
|
b786029e18
|
fix the dimension per head to be independent of dim and heads, to make sure users do not have it be too small to learn anything
|
2020-12-17 07:43:52 -08:00 |
|
Phil Wang
|
9624181940
|
simplify mlp head
|
2020-12-07 14:31:50 -08:00 |
|
Phil Wang
|
6c8dfc185e
|
remove float(-inf) as masking value
|
2020-11-13 12:25:21 -08:00 |
|
Phil Wang
|
7a214d7109
|
allow for training on different image sizes, provided images are smaller than what was passed as image_size keyword on init
|
2020-10-25 13:17:42 -07:00 |
|
Phil Wang
|
6d1df1a970
|
more efficient
|
2020-10-22 22:37:06 -07:00 |
|
Phil Wang
|
d65a8c17a5
|
remove dropout from last linear to logits
|
2020-10-16 13:58:23 -07:00 |
|
Phil Wang
|
f7c164d910
|
assert minimum number of patches
|
2020-10-16 12:19:50 -07:00 |
|
Phil Wang
|
5b5d98a3a7
|
dropouts are more specific and aggressive in the paper, thanks for letting me know @hila-chefer
|
2020-10-14 09:22:16 -07:00 |
|
Phil Wang
|
b0e4790c24
|
bump package
|
2020-10-13 13:12:19 -07:00 |
|
Phil Wang
|
ced464dcb4
|
Update setup.py
|
2020-10-11 00:06:26 -07:00 |
|
Phil Wang
|
a0fa41070f
|
norm cls token before sending to mlp head
|
2020-10-10 12:08:42 -07:00 |
|
Phil Wang
|
b298031c17
|
write up example for using efficient transformers
|
2020-10-07 19:15:21 -07:00 |
|
Phil Wang
|
d66b29e4cf
|
cleanup stray print
|
2020-10-07 11:22:45 -07:00 |
|
Phil Wang
|
f7123720c3
|
add masking
|
2020-10-07 11:21:03 -07:00 |
|
Phil Wang
|
8fb261ca66
|
fix a bug and add suggestion for BYOL pre-training
|
2020-10-04 14:55:29 -07:00 |
|
Phil Wang
|
825a9484d1
|
small bug fix
|
2020-10-04 12:39:51 -07:00 |
|
Phil Wang
|
ee8088b3ea
|
first commit
|
2020-10-04 12:35:01 -07:00 |
|