Commit Graph

21 Commits

Author SHA1 Message Date
Phil Wang
34e6284f95 Update README.md 2020-12-24 10:58:41 -08:00
Phil Wang
aa9ed249a3 add knowledge distillation with distillation tokens, in light of new finding from facebook ai 2020-12-24 10:39:15 -08:00
Phil Wang
ea0924ec96 update readme 2020-12-23 19:06:48 -08:00
Phil Wang
24339644ca offer a way to use mean pooling of last layer 2020-12-23 17:23:58 -08:00
Phil Wang
b786029e18 fix the dimension per head to be independent of dim and heads, to make sure users do not have it be too small to learn anything 2020-12-17 07:43:52 -08:00
Phil Wang
a656a213e6 update diagram 2020-12-04 12:26:28 -08:00
Long M. Lưu
3f50dd72cf Update README.md 2020-11-21 18:37:03 +07:00
Phil Wang
4f84ad7a64 authors are now known 2020-11-03 14:28:20 -08:00
Phil Wang
c74bc781f0 cite 2020-11-03 11:59:05 -08:00
Phil Wang
c1043ab00c update readme 2020-10-26 19:01:03 -07:00
Phil Wang
5b5d98a3a7 dropouts are more specific and aggressive in the paper, thanks for letting me know @hila-chefer 2020-10-14 09:22:16 -07:00
Phil Wang
0b2b3fc20c add dropouts 2020-10-13 13:11:59 -07:00
Phil Wang
b298031c17 write up example for using efficient transformers 2020-10-07 19:15:21 -07:00
Phil Wang
f7123720c3 add masking 2020-10-07 11:21:03 -07:00
Phil Wang
8fb261ca66 fix a bug and add suggestion for BYOL pre-training 2020-10-04 14:55:29 -07:00
Phil Wang
112ba5c476 update with link to Yannics video 2020-10-04 13:53:47 -07:00
Phil Wang
f899226d4f add diagram 2020-10-04 12:47:08 -07:00
Phil Wang
ee8088b3ea first commit 2020-10-04 12:35:01 -07:00
Phil Wang
ea03db32f0 Update README.md 2020-10-03 15:49:27 -07:00
Phil Wang
30362d50dc Update README.md 2020-10-03 15:49:02 -07:00
Phil Wang
efb40e0b01 Initial commit 2020-10-03 15:47:26 -07:00