vit-pytorch

mirror of https://github.com/lucidrains/vit-pytorch.git synced 2025-12-30 08:02:29 +00:00

Author	SHA1	Message	Date
Phil Wang	e712003dfb	add CrossViT	2021-03-30 00:53:27 -07:00
Phil Wang	8135d70e4e	use hooks to retrieve attention maps for user without modifying ViT	2021-03-29 15:10:12 -07:00
Phil Wang	3067155cea	add recorder class, for recording attention across layers, for researchers	2021-03-29 11:08:19 -07:00
Phil Wang	ab7315cca1	cleanup	2021-03-27 22:14:16 -07:00
Phil Wang	15294c304e	remove masking, as it complicates with little benefit	2021-03-23 12:18:47 -07:00
Phil Wang	b900850144	add deep vit	2021-03-23 11:57:13 -07:00
Phil Wang	78489045cd	readme	2021-03-09 19:23:09 -08:00
Phil Wang	173e07e02e	cleanup and release 0.8.0	2021-03-08 07:28:31 -08:00
Phil Wang	0e63766e54	Merge pull request #66 from zankner/masked_patch_pred Masked Patch Prediction "Suggested in #63" Work in Progress	2021-03-08 07:21:52 -08:00
Zack Ankner	a6cbda37b9	added to readme	2021-03-08 09:34:55 -05:00
Phil Wang	3744ac691a	remove patch size from T2TViT	2021-02-21 19:15:19 -08:00
Phil Wang	e3205c0a4f	add token to token ViT	2021-02-19 22:28:53 -08:00
Phil Wang	4fc7365356	incept idea for using nystromformer	2021-02-17 15:30:45 -08:00
Phil Wang	5db8d9deed	update readme about non-square images	2021-01-12 06:55:45 -08:00
Phil Wang	e8ca6038c9	allow for DistillableVit to still run predictions	2021-01-11 10:49:14 -08:00
Phil Wang	1106a2ba88	link to official repo	2021-01-08 08:23:50 -08:00
Phil Wang	f95fa59422	link to resources for vision people	2021-01-04 10:10:54 -08:00
Phil Wang	be1712ebe2	add quote	2020-12-28 10:22:59 -08:00
Phil Wang	1a76944124	update readme	2020-12-27 19:10:38 -08:00
Phil Wang	74074e2b6c	offer easy way to turn DistillableViT to ViT at the end of training	2020-12-25 11:16:52 -08:00
Phil Wang	e0007bd801	add distill diagram	2020-12-24 11:34:15 -08:00
Phil Wang	dc4b3327ce	no grad for teacher in distillation	2020-12-24 11:11:58 -08:00
Phil Wang	aa8f0a7bf3	Update README.md	2020-12-24 10:59:03 -08:00
Phil Wang	34e6284f95	Update README.md	2020-12-24 10:58:41 -08:00
Phil Wang	aa9ed249a3	add knowledge distillation with distillation tokens, in light of new finding from facebook ai	2020-12-24 10:39:15 -08:00
Phil Wang	ea0924ec96	update readme	2020-12-23 19:06:48 -08:00
Phil Wang	24339644ca	offer a way to use mean pooling of last layer	2020-12-23 17:23:58 -08:00
Phil Wang	b786029e18	fix the dimension per head to be independent of dim and heads, to make sure users do not have it be too small to learn anything	2020-12-17 07:43:52 -08:00
Phil Wang	a656a213e6	update diagram	2020-12-04 12:26:28 -08:00
Long M. Lưu	3f50dd72cf	Update README.md	2020-11-21 18:37:03 +07:00
Phil Wang	4f84ad7a64	authors are now known	2020-11-03 14:28:20 -08:00
Phil Wang	c74bc781f0	cite	2020-11-03 11:59:05 -08:00
Phil Wang	c1043ab00c	update readme	2020-10-26 19:01:03 -07:00
Phil Wang	5b5d98a3a7	dropouts are more specific and aggressive in the paper, thanks for letting me know @hila-chefer	2020-10-14 09:22:16 -07:00
Phil Wang	0b2b3fc20c	add dropouts	2020-10-13 13:11:59 -07:00
Phil Wang	b298031c17	write up example for using efficient transformers	2020-10-07 19:15:21 -07:00
Phil Wang	f7123720c3	add masking	2020-10-07 11:21:03 -07:00
Phil Wang	8fb261ca66	fix a bug and add suggestion for BYOL pre-training	2020-10-04 14:55:29 -07:00
Phil Wang	112ba5c476	update with link to Yannics video	2020-10-04 13:53:47 -07:00
Phil Wang	f899226d4f	add diagram	2020-10-04 12:47:08 -07:00
Phil Wang	ee8088b3ea	first commit	2020-10-04 12:35:01 -07:00
Phil Wang	ea03db32f0	Update README.md	2020-10-03 15:49:27 -07:00
Phil Wang	30362d50dc	Update README.md	2020-10-03 15:49:02 -07:00
Phil Wang	efb40e0b01	Initial commit	2020-10-03 15:47:26 -07:00

44 Commits