vit-pytorch

mirror of https://github.com/lucidrains/vit-pytorch.git synced 2025-12-30 08:02:29 +00:00

Author	SHA1	Message	Date
Phil Wang	4ef72fc4dc	add EsViT, by popular request, an alternative to Dino that is compatible with efficient ViTs with accounting for regional self-supervised loss	2022-05-03 10:29:29 -07:00
Phil Wang	81661e3966	fix mbconv residual block	2022-04-06 16:43:06 -07:00
Phil Wang	13f8e123bb	fix maxvit - need feedforwards after attention	2022-04-06 16:34:40 -07:00
Phil Wang	c7bb5fc43f	maxvit intent to build (#211 ) complete hybrid mbconv + block / grid efficient self attention MaxViT	2022-04-06 16:12:17 -07:00
Phil Wang	d93cd84ccd	let windowed tokens exchange information across heads a la talking heads prior to pointwise attention in sep-vit	2022-03-31 15:22:24 -07:00
Phil Wang	5d4c798949	cleanup sepvit	2022-03-31 14:35:11 -07:00
Phil Wang	d65a742efe	intent to build (#210 ) complete SepViT, from bytedance AI labs	2022-03-31 14:30:23 -07:00
Phil Wang	8c54e01492	do not layernorm on last transformer block for scalable vit, as there is already one in mlp head	2022-03-31 13:25:21 -07:00
Phil Wang	df656fe7c7	complete learnable memory ViT, for efficient fine-tuning and potentially plays into continual learning	2022-03-31 09:51:12 -07:00
Phil Wang	4e6a42a0ca	correct need for post-attention dropout	2022-03-30 10:50:57 -07:00
Phil Wang	9cd56ff29b	CCT allow for rectangular images	2022-03-26 14:02:49 -07:00
Phil Wang	2aae406ce8	add proposed parallel vit from facebook ai for exploration purposes	2022-03-23 10:42:35 -07:00
Phil Wang	c2b2db2a54	fix window size of none for scalable vit for rectangular images	2022-03-22 17:37:59 -07:00
Phil Wang	719048d1bd	some better defaults for scalable vit	2022-03-22 17:19:58 -07:00
Phil Wang	d27721a85a	add scalable vit, from bytedance AI	2022-03-22 17:02:47 -07:00
Phil Wang	6db20debb4	add patch merger	2022-03-01 16:50:17 -08:00
Phil Wang	1bae5d3cc5	allow for rectangular images for efficient adapter	2022-01-31 08:55:31 -08:00
Phil Wang	25b384297d	return None from extractor if no attention layers	2022-01-28 17:49:58 -08:00
Phil Wang	64a07f50e6	epsilon should be inside square root	2022-01-24 17:24:41 -08:00
Phil Wang	c1528acd46	fix feature maps in Nest, thanks to @MarkYangjiayi	2022-01-22 13:17:30 -08:00
Phil Wang	1cc0f182a6	decoder positional embedding needs to be reapplied https://twitter.com/giffmana/status/1479195631587631104	2022-01-06 13:14:41 -08:00
Phil Wang	0082301f9e	build @jrounds suggestion	2022-01-03 12:56:25 -08:00
chinhsuanwu	f2414b2c1b	Update MobileViT	2021-12-30 05:52:23 +08:00
Phil Wang	70ba532599	add ViT for small datasets https://arxiv.org/abs/2112.13492	2021-12-28 10:58:21 -08:00
Phil Wang	e52ac41955	allow extractor to only return embeddings, to ready for vision transformers to be used in x-clip	2021-12-25 12:31:21 -08:00
Phil Wang	2c368d1d4e	add extractor wrapper	2021-12-21 11:11:39 -08:00
Phil Wang	b983bbee39	release MobileViT, from @murufeng	2021-12-21 10:22:59 -08:00
murufeng	89d3a04b3f	Add files via upload	2021-12-21 20:48:34 +08:00
Phil Wang	365b4d931e	add adaptive token sampling paper	2021-12-03 19:52:40 -08:00
Phil Wang	b45c1356a1	cleanup	2021-11-22 22:53:02 -08:00
Phil Wang	ff44d97cb0	make initial channels customizable for PiT	2021-11-22 18:08:49 -08:00
Phil Wang	b69b5af34f	dynamic positional bias for crossformer the more efficient way as described in appendix of paper	2021-11-22 17:39:36 -08:00
Phil Wang	36e32b70fb	complete and release crossformer	2021-11-22 17:10:53 -08:00
Phil Wang	768e47441e	crossformer without dynamic position bias	2021-11-22 16:21:55 -08:00
Phil Wang	6665fc6cd1	cleanup region vit	2021-11-22 12:42:24 -08:00
Phil Wang	9f8c60651d	clearer mae	2021-11-22 10:19:48 -08:00
Phil Wang	5ae555750f	add SimMIM	2021-11-21 15:50:19 -08:00
Phil Wang	dc57c75478	cleanup	2021-11-14 12:24:48 -08:00
Phil Wang	e8f6d72033	release masked autoencoder	2021-11-12 20:08:48 -08:00
Phil Wang	cb1729af28	more efficient feedforward for regionvit	2021-11-07 17:18:59 -08:00
Phil Wang	06d375351e	add RegionViT paper	2021-11-07 09:47:28 -08:00
Phil Wang	f196d1ec5b	move freqs in RvT to linspace	2021-10-05 09:23:44 -07:00
Yonghye Kwon	24ac8350bf	remove unused package	2021-08-30 18:25:03 +09:00
Yonghye Kwon	ca3cef9de0	Cleanup Attention Class	2021-08-30 18:05:16 +09:00
Phil Wang	73ed562ce4	Merge pull request #147 from developer0hye/patch-4 Make T2T process any scale image	2021-08-21 09:03:42 -07:00
Yonghye Kwon	ca0bdca192	Make model process any scale image Related to #145	2021-08-21 22:35:26 +09:00
Yonghye Kwon	1c70271778	Support image with width and height less than the image_size Related to #145	2021-08-21 22:25:46 +09:00
Yonghye Kwon	946815164a	Remove unused package	2021-08-20 13:44:57 +09:00
Phil Wang	aeed3381c1	use hardswish for levit	2021-08-19 08:22:55 -07:00
Phil Wang	3f754956fb	remove last transformer layer in t2t	2021-08-14 08:06:23 -07:00

1 2 3 4

151 Commits