vit-pytorch

mirror of https://github.com/lucidrains/vit-pytorch.git synced 2025-12-30 08:02:29 +00:00

Author	SHA1	Message	Date
lucidrains	e05cd6d8b8	some models only return embeddings with some kwarg on forward	2025-07-27 08:46:43 -07:00
lucidrains	b46233c3d6	need to be able to invoke with eval no grad	2025-07-27 08:25:58 -07:00
lucidrains	68e13a3c7d	bit more flexible	2025-07-27 08:14:48 -07:00
lucidrains	b22dc0ecd2	add a wrapper for accepting video and processing the images individually, optionally able to add time positional embeddings - for use in two robotics work	2025-07-27 08:05:48 -07:00
lucidrains	db05a141a6	add the proposed jumbo vit from Fuller et al. of Carleton University	2025-03-05 10:50:34 -08:00
lucidrains	9f49a31977	1.9.2	2025-01-19 05:53:11 -08:00
Phil Wang	c3018d1433	1.9.1	2025-01-04 07:55:49 -08:00
lucidrains	e7cba9ba6d	add a simple vit flavor for a new bytedance paper that proposes to break out of the traditional one residual stream architecture - "hyper-connections"	2024-12-20 17:43:50 -08:00
lucidrains	56373c0cbd	make value residual learned	2024-11-24 08:21:28 -08:00
lucidrains	24196a3e8a	allow for qk norm to be turned off for na vit nested tensor	2024-11-20 10:59:22 -08:00
Phil Wang	141239ca86	fix value residual	2024-10-31 06:48:24 -07:00
lucidrains	0b5c9b4559	add value residual based simple vit	2024-10-28 09:19:00 -07:00
lucidrains	e300cdd7dc	fix multiheaded qk rmsnorm in nViT	2024-10-10 19:15:17 -07:00
Phil Wang	36ddc7a6ba	go all the way with the normalized vit, fix some scales	2024-10-10 10:42:37 -07:00
Phil Wang	74b62009f8	go for multi-headed rmsnorm for the qknorm on hypersphere vit	2024-10-10 08:09:58 -07:00
Phil Wang	f50d7d1436	add a hypersphere vit, adapted from https://arxiv.org/abs/2410.01131	2024-10-09 07:32:25 -07:00
lucidrains	82f2fa751d	address https://github.com/lucidrains/vit-pytorch/issues/330	2024-10-04 07:01:48 -07:00
lucidrains	fcb9501cdd	add register tokens to the nested tensor 3d na vit example for researcher	2024-08-28 12:21:31 -07:00
lucidrains	c4651a35a3	1.7.11	2024-08-21 19:24:13 -07:00
Phil Wang	5e808f48d1	3d version of navit nested tensor	2024-08-21 07:23:21 -07:00
Phil Wang	bed48b5912	fix tests fix tests	2024-08-20 15:35:04 -07:00
lucidrains	73199ab486	Nested navit (#325 ) add a variant of NaViT using nested tensors	2024-08-20 15:12:29 -07:00
Phil Wang	4f22eae631	1.7.5	2024-08-07 08:46:18 -07:00
lucidrains	9992a615d1	attention re-use in lookup vit should use pre-softmax attention matrix	2024-07-19 19:23:38 -07:00
Phil Wang	4b2c00cb63	when cross attending in look vit, make sure context tokens are normalized	2024-07-19 10:23:12 -07:00
Phil Wang	ec6c48b8ff	norm not needed when reusing attention in lookvit	2024-07-19 10:00:03 -07:00
Phil Wang	547bf94d07	1.7.1	2024-07-19 09:49:44 -07:00
lucidrains	e3256d77cd	fix t2t vit having two layernorms, and make final layernorm in distillation wrapper configurable, default to False for vit	2024-06-11 15:12:53 -07:00
lucidrains	90be7233a3	rotary needs to be done with full precision to be safe	2024-05-11 08:04:32 -07:00
Phil Wang	bca88e9039	address https://github.com/lucidrains/vit-pytorch/issues/300	2024-05-02 08:46:39 -07:00
Phil Wang	96f66d2754	address https://github.com/lucidrains/vit-pytorch/issues/306	2024-04-18 09:44:29 -07:00
Phil Wang	12249dcc5f	address https://github.com/lucidrains/vit-pytorch/issues/304	2024-04-17 09:40:03 -07:00
SOUMYADIP MAL	8b8da8dede	Update setup.py (#303 )	2024-04-17 08:21:30 -07:00
lucidrains	5578ac472f	address https://github.com/lucidrains/vit-pytorch/issues/292	2023-12-23 08:11:39 -08:00
lucidrains	d446a41243	share an idea that should be tried if it has not been	2023-11-14 16:55:36 -08:00
lucidrains	0ad09c4cbc	allow channels to be customizable for cvt	2023-10-25 14:47:58 -07:00
Phil Wang	92b69321f4	1.6.2	2023-10-24 12:47:38 -07:00
lucidrains	53fe345e85	no longer needed with einops 0.7	2023-10-19 18:16:46 -07:00
Phil Wang	1616288e30	add xcit (#284 ) * add xcit * use Rearrange layers * give cross correlation transformer a final norm at end * document	2023-10-13 09:15:13 -07:00
lucidrains	bbb24e34d4	give a learned bias to and from registers for maxvit + register token variant	2023-10-06 10:40:26 -07:00
lucidrains	df8733d86e	improvise a max vit with register tokens	2023-10-06 10:27:36 -07:00
lucidrains	3fdb8dd352	fix pypi	2023-10-01 08:14:20 -07:00
lucidrains	a36546df23	add simple vit with register tokens example, cite	2023-10-01 08:11:40 -07:00
lucidrains	d830b05f06	address https://github.com/lucidrains/vit-pytorch/issues/279	2023-09-10 09:32:57 -07:00
Phil Wang	8208c859a5	just remove PreNorm wrapper from all ViTs, as it is unlikely to change at this point	2023-08-14 09:48:55 -07:00
Phil Wang	4264efd906	1.4.2	2023-08-14 07:59:35 -07:00
lucidrains	950c901b80	fix linear head in simple vit, thanks to @atkos	2023-08-10 14:36:21 -07:00
Phil Wang	3e5d1be6f0	address https://github.com/lucidrains/vit-pytorch/pull/274	2023-08-09 07:53:38 -07:00
Phil Wang	6e2393de95	wrap up NaViT	2023-07-25 10:38:55 -07:00
Phil Wang	32974c33df	one can pass a callback to token_dropout_prob for NaViT that takes in height and width and calculate appropriate dropout rate	2023-07-24 14:52:40 -07:00

1 2 3 4 5

205 Commits