vit-pytorch

mirror of https://github.com/lucidrains/vit-pytorch.git synced 2025-12-30 08:02:29 +00:00

Author	SHA1	Message	Date
lucidrains	dd6462d19b	release small navit perf 1.16.3	2025-12-06 04:57:12 -08:00
Amit Moryossef	a1ee1daa1a	optimize NaViT with SDPA and vectorized forward pass (#353 ) - Replace manual attention with F.scaled_dot_product_attention - Use repeat_interleave instead of meshgrid for position computation - Build image_ids efficiently with repeat_interleave instead of F.pad - Remove unused Rearrange import ~56% speedup (91ms -> 58ms on 512 variable-sized images) Numerically equivalent (max diff ~5e-4, within flash attention tolerance) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <noreply@anthropic.com>	2025-12-06 04:56:40 -08:00
lucidrains	3cff5e547a	address https://github.com/lucidrains/vit-pytorch/issues/352 1.16.2	2025-12-02 05:21:52 -08:00
lucidrains	fdaf7f92b9	fix positional embed for mean pool case and cleanup	2025-11-27 17:01:47 -08:00
lucidrains	0ebd4edab9	address https://github.com/lucidrains/vit-pytorch/issues/351 1.16.0	2025-11-27 06:07:43 -08:00
lucidrains	aa49c2783a	VAAT should have two ears 1.15.7	2025-11-22 08:32:23 -08:00
lucidrains	6aa0374313	register tokens for the AST in VAAT 1.15.6	2025-11-22 08:12:01 -08:00
lucidrains	b35a97de05	improvise a variant of VAT with audio cortex before fully generalizing it 1.15.5	2025-11-22 07:51:19 -08:00
lucidrains	1374b93145	the paper claims finetuning everything was better, but just allow for freezing the visual cortex, what PI proposes 1.15.4	2025-11-09 10:59:55 -08:00
lucidrains	4386742cd1	an option to return zero for decorr aux loss if insufficient samples 1.15.3	2025-11-09 10:08:06 -08:00
lucidrains	5cf8384c56	add a vit with decorrelation auxiliary losses for mha and feedforwards, right after prenorm - this is in line with a paper from the netherlands, but without extra parameters or their manual sgd update scheme	2025-10-28 12:17:32 -07:00
lucidrains	f7d59cecb5	some register tokens cannot hurt for VAT 1.14.5	2025-10-24 14:00:38 -07:00
lucidrains	a583cb5988	last tweak to vat 1.14.4	2025-10-23 12:21:09 -07:00
lucidrains	25871013f5	forgot task conditioning for vat 1.14.2	2025-10-23 10:55:16 -07:00
lucidrains	e66862bcd5	add VAT from iclr 2026, which claims SOTA on libero using a relatively simple scheme (#350 ) 1.14.1	2025-10-23 10:23:53 -07:00
lucidrains	39fd9ac8be	for n-dimensional vit, have a method for fetching muon friendly parameters 1.12.5	2025-10-13 12:07:48 -07:00
lucidrains	3becf087bb	have a language model address https://github.com/lucidrains/vit-pytorch/issues/348	2025-09-25 06:21:13 -07:00
lucidrains	f6bc14c81d	able to return embed from vit-nd-rotary 1.12.2	2025-09-23 07:21:34 -07:00
lucidrains	845c844b3b	add a vit nd with rotary nd, from Jerry Xiong at UIUC 1.12.1	2025-09-21 10:45:42 -07:00
lucidrains	5f2bc0c796	with assistance from claude (yes it did the einops equation building here), generalize to n-dimensions 1.12.0	2025-09-21 06:22:43 -07:00
lucidrains	35bf273037	1.11.7 1.11.7	2025-08-17 18:07:42 -07:00
Baraa sameeh	1123063a5e	Make all CCT regularization parameters user-configurable. (#346 )	2025-08-17 18:07:25 -07:00
lucidrains	f8bec5ede2	able to project the image embedding before applying time positional embedding for accept video wrapper 1.11.6	2025-08-13 10:15:18 -07:00
lucidrains	297e7d00a2	handle channel first for accept video wrapper 1.11.5	2025-08-03 08:29:40 -07:00
lucidrains	29ac8e143c	fix when video time seq len less than max time seq len for video acceptor 1.11.4	2025-07-27 09:00:56 -07:00
lucidrains	e05cd6d8b8	some models only return embeddings with some kwarg on forward 1.11.3	2025-07-27 08:46:43 -07:00
lucidrains	b46233c3d6	need to be able to invoke with eval no grad 1.11.2	2025-07-27 08:25:58 -07:00
lucidrains	68e13a3c7d	bit more flexible 1.11.1	2025-07-27 08:14:48 -07:00
lucidrains	b22dc0ecd2	add a wrapper for accepting video and processing the images individually, optionally able to add time positional embeddings - for use in two robotics work 1.11.0	2025-07-27 08:05:48 -07:00
lucidrains	db05a141a6	add the proposed jumbo vit from Fuller et al. of Carleton University	2025-03-05 10:50:34 -08:00
lucidrains	9f49a31977	1.9.2 1.9.2	2025-01-19 05:53:11 -08:00
JacobLinCool	ab63fc9cc8	remove duplicated qkv computation in na_vit_nested_tensor_3d.py (#341 )	2025-01-19 05:52:46 -08:00
Phil Wang	c3018d1433	1.9.1 1.9.1	2025-01-04 07:55:49 -08:00
Kale Kundert	b7ed6bad28	add option to set frame padding for 3D CCT (#339 )	2025-01-04 07:55:27 -08:00
lucidrains	e7cba9ba6d	add a simple vit flavor for a new bytedance paper that proposes to break out of the traditional one residual stream architecture - "hyper-connections"	2024-12-20 17:43:50 -08:00
lucidrains	56373c0cbd	make value residual learned 1.8.9	2024-11-24 08:21:28 -08:00
lucidrains	24196a3e8a	allow for qk norm to be turned off for na vit nested tensor 1.8.8	2024-11-20 10:59:22 -08:00
Phil Wang	f6d7287b6b	readme	2024-11-19 08:20:38 -08:00
lucidrains	d47c57e32f	fix tests	2024-11-10 09:43:54 -08:00
lucidrains	0449865786	update minimum version for nested tensor of NaViT	2024-11-10 09:37:48 -08:00
lucidrains	6693d47d0b	update comment for navit 3d	2024-11-07 20:02:07 -08:00
Phil Wang	141239ca86	fix value residual 1.8.7	2024-10-31 06:48:24 -07:00
lucidrains	0b5c9b4559	add value residual based simple vit 1.8.6	2024-10-28 09:19:00 -07:00
lucidrains	e300cdd7dc	fix multiheaded qk rmsnorm in nViT 1.8.5	2024-10-10 19:15:17 -07:00
Phil Wang	36ddc7a6ba	go all the way with the normalized vit, fix some scales 1.8.4	2024-10-10 10:42:37 -07:00
Phil Wang	1d1a63fc5c	cite for hypersphere vit adapted from ngpt	2024-10-10 10:15:04 -07:00
Phil Wang	74b62009f8	go for multi-headed rmsnorm for the qknorm on hypersphere vit 1.8.2	2024-10-10 08:09:58 -07:00
Phil Wang	f50d7d1436	add a hypersphere vit, adapted from https://arxiv.org/abs/2410.01131 1.8.1	2024-10-09 07:32:25 -07:00
lucidrains	82f2fa751d	address https://github.com/lucidrains/vit-pytorch/issues/330 1.7.14	2024-10-04 07:01:48 -07:00
lucidrains	fcb9501cdd	add register tokens to the nested tensor 3d na vit example for researcher 1.7.12	2024-08-28 12:21:31 -07:00

1 2 3 4 5 ...

370 Commits