vit-pytorch

ZF/vit-pytorch

Fork 0

mirror of https://github.com/lucidrains/vit-pytorch.git synced 2025-12-29 23:52:27 +00:00

Commit Graph

Select branches

Hide Pull Requests

levit

main

xcit

#101

#102

#105

#118

#128

#144

#145

#146

#147

#150

#151

#153

#155

#157

#162

#170

#173

#173

#181

#186

#210

#211

#212

#216

#216

#220

#231

#242

#260

#264

#265

#272

#273

#274

#283

#284

#285

#290

#290

#301

#301

#303

#31

#313

#313

#317

#317

#318

#318

#323

#323

#324

#324

#325

#326

#327

#333

#334

#334

#338

#338

#339

#341

#343

#343

#346

#350

#353

#354

#355

#4

#51

#66

#7

#87

#88

#91

0.0.1

0.0.2

0.0.3

0.0.4

0.0.5

0.1.0

0.10.1

0.10.2

0.10.3

0.11.0

0.11.1

0.12.0

0.14.0

0.14.1

0.14.2

0.14.3

0.14.4

0.14.5

0.15.0

0.15.1

0.15.2

0.16.0

0.16.1

0.16.10

0.16.11

0.16.12

0.16.13

0.16.2

0.16.3

0.16.4

0.16.5

0.16.6

0.16.7

0.16.8

0.16.9

0.17.0

0.17.1

0.17.2

0.17.3

0.18.0

0.18.1

0.18.2

0.18.3

0.18.4

0.19.0

0.19.1

0.19.2

0.19.3

0.19.4

0.19.5

0.19.6

0.2.0

0.2.1

0.2.2

0.2.3

0.2.4

0.2.5

0.2.6

0.2.7

0.20.0

0.20.1

0.20.2

0.20.3

0.20.4

0.20.5

0.20.6

0.20.7

0.20.8

0.21.0

0.21.1

0.22.0

0.23.0

0.23.1

0.23.2

0.24.0

0.24.1

0.24.2

0.24.3

0.25.0

0.25.1

0.25.2

0.25.3

0.25.5

0.25.6

0.26.0

0.26.1

0.26.2

0.26.3

0.26.4

0.26.5

0.26.6

0.26.7

0.27.0

0.27.1

0.28.0

0.28.1

0.28.2

0.29.0

0.29.1

0.3.0

0.30.0

0.30.1

0.31.1

0.32.0

0.32.1

0.32.2

0.33.0

0.33.1

0.33.2

0.34.0

0.34.1

0.35.0

0.35.1

0.35.2

0.36.0

0.36.1

0.36.2

0.37.0

0.37.1

0.38.0

0.38.1

0.39.0

0.39.1

0.4.0

0.40.0

0.40.1

0.40.2

0.5.0

0.5.1

0.6.0

0.6.1

0.6.2

0.6.3

0.6.4

0.6.5

0.6.7

0.6.8

0.7.0

0.7.1

0.7.2

0.7.3

0.7.4

0.7.5

0.7.6

0.8.0

0.9.0

0.9.1

0.9.2

0.9.3

1.0.0

1.0.1

1.0.2

1.1.0

1.1.1

1.10.0

1.10.1

1.11.0

1.11.1

1.11.2

1.11.3

1.11.4

1.11.5

1.11.6

1.11.7

1.12.0

1.12.1

1.12.2

1.12.3

1.12.4

1.12.5

1.14.0

1.14.1

1.14.2

1.14.4

1.14.5

1.15.0

1.15.1

1.15.2

1.15.3

1.15.4

1.15.5

1.15.6

1.15.7

1.16.0

1.16.1

1.16.2

1.16.3

1.16.4

1.16.5

1.17.0

1.17.1

1.2.0

1.2.1

1.2.2

1.2.4

1.2.5

1.2.6

1.2.7

1.2.8

1.2.9

1.4.0

1.4.1

1.4.2

1.4.3

1.4.4

1.4.5

1.5.0

1.5.0a

1.5.1

1.5.2

1.5.3

1.6.0

1.6.1

1.6.2

1.6.3a

1.6.4

1.6.5

1.6.6

1.6.7

1.6.8

1.6.9

1.7.0

1.7.1

1.7.10

1.7.11

1.7.12

1.7.14

1.7.2

1.7.3

1.7.4

1.7.5

1.7.6

1.7.7

1.7.8

1.7.9

1.8.0

1.8.1

1.8.2

1.8.3

1.8.4

1.8.5

1.8.6

1.8.7

1.8.8

1.8.9

1.9.0

1.9.1

1.9.2

v0.35.3

v0.35.4

v0.35.5

v0.35.6

v0.35.7

v0.35.8

fb5014f0ee get a version of n-dimensional vit with golden gate polar coordinate embeddings into the repo for future use main lucidrains 2025-12-25 09:11:13 -08:00
7e703f239f get a version of n-dimensional vit with golden gate polar coordinate embeddings into the repo for future use 1.17.1 lucidrains 2025-12-25 06:57:56 -08:00
1b1fff3526 get a version of n-dimensional vit with golden gate polar coordinate embeddings into the repo for future use 1.17.0 lucidrains 2025-12-25 06:55:10 -08:00
0b7518ef45 educate Phil Wang 2025-12-21 07:06:20 -08:00
077d8c188f fix distill 1.16.5 lucidrains 2025-12-10 15:52:10 -08:00
5888f05300 1.16.4 1.16.4 lucidrains 2025-12-07 04:32:52 -08:00
d518e89573 cache position grids in NaViT forward pass (#354) Amit Moryossef 2025-12-07 13:32:30 +01:00
dd6462d19b release small navit perf 1.16.3 lucidrains 2025-12-06 04:57:12 -08:00
a1ee1daa1a optimize NaViT with SDPA and vectorized forward pass (#353) Amit Moryossef 2025-12-06 13:56:40 +01:00
3cff5e547a address https://github.com/lucidrains/vit-pytorch/issues/352 1.16.2 lucidrains 2025-12-02 05:21:52 -08:00
fdaf7f92b9 fix positional embed for mean pool case and cleanup lucidrains 2025-11-27 17:01:47 -08:00
ad80b6c51e fix positional embed for mean pool case and cleanup 1.16.1 lucidrains 2025-11-27 16:56:36 -08:00
0ebd4edab9 address https://github.com/lucidrains/vit-pytorch/issues/351 1.16.0 lucidrains 2025-11-27 06:07:43 -08:00
aa49c2783a VAAT should have two ears 1.15.7 lucidrains 2025-11-22 08:32:23 -08:00
6aa0374313 register tokens for the AST in VAAT 1.15.6 lucidrains 2025-11-22 08:12:01 -08:00
b35a97de05 improvise a variant of VAT with audio cortex before fully generalizing it 1.15.5 lucidrains 2025-11-22 07:51:19 -08:00
1374b93145 the paper claims finetuning everything was better, but just allow for freezing the visual cortex, what PI proposes 1.15.4 lucidrains 2025-11-09 10:59:55 -08:00
4386742cd1 an option to return zero for decorr aux loss if insufficient samples 1.15.3 lucidrains 2025-11-09 10:08:06 -08:00
5cf8384c56 add a vit with decorrelation auxiliary losses for mha and feedforwards, right after prenorm - this is in line with a paper from the netherlands, but without extra parameters or their manual sgd update scheme lucidrains 2025-10-28 12:17:32 -07:00
cbf6723063 add a vit with decorrelation auxiliary losses for mha and feedforwards, right after prenorm - this is in line with a paper from the netherlands, but without extra parameters or their manual sgd update scheme 1.15.2 lucidrains 2025-10-26 18:25:13 -07:00
c07a55cc83 add a vit with decorrelation auxiliary losses for mha and feedforwards, right after prenorm - this is in line with a paper from the netherlands, but without extra parameters or their manual sgd update scheme 1.15.1 lucidrains 2025-10-26 18:09:57 -07:00
2f32a78790 add a vit with decorrelation auxiliary losses for mha and feedforwards, right after prenorm - this is in line with a paper from the netherlands, but without extra parameters or their manual sgd update scheme 1.15.0 lucidrains 2025-10-26 17:49:38 -07:00
f7d59cecb5 some register tokens cannot hurt for VAT 1.14.5 lucidrains 2025-10-24 14:00:38 -07:00
a583cb5988 last tweak to vat 1.14.4 lucidrains 2025-10-23 12:21:09 -07:00
25871013f5 forgot task conditioning for vat 1.14.2 lucidrains 2025-10-23 10:55:16 -07:00
e66862bcd5 add VAT from iclr 2026, which claims SOTA on libero using a relatively simple scheme (#350) 1.14.1 lucidrains 2025-10-23 10:23:53 -07:00
d5d6c3f38f add VAT from iclr 2026, which claims SOTA on libero using a relatively simple scheme (#350) 1.14.0 Phil Wang 2025-10-23 10:16:40 -07:00
39fd9ac8be for n-dimensional vit, have a method for fetching muon friendly parameters 1.12.5 lucidrains 2025-10-13 12:07:48 -07:00
3becf087bb have a language model address https://github.com/lucidrains/vit-pytorch/issues/348 lucidrains 2025-09-25 06:21:13 -07:00
0b273a2148 have a language model address https://github.com/lucidrains/vit-pytorch/issues/348 1.12.4 lucidrains 2025-09-25 06:16:04 -07:00
98cbdab5a4 have a language model address https://github.com/lucidrains/vit-pytorch/issues/348 1.12.3 lucidrains 2025-09-25 06:12:37 -07:00
f6bc14c81d able to return embed from vit-nd-rotary 1.12.2 lucidrains 2025-09-23 07:21:34 -07:00
845c844b3b add a vit nd with rotary nd, from Jerry Xiong at UIUC 1.12.1 lucidrains 2025-09-21 10:45:42 -07:00
5f2bc0c796 with assistance from claude (yes it did the einops equation building here), generalize to n-dimensions 1.12.0 lucidrains 2025-09-21 06:22:43 -07:00
35bf273037 1.11.7 1.11.7 lucidrains 2025-08-17 18:07:42 -07:00
1123063a5e Make all CCT regularization parameters user-configurable. (#346) Baraa sameeh 2025-08-18 09:07:25 +08:00
f8bec5ede2 able to project the image embedding before applying time positional embedding for accept video wrapper 1.11.6 lucidrains 2025-08-13 10:15:18 -07:00
297e7d00a2 handle channel first for accept video wrapper 1.11.5 lucidrains 2025-08-03 08:29:40 -07:00
29ac8e143c fix when video time seq len less than max time seq len for video acceptor 1.11.4 lucidrains 2025-07-27 09:00:56 -07:00
e05cd6d8b8 some models only return embeddings with some kwarg on forward 1.11.3 lucidrains 2025-07-27 08:46:43 -07:00
b46233c3d6 need to be able to invoke with eval no grad 1.11.2 lucidrains 2025-07-27 08:25:58 -07:00
68e13a3c7d bit more flexible 1.11.1 lucidrains 2025-07-27 08:14:48 -07:00
b22dc0ecd2 add a wrapper for accepting video and processing the images individually, optionally able to add time positional embeddings - for use in two robotics work 1.11.0 lucidrains 2025-07-27 08:05:48 -07:00
db05a141a6 add the proposed jumbo vit from Fuller et al. of Carleton University lucidrains 2025-03-05 10:50:34 -08:00
1de866d15d add the proposed jumbo vit from Fuller et al. of Carleton University 1.10.1 lucidrains 2025-03-05 07:56:48 -08:00
26a6eebc8a add the proposed jumbo vit from Fuller et al. of Carleton University 1.10.0 lucidrains 2025-03-05 07:46:39 -08:00
9f49a31977 1.9.2 1.9.2 lucidrains 2025-01-19 05:53:11 -08:00
ab63fc9cc8 remove duplicated qkv computation in na_vit_nested_tensor_3d.py (#341) JacobLinCool 2025-01-19 21:52:46 +08:00
c3018d1433 1.9.1 1.9.1 Phil Wang 2025-01-04 07:55:49 -08:00
b7ed6bad28 add option to set frame padding for 3D CCT (#339) Kale Kundert 2025-01-04 10:55:27 -05:00
e7cba9ba6d add a simple vit flavor for a new bytedance paper that proposes to break out of the traditional one residual stream architecture - "hyper-connections" lucidrains 2024-12-20 17:43:50 -08:00
13b313b6d8 add a simple vit flavor for a new bytedance paper that proposes to break out of the traditional one residual stream architecture - "hyper-connections" 1.9.0 lucidrains 2024-12-20 17:31:14 -08:00
56373c0cbd make value residual learned 1.8.9 lucidrains 2024-11-24 08:21:28 -08:00
24196a3e8a allow for qk norm to be turned off for na vit nested tensor 1.8.8 lucidrains 2024-11-20 10:59:22 -08:00
f6d7287b6b readme Phil Wang 2024-11-19 08:20:26 -08:00
d47c57e32f fix tests lucidrains 2024-11-10 09:43:54 -08:00
0449865786 update minimum version for nested tensor of NaViT lucidrains 2024-11-10 09:37:48 -08:00
6693d47d0b update comment for navit 3d lucidrains 2024-11-07 20:02:02 -08:00
141239ca86 fix value residual 1.8.7 Phil Wang 2024-10-31 06:48:24 -07:00
0b5c9b4559 add value residual based simple vit 1.8.6 lucidrains 2024-10-28 09:19:00 -07:00
e300cdd7dc fix multiheaded qk rmsnorm in nViT 1.8.5 lucidrains 2024-10-10 19:15:17 -07:00
36ddc7a6ba go all the way with the normalized vit, fix some scales 1.8.4 Phil Wang 2024-10-10 10:42:37 -07:00
5f85d7b987 go all the way with the normalized vit, fix some scales 1.8.3 Phil Wang 2024-10-10 10:40:32 -07:00
1d1a63fc5c cite for hypersphere vit adapted from ngpt Phil Wang 2024-10-10 10:15:04 -07:00
74b62009f8 go for multi-headed rmsnorm for the qknorm on hypersphere vit 1.8.2 Phil Wang 2024-10-10 08:09:58 -07:00
f50d7d1436 add a hypersphere vit, adapted from https://arxiv.org/abs/2410.01131 1.8.1 Phil Wang 2024-10-09 07:32:25 -07:00
cc17cf0be3 add a hypersphere vit, adapted from https://arxiv.org/abs/2410.01131 1.8.0 Phil Wang 2024-10-09 07:23:10 -07:00
82f2fa751d address https://github.com/lucidrains/vit-pytorch/issues/330 1.7.14 lucidrains 2024-10-04 07:01:48 -07:00
fcb9501cdd add register tokens to the nested tensor 3d na vit example for researcher 1.7.12 lucidrains 2024-08-28 12:21:31 -07:00
c4651a35a3 1.7.11 1.7.11 lucidrains 2024-08-21 19:24:13 -07:00
9d43e4d0bb Add ViViT variant with factorized self-attention (#327) roydenwa 2024-08-22 04:23:38 +02:00
5e808f48d1 3d version of navit nested tensor 1.7.10 Phil Wang 2024-08-21 07:23:21 -07:00
d7cc18c761 3d version of navit nested tensor 1.7.9 Phil Wang 2024-08-21 07:09:32 -07:00
035aa4fc0b 1.7.8 1.7.8 Phil Wang 2024-08-21 07:07:40 -07:00
bed48b5912 fix tests Phil Wang 2024-08-20 15:35:04 -07:00
73199ab486 Nested navit (#325) 1.7.7 lucidrains 2024-08-20 15:12:29 -07:00
771fb6daaf Nested navit (#325) 1.7.6 Phil Wang 2024-08-20 15:07:20 -07:00
4f22eae631 1.7.5 1.7.5 Phil Wang 2024-08-07 08:46:18 -07:00
dfc8df6713 add the u-vit implementation with simple vit + register tokens Phil Wang 2024-08-07 08:45:50 -07:00
9992a615d1 attention re-use in lookup vit should use pre-softmax attention matrix 1.7.4 lucidrains 2024-07-19 19:23:38 -07:00
4b2c00cb63 when cross attending in look vit, make sure context tokens are normalized 1.7.3 Phil Wang 2024-07-19 10:23:12 -07:00
ec6c48b8ff norm not needed when reusing attention in lookvit 1.7.2 Phil Wang 2024-07-19 10:00:03 -07:00
547bf94d07 1.7.1 1.7.1 Phil Wang 2024-07-19 09:49:44 -07:00
bd72b58355 add lookup vit, cite, document later Phil Wang 2024-07-19 09:48:49 -07:00
e3256d77cd fix t2t vit having two layernorms, and make final layernorm in distillation wrapper configurable, default to False for vit 1.7.0 lucidrains 2024-06-11 15:12:53 -07:00
90be7233a3 rotary needs to be done with full precision to be safe 1.6.9 lucidrains 2024-05-11 08:04:14 -07:00
bca88e9039 address https://github.com/lucidrains/vit-pytorch/issues/300 1.6.8 Phil Wang 2024-05-02 08:46:39 -07:00
96f66d2754 address https://github.com/lucidrains/vit-pytorch/issues/306 1.6.7 Phil Wang 2024-04-18 09:44:29 -07:00
12249dcc5f address https://github.com/lucidrains/vit-pytorch/issues/304 1.6.6 Phil Wang 2024-04-17 09:39:45 -07:00
8b8da8dede Update setup.py (#303) SOUMYADIP MAL 2024-04-17 20:51:30 +05:30
5578ac472f address https://github.com/lucidrains/vit-pytorch/issues/292 1.6.5 lucidrains 2023-12-23 08:11:39 -08:00
d446a41243 share an idea that should be tried if it has not been 1.6.4 lucidrains 2023-11-14 16:55:36 -08:00
0ad09c4cbc allow channels to be customizable for cvt 1.6.3a lucidrains 2023-10-25 14:47:58 -07:00
92b69321f4 1.6.2 1.6.2 Phil Wang 2023-10-24 12:47:38 -07:00
fb4ac25174 Fix typo in LayerNorm (#285) Artem Lukin 2023-10-24 21:47:21 +02:00
53fe345e85 no longer needed with einops 0.7 1.6.1 lucidrains 2023-10-19 18:16:43 -07:00
efb94608ea readme Phil Wang 2023-10-19 09:38:35 -07:00
51310d1d07 add xcit diagram lucidrains 2023-10-13 09:18:12 -07:00
91211ecbef actually add xcit image xcit lucidrains 2023-10-13 09:15:31 -07:00
1616288e30 add xcit (#284) 1.6.0 Phil Wang 2023-10-13 09:15:13 -07:00

1 2 3 4 5 ...