lucidrains
|
25871013f5
|
forgot task conditioning for vat
|
2025-10-23 10:55:16 -07:00 |
|
lucidrains
|
e66862bcd5
|
add VAT from iclr 2026, which claims SOTA on libero using a relatively simple scheme (#350)
|
2025-10-23 10:23:53 -07:00 |
|
lucidrains
|
39fd9ac8be
|
for n-dimensional vit, have a method for fetching muon friendly parameters
|
2025-10-13 12:07:48 -07:00 |
|
lucidrains
|
f6bc14c81d
|
able to return embed from vit-nd-rotary
|
2025-09-23 07:21:34 -07:00 |
|
lucidrains
|
845c844b3b
|
add a vit nd with rotary nd, from Jerry Xiong at UIUC
|
2025-09-21 10:45:42 -07:00 |
|
lucidrains
|
5f2bc0c796
|
with assistance from claude (yes it did the einops equation building here), generalize to n-dimensions
|
2025-09-21 06:22:43 -07:00 |
|
Baraa sameeh
|
1123063a5e
|
Make all CCT regularization parameters user-configurable. (#346)
|
2025-08-17 18:07:25 -07:00 |
|
lucidrains
|
f8bec5ede2
|
able to project the image embedding before applying time positional embedding for accept video wrapper
|
2025-08-13 10:15:18 -07:00 |
|
lucidrains
|
297e7d00a2
|
handle channel first for accept video wrapper
|
2025-08-03 08:29:40 -07:00 |
|
lucidrains
|
29ac8e143c
|
fix when video time seq len less than max time seq len for video acceptor
|
2025-07-27 09:00:56 -07:00 |
|
lucidrains
|
e05cd6d8b8
|
some models only return embeddings with some kwarg on forward
|
2025-07-27 08:46:43 -07:00 |
|
lucidrains
|
b46233c3d6
|
need to be able to invoke with eval no grad
|
2025-07-27 08:25:58 -07:00 |
|
lucidrains
|
68e13a3c7d
|
bit more flexible
|
2025-07-27 08:14:48 -07:00 |
|
lucidrains
|
b22dc0ecd2
|
add a wrapper for accepting video and processing the images individually, optionally able to add time positional embeddings - for use in two robotics work
|
2025-07-27 08:05:48 -07:00 |
|
lucidrains
|
db05a141a6
|
add the proposed jumbo vit from Fuller et al. of Carleton University
|
2025-03-05 10:50:34 -08:00 |
|
JacobLinCool
|
ab63fc9cc8
|
remove duplicated qkv computation in na_vit_nested_tensor_3d.py (#341)
|
2025-01-19 05:52:46 -08:00 |
|
Kale Kundert
|
b7ed6bad28
|
add option to set frame padding for 3D CCT (#339)
|
2025-01-04 07:55:27 -08:00 |
|
lucidrains
|
e7cba9ba6d
|
add a simple vit flavor for a new bytedance paper that proposes to break out of the traditional one residual stream architecture - "hyper-connections"
|
2024-12-20 17:43:50 -08:00 |
|
lucidrains
|
56373c0cbd
|
make value residual learned
|
2024-11-24 08:21:28 -08:00 |
|
lucidrains
|
24196a3e8a
|
allow for qk norm to be turned off for na vit nested tensor
|
2024-11-20 10:59:22 -08:00 |
|
lucidrains
|
d47c57e32f
|
fix tests
|
2024-11-10 09:43:54 -08:00 |
|
lucidrains
|
0449865786
|
update minimum version for nested tensor of NaViT
|
2024-11-10 09:37:48 -08:00 |
|
lucidrains
|
6693d47d0b
|
update comment for navit 3d
|
2024-11-07 20:02:07 -08:00 |
|
Phil Wang
|
141239ca86
|
fix value residual
|
2024-10-31 06:48:24 -07:00 |
|
lucidrains
|
0b5c9b4559
|
add value residual based simple vit
|
2024-10-28 09:19:00 -07:00 |
|
lucidrains
|
e300cdd7dc
|
fix multiheaded qk rmsnorm in nViT
|
2024-10-10 19:15:17 -07:00 |
|
Phil Wang
|
36ddc7a6ba
|
go all the way with the normalized vit, fix some scales
|
2024-10-10 10:42:37 -07:00 |
|
Phil Wang
|
74b62009f8
|
go for multi-headed rmsnorm for the qknorm on hypersphere vit
|
2024-10-10 08:09:58 -07:00 |
|
Phil Wang
|
f50d7d1436
|
add a hypersphere vit, adapted from https://arxiv.org/abs/2410.01131
|
2024-10-09 07:32:25 -07:00 |
|
lucidrains
|
82f2fa751d
|
address https://github.com/lucidrains/vit-pytorch/issues/330
|
2024-10-04 07:01:48 -07:00 |
|
lucidrains
|
fcb9501cdd
|
add register tokens to the nested tensor 3d na vit example for researcher
|
2024-08-28 12:21:31 -07:00 |
|
roydenwa
|
9d43e4d0bb
|
Add ViViT variant with factorized self-attention (#327)
* Add FactorizedTransformer
* Add variant param and check in fwd method
* Check if variant is implemented
* Describe new ViViT variant
|
2024-08-21 19:23:38 -07:00 |
|
Phil Wang
|
5e808f48d1
|
3d version of navit nested tensor
|
2024-08-21 07:23:21 -07:00 |
|
lucidrains
|
73199ab486
|
Nested navit (#325)
add a variant of NaViT using nested tensors
|
2024-08-20 15:12:29 -07:00 |
|
Phil Wang
|
dfc8df6713
|
add the u-vit implementation with simple vit + register tokens
|
2024-08-07 08:45:57 -07:00 |
|
lucidrains
|
9992a615d1
|
attention re-use in lookup vit should use pre-softmax attention matrix
|
2024-07-19 19:23:38 -07:00 |
|
Phil Wang
|
4b2c00cb63
|
when cross attending in look vit, make sure context tokens are normalized
|
2024-07-19 10:23:12 -07:00 |
|
Phil Wang
|
ec6c48b8ff
|
norm not needed when reusing attention in lookvit
|
2024-07-19 10:00:03 -07:00 |
|
Phil Wang
|
bd72b58355
|
add lookup vit, cite, document later
|
2024-07-19 09:48:58 -07:00 |
|
lucidrains
|
e3256d77cd
|
fix t2t vit having two layernorms, and make final layernorm in distillation wrapper configurable, default to False for vit
|
2024-06-11 15:12:53 -07:00 |
|
lucidrains
|
90be7233a3
|
rotary needs to be done with full precision to be safe
|
2024-05-11 08:04:32 -07:00 |
|
Phil Wang
|
bca88e9039
|
address https://github.com/lucidrains/vit-pytorch/issues/300
|
2024-05-02 08:46:39 -07:00 |
|
Phil Wang
|
96f66d2754
|
address https://github.com/lucidrains/vit-pytorch/issues/306
|
2024-04-18 09:44:29 -07:00 |
|
Phil Wang
|
12249dcc5f
|
address https://github.com/lucidrains/vit-pytorch/issues/304
|
2024-04-17 09:40:03 -07:00 |
|
lucidrains
|
5578ac472f
|
address https://github.com/lucidrains/vit-pytorch/issues/292
|
2023-12-23 08:11:39 -08:00 |
|
lucidrains
|
d446a41243
|
share an idea that should be tried if it has not been
|
2023-11-14 16:55:36 -08:00 |
|
lucidrains
|
0ad09c4cbc
|
allow channels to be customizable for cvt
|
2023-10-25 14:47:58 -07:00 |
|
Artem Lukin
|
fb4ac25174
|
Fix typo in LayerNorm (#285)
Co-authored-by: Artem Lukin <artyom.lukin98@gmail.com>
|
2023-10-24 12:47:21 -07:00 |
|
lucidrains
|
53fe345e85
|
no longer needed with einops 0.7
|
2023-10-19 18:16:46 -07:00 |
|
Phil Wang
|
1616288e30
|
add xcit (#284)
* add xcit
* use Rearrange layers
* give cross correlation transformer a final norm at end
* document
|
2023-10-13 09:15:13 -07:00 |
|