lucidrains
0ad09c4cbc
allow channels to be customizable for cvt
2023-10-25 14:47:58 -07:00
Artem Lukin
fb4ac25174
Fix typo in LayerNorm ( #285 )
...
Co-authored-by: Artem Lukin <artyom.lukin98@gmail.com >
2023-10-24 12:47:21 -07:00
lucidrains
53fe345e85
no longer needed with einops 0.7
2023-10-19 18:16:46 -07:00
Phil Wang
1616288e30
add xcit ( #284 )
...
* add xcit
* use Rearrange layers
* give cross correlation transformer a final norm at end
* document
2023-10-13 09:15:13 -07:00
lucidrains
bbb24e34d4
give a learned bias to and from registers for maxvit + register token variant
2023-10-06 10:40:26 -07:00
lucidrains
df8733d86e
improvise a max vit with register tokens
2023-10-06 10:27:36 -07:00
lucidrains
680d446e46
document in readme later
2023-10-03 09:26:02 -07:00
lucidrains
a36546df23
add simple vit with register tokens example, cite
2023-10-01 08:11:40 -07:00
lucidrains
d830b05f06
address https://github.com/lucidrains/vit-pytorch/issues/279
2023-09-10 09:32:57 -07:00
Phil Wang
8208c859a5
just remove PreNorm wrapper from all ViTs, as it is unlikely to change at this point
2023-08-14 09:48:55 -07:00
Phil Wang
b194359301
add a simple vit with qknorm, since authors seem to be promoting the technique on twitter
2023-08-14 07:58:45 -07:00
lucidrains
950c901b80
fix linear head in simple vit, thanks to @atkos
2023-08-10 14:36:21 -07:00
Phil Wang
3e5d1be6f0
address https://github.com/lucidrains/vit-pytorch/pull/274
2023-08-09 07:53:38 -07:00
Phil Wang
6e2393de95
wrap up NaViT
2023-07-25 10:38:55 -07:00
Phil Wang
32974c33df
one can pass a callback to token_dropout_prob for NaViT that takes in height and width and calculate appropriate dropout rate
2023-07-24 14:52:40 -07:00
Phil Wang
17675e0de4
add constant token dropout for NaViT
2023-07-24 14:14:36 -07:00
Phil Wang
23820bc54a
begin work on NaViT ( #273 )
...
finish core idea of NaViT
2023-07-24 13:54:02 -07:00
roydenwa
d4daf7bd0f
Support SimpleViT as encoder in MAE ( #272 )
...
support simplevit in mae
2023-07-24 06:43:01 -07:00
Phil Wang
9e3fec2398
fix mpp
2023-06-28 08:02:43 -07:00
Phil Wang
ce4bcd08fb
address https://github.com/lucidrains/vit-pytorch/issues/266
2023-05-20 08:24:49 -07:00
Phil Wang
ad4ca19775
enforce latest einops
2023-05-08 09:34:14 -07:00
Phil Wang
c59843d7b8
add a version of simple vit using flash attention
2023-03-18 09:41:39 -07:00
lucidrains
9a8e509b27
separate a simple vit from mp3, so that simple vit can be used after being pretrained
2023-03-07 19:31:10 -08:00
Srikumar Sastry
4218556acd
Add Masked Position Prediction ( #260 )
...
* Create mp3.py
* Implementation: Position Prediction as an Effective Pretraining Strategy
* Added description for Masked Position Prediction
* MP3 image added
2023-03-07 14:28:40 -08:00
Phil Wang
5699ed7d13
double down on dual patch norm, fix MAE and Simmim to be compatible with dual patchnorm
2023-02-10 10:39:50 -08:00
Phil Wang
46dcaf23d8
seeing a signal with dual patchnorm in another repository, fully incorporate
2023-02-06 09:45:12 -08:00
Phil Wang
bdaf2d1491
adopt dual patchnorm paper for as many vit as applicable, release 1.0.0
2023-02-03 08:11:29 -08:00
Phil Wang
500e23105a
need simple vit with patch dropout for another project
2022-12-05 10:47:36 -08:00
Phil Wang
89e1996c8b
add vit with patch dropout, fully embrace structured dropout as multiple papers are now corroborating each other
2022-12-02 11:28:11 -08:00
Phil Wang
2f87c0cf8f
offer 1d versions, in light of https://arxiv.org/abs/2211.14730
2022-12-01 10:31:05 -08:00
Phil Wang
cb6d749821
add a 3d version of cct, addressing https://github.com/lucidrains/vit-pytorch/issues/238 0.38.1
2022-10-29 11:35:06 -07:00
Phil Wang
6ec8fdaa6d
make sure global average pool can be used for vivit in place of cls token
2022-10-24 19:59:48 -07:00
Phil Wang
13fabf901e
add vivit
2022-10-24 09:34:04 -07:00
Ryan Russell
c0eb4c0150
Improving Readability ( #220 )
...
Signed-off-by: Ryan Russell <git@ryanrussell.org >
Signed-off-by: Ryan Russell <git@ryanrussell.org >
2022-10-17 10:42:45 -07:00
Srikumar Sastry
9a95e7904e
Update mae.py ( #242 )
...
update mae so decoded tokens can be easily reshaped back to visualize the reconstruction
2022-10-17 10:41:10 -07:00
Phil Wang
b4853d39c2
add the 3d simple vit
2022-10-16 20:45:30 -07:00
Phil Wang
29fbf0aff4
begin extending some of the architectures over to 3d, starting with basic ViT
2022-10-16 15:31:59 -07:00
Phil Wang
f86e052c05
offer way for extractor to return latents without detaching them
2022-07-16 16:22:40 -07:00
Phil Wang
2fa2b62def
slightly more clear of einops rearrange for cls token, for https://github.com/lucidrains/vit-pytorch/issues/224
2022-06-30 08:11:17 -07:00
Phil Wang
9f87d1c43b
follow @arquolo feedback and advice for MaxViT
2022-06-29 08:53:09 -07:00
Phil Wang
2c6dd7010a
fix hidden dimension in MaxViT thanks to @arquolo
2022-06-24 23:28:35 -07:00
Phil Wang
6460119f65
be able to accept a reference to a layer within the model for forward hooking and extracting the embedding output, for regionvit to work with extractor
2022-06-19 08:22:18 -07:00
Phil Wang
4e62e5f05e
make extractor flexible for layers that output multiple tensors, show CrossViT example
2022-06-19 08:11:41 -07:00
Phil Wang
b3e90a2652
add simple vit, from https://arxiv.org/abs/2205.01580
2022-05-03 20:24:14 -07:00
Phil Wang
4ef72fc4dc
add EsViT, by popular request, an alternative to Dino that is compatible with efficient ViTs with accounting for regional self-supervised loss
2022-05-03 10:29:29 -07:00
Phil Wang
81661e3966
fix mbconv residual block
2022-04-06 16:43:06 -07:00
Phil Wang
13f8e123bb
fix maxvit - need feedforwards after attention
2022-04-06 16:34:40 -07:00
Phil Wang
c7bb5fc43f
maxvit intent to build ( #211 )
...
complete hybrid mbconv + block / grid efficient self attention MaxViT
2022-04-06 16:12:17 -07:00
Phil Wang
d93cd84ccd
let windowed tokens exchange information across heads a la talking heads prior to pointwise attention in sep-vit
2022-03-31 15:22:24 -07:00
Phil Wang
5d4c798949
cleanup sepvit
2022-03-31 14:35:11 -07:00