Ryan Russell
|
c0eb4c0150
|
Improving Readability (#220)
Signed-off-by: Ryan Russell <git@ryanrussell.org>
Signed-off-by: Ryan Russell <git@ryanrussell.org>
|
2022-10-17 10:42:45 -07:00 |
|
Phil Wang
|
5f1a6a05e9
|
release updated mae where one can more easily visualize reconstructions, thanks to @Vishu26
0.36.2
|
2022-10-17 10:41:46 -07:00 |
|
Srikumar Sastry
|
9a95e7904e
|
Update mae.py (#242)
update mae so decoded tokens can be easily reshaped back to visualize the reconstruction
|
2022-10-17 10:41:10 -07:00 |
|
Phil Wang
|
b4853d39c2
|
add the 3d simple vit
0.36.1
|
2022-10-16 20:45:30 -07:00 |
|
Phil Wang
|
29fbf0aff4
|
begin extending some of the architectures over to 3d, starting with basic ViT
0.36.0
|
2022-10-16 15:31:59 -07:00 |
|
Phil Wang
|
4b8f5bc900
|
add link to Flax translation by @conceptofmind
|
2022-07-27 08:58:18 -07:00 |
|
Phil Wang
|
f86e052c05
|
offer way for extractor to return latents without detaching them
v0.35.8
|
2022-07-16 16:22:40 -07:00 |
|
Phil Wang
|
2fa2b62def
|
slightly more clear of einops rearrange for cls token, for https://github.com/lucidrains/vit-pytorch/issues/224
v0.35.7
|
2022-06-30 08:11:17 -07:00 |
|
Phil Wang
|
9f87d1c43b
|
follow @arquolo feedback and advice for MaxViT
v0.35.6
|
2022-06-29 08:53:09 -07:00 |
|
Phil Wang
|
2c6dd7010a
|
fix hidden dimension in MaxViT thanks to @arquolo
v0.35.5
|
2022-06-24 23:28:35 -07:00 |
|
Phil Wang
|
6460119f65
|
be able to accept a reference to a layer within the model for forward hooking and extracting the embedding output, for regionvit to work with extractor
v0.35.4
|
2022-06-19 08:22:18 -07:00 |
|
Phil Wang
|
4e62e5f05e
|
make extractor flexible for layers that output multiple tensors, show CrossViT example
v0.35.3
|
2022-06-19 08:11:41 -07:00 |
|
Phil Wang
|
b3e90a2652
|
add simple vit, from https://arxiv.org/abs/2205.01580
|
2022-05-03 20:24:14 -07:00 |
|
Phil Wang
|
4ef72fc4dc
|
add EsViT, by popular request, an alternative to Dino that is compatible with efficient ViTs with accounting for regional self-supervised loss
|
2022-05-03 10:29:29 -07:00 |
|
Zhengzhong Tu
|
c2aab05ebf
|
fix bibtex typo (#212)
|
2022-04-06 22:15:05 -07:00 |
|
Phil Wang
|
81661e3966
|
fix mbconv residual block
0.33.2
|
2022-04-06 16:43:06 -07:00 |
|
Phil Wang
|
13f8e123bb
|
fix maxvit - need feedforwards after attention
0.33.1
|
2022-04-06 16:34:40 -07:00 |
|
Phil Wang
|
2d4089c88e
|
link to maxvit in readme
|
2022-04-06 16:24:12 -07:00 |
|
Phil Wang
|
c7bb5fc43f
|
maxvit intent to build (#211)
complete hybrid mbconv + block / grid efficient self attention MaxViT
0.33.0
|
2022-04-06 16:12:17 -07:00 |
|
Phil Wang
|
946b19be64
|
sponsor button
|
2022-04-06 14:12:11 -07:00 |
|
Phil Wang
|
d93cd84ccd
|
let windowed tokens exchange information across heads a la talking heads prior to pointwise attention in sep-vit
0.32.2
|
2022-03-31 15:22:24 -07:00 |
|
Phil Wang
|
5d4c798949
|
cleanup sepvit
0.32.1
|
2022-03-31 14:35:11 -07:00 |
|
Phil Wang
|
d65a742efe
|
intent to build (#210)
complete SepViT, from bytedance AI labs
0.32.0
|
2022-03-31 14:30:23 -07:00 |
|
Phil Wang
|
8c54e01492
|
do not layernorm on last transformer block for scalable vit, as there is already one in mlp head
0.31.1
|
2022-03-31 13:25:21 -07:00 |
|
Phil Wang
|
df656fe7c7
|
complete learnable memory ViT, for efficient fine-tuning and potentially plays into continual learning
0.30.1
|
2022-03-31 09:51:12 -07:00 |
|
Phil Wang
|
4e6a42a0ca
|
correct need for post-attention dropout
|
2022-03-30 10:50:57 -07:00 |
|
Phil Wang
|
6d7298d8ad
|
link to tensorflow2 translation by @taki0112
|
2022-03-28 09:05:34 -07:00 |
|
Phil Wang
|
9cd56ff29b
|
CCT allow for rectangular images
0.29.1
|
2022-03-26 14:02:49 -07:00 |
|
Phil Wang
|
2aae406ce8
|
add proposed parallel vit from facebook ai for exploration purposes
|
2022-03-23 10:42:35 -07:00 |
|
Phil Wang
|
c2b2db2a54
|
fix window size of none for scalable vit for rectangular images
0.28.2
|
2022-03-22 17:37:59 -07:00 |
|
Phil Wang
|
719048d1bd
|
some better defaults for scalable vit
0.28.1
|
2022-03-22 17:19:58 -07:00 |
|
Phil Wang
|
d27721a85a
|
add scalable vit, from bytedance AI
0.28.0
|
2022-03-22 17:02:47 -07:00 |
|
Phil Wang
|
cb22cbbd19
|
update to einops 0.4, which is torchscript jit friendly
0.27.1
|
2022-03-22 13:58:00 -07:00 |
|
Phil Wang
|
6db20debb4
|
add patch merger
0.27.0
|
2022-03-01 16:50:17 -08:00 |
|
Phil Wang
|
1bae5d3cc5
|
allow for rectangular images for efficient adapter
0.26.7
|
2022-01-31 08:55:31 -08:00 |
|
Phil Wang
|
25b384297d
|
return None from extractor if no attention layers
0.26.6
|
2022-01-28 17:49:58 -08:00 |
|
Phil Wang
|
64a07f50e6
|
epsilon should be inside square root
0.26.5
|
2022-01-24 17:24:41 -08:00 |
|
Phil Wang
|
126d204ff2
|
fix block repeats in readme example for Nest
|
2022-01-22 21:32:53 -08:00 |
|
Phil Wang
|
c1528acd46
|
fix feature maps in Nest, thanks to @MarkYangjiayi
0.26.4
|
2022-01-22 13:17:30 -08:00 |
|
Phil Wang
|
1cc0f182a6
|
decoder positional embedding needs to be reapplied https://twitter.com/giffmana/status/1479195631587631104
|
2022-01-06 13:14:41 -08:00 |
|
Phil Wang
|
28eaba6115
|
0.26.2
0.26.2
|
2022-01-03 12:56:34 -08:00 |
|
Phil Wang
|
0082301f9e
|
build @jrounds suggestion
|
2022-01-03 12:56:25 -08:00 |
|
Phil Wang
|
91ed738731
|
0.26.1
0.26.1
|
2021-12-30 19:31:26 -08:00 |
|
Phil Wang
|
1b58daa20a
|
Merge pull request #186 from chinhsuanwu/mobilevit
Update MobileViT
|
2021-12-30 19:31:01 -08:00 |
|
chinhsuanwu
|
f2414b2c1b
|
Update MobileViT
|
2021-12-30 05:52:23 +08:00 |
|
Phil Wang
|
891b92eb74
|
readme
|
2021-12-28 16:00:00 -08:00 |
|
Phil Wang
|
70ba532599
|
add ViT for small datasets https://arxiv.org/abs/2112.13492
|
2021-12-28 10:58:21 -08:00 |
|
Phil Wang
|
e52ac41955
|
allow extractor to only return embeddings, to ready for vision transformers to be used in x-clip
0.25.6
|
2021-12-25 12:31:21 -08:00 |
|
Phil Wang
|
0891885485
|
include tests in package for conda
0.25.5
|
2021-12-22 12:44:29 -08:00 |
|
Phil Wang
|
976f489230
|
add some tests
0.25.3
|
2021-12-22 09:13:31 -08:00 |
|