Commit Graph

75 Commits

Author SHA1 Message Date
Phil Wang
e42e9876bc offer a way to turn off ds conv in rotary vision transformer for ablation 2021-04-20 10:12:03 -07:00
Phil Wang
566365978d add ability to turn off rotary, for ablation 2021-04-20 09:00:27 -07:00
Phil Wang
34f78294d3 fix pooling bugs across a few new archs 2021-04-19 22:36:23 -07:00
Phil Wang
4c29328363 fix frequency in rotary vision transformer 2021-04-15 16:06:32 -07:00
Phil Wang
fa216c45ea tweak 2021-04-14 16:52:53 -07:00
Phil Wang
53b3af05f6 use convolution on query with padding to give the network absolute spatial awareness in addition to relative encoding from rotary embeddings 2021-04-14 15:56:02 -07:00
shabie
dc6622c05c Fix alpha coefficient multiplication in the loss 2021-04-14 11:36:43 +02:00
Phil Wang
30b37c4028 add LocalViT 2021-04-12 19:17:32 -07:00
Phil Wang
4497f1e90f add rotary vision transformer 2021-04-10 22:59:15 -07:00
Phil Wang
b50d3e1334 cleanup levit 2021-04-06 13:46:19 -07:00
Phil Wang
e075460937 stray print 2021-04-06 13:38:52 -07:00
Phil Wang
2cb6b35030 complete levit 2021-04-06 13:36:11 -07:00
Phil Wang
2ec9161a98 levit without pos emb 2021-04-06 12:58:05 -07:00
Phil Wang
3a3038c702 add layer dropout for CaiT 2021-04-01 20:30:37 -07:00
Phil Wang
b1f1044c8e offer hard distillation as well 2021-04-01 16:56:14 -07:00
Phil Wang
05b47cc070 make sure layerscale epsilon is a function of depth 2021-03-31 22:53:04 -07:00
Phil Wang
9ef8da4759 add CaiT, new vision transformer out of facebook AI, complete with layerscale, talking heads, and cls -> patch cross attention 2021-03-31 22:42:16 -07:00
Phil Wang
506fcf83a6 add documentation for three recent vision transformer follow-up papers 2021-03-31 09:22:15 -07:00
Phil Wang
6fb360a1ff add arxiv links for now, document in readme later 2021-03-30 22:26:44 -07:00
Phil Wang
da950e6d2c add working PiT 2021-03-30 22:15:19 -07:00
Phil Wang
4b9a02d89c use depthwise conv for CvT projections 2021-03-30 18:18:35 -07:00
Phil Wang
518924eac5 add CvT 2021-03-30 14:42:39 -07:00
Phil Wang
e712003dfb add CrossViT 2021-03-30 00:53:27 -07:00
Phil Wang
d04ce06a30 make recorder work for t2t and deepvit 2021-03-29 18:16:34 -07:00
Phil Wang
8135d70e4e use hooks to retrieve attention maps for user without modifying ViT 2021-03-29 15:10:12 -07:00
Phil Wang
3067155cea add recorder class, for recording attention across layers, for researchers 2021-03-29 11:08:19 -07:00
Phil Wang
15294c304e remove masking, as it complicates with little benefit 2021-03-23 12:18:47 -07:00
Phil Wang
b900850144 add deep vit 2021-03-23 11:57:13 -07:00
Phil Wang
173e07e02e cleanup and release 0.8.0 2021-03-08 07:28:31 -08:00
Phil Wang
0e63766e54 Merge pull request #66 from zankner/masked_patch_pred
Masked Patch Prediction "Suggested in #63" Work in Progress
2021-03-08 07:21:52 -08:00
Zack Ankner
73de1e8a73 converting bin targets to hard labels 2021-03-07 12:19:30 -05:00
Phil Wang
1698b7bef8 make it so one can plug performer into t2tvit 2021-02-25 20:55:34 -08:00
Phil Wang
6760d554aa no need to do projection to combine attention heads for T2Ts initial one-headed attention layers 2021-02-24 12:23:39 -08:00
Phil Wang
a82894846d add DistillableT2TViT 2021-02-21 19:54:45 -08:00
Phil Wang
3744ac691a remove patch size from T2TViT 2021-02-21 19:15:19 -08:00
Phil Wang
6af7bbcd11 make sure distillation still works 2021-02-21 19:08:18 -08:00
Phil Wang
05edfff33c cleanup 2021-02-20 11:32:38 -08:00
Phil Wang
e3205c0a4f add token to token ViT 2021-02-19 22:28:53 -08:00
Phil Wang
3f2cbc6e23 fix for ambiguity in broadcasting mask 2021-02-17 07:38:11 -08:00
Zack Ankner
fc14561de7 made bit boundaries a function of output bits and max pixel val, fixed spelling error and reset vit_pytorch to og file 2021-02-13 18:19:21 -07:00
Zack Ankner
be5d560821 mpp loss is now based on descritized average pixels, vit forward unchanged 2021-02-12 18:30:56 -07:00
Zack Ankner
77703ae1fc moving mpp loss into wrapper 2021-02-10 21:47:49 -07:00
Zack Ankner
a0a4fa5e7d Working implementation of masked patch prediction as a wrapper. Need to clean code up 2021-02-09 22:55:06 -07:00
Zack Ankner
174e71cf53 Wrapper for masked patch prediction. Built handling of input and masking of patches. Need to work on integrating into vit forward call and mpp loss function 2021-02-07 16:49:06 -05:00
Zack Ankner
e14bd14a8f Prelim work on masked patch prediction for self supervision 2021-02-04 22:00:02 -05:00
Phil Wang
85314cf0b6 patch for scaling factor, thanks to @urkax 2021-01-21 09:39:42 -08:00
Phil Wang
e8ca6038c9 allow for DistillableVit to still run predictions 2021-01-11 10:49:14 -08:00
Phil Wang
2263b7396f allow distillable efficient vit to restore efficient vit as well 2020-12-25 19:31:25 -08:00
Phil Wang
74074e2b6c offer easy way to turn DistillableViT to ViT at the end of training 2020-12-25 11:16:52 -08:00
Phil Wang
5918f301a2 cleanup 2020-12-25 09:30:38 -08:00