1 intro
code: https://github.com/ChengHan111/E2VPT
task: parameter-efficient learning
method: effective and efficient visual prompt tuning (E^2VPT)
three types of existing parameter-efficient learning methods:
partial tuning: finetune part of the backbone e.g., the cls head or last layers
extra module: insert learnable bias or additional adapters
Prompt tuning: add prompt tokens but without changing or fine-tuning backbone
limitations of existing work:
1) 现有方法没有改变transformer最核心的key-value操作;
2) 现有方法还是不够极致节省计算量
2 this paper
main idea:
1) prompt:visual tokens, + add learnable tokens into key-value prompts
2) prune:redunce the number of learnable parameters by pruning unnecessary prompts
- 文章做法:对visual prompt和key-value prompt都进行efficient tuning;
对比的baselines & exp