However, the picture changes if the models are trained on larger datasets (14M-300M images). We find that large scale training trumps inductive bias
- AN IMAGE IS WORTH 16X16 WORDS:TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE
transformer非常好的解释
However, the picture changes if the models are trained on larger datasets (14M-300M images). We find that large scale training trumps inductive bias