We introduce the Progressive Visual Token Compression (PVC) in large vision-language models (VLMs), which unifies the visual inputs as videos and progressively compresses vision tokens across video ...
Abstract: The rich spectral information within hyperspectral images (HSIs) results in large data volumes. Thus finding a compact representation for HSIs while maintaining reconstruction quality is a ...