The encoder.plan file was successfully converted using export_tensorrt.sh, and its size is approximately 1.45GB. Running it on Nvidia triton server, it uses over 13GB of VRAM after startup, and I expect VRAM usage to be below 10GB. How can I reduce VRAM usage?