Efficient Vision And Language Models For Autonomous Systems