Affective Human-Machine Interfaces: Towards Multi-Lingual, Environment-Robust Emotion Detection from Speech