Automated Discovery And Modeling Of Modalities In Large-Scale Heterogeneous Data