Evaluating Large Language Models for Document Question Answering