Is there a good benchmark of how well LLMs can read images? E.g., for a task of counting cars, identifying their shapes, models etc. Like this:Like Loading... Post navigation petpetWeb Content Strategist