ArtMentor: AI-Assisted Evaluation of Artworks to Explore Multimodal Large Language Models Capabilities

Institution Name
Conferance name and year

*Indicates Equal Contribution
Notice: This document and its contents are prepared for blind peer review for submission to CHI 2025.

ArtMentor System Interface and Operation Process.

Abstract

Multimodal Large Language Models (MLLMs) face challenges in artwork evaluation, including subjective human assessments, limitations of result-oriented methods, and lack of modularity. In this paper, we propose that the design and analysis of HCI spaces, using process-oriented data, can more effectively evaluate MLLM capabilities and drive improvements. Applying this methodology, we introduce ArtMentor, a space that combines a dataset and three systems to enhance MLLM evaluations. ArtMentor documents 380 sessions with five art teachers, assessing artworks across nine critical dimensions. The modular system features entity recognition, review generation, and suggestion generation agents, enabling iterative upgrades. Process-based results analysis integrates machine learning and natural language processing to ensure reliable evaluations. Finally, we emphasize MLLM’s focus on details at the expense of the bigger picture and the superior performance of review generation compared to suggestion generation. We encourage further collaboration to cost-effectively enhance MLLM capabilities. Our contributions are available at https://artmentor.github.io.

HCI File Structure Analysis

Entities Folder

This folder contains 20 JSON files, each representing the data of an entity.


{
  "original": ["Face", "Black hair", "Open mouth", "Green shirt", "Blue shorts", "Black shoes", "Monkey", "Cat", "Dog", "Bird", "Insect", "Exclamation mark", "Yellow platform", "Books"],
  "added": ["Yellow balances", "schoolbag"],
  "removed": ["Yellow platform"],
  "style": {
    "original": ["Style: Cartoon"],
    "added": [],
    "removed": []
  }
}
        

Field Explanations:

  • original: Elements recognized in the original image
  • added: New elements added by the user
  • removed: Elements removed by the user
  • style: Style-related information, including original style, added styles, and removed styles

score_Review Folder

This folder contains 180 files, each representing scores and reviews for a photo across 9 dimensions.


[
  {
    "round": 1,
    "data": {
      "scores": {
        "original": 0,
        "current": 0,
        "initGPTscore": null
      },
      "Reviews": {
        "original": "",
        "current": "",
        "added": "",
        "removed": ""
      }
    }
  },
  {
    "round": 2,
    "data": {
      "scores": {
        "original": 4,
        "current": 4,
        "initGPTscore": 4
      },
      "Reviews": {
        "original": "The artwork effectively uses contrasting colors to enhance visual interest...",
        "current": "The artwork effectively uses contrasting colors to enhance visual interest...",
        "added": "",
        "removed": ""
      }
    }
  }
]
        

Field Explanations:

  • round: Scoring round
  • scores: Contains original score, current score, and initial GPT score
  • Reviews: Contains original review, current review, added review, and removed review

suggestion Folder

This folder contains 180 files, each representing suggestions for a photo across 9 dimensions.


[
  {
    "round": 1,
    "data": {
      "suggestions": {
        "original": "",
        "current": "",
        "added": "",
        "removed": ""
      }
    }
  },
  {
    "round": 2,
    "data": {
      "suggestions": {
        "original": "To improve the color contrast in the artwork, consider using more vibrant and varied background colors...",
        "current": "To improve the color contrast in the artwork, consider using more vibrant and varied background colors...",
        "added": "",
        "removed": ""
      }
    }
  }
]
        

Field Explanations:

  • round: Suggestion round
  • suggestions: Contains original suggestion, current suggestion, added suggestion, and removed suggestion

BibTeX

BibTex Code Here