Skip to content

manga109/MangaVQA_LMM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MangaVQA and MangaLMM

This is the repository that contains source code for the MangaVQA and MangaLMM project website.

Project Description

We present MangaVQA, a benchmark of 526 manually constructed question–answer pairs designed to evaluate an LMM's ability to accurately answer targeted, factual questions grounded in both visual and textual context. We also develop MangaLMM, a manga-specialized version of Qwen2.5-VL, finetuned to jointly address both VQA and OCR tasks.

Citation

If you find MangaVQA and MangaLMM useful for your work please cite:

@inproceedings{baek2025mangavqa,
  author    = {Baek, Jeonghun and Egashira, Kazuki and Onohara, Shota and Miyai, Atsuyuki and Imajuku, Yuki and Ikuta, Hikaru and Aizawa, Kiyoharu},
  title     = {MangaVQA and MangaLMM: A Benchmark and Specialized Model for Multimodal Manga Understanding},
  booktitle = {Findings of the Association for Computational Linguistics: EACL 2026},
  year      = {2026},
}

Links

Website License

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

This website is inspired by and references Nerfies.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors