RA-QA: Towards Respiratory Audio-based Health Question Answering

• RA-QA dataset: 7.5M QA pairs, 60+ attributes, 3 question types. • Curated 11 respiratory audio datasets, first multimodal QA resource bridging audio and natural language. • Benchmark compares audio-text generation models vs traditional audio classifiers on QA tasks. • Experiments show varied performance across attributes and question types, establishing baseline. • Highlights gap in real-time, natural language interactive systems for respiratory diagnosis. • Opens path for intelligent, accessible diagnostic tools in respiratory healthcare.

Article Summaries:

Researchers have released RA‑QA, the first large‑scale respiratory audio question‑answering dataset and benchmark. The resource compiles 11 existing respiratory sound collections into about 7.5 million QA pairs covering more than 60 clinical attributes and three question formats (verification, multiple‑choice, open‑ended). A new benchmark evaluates audio‑text generation models against conventional audio classifiers, revealing performance differences across attributes and question types. The study establishes baseline scores and demonstrates the feasibility of integrating natural‑language dialogue with clinical audio, paving the way for more interactive, accessible diagnostic tools in respiratory care.

Sources:

https://arxiv.org/abs/2602.18452