Publications

A collection of my research work.

Observable Patterns Are Not Explanations: A Causal-Geometric Analysis of Latent Reasoning Models

Darpan Aswal, Thomas Palmeira Ferraz, Yongxin Zhou, Maxime Peyrard

arXiv preprint 2026

Causal and geometric analysis showing observable latent-state patterns in latent reasoning models are not sufficient evidence for internal reasoning mechanisms.

Link

TempPerturb-Eval: On the Joint Effects of Internal Temperature and External Perturbations in RAG Robustness

Yongxin Zhou, Philippe Mulhem, Didier Schwab

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026) 2026

Systematic evaluation of how LLM temperature and external perturbations jointly affect RAG robustness.

Link

SciCiteVal: A Multi-Domain Dataset for Scientific Citation Verification

Qinyue Liu, Yongxin Zhou, Cyril Labbe

Proceedings of the Fifteenth Language Resources and Evaluation Conference (LREC 2026) 2026

A manually annotated multi-domain dataset for automated scientific citation verification, with LLM baselines.

Link

Automated Clinical Report Generation for Remote Cognitive Remediation: Comparing Knowledge-Engineered Templates and LLMs in Low-Resource Settings

Yongxin Zhou, Fabien Ringeval, François Portet

arXiv preprint 2026

Comparative study of template-based vs. LLM approaches for automated clinical report generation in cognitive remediation, evaluated by speech therapists.

Link

What Matters to an LLM? Behavioral and Computational Evidences from Summarization

Yongxin Zhou, Changshun Wu, Philippe Mulhem, Didier Schwab, Maxime Peyrard

Findings of the Association for Computational Linguistics: EACL 2026 2026

Behavioral and computational study of LLM informational preferences in summarization, revealing divergence from pre-LLM baselines.

Link

Can GPT models Follow Human Summarization Guidelines? A Study for Targeted Communication Goals

Yongxin Zhou, Fabien Ringeval, François Portet

Proceedings of the 18th International Natural Language Generation Conference (INLG 2025) 2025

Evaluation of GPT models' ability to follow expert-crafted summarization guidelines for targeted communication goals.

Link

GETALP@AutoMin 2025: Leveraging RAG to Answer Questions based on Meeting Transcripts

Jeongwoo Kang, Markarit Vartampetian, Felix Herron, Yongxin Zhou, Diandra Fabre, Gabriela Gonzalez-Saez

Proceedings of the Third Run of the Automatic Minuting Shared Task @ SIGDial 2025 2025

RAG + AMR system for meeting transcript question-answering, submitted to the AutoMin shared task at SIGDial 2025.

Link

Explicabilité par Perturbations pour les Systèmes RAG

Yongxin Zhou, Philippe Mulhem, Didier Schwab

DIAG-LLM Workshop @ CORIA-TALN 2025 2025

Perturbation-based explainability for RAG systems, analyzing the impact of retrieval and generation-stage perturbations.

Link

PSentScore: Evaluating Sentiment Polarity in Dialogue Summarization

Yongxin Zhou, Fabien Ringeval, François Portet

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) 2024

PSentScore: an automatic evaluation metric for sentiment polarity preservation in dialogue summarization.

Link

Jargon: A Suite of Language Models and Evaluation Tasks for French Specialized Domains

Vincent Segonne, Aidan Mannion, Laura Cristina Alonzo Canul, Alexandre Daniel Audibert, Xingyu Liu, Cécile Macaire, Adrien Pupier, Yongxin Zhou, Mathilde Aguiar, Felix E. Herron, Magali Norré, Massih R Amini, Pierrette Bouillon, Iris Eshkol-Taravella, Emmanuelle Esperança-Rodier, Thomas François, Lorraine Goeuriot, Jérôme Goulian, Mathieu Lafourcade, Benjamin Lecouteux, François Portet, Fabien Ringeval, Vincent Vandeghinste, Maximin Coavoux, Marco Dinarelli, Didier Schwab

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) 2024

A suite of French specialized language models and evaluation tasks across transcribed speech, medical, and legal domains.

Link

A Survey of Evaluation Methods of Generated Medical Textual Reports

Yongxin Zhou, Fabien Ringeval, François Portet

Proceedings of the 5th Clinical Natural Language Processing Workshop (ClinicalNLP 2023) 2023

A systematic survey of automatic and human evaluation methods for generated medical textual reports.

Link

Exploration de caractéristiques linguistiques et acoustiques pour la génération automatique de rapports de séances de remédiation cognitive avec un assistant virtuel

Yongxin Zhou, Fabien Ringeval, François Portet

JPC 2023 — 9èmes Journées de Phonétique Clinique 2023

Exploration of linguistic and acoustic features for automatic report generation in cognitive remediation sessions.

Link

MLLabs-LIG at TempoWiC 2022: A Generative Approach for Examining Temporal Meaning Shift

Chenyang Lyu, Yongxin Zhou, Tianbo Ji

Proceedings of the First Workshop on Ever Evolving NLP (EvoNLP) 2022

Generative seq2seq approach for the TempoWiC shared task on temporal meaning shift detection (EvoNLP @ EMNLP 2022).

Link

Effectiveness of French Language Models on Abstractive Dialogue Summarization Task

Yongxin Zhou, François Portet, Fabien Ringeval

Proceedings of the Thirteenth Language Resources and Evaluation Conference (LREC 2022) 2022

Benchmarking French pre-trained language models for abstractive dialogue summarization on the DECODA call center corpus.

Link

THERADIA: Digital Therapies Augmented by Artificial Intelligence

Franck Tarpin-Bernard, Joan Fruitet, Jean-Philippe Vigne, Patrick Constant, Hanna Chainay, Olivier Koenig, Fabien Ringeval, Béatrice Bouchot, Gérard Bailly, François Portet, Sina Alisamir, Yongxin Zhou, Jean Serre, Vincent Delerue, Hippolyte Fournier, Kévin Berenger, Isabella Zsoldos, Olivier Perrotin, Frédéric Elisei, Martin Lenglet, Charles Puaux, Léo Pacheco, Mélodie Fouillen, Didier Ghenassia

Proceedings of the AHFE 2021 Virtual Conference on Human Factors in Robots, Drones and Unmanned Systems 2021

Overview of the THERADIA project: AI-augmented digital therapies for cognitive remediation using virtual avatars and NLP.

Link

Towards an XAI-Assisted Third-Party Evaluation of AI Systems: Illustration on Decision Trees

Yongxin Zhou, Matthieu Boussard, Agnes Delaborde

Proceedings of the AAMAS 2021 Workshop on Explainable and Transparent AI and Multi-Agent Systems (EXTRAAMAS 2021) 2021

An XAI-assisted framework for third-party AI system evaluation, illustrated on explainable-by-design decision trees over health and financial data.

Link