Yongxin Zhou
Email: yongxin [dot] zhou [at] univ-grenoble-alpes [dot] fr

I'm currently a 4th and last year PhD student in Natural Language Processing at GETALP team, Université Grenoble Alpes, supervised by François Portet and Fabien Ringeval. After studying at Université Sorbonne Nouvelle - Paris 3 in Phonetics and Phonology, I received my second master's degree in Language and Computer Science from Sorbonne Université, France. My current focus is on Medical Report Generation, aiming at automatically generating reports summarizing cognitive remediation sessions performed at home by patients for clinicians. I am also interested in Dialogue Summarization and Explainable Artificial Intelligence.

LinkedIn | twitter | github

News
  • 20/02/2024 “PSentScore: Evaluating Sentiment Polarity in Dialogue Summarization” accepted by LREC-COLING 2024.
  • 20/02/2024 “Jargon: A Suite of Language Models and Evaluation Tasks for French Specialized Domains” accepted by LREC-COLING 2024.
  • 25/10/2023 Check our new preprint "Can GPT models Follow Human Summarization Guidelines? Evaluating ChatGPT and GPT-4 for Dialogue Summarization" - arXiv link
  • 23/07/2023 Check our new preprint "Evaluating Emotional Nuances in Dialogue Summarization" - arXiv link
  • 14/07/2023 Presented a poster of our paper at Clinical NLP Workshop 2023, Via GatherTown.
  • 15/06/2023 Presented a poster of our abstract at JPC 2023, Toulouse.
  • 01/06/2023 “A Survey of Evaluation Methods of Generated Medical Textual Reports” accepted by ACL 2023 Workshop Clinical NLP.
  • 23/03/2023 Résumé “Exploration de caractéristiques linguistiques et acoustiques pour la génération automatique de rapports de séances de remédiation cognitive avec un assistant virtuel” accepté aux JPC 2023 (les 9èmes Journées de Phonétique Clinique).
  • 31/10/2022 Shared Task paper “MLLabs-LIG at TempoWiC 2022: A Generative Approach for Examining Temporal Meaning Shift” accepted by EMNLP 2022 Workshop EvoNLP.
  • 22/06/2022 Presented a poster of our paper at LREC 2022.
  • 29/04/2022 Presented a poster at LIG PHD Day - Journée des doctorants 2ème année.
  • 04/2022 “Effectiveness of French Language Models on Abstractive Dialogue Summarization Task” accepted by LREC 2022.
  • 05/2021 Presented a talk of our paper at AAMAS 2021 Workshop on Explainable and Transparent AI and Multi-Agent Systems.
  • 03/2021 “Towards an XAI-Assisted Third-Party Evaluation of AI Systems: Illustration on Decision Trees” accepted by AAMAS 2021 Workshop on Explainable and Transparent AI and Multi-Agent Systems.
Talks
23/05/2024 PSentScore: Evaluating Sentiment Polarity in Dialogue Summarization
@LREC-COLING 2024, Torino, Italia
21/07/2020 XAI pour l'évaluation de l'IA. (Séminaire stage Craft AI)
@Craft AI, Paris, France
Research Experience
05/2020 - 10/2020 Research Intern in the Department of Artificial Intelligence Evaluation
@Laboratoire national de métrologie et d’essais (LNE), Trappes, France
Teaching
10/2021 - 12/2021 Advanced models of machine learning (exercise classes, M2)
(and 09/2022 - 12/2022)
@ Master INDUSTRIES DE LA LANGUE - Université Grenoble Alpes | GitHub

10/2021 - 12/2021 Automatic text generation (exercise classes, M2)
(and 09/2022 - 12/2022)
@ Master INDUSTRIES DE LA LANGUE - Université Grenoble Alpes | GitHub
Publications

PSentScore: Evaluating Sentiment Polarity in Dialogue Summarization
Yongxin Zhou, Fabien Ringeval, François Portet
LREC-COLING, 2024

pdf | abstract | bibtex

Automatic dialogue summarization is a well-established task with the goal of distilling the most crucial information from human conversations into concise textual summaries. However, most existing research has predominantly focused on summarizing factual information, neglecting the affective content, which can hold valuable insights for analyzing, monitoring, or facilitating human interactions. In this paper, we introduce and assess a set of measures PSentScore, aimed at quantifying the preservation of affective content in dialogue summaries. Our findings indicate that state-of-the-art summarization models do not preserve well the affective content within their summaries. Moreover, we demonstrate that a careful selection of the training set for dialogue samples can lead to improved preservation of affective content in the generated summaries, albeit with a minor reduction in content-related metrics.

	@inproceedings{zhou-etal-2024-psentscore-evaluating,
	    title = "{PS}ent{S}core: Evaluating Sentiment Polarity in Dialogue Summarization",
	    author = "Zhou, Yongxin  and
	      Ringeval, Fabien  and
	      Portet, Fran{\c{c}}ois",
	    editor = "Calzolari, Nicoletta  and
	      Kan, Min-Yen  and
	      Hoste, Veronique  and
	      Lenci, Alessandro  and
	      Sakti, Sakriani  and
	      Xue, Nianwen",
	    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
	    month = may,
	    year = "2024",
	    address = "Torino, Italia",
	    publisher = "ELRA and ICCL",
	    url = "https://aclanthology.org/2024.lrec-main.1163",
	    pages = "13290--13302",
	    abstract = "",
	}
        

A Survey of Evaluation Methods of Generated Medical Textual Reports
Yongxin Zhou, Fabien Ringeval, François Portet
ACL - ClinicalNLP, 2023

pdf | abstract | bibtex

Medical Report Generation (MRG) is a sub-task of Natural Language Generation (NLG) and aims to present information from various sources in textual form and synthesize salient information, with the goal of reducing the time spent by domain experts in writing medical reports and providing support information for decision-making. Given the specificity of the medical domain, the evaluation of automatically generated medical reports is of paramount importance to the validity of these systems. Therefore, in this paper, we focus on the evaluation of automatically generated medical reports from the perspective of automatic and human evaluation. We present evaluation methods for general NLG evaluation and how they have been applied to domain-specific medical tasks. The study shows that MRG evaluation methods are very diverse, and that further work is needed to build shared evaluation methods. The state of the art also emphasizes that such an evaluation must be task specific and include human assessments, requesting the participation of experts in the field.

	@inproceedings{zhou-etal-2023-survey,
	    title = "A Survey of Evaluation Methods of Generated Medical Textual Reports",
	    author = "Zhou, Yongxin  and
	      Ringeval, Fabien  and
	      Portet, Fran{\c{c}}ois",
	    booktitle = "Proceedings of the 5th Clinical Natural Language Processing Workshop",
	    month = jul,
	    year = "2023",
	    address = "Toronto, Canada",
	    publisher = "Association for Computational Linguistics",
	    url = "https://aclanthology.org/2023.clinicalnlp-1.48",
	    doi = "10.18653/v1/2023.clinicalnlp-1.48",
	    pages = "447--459",
	    abstract = "",
	}
        

Effectiveness of French Language Models on Abstractive Dialogue Summarization Task
Yongxin Zhou, François Portet, Fabien Ringeval
LREC, 2022

pdf | abstract | bibtex

Pre-trained language models have established the state-of-the-art on various natural language processing tasks, including dialogue summarization, which allows the reader to quickly access key information from long conversations in meetings, interviews or phone calls. However, such dialogues are still difficult to handle with current models because the spontaneity of the language involves expressions that are rarely present in the corpora used for pre-training the language models. Moreover, the vast majority of the work accomplished in this field has been focused on English. In this work, we present a study on the summarization of spontaneous oral dialogues in French using several language specific pre-trained models: BARThez, and BelGPT-2, as well as multilingual pre-trained models: mBART, mBARThez, and mT5. Experiments were performed on the DECODA (Call Center) dialogue corpus whose task is to generate abstractive synopses from call center conversations between a caller and one or several agents depending on the situation. Results show that the BARThez models offer the best performance far above the previous state-of-the-art on DECODA. We further discuss the limits of such pre-trained models and the challenges that must be addressed for summarizing spontaneous dialogues.

          @InProceedings{zhou-portet-ringeval:2022:LREC,
	  author    = {Zhou, Yongxin  and  Portet, François  and  Ringeval, Fabien},
	  title     = {Effectiveness of French Language Models on Abstractive Dialogue Summarization Task},
	  booktitle      = {Proceedings of the Language Resources and Evaluation Conference},
	  month          = {June},
	  year           = {2022},
	  address        = {Marseille, France},
	  publisher      = {European Language Resources Association},
	  pages     = {3571--3581},
	  abstract  = {},
	  url       = {https://aclanthology.org/2022.lrec-1.382}
	}
        

THERADIA: Digital Therapies Augmented by Artificial Intelligence
Franck Tarpin-Bernard, Joan Fruitet, Jean-Philippe Vigne, Patrick Constant, Hanna Chainay, Olivier Koenig, Fabien Ringeval, Béatrice Bouchot, Gérard Bailly, François Portet, Sina Alisamir, Yongxin Zhou, Jean Serre, Vincent Delerue, Hippolyte Fournier, Kévin Berenger, Isabella Zsoldos, Olivier Perrotin, Frédéric Elisei, Martin Lenglet, Charles Puaux, Léo Pacheco, Mélodie Fouillen, Didier Ghenassia
AHFE, 2021

pdf | abstract | bibtex

Digital plays a key role in the transformation of medicine. Beyond the simple computerisation of healthcare systems, many non-drug treatments are now possible thanks to digital technology. Thus, interactive stimulation exercises can be offered to people suffering from cognitive disorders, such as developmental disorders, neurodegenerative diseases, stroke or traumas. The efficiency of these new treatments, which are still primarily offered face-to-face by therapists, can be greatly improved if patients can pursue them at home. However, patients are left to their own devices which can be problematic. We introduce THERADIA, a 5-year project that aims to develop an empathic virtual agent that accompanies patients while receiving digital therapies at home, and that provides feedback to therapists and caregivers. We detail the architecture of our agent as well as the framework of our Wizard-of-Oz protocol, designed to collect a large corpus of interactions between people and our virtual assistant in order to train our models and improve our dialogues.

	@inproceedings{tarpin2021theradia,
	  title={THERADIA: Digital Therapies Augmented by Artificial Intelligence},
	  author={Tarpin-Bernard, Franck and Fruitet, Joan and Vigne, Jean-Philippe and Constant, Patrick and Chainay, Hanna and Koenig, Olivier and Ringeval, Fabien and Bouchot, B{\'e}atrice and Bailly, G{\'e}rard and Portet, Fran{\c{c}}ois and others},
	  booktitle={International Conference on Applied Human Factors and Ergonomics},
	  pages={478--485},
	  year={2021},
	  organization={Springer}
	}
        

Towards an XAI-Assisted Third-Party Evaluation of AI Systems: Illustration on Decision Trees
Yongxin Zhou, Matthieu Boussard, Agnes Delaborde
AAMAS - EXTRAAMAS, 2021

pdf | abstract | bibtex

We explored the potential contribution of eXplainable Artificial Intelligence (XAI) for the evaluation of Artificial Intelligence (AI), in a context where such an evaluation is performed by independent third-party evaluators, for example in the objective of certification. The experimental approach of this paper is based on “explainable by design” decision trees that produce predictions on health data and bank data. Results presented in this paper show that the explanations could be used by the evaluators to identify the parameters used in decision making and their levels of importance. The explanations would thus make it possible to orient the constitution of the evaluation corpus, to explore the rules followed for decision-making and to identify potentially critical relationships between different parameters. In addition, the explanations make it possible to inspect the presence of bias in the database and in the algorithm. These first results lay the groundwork for further additional research in order to generalize the conclusions of this paper to different XAI methods.

	@InProceedings{10.1007/978-3-030-82017-6_10,
	author="Zhou, Yongxin
	and Boussard, Matthieu
	and Delaborde, Agnes",
	editor="Calvaresi, Davide
	and Najjar, Amro
	and Winikoff, Michael
	and Fr{\"a}mling, Kary",
	title="Towards an XAI-Assisted Third-Party Evaluation of AI Systems: Illustration on Decision Trees",
	booktitle="Explainable and Transparent AI and Multi-Agent Systems",
	year="2021",
	publisher="Springer International Publishing",
	address="Cham",
	pages="158--172",
	abstract="",
	isbn="978-3-030-82017-6"
	}
        
Communication
  • Exploration de caractéristiques linguistiques et acoustiques pour la génération automatique de rapports de séances de remédiation cognitive avec un assistant virtuel
    Yongxin Zhou, Fabien Ringeval, François Portet
    JPC, 2023 - 9èmes Journées de Phonétique Clinique
    pdf | poster | bibtex
              @article{zhouexploration,
    	  title={Exploration de caract{\'e}ristiques linguistiques et acoustiques pour la g{\'e}n{\'e}ration automatique de rapports de s{\'e}ances de rem{\'e}diation cognitive avec un assistant virtuel},
    	  author={ZHOU, Yongxin and RINGEVAL, Fabien and PORTET, Fran{\c{c}}ois},
    	  journal={9{\`e}me Journ{\'e}e de Phon{\'e}tique Clinique},
    	  pages={117}
    	} 
  • Professional Activities
    • Conference Reviewer
      • The 13th Edition of Language Resources and Evaluation Conference, LREC 2022
      • Generation, Evaluation & Metrics (GEM) Workshop, GEM 2022, GEM 2023
      • International Conference on Affective Computing + Intelligent Interaction, ACII 2023
      • The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC-COLING 2024
    • Volunteer
      • Organisation of social activities at The first Advanced Language Processing School, ALPS 2021
      • Phd volunteer in the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2022
      • Virtual Volunteer in the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
    • Others
      • 02/2021 - 07/2021, Supervision of Master Internship on Natural Language Grounding through Dense Video Captioning, Multi3Generation
      • Teacher Assistant at The second Advanced Language Processing School, ALPS 2022