Hoyeon Lee

I am a Senior Research Scientist and Tech Lead in the Voice team at NAVER Cloud.

My research focuses on Language, Speech AI, and Multimodal AI, with an emphasis on large language models (LLMs). I work on multilingual and cross-lingual representation learning, as well as LLM-driven data generation and evaluation methods for robust speech and Text-to-Speech (TTS) pipelines.

I have developed multilingual large-scale language models covering Korean, French, and other cross-lingual settings, and deployed them in AI products actively used across NAVER’s services. My work has been presented at top-tier conferences, including EMNLP, INTERSPEECH, and other leading venues.

Since April 2026, I have also served as an AI Technical Mentor at AI·SW Maestro, a national talent-development program organized by the Ministry of Science and ICT (MSIT), guiding two teams in building production-grade AI systems across LLM applications for language, speech AI, and multimodal intelligence.

If you are interested in collaboration or would like to get in touch, please feel free to contact me.

news

Sep 2026	Excited to share that our work “LLM-Based Multi-Reference Evaluation for Efficient and Robust Assessment of Phrase Break Annotations” will appear at INTERSPEECH 2026. Looking forward to presenting it in Sydney, Australia 🇦🇺!
Apr 2026	I joined AI·SW Maestro as a Technical Mentor. Starting this year, I mentor two teams building production-grade AI systems across language modeling, speech AI, and multimodal intelligence.
Aug 2025	Our paper “Synthetic Data Generation for Phrase Break Prediction with Large Language Model” was accepted to INTERSPEECH 2025 in Rotterdam 🇳🇱.
Nov 2024	We presented our work “A Two-Step Approach for Data-Efficient French Pronunciation Learning” at EMNLP 2024 in Miami 🇺🇸.
Jun 2023	I give a talk on the accepted paper “Lightweight Grapheme-to-Phoneme Conversion Based on Knowledge Distilled BERT” at Summer Annual Conference of IEIE to be held at Jeju 🇰🇷.