About Noé Tits

I am an AI Research Engineer with a deep foundation in Machine Learning and Deep Learning, specializing in the development and application of Foundational Transformer Models, Large Language Models (LLMs), Multimodal AI, and Agentic AI Workflows. My expertise lies in transforming complex data challenges into intelligent, actionable solutions, particularly within speech, language, and clinical domains.

Currently, I leverage advanced Machine Learning models and algorithms at CluePoints, focusing on detecting subtle inconsistencies in clinical data. My project goes beyond traditional detection systems by ingesting and utilizing clinical trial protocols to drive the data issue detection process, implemented as an innovative agentic workflow. This approach highlights my capability in building autonomous, goal-driven AI systems for critical applications.

My academic journey at the University of the Basque Country (Aholab laboratory) and the Numediart Institute of UMONS (ISIA Lab) provided the bedrock for my specialization. My Master’s thesis involved developing tools for pathological voice analysis, hinting at my early interest in complex signal interpretation. During my PhD, I pushed the boundaries of Text-to-Speech Synthesis, focusing on emotional expressiveness. My approach involved training a Deep Learning architecture for TTS that used an attention-based decoder, alongside a text encoder and a dedicated emotional expressiveness encoder. This foundational work established my proficiency in designing and training sophisticated neural architectures, highly relevant to today’s transformer-based sequence-to-sequence models.

My contributions to Flowchase involved spearheading the R&D for their speech technology. Here, I implemented a transformer-based foundational audio model (wav2vec 2.0), which I fine-tuned to create a duration-aware phoneme sequence predictor. This cutting-edge approach enabled robust phonetic alignment and classification, facilitating multilingual text phonetization, precise speech alignment, and efficient model training for tasks like speech-to-phoneme inference and nuanced intonation assessment. The robust Speech & Language Tools I developed offer comprehensive solutions for linguistic data extraction, documentation, dataset parsing, and automatic transcription, all areas now profoundly enhanced by LLMs and advanced natural language understanding techniques.

My diverse project experience spans from electromagnetic field simulations to voice analysis/generation, medical image processing, and clinical data analysis, demonstrating my versatility and ability to apply advanced analytical methods across various scientific and engineering challenges. This broad exposure is key to my approach in multimodal AI, where integrating diverse data types unlocks deeper insights.

I am passionate about building intelligent systems that learn, adapt, and reason, driving innovation from foundational research to real-world applications.

Introduction to Speech Technologies for the European Commission

I got invited by the Interinstitutional Task Force on Speech Recognition of the European Commission to do the introduction of a workshop entitled “Speech synthesis: driver of multilingualism”.
The main objective of the task force groups representatives of the various EU institutions and bodies is to offer a forum for exchanging views and ideas and finding synergies in the area of speech recognition. The scope of the Task Force has recently been expanded to all AI powered speech technologies, including speech synthesis and speech-to-speech translation.
The audience was constituted of members of the task force who represent the language services from all EU institutions. In addition, there were colleagues from the artificial intelligence network and the emerging technologies committee in the EU institutions as well as representatives from academia, industry, and other stakeholders.

AI R&D @ Flowchase.app

PhD research subject

This presentation comes from the “Ma thèse en 180 secondes” competition aiming to explain the problematic of a PhD thesis in simple words in 2-3 minutes.