Viacheslav Meshchaninov

Hello, I’m a PhD student in computer science at Constructor University, Bremen, where my research is focused on generative and diffusion models for discrete data such as proteins and text.

I completed my Bachelor’s and Master’s degrees in computer science at Lomonosov Moscow State University in 2024.

Here, I share my professional notes and personal insights from my ongoing research. My hope is that they will be a valuable resource for other researchers and enthusiasts in the field.

News

Feb 01, 2025

Hello from Bremen! My PhD Journey Starts Now.

Jun 22, 2024

I completed my Master’s degree at MSU with highest honors.

Publications

ICML

Diffusion on language model encodings for protein sequence generation

Viacheslav Meshchaninov, Pavel Strashnov, Andrey Shevtsov, and 4 more authors

arXiv preprint arXiv:2403.03726, 2025

ABSTRACT PDF

Protein sequence design has seen significant advances through discrete diffusion and autoregressive approaches, yet the potential of continuous diffusion remains underexplored. Here, we present DiMA, a latent diffusion framework that operates on protein language model representations. Through systematic exploration of architectural choices and diffusion components, we develop a robust methodology that generalizes across multiple protein encoders ranging from 8M to 3B parameters. We demonstrate that our framework achieves consistently high performance across sequence-only (ESM-2, ESMc), dual-decodable (CHEAP), and multimodal (SaProt) representations using the same architecture and training approach. We extensively evaluate existing methods alongside DiMA using multiple metrics across two protein modalities, covering quality, diversity, novelty, and distribution matching of generated proteins. DiMA consistently produces novel, high-quality and diverse protein sequences and achieves strong results compared to baselines such as autoregressive, discrete diffusion and flow matching language models. The model demonstrates versatile functionality, supporting conditional generation tasks including protein family-generation, motif scaffolding and infilling, and fold-specific sequence design. This work provides a universal continuous diffusion framework for protein sequence generation, offering both architectural insights and practical applicability across various protein design scenarios.
AAAI

Tencdm: Understanding the properties of the diffusion model in the space of language model encodings

Alexander Shabalin, Viacheslav Meshchaninov, Egor Chimbulatov, and 6 more authors

In Proceedings of the AAAI Conference on Artificial Intelligence, 2025

ABSTRACT PDF

This paper presents the Text Encoding Diffusion Model (TEncDM), a novel approach to diffusion modeling that operates in the space of pre-trained language model encodings. In contrast to traditionally used embeddings, encodings integrate contextual information. In our approach, we also employ a transformer-based decoder, specifically designed to incorporate context in the token prediction process. We conduct a comprehensive examination of the influence of the encoder, decoder, noise scheduler, and self-conditioning on zero-shot generation. Furthermore, we compare TEncDM with previous approaches on three conditional text generation tasks: QQP, XSum, and Wiki-Auto. The results show that TEncDM exhibits superior performance compared to existing non-autoregressive diffusion models.