|
Hayk Poghosyan
I'm a Senior ML research engineer at Pipio, where I work on generative AI for audio-conditioned
video generation. Check out our latest project: EditYourself.
Previously, I was a senior ML scientist
at PicsArt, focusing on large-scale controllable image and video generation using GANs and diffusion
models, low-light enhancement and person segmentation. I've also worked on AI
website builders, time-series forecasting, and anomaly detection. I have 7+ years in experiance in
both research and engineering.
I hold a PhD in theoretical physics from the Yerevan Physics Institute, with research spanning
classical and quantum spin systems, and a master's degree in informatics and computer engineering.
My work bridges computer vision, generative models, and large-scale data systems, with publications
at venues including CVPR, JHEP, ICASSP, ACM Multimedia, and Physica A.
Email /
LinkedIn /
Scholar /
GitHub
📍 Alicante, Spain
|
|
-
Senior ML Research Engineer — Pipio
October 2025 - Present
- Generative AI for audio-conditioned video generation.
- Worked on EditYourself, a diffusion-based video editing model for talking heads.
- Deployed and optimized large transformer-based models.
-
Senior ML Scientist — PicsArt
September 2021 - September 2025
- Low-light image enhancement with emphasis on controllability.
- Person semantic segmentation.
- Image editing using GANs and diffusion models.
- Image and video generation with generative AI.
- Processed and managed petabyte-scale image and video datasets.
-
Data Scientist — Noble Scripts
April 2021 - September 2021
- Time-series forecasting.
- Anomaly detection.
-
Senior ML Engineer — 10Web
Dectember 2018 - April 2021
- Worked on an end-to-end AI Website Builder.
-
Researcher — YerPhI (Yerevan Physics Institute)
2016 - 2025
- Research in theoretical and computational physics.
- Collaboration with the Demokritos Institute (Athens) on random number generator
research.
|
Research
I'm interested in computer vision, deep learning, generative AI, and image processing. Most of my
work focuses on image and video generation, controllable image editing, and inferring structure and
semantics from visual data using modern generative models as well as optimizing them for real world
loads and usecases.
|
|
|
EditYourself: Audio-Driven Generation and Manipulation of Talking Head
Videos with Diffusion Transformers
NEW
John Flynn,
Wolfgang Paier,
Dimitar Dinev,
Sam Nhut Nguyen,
Hayk Poghosyan,
Manuel Toribio,
Sandipan Banerjee,
Guy Gafnistrong>
ArXiv 2026
project page
·
arXiv
EditYourself is a diffusion-based video editing model for talking heads, enabling transcript-driven
lip-syncing, insertion, removal and retiming of speech while preserving identity and visual
fidelity.
|
|
|
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation
from Text
Roberto Henschel,
Levon Khachatryan,
Daniil Hayrapetyan,
Hayk Poghosyan,
Vahram Tadevosyan,
Zhangyang Wang,
Shant Navasardyan,
Humphrey Shi
CVPR 2025
project page
·
arXiv
An autoregressive approach for long video generation (80 to 1200+ frames) with temporal consistency,
high motion dynamics, and smooth transitions. Uses a conditional attention module (CAM) and
appearance preservation module (APM); includes StreamingSVD (image-to-video) and StreamingModelscope
(up to 2 minutes).
|
|
|
Grounded-Instruct-Pix2Pix: Improving Instruction Based Image Editing with
Automatic Target Grounding
Artur Shagidanov,
Hayk Poghosyan,
Xinyu Gong,
Zhangyang Wang,
Shant Navasardyan,
Humphrey Shi
code
A robust framework for localized instruction-based image editing. Two stages: (1) grounding mask
extraction via CLIP-Score Filtering and Grounded-SAM, (2) localized editing with Instruct-Pix2Pix
and latent blending. Incurs little overhead and requires no additional user inputs.
|
|
|
ReCoRo: Region-Controllable Robust Light Enhancement by User-Specified
Imprecise Masks
Dejia Xu,
Hayk Poghosyan,
Shant Navasardyan,
Yifan Jiang,
Humphrey Shi,
Zhangyang Wang
ACM Multimedia 2022
code
·
paper
(PDF)
A low-light enhancement method that lets users specify where and how much to enhance an input image.
Robust to roughly-supplied user masks; supports both imprecise and fine matting masks.
|
-
J. Flynn, W. Paier, D. Dinev, S. N. Nguyen, H. Poghosyan, M.
Toribio, S. Banerjee, G. Gafni
EditYourself: Audio-Driven Generation and Manipulation of Talking
Head Videos with Diffusion Transformers
-
G. Amatuni, ÄŚ. BurdĂk, H. Poghosyan, L. Ananikyan, N. Ananikian
Spin-1 Antiferromagnetic Diamond Chains with Biquadratic
Nodal-nodal Interactions: Magnetization Plateaus, Super-Stable Points and Cycles
-
R. Henschel, L. Khachatryan, H. Poghosyan, D. Hayrapetyan, V.
Tadevossyan, Z. Wang, S. Navasardyan, H. Shi
StreamingT2V: Consistent, Dynamic, and Extendable Long Video
Generation from Text
2025. IEEE/CVF CVPR.
-
A. Shagidanov, H. Poghosyan, X. Gong, Z. Wang, S. Navasardyan,
H. Shi
Grounded-Instruct-Pix2Pix: Improving Instruction-Based Image
Editing with Automatic Target Grounding
2024. IEEE ICASSP.
-
D. Xu, H. Poghosyan, S. Navasardyan, Y. Jiang, H. Shi, Z. Wang
ReCoRo: Region-Controllable Robust Light Enhancement with
User-Specified Imprecise Masks
2022. ACM Multimedia (MM '22).
-
V. Abgaryan, N. Ananikian, L. Ananikyan, H. Poghosyan
Magnetic Properties and Entanglement of Nickel Containing Polymer
2019. Armenian Journal of Physics.
-
H. Poghosyan, K. Savvidy, G. Savvidy
Classical limit theorems and high entropy MIX-MAX random number
generator
-
N. Ananikian, R. Artusov, H. Poghosyan
Superstable cycles and magnetization plateau for
antiferromagnetic spin-1 Ising and Ising-Heisenberg models on diamond chains
-
H. Poghosyan
Super stable cycles and magnetization plateau for spin-1 Ising
model on diamond-like decorated Bethe lattice
2017. Armenian Journal of Physics, 10(3), 92-98.
-
N. Ananikian, ÄŚ. BurdĂk, L. Ananikyan, H. Poghosyan
Magnetization Plateaus and Thermal Entanglement of Spin Systems
-
H. Poghosyan, V. Poghosyan
Frontal Cellular Automata for the Study of Non-Equilibrium
Lattice Models
-
A. Poghosyan, H. Poghosyan
Mixing with descendant fields in perturbed minimal CFT models
|
|