aigc-acceleration-arxiv-daily

AIGC Acceleration for Video Generation

Automatically Updated on 2026.04.04

Current Search Keywords: Video Generation, Text-to-Video, Image-to-Video, Video Editing, Diffusion Models, Real-time Generation, Video Diffusion, Video Synthesis, Latent Diffusion, Video Generation Acceleration

If you have any other keywords, please feel free to let us know :)

Web Page (Scrape Code)

Table of Contents

<a href=#video-generation>Video Generation</a>
<a href=#image-to-video>Image-to-Video</a>
<a href=#video-editing>Video Editing</a>
<a href=#diffusion-models>Diffusion Models</a>
<a href=#real-time-generation>Real-time Generation</a>
<a href=#dit-acceleration>DiT Acceleration</a>

Video Generation

Publish Date	Title	Authors	PDF	Code
2026-04-02	ActionParty: Multi-Subject Action Binding in Generative Video Games	Alexander Pondaven et.al.	2604.02330	null
2026-04-02	Generative World Renderer	Zheng-Hui Huang et.al.	2604.02329	null
2026-04-02	VOID: Video Object and Interaction Deletion	Saman Motamed et.al.	2604.02296	null
2026-04-02	Resonance4D: Frequency-Domain Motion Supervision for Preset-Free Physical Parameter Learning in 4D Dynamic Physical Scene Simulation	Changshe Zhang et.al.	2604.01994	null
2026-04-02	DriveDreamer-Policy: A Geometry-Grounded World-Action Model for Unified Generation and Planning	Yang Zhou et.al.	2604.01765	null
2026-04-02	Control-DINO: Feature Space Conditioning for Controllable Image-to-Video Diffusion	Edoardo A. Dominici et.al.	2604.01761	null
2026-04-02	Can Video Diffusion Models Predict Past Frames? Bidirectional Cycle Consistency for Reversible Interpolation	Lingyu Liu et.al.	2604.01700	null
2026-04-02	From Understanding to Erasing: Towards Complete and Stable Video Object Removal	Dingming Liu et.al.	2604.01693	null
2026-04-02	DynaVid: Learning to Generate Highly Dynamic Videos using Synthetic Motion Data	Wonjoon Jin et.al.	2604.01666	null
2026-04-02	Moiré Video Authentication: A Physical Signature Against AI Video Generation	Yuan Qing et.al.	2604.01654	null
2026-04-02	ZEUS: Accelerating Diffusion Models with Only Second-Order Predictor	Yixiao Wang et.al.	2604.01552	null
2026-04-01	Reinforcing Consistency in Video MLLMs with Structured Rewards	Yihao Quan et.al.	2604.01460	null
2026-04-01	GRAZE: Grounded Refinement and Motion-Aware Zero-Shot Event Localization	Syed Ahsan Masud Zaidi et.al.	2604.01383	null
2026-04-01	TRACE: High-Fidelity 3D Scene Editing via Tangible Reconstruction and Geometry-Aligned Contextual Video Masking	Jiyuan Hu et.al.	2604.01207	null
2026-04-01	ReinDriveGen: Reinforcement Post-Training for Out-of-Distribution Driving Scene Generation	Hao Zhang et.al.	2604.01129	null
2026-04-01	PHASOR: Anatomy- and Phase-Consistent Volumetric Diffusion for CT Virtual Contrast Enhancement	Zilong Li et.al.	2604.01053	null
2026-04-01	ONE-SHOT: Compositional Human-Environment Video Synthesis via Spatial-Decoupled Motion Injection and Hybrid Context Integration	Fengyuan Yang et.al.	2604.01043	null
2026-04-01	MotionGrounder: Grounded Multi-Object Motion Transfer via Diffusion Transformer	Samuel Teodoro et.al.	2604.00853	null
2026-04-01	HICT: High-precision 3D CBCT reconstruction from a single X-ray	Wen Ma et.al.	2604.00792	null
2026-04-01	CL-VISTA: Benchmarking Continual Learning in Video Large Language Models	Haiyang Guo et.al.	2604.00677	null
2026-03-31	Collaborative AI Agents and Critics for Fault Detection and Cause Analysis in Network Telemetry	Syed Eqbal Alam et.al.	2604.00319	null
2026-03-31	OmniRoam: World Wandering via Long-Horizon Panoramic Video Generation	Yuheng Liu et.al.	2603.30045	null
2026-03-31	Video Models Reason Early: Exploiting Plan Commitment for Maze Solving	Kaleb Newman et.al.	2603.30043	null
2026-03-31	Gloria: Consistent Character Video Generation via Content Anchors	Yuhang Yang et.al.	2603.29931	null
2026-03-31	SLVMEval: Synthetic Meta Evaluation Benchmark for Text-to-Long Video Generation	Ryosuke Matsuda et.al.	2603.29186	null
2026-03-31	TrajectoryMover: Generative Movement of Object Trajectories in Videos	Kiran Chhatre et.al.	2603.29092	null
2026-03-30	Generating Humanless Environment Walkthroughs from Egocentric Walking Tour Videos	Yujin Ham et.al.	2603.29036	null
2026-03-30	Stepper: Stepwise Immersive Scene Generation with Multiview Panoramas	Felix Wimbauer et.al.	2603.28980	null
2026-03-30	Video Generation Models as World Models: Efficient Paradigms, Architectures and Algorithms	Muyang He et.al.	2603.28489	null
2026-03-30	VistaGEN: Consistent Driving Video Generation with Fine-Grained Control Using Multiview Visual-Language Reasoning	Li-Heng Chen et.al.	2603.28353	null
2026-03-30	LogiStory: A Logic-Aware Framework for Multi-Image Story Visualization	Chutian Meng et.al.	2603.28082	null
2026-03-30	FlashSign: Pose-Free Guidance for Efficient Sign Language Video Generation	Liuzhou Zhang et.al.	2603.27915	null
2026-03-29	Wan-R1: Verifiable-Reinforcement Learning for Video Reasoning	Ming Liu et.al.	2603.27866	null
2026-03-29	TokenDial: Continuous Attribute Control in Text-to-Video via Spatiotemporal Token Offsets	Zhixuan Liu et.al.	2603.27520	null
2026-03-29	KV Cache Quantization for Self-Forcing Video Generation: A 33-Method Empirical Study	Suraj Ranganath et.al.	2603.27469	null
2026-03-28	LOME: Learning Human-Object Manipulation with Action-Conditioned Egocentric World Model	Quankai Gao et.al.	2603.27449	null
2026-03-28	TrackMAE: Video Representation Learning via Track Mask and Predict	Renaud Vandeghen et.al.	2603.27268	null
2026-03-28	LightMover: Generative Light Movement with Color and Intensity Controls	Gengze Zhou et.al.	2603.27209	null
2026-03-28	EFlow: Fast Few-Step Video Generator Training from Scratch via Efficient Solution Flow	Dogyun Park et.al.	2603.27086	null
2026-03-28	LightCtrl: Training-free Controllable Video Relighting	Yizuo Peng et.al.	2603.27083	null
2026-03-27	Think over Trajectories: Leveraging Video Generation to Reconstruct GPS Trajectories from Cellular Signaling	Ruixing Zhang et.al.	2603.26610	null
2026-03-27	VGGRPO: Towards World-Consistent Video Generation with 4D Latent Reward	Zhaochong An et.al.	2603.26599	null
2026-03-27	Generation Is Compression: Zero-Shot Video Coding via Stochastic Rectified Flow	Ziyue Zeng et.al.	2603.26571	null
2026-03-26	ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling	Yawen Luo et.al.	2603.25746	null
2026-03-26	RefAlign: Representation Alignment for Reference-to-Video Generation	Lei Wang et.al.	2603.25743	null
2026-03-26	PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference	Xiaofeng Mao et.al.	2603.25730	null
2026-03-26	Beyond the Golden Data: Resolving the Motion-Vision Quality Dilemma via Timestep Selective Training	Xiangyang Luo et.al.	2603.25527	null
2026-03-26	EagleNet: Energy-Aware Fine-Grained Relationship Learning Network for Text-Video Retrieval	Yuhan Chen et.al.	2603.25267	null
2026-03-26	Free-Lunch Long Video Generation via Layer-Adaptive O.O.D Correction	Jiahao Tian et.al.	2603.25209	null
2026-03-26	AnyID: Ultra-Fidelity Universal Identity-Preserving Video Generation from Any Visual References	Jiahao Wang et.al.	2603.25188	null
2026-03-26	GaussFusion: Improving 3D Reconstruction in the Wild with A Geometry-Informed Video Generator	Liyuan Zhu et.al.	2603.25053	null
2026-03-26	ScrollScape: Unlocking 32K Image Generation With Video Diffusion Priors	Haodong Yu et.al.	2603.24270	null
2026-03-25	DCARL: A Divide-and-Conquer Framework for Autoregressive Long-Trajectory Video Generation	Junyi Ouyang et.al.	2603.24835	null
2026-03-25	DreamerAD: Efficient Reinforcement Learning via Latent World Model for Autonomous Driving	Pengxuan Yang et.al.	2603.24587	null
2026-03-25	Anti-I2V: Safeguarding your photos from malicious image-to-video generation	Duc Vu et.al.	2603.24570	null
2026-03-25	Toward Physically Consistent Driving Video World Models under Challenging Trajectories	Jiawei Zhou et.al.	2603.24506	null
2026-03-25	OmniWeaving: Towards Unified Video Generation with Free-form Composition and Reasoning	Kaihang Pan et.al.	2603.24458	null
2026-03-25	Accelerating Diffusion-based Video Editing via Heterogeneous Caching: Beyond Full Computing at Sampled Denoising Timestep	Tianyi Liu et.al.	2603.24260	null
2026-03-25	Leave No Stone Unturned: Uncovering Holistic Audio-Visual Intrinsic Coherence for Deepfake Detection	Jielun Peng et.al.	2603.23960	null
2026-03-25	Knowledge-Refined Dual Context-Aware Network for Partially Relevant Video Retrieval	Junkai Yang et.al.	2603.23902	null
2026-03-24	WildWorld: A Large-Scale Dataset for Dynamic World Modeling with Actions and Explicit State toward Generative ARPG	Zhen Li et.al.	2603.23497	null
2026-03-24	Foveated Diffusion: Efficient Spatially Adaptive Image and Video Generation	Brian Chao et.al.	2603.23491	null
2026-03-24	TETO: Tracking Events with Teacher Observation for Motion Estimation and Frame Interpolation	Jini Yang et.al.	2603.23487	null
2026-03-24	RealMaster: Lifting Rendered Scenes into Photorealistic Video	Dana Cohen-Bar et.al.	2603.23462	null
2026-03-24	I3DM: Implicit 3D-aware Memory Retrieval and Injection for Consistent Video Scene Generation	Jia Li et.al.	2603.23413	null
2026-03-24	ABot-PhysWorld: Interactive World Foundation Model for Robotic Manipulation with Physics Alignment	Yuzhi Chen et.al.	2603.23376	null
2026-03-24	ViBe: Ultra-High-Resolution Video Synthesis Born from Pure Images	Yunfeng Wu et.al.	2603.23326	null
2026-03-24	GO-Renderer: Generative Object Rendering with 3D-aware Controllable Video Diffusion Models	Zekai Gu et.al.	2603.23246	null
2026-03-24	InterDyad: Interactive Dyadic Speech-to-Video Generation by Querying Intermediate Visual Guidance	Dongwei Pan et.al.	2603.23132	null
2026-03-24	WorldMesh: Generating Navigable Multi-Room 3D Scenes via Mesh-Conditioned Image Diffusion	Manuel-Andreas Schneider et.al.	2603.22972	null
2026-03-24	Cluster-Wise Spatio-Temporal Masking for Efficient Video-Language Pretraining	Weijun Zhuang et.al.	2603.22953	null
2026-03-23	TrajLoom: Dense Future Trajectory Generation from Video	Zewei Zhang et.al.	2603.22606	null
2026-03-23	Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models	Meiqi Wu et.al.	2603.22212	null
2026-03-23	PAM: A Pose-Appearance-Motion Engine for Sim-to-Real HOI Video Generation	Mingju Gao et.al.	2603.22193	null
2026-03-23	Mamba-VMR: Multimodal Query Augmentation via Generated Videos for Precise Temporal Grounding	Yunzhuo Sun et.al.	2603.22121	null
2026-03-23	P-Flow: Prompting Visual Effects Generation	Rui Zhao et.al.	2603.22091	null
2026-03-23	Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model	SII-GAIR et.al.	2603.21986	null
2026-03-23	Manifold-Aware Exploration for Reinforcement Learning in Video Generation	Mingzhe Zheng et.al.	2603.21872	null
2026-03-23	Adaptive Video Distillation: Mitigating Oversaturation and Temporal Collapse in Few-Step Generation	Yuyang You et.al.	2603.21864	null
2026-03-23	Climate Prompting: Generating the Madden-Julian Oscillation using Video Diffusion and Low-Dimensional Conditioning	Sulian Thual et.al.	2603.21856	null
2026-03-23	PROBE: Diagnosing Residual Concept Capacity in Erased Text-to-Video Diffusion Models	Yiwei Xie et.al.	2603.21547	null
2026-03-22	Relax Forcing: Relaxed KV-Memory for Consistent Long Video Generation	Zengqun Zhao et.al.	2603.21366	null
2026-03-22	Identity-Consistent Video Generation under Large Facial-Angle Variations	Bin Hu et.al.	2603.21299	null
2026-03-22	Pretrained Video Models as Differentiable Physics Simulators for Urban Wind Flows	Janne Perini et.al.	2603.21210	null
2026-03-20	Uni-Classifier: Leveraging Video Diffusion Priors for Universal Guidance Classifier	Yujie Zhou et.al.	2603.20382	null
2026-03-20	MME-CoF-Pro: Evaluating Reasoning Coherence in Video Generative Models with Text and Visual Hints	Yu Qi et.al.	2603.20194	null
2026-03-20	LumosX: Relate Any Identities with Their Attributes for Personalized Video Generation	Jiazheng Xing et.al.	2603.20192	null
2026-03-20	X-World: Controllable Ego-Centric Multi-Camera World Models for Scalable End-to-End Driving	Chaoda Zheng et.al.	2603.19979	null
2026-03-20	Morphology-Consistent Humanoid Interaction through Robot-Centric Video Synthesis	Weisheng Xu et.al.	2603.19709	null
2026-03-20	Making Video Models Adhere to User Intent with Minor Adjustments	Daniel Ajisafe et.al.	2603.19672	null
2026-03-20	OrbitNVS: Harnessing Video Diffusion Priors for Novel View Synthesis	Jinglin Liang et.al.	2603.19613	null
2026-03-20	Physion-Eval: Evaluating Physical Realism in Generated Video via Human Reasoning	Qin Zhang et.al.	2603.19607	null
2026-03-19	Depictions of Depression in Generative AI Video Models: A Preliminary Study of OpenAI’s Sora 2	Matthew Flathers et.al.	2603.19527	null
2026-03-19	Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding	Xianjin Wu et.al.	2603.19235	null
2026-03-19	MonoArt: Progressive Structural Reasoning for Monocular Articulated 3D Reconstruction	Haitian Li et.al.	2603.19231	null
2026-03-19	Spectrally-Guided Diffusion Noise Schedules	Carlos Esteves et.al.	2603.19222	null
2026-03-19	Measuring 3D Spatial Geometric Consistency in Dynamic Generated Videos	Weijia Dou et.al.	2603.19048	null
2026-03-19	V-Dreamer: Automating Robotic Simulation and Trajectory Synthesis via Video Generation Priors	Songjia He et.al.	2603.18811	null
2026-03-19	6Bit-Diffusion: Inference-Time Mixed-Precision Quantization for Video Diffusion Models	Rundong Su et.al.	2603.18742	null
2026-03-19	PhysVideo: Physically Plausible Video Generation with Cross-View Geometry Guidance	Cong Wang et.al.	2603.18639	null
2026-03-19	Training-Free Sparse Attention for Fast Video Generation via Offline Layer-Wise Sparsity Profiling and Online Bidirectional Co-Clustering	Jiayi Luo et.al.	2603.18636	null
2026-03-19	GenVideoLens: Where LVLMs Fall Short in AI-Generated Video Detection?	Yueying Zou et.al.	2603.18625	null
2026-03-19	Improving Joint Audio-Video Generation with Cross-Modal Context Learning	Bingqi Ma et.al.	2603.18600	null
2026-03-19	3DreamBooth: High-Fidelity 3D Subject-Driven Video Generation Model	Hyun-kyu Ko et.al.	2603.18524	null
2026-03-19	Efficient Video Diffusion with Sparse Information Transmission for Video Compression	Mingde Zhou et.al.	2603.18501	null
2026-03-18	The Unreasonable Effectiveness of Text Embedding Interpolation for Continuous Image Steering	Yigit Ekin et.al.	2603.17998	null
2026-03-18	Versatile Editing of Video Content, Actions, and Dynamics without Training	Vladimir Kulikov et.al.	2603.17989	null
2026-03-18	AHOY! Animatable Humans under Occlusion from YouTube Videos with Gaussian Splatting and Video Diffusion Priors	Aymen Mir et.al.	2603.17975	null
2026-03-18	Identity as Presence: Towards Appearance and Voice Personalized Joint Audio-Video Generation	Yingjie Chen et.al.	2603.17889	null
2026-03-18	Steering Video Diffusion Transformers with Massive Activations	Xianhang Cheng et.al.	2603.17825	null
2026-03-18	ChopGrad: Pixel-Wise Losses for Latent Video Diffusion via Truncated Backpropagation	Dmitriy Rivkin et.al.	2603.17812	null
2026-03-18	EVA: Aligning Video World Models with Executable Robot Actions via Inverse Dynamics Rewards	Ruixiang Wang et.al.	2603.17808	null
2026-03-18	TAPESTRY: From Geometry to Appearance via Consistent Turntable Videos	Yan Zeng et.al.	2603.17735	null
2026-03-18	Learning Transferable Temporal Primitives for Video Reasoning via Synthetic Videos	Songtao Jiang et.al.	2603.17693	null
2026-03-18	FrescoDiffusion: 4K Image-to-Video with Prior-Regularized Tiled Diffusion	Hugo Caselles-Dupré et.al.	2603.17555	null
2026-03-18	ProGVC: Progressive-based Generative Video Compression via Auto-Regressive Context Modeling	Daowen Li et.al.	2603.17546	null
2026-03-18	AR-CoPO: Align Autoregressive Video Generation with Contrastive Policy Optimization	Dailan He et.al.	2603.17461	null
2026-03-18	SHIFT: Motion Alignment in Video Diffusion Models with Adversarial Hybrid Fine-Tuning	Xi Ye et.al.	2603.17426	null
2026-03-18	Motion-Adaptive Temporal Attention for Lightweight Video Generation with Stable Diffusion	Rui Hong et.al.	2603.17398	null
2026-03-18	Stereo World Model: Camera-Guided Stereo Video Generation	Yang-Tian Sun et.al.	2603.17375	null
2026-03-17	WorldCam: Interactive Autoregressive 3D Gaming Worlds with Camera Pose as a Unifying Geometric Representation	Jisu Nam et.al.	2603.16871	null
2026-03-17	Demystifing Video Reasoning	Ruisi Wang et.al.	2603.16870	null
2026-03-17	DreamPlan: Efficient Reinforcement Fine-Tuning of Vision-Language Planners via Video World Models	Emily Yue-Ting Jia et.al.	2603.16860	null
2026-03-17	World Reconstruction From Inconsistent Views	Lukas Höllein et.al.	2603.16736	null
2026-03-17	Search2Motion: Training-Free Object-Level Motion Control via Attention-Consensus Search	Sainan Liu et.al.	2603.16711	null
2026-03-17	Kinema4D: Kinematic 4D World Modeling for Spatiotemporal Embodied Simulation	Mutian Xu et.al.	2603.16669	null
2026-03-17	VideoMatGen: PBR Materials through Joint Generative Modeling	Jon Hasselgren et.al.	2603.16566	null
2026-03-17	VIGOR: VIdeo Geometry-Oriented Reward for Temporal Generative Alignment	Tengjiao Yin et.al.	2603.16271	null
2026-03-17	S-VAM: Shortcut Video-Action Model by Self-Distilling Geometric and Semantic Foresight	Haodong Yan et.al.	2603.16195	null
2026-03-17	Diffusion Models for Joint Audio-Video Generation	Alejandro Paredes La Torre et.al.	2603.16093	null
2026-03-16	Tri-Prompting: Video Diffusion with Unified Control over Scene, Subject, and Motion	Zhenghong Zhou et.al.	2603.15614	null
2026-03-16	Grounding World Simulation Models in a Real-World Metropolis	Junyoung Seo et.al.	2603.15583	null
2026-03-16	iDaVIE v1.0: A virtual reality tool for interactive analysis of astronomical data cubes	Alexander Sivitilli et.al.	2603.15490	null
2026-03-16	ViFeEdit: A Video-Free Tuner of Your Video Diffusion Transformer	Ruonan Yu et.al.	2603.15478	null
2026-03-16	AnyCrowd: Instance-Isolated Identity-Pose Binding for Arbitrary Multi-Character Animation	Zhenyu Xie et.al.	2603.15415	null

(<a href=#updated-on-20260404>back to top</a>)

Image-to-Video

Publish Date	Title	Authors	PDF	Code
2026-04-02	DenOiS: Dual-Domain Denoising of Observation and Solution in Ultrasound Image Reconstruction	Can Deniz Bezek et.al.	2604.02105	null
2026-04-02	Control-DINO: Feature Space Conditioning for Controllable Image-to-Video Diffusion	Edoardo A. Dominici et.al.	2604.01761	null
2026-04-02	ZEUS: Accelerating Diffusion Models with Only Second-Order Predictor	Yixiao Wang et.al.	2604.01552	null
2026-04-01	AffordTissue: Dense Affordance Prediction for Tool-Action Specific Tissue Interaction	Aiza Maksutova et.al.	2604.01371	null
2026-04-01	OkanNet: A Lightweight Deep Learning Architecture for Classification of Brain Tumor from MRI Images	Okan Uçar et.al.	2604.01264	null
2026-04-01	Simulating Realistic LiDAR Data Under Adverse Weather for Autonomous Vehicles: A Physics-Informed Learning Approach	Vivek Anand et.al.	2604.01254	null
2026-04-01	Camouflage-aware Image-Text Retrieval via Expert Collaboration	Yao Jiang et.al.	2604.01251	null
2026-04-01	AdaLoRA-QAT: Adaptive Low-Rank and Quantization-Aware Segmentation	Prantik Deb et.al.	2604.01167	null
2026-04-01	Looking into a Pixel by Nonlinear Unmixing – A Generative Approach	Maofeng Tang et.al.	2604.01141	null
2026-04-01	VRUD: A Drone Dataset for Complex Vehicle-VRU Interactions within Mixed Traffic	Ziyu Wang et.al.	2604.01134	null
2026-04-01	Region-Adaptive Generative Compression with Spatially Varying Diffusion Models	Lucas Relic et.al.	2604.01122	null
2026-04-01	ProOOD: Prototype-Guided Out-of-Distribution 3D Occupancy Prediction	Yuheng Zhang et.al.	2604.01081	null
2026-04-01	IWP: Token Pruning as Implicit Weight Pruning in Large Vision Language Models	Dong-Jae Lee et.al.	2604.00757	null
2026-03-31	Collaborative AI Agents and Critics for Fault Detection and Cause Analysis in Network Telemetry	Syed Eqbal Alam et.al.	2604.00319	null
2026-03-31	Prompt-Guided Prefiltering for VLM Image Compression	Bardia Azizian et.al.	2604.00314	null
2026-03-31	Feature-level Site Leakage Reduction for Cross-Hospital Chest X-ray Transfer via Self-Supervised Learning	Ayoub Louaye Bouaziz et.al.	2604.00263	null
2026-03-31	Evaluation of neuroCombat and deep learning harmonization for multi-site magnetic resonance neuroimaging in youth with prenatal alcohol exposure	Chloe Scholten et.al.	2604.00251	null
2026-03-31	Harmonization mitigates diffusion MRI scanner effects in infancy: insights from the HEALthy Brain and Childhood Development (HBCD) study	Elyssa M. McMaster et.al.	2604.00246	null
2026-03-31	Pupil Design for Computational Wavefront Estimation	Ali Almuallem et.al.	2604.00225	null
2026-03-31	Brain MR Image Synthesis with Multi-contrast Self-attention GAN	Zaid A. Abod et.al.	2604.00070	null
2026-03-31	OmniRoam: World Wandering via Long-Horizon Panoramic Video Generation	Yuheng Liu et.al.	2603.30045	null
2026-03-31	Polyhedral Unmixing: Bridging Semantic Segmentation with Hyperspectral Unmixing via Polyhedral-Cone Partitioning	Antoine Bottenmuller et.al.	2603.29438	null
2026-03-31	Rich-U-Net: A medical image segmentation model for fusing spatial depth features and capturing minute structural details	Zhuoyi Fang et.al.	2603.29404	null
2026-03-31	Retinal Malady Classification using AI: A novel ViT-SVM combination architecture	Shashwat Jha et.al.	2603.29181	null
2026-03-30	The Surprising Effectiveness of Noise Pretraining for Implicit Neural Representations	Kushal Vyas et.al.	2603.29034	null
2026-03-30	End-to-end optimization of sparse ultrasound linear probes	Sergio Urrea et.al.	2603.29014	null
2026-03-30	Hybrid Quantum-Classical AI for Industrial Defect Classification in Welding Images	Akshaya Srinivasan et.al.	2603.28995	null
2026-03-30	Learning a dynamic four-chamber shape model of the human heart for 95,695 UK Biobank participants	Qiang Ma et.al.	2603.28711	null
2026-03-30	MRI-to-CT synthesis using drifting models	Qing Lyu et.al.	2603.28498	null
2026-03-30	Video Generation Models as World Models: Efficient Paradigms, Architectures and Algorithms	Muyang He et.al.	2603.28489	null
2026-03-30	Deep Learning Based Site-Specific Channel Inference Using Satellite Images	Junzhe Song et.al.	2603.28083	null
2026-03-30	MolmoPoint: Better Pointing for VLMs with Grounding Tokens	Christopher Clark et.al.	2603.28069	null
2026-03-30	Physics-Embedded Feature Learning for AI in Medical Imaging	Pulock Das et.al.	2603.28057	null
2026-03-29	Towards Emotion Recognition with 3D Pointclouds Obtained from Facial Expression Images	Laura Rayón Ropero et.al.	2603.27798	null
2026-03-28	Guided Lensless Polarization Imaging	Noa Kraicer et.al.	2603.27357	null
2026-03-28	DeepBayesFlow: A Bayesian Structured Variational Framework for Generalizable Prostate Segmentation via Expressive Posteriors and SDE-Girsanov Uncertainty Modeling	Zhuoyi Fang et.al.	2603.27263	null
2026-03-28	MD-RWKV-UNet: Scale-Aware Anatomical Encoding with Cross-Stage Fusion for Multi-Organ Segmentation	Zhuoyi Fang et.al.	2603.27261	null
2026-03-28	Quantitative measurements of biological/chemical concentrations using smartphone cameras	Zhendong Cao et.al.	2603.27118	null
2026-03-27	On-Device Super Resolution Imaging Using Low-Cost SPAD Array and Embedded Lightweight Deep Learning	Zhenya Zang et.al.	2603.27018	null
2026-03-27	Make Geometry Matter for Spatial Reasoning	Shihua Zhang et.al.	2603.26639	null
2026-03-27	Think over Trajectories: Leveraging Video Generation to Reconstruct GPS Trajectories from Cellular Signaling	Ruixing Zhang et.al.	2603.26610	null
2026-03-27	From Static to Dynamic: Exploring Self-supervised Image-to-Video Representation Transfer Learning	Yang Liu et.al.	2603.26597	null
2026-03-27	Generation Is Compression: Zero-Shot Video Coding via Stochastic Rectified Flow	Ziyue Zeng et.al.	2603.26571	null
2026-03-26	TRACE: Object Motion Editing in Videos with First-Frame Trajectory Guidance	Quynh Phung et.al.	2603.25707	null
2026-03-26	Colon-Bench: An Agentic Workflow for Scalable Dense Lesion Annotation in Full-Procedure Colonoscopy Videos	Abdullah Hamdi et.al.	2603.25645	null
2026-03-26	A Mamba-based Perceptual Loss Function for Learning-based UGC Transcoding	Zihao Qi et.al.	2603.25566	null
2026-03-26	Challenges in Hyperspectral Imaging for Autonomous Driving: The HSI-Drive Case	Koldo Basterretxea et.al.	2603.25510	null
2026-03-26	Language-Free Generative Editing from One Visual Example	Omar Elezabi et.al.	2603.25441	null
2026-03-26	PMT: Plain Mask Transformer for Image and Video Segmentation with Frozen Vision Encoders	Niccolò Cavagnero et.al.	2603.25398	null
2026-03-26	Underdetermined Blind Source Separation via Weighted Simplex Shrinkage Regularization and Quantum Deep Image Prior	Chia-Hsiang Lin et.al.	2603.25384	null
2026-03-26	Image Rotation Angle Estimation: Comparing Circular-Aware Methods	Maximilian Woehrer et.al.	2603.25351	null
2026-03-26	Pixelis: Reasoning in Pixels, from Seeing to Acting	Yunpeng Zhou et.al.	2603.25091	null
2026-03-26	MoE-GRPO: Optimizing Mixture-of-Experts via Reinforcement Learning in Vision-Language Models	Dohwan Ko et.al.	2603.24984	null
2026-03-26	Subject-Specific Low-Field MRI Synthesis via a Neural Operator	Ziqi Gao et.al.	2603.24968	null
2026-03-25	OpenCap Monocular: 3D Human Kinematics and Musculoskeletal Dynamics from a Single Smartphone Video	Selim Gilon et.al.	2603.24733	null
2026-03-25	Vision-Language Models vs Human: Perceptual Image Quality Assessment	Imran Mehmood et.al.	2603.24578	null
2026-03-25	Anti-I2V: Safeguarding your photos from malicious image-to-video generation	Duc Vu et.al.	2603.24570	null
2026-03-25	OmniWeaving: Towards Unified Video Generation with Free-form Composition and Reasoning	Kaihang Pan et.al.	2603.24458	null
2026-03-25	Modeling Spatiotemporal Neural Frames for High Resolution Brain Dynamic	Wanying Qu et.al.	2603.24176	null
2026-03-25	Comparative analysis of dual-form networks for live land monitoring using multi-modal satellite image time series	Iris Dumeur et.al.	2603.24109	null
2026-03-25	Blind Quality Enhancement for G-PCC Compressed Dynamic Point Clouds	Tian Guo et.al.	2603.24026	null
2026-03-25	MonoSIM: An open source SIL framework for Ackermann Vehicular Systems with Monocular Vision	Shantanu Rahman et.al.	2603.23965	null
2026-03-25	Leave No Stone Unturned: Uncovering Holistic Audio-Visual Intrinsic Coherence for Deepfake Detection	Jielun Peng et.al.	2603.23960	null
2026-03-25	Attention-aware Inference Optimizations for Large Vision-Language Models with Memory-efficient Decoding	Fatih Ilhan et.al.	2603.23914	null
2026-03-25	Joint Source-Channel-Check Coding with HARQ for Reliable Semantic Communications	Boyuan Li et.al.	2603.23869	null
2026-03-24	Sentinel-2 for Crop Yield Estimation: A Systematic Review	Mohammadreza Narimani et.al.	2603.23779	null
2026-03-24	Foveated Diffusion: Efficient Spatially Adaptive Image and Video Generation	Brian Chao et.al.	2603.23491	null
2026-03-24	Harnessing Lightweight Transformer with Contextual Synergic Enhancement for Efficient 3D Medical Image Segmentation	Xinyu Liu et.al.	2603.23390	null
2026-03-24	GO-Renderer: Generative Object Rendering with 3D-aware Controllable Video Diffusion Models	Zekai Gu et.al.	2603.23246	null
2026-03-24	Rigid Motion Estimation using Accelerated Iterative Coordinate Descent (REACT) for MR Imaging	Kwang Eun Jang et.al.	2603.23096	null
2026-03-24	WorldMesh: Generating Navigable Multi-Room 3D Scenes via Mesh-Conditioned Image Diffusion	Manuel-Andreas Schneider et.al.	2603.22972	null
2026-03-24	Retrieval-Guided Photovoltaic Inventory Estimation from Satellite Imagery for Distribution Grid Planning	Muhao Guo et.al.	2603.22856	null
2026-03-24	L-UNet: An LSTM Network for Remote Sensing Image Change Detection	Shuting Sun et.al.	2603.22842	null
2026-03-24	Viewport-based Neural 360° Image Compression	Jingwei Liao et.al.	2603.22776	null
2026-03-23	Drop-In Perceptual Optimization for 3D Gaussian Splatting	Ezgi Ozyilkan et.al.	2603.23297	null
2026-03-23	Single-Subject Multi-View MRI Super-Resolution via Implicit Neural Representations	Heejong Kim et.al.	2603.22627	null
2026-03-23	Far-field compressive ultrasound beamforming	Nikunj Khetan et.al.	2603.22496	null
2026-03-23	P-Flow: Prompting Visual Effects Generation	Rui Zhao et.al.	2603.22091	null
2026-03-23	A Latent Representation Learning Framework for Hyperspectral Image Emulation in Remote Sensing	Chedly Ben Azizi et.al.	2603.21911	null
2026-03-23	HMS-VesselNet: Hierarchical Multi-Scale Attention Network with Topology-Preserving Loss for Retinal Vessel Segmentation	Amarnath R et.al.	2603.21891	null
2026-03-23	The Universal Normal Embedding	Chen Tasker et.al.	2603.21786	null
2026-03-23	Cycle Inverse-Consistent TransMorph: A Balanced Deep Learning Framework for Brain MRI Registration	Jiaqi Shang et.al.	2603.21760	null
2026-03-23	Unregistered Spectral Image Fusion: Unmixing, Adversarial Learning, and Recoverability	Jiahui Song et.al.	2603.21510	null
2026-03-22	OrbitStream: Training-Free Adaptive 360-degree Video Streaming via Semantic Potential Fields	Aizierjiang Aiersilan et.al.	2603.20999	null
2026-03-21	Underwater imaging without color distortions requires RAW capture	Derya Akkaynak et.al.	2603.20823	null
2026-03-21	mmWave-Diffusion:A Novel Framework for Respiration Sensing Using Observation-Anchored Conditional Diffusion Model	Yong Wang et.al.	2603.20700	null
2026-03-21	Seed1.8 Model Card: Towards Generalized Real-World Agency	Bytedance Seed et.al.	2603.20633	null
2026-03-20	Thermal is Always Wild: Characterizing and Addressing Challenges in Thermal-Only Novel View Synthesis	M. Kerem Aydin et.al.	2603.20448	null
2026-03-20	CaroTo: A Tool for Fast Comprehensive Analysis of Carotid Artery Stenosis in 4D PC- and 3D BB-MRI Data	Hinrich Rahlfs et.al.	2603.20355	null
2026-03-20	A Unified Platform and Quality Assurance Framework for 3D Ultrasound Reconstruction with Robotic, Optical, and Electromagnetic Tracking	Lewis Howell et.al.	2603.20077	null
2026-03-20	Investigating a Policy-Based Formulation for Endoscopic Camera Pose Recovery	Jan Emily Mangulabnan et.al.	2603.20045	null
2026-03-20	Goal-Oriented Framework for Optical Flow-based Multi-User Multi-Task Video Transmission	Yujie Xu et.al.	2603.19995	null
2026-03-20	Evaluating Test-Time Adaptation For Facial Expression Recognition Under Natural Cross-Dataset Distribution Shifts	John Turnbull et.al.	2603.19994	null
2026-03-20	ReconMIL: Synergizing Latent Space Reconstruction with Bi-Stream Mamba for Whole Slide Image Analysis	Lubin Gan et.al.	2603.19925	null
2026-03-20	Offshore oil and gas platform dynamics in the North Sea, Gulf of Mexico, and Persian Gulf: Exploiting the Sentinel-1 archive	Robin Spanier et.al.	2603.19801	null
2026-03-19	TuLaBM: Tumor-Biased Latent Bridge Matching for Contrast-Enhanced MRI Synthesis	Atharva Rege et.al.	2603.19386	null
2026-03-19	Spectrally-Guided Diffusion Noise Schedules	Carlos Esteves et.al.	2603.19222	null
2026-03-19	GenMFSR: Generative Multi-Frame Image Restoration and Super-Resolution	Harshana Weligampola et.al.	2603.19187	null
2026-03-19	Student views in AI Ethics and Social Impact	Tudor-Dan Mihoc et.al.	2603.18827	null
2026-03-19	A Hybrid Physical–Digital Framework for Annotated Fracture Reduction Data Evaluated using Clinically Relevant 3D metrics	Basile Longo et.al.	2603.18723	null
2026-03-19	UEPS: Robust and Efficient MRI Reconstruction	Xiang Zhou et.al.	2603.18572	null
2026-03-19	SCISSR: Scribble-Conditioned Interactive Surgical Segmentation and Refinement	Haonan Ping et.al.	2603.18544	null
2026-03-19	TransText: Alpha-as-RGB Representation for Transparent Text Animation	Fei Zhang et.al.	2603.17944	null
2026-03-18	Energy-Aware Frame Rate Selection for Video Coding	Geetha Ramasubbu et.al.	2603.18305	null
2026-03-18	Understanding Task Aggregation for Generalizable Ultrasound Foundation Models	Fangyijie Wang et.al.	2603.18123	null
2026-03-18	Dual Agreement Consistency Learning with Foundation Models for Semi-Supervised Fetal Heart Ultrasound Segmentation and Diagnosis	Fangyijie Wang et.al.	2603.18119	null
2026-03-18	Insight-V++: Towards Advanced Long-Chain Visual Reasoning with Multimodal Large Language Models	Yuhao Dong et.al.	2603.18118	null
2026-03-18	The Unreasonable Effectiveness of Text Embedding Interpolation for Continuous Image Steering	Yigit Ekin et.al.	2603.17998	null
2026-03-18	Video Understanding: From Geometry and Semantics to Unified Models	Zhaochong An et.al.	2603.17840	null
2026-03-18	Cache-enabled Generative Joint Source-Channel Coding for Evolving Semantic Communications	Shunpu Tang et.al.	2603.17702	null
2026-03-18	Learning Transferable Temporal Primitives for Video Reasoning via Synthetic Videos	Songtao Jiang et.al.	2603.17693	null
2026-03-18	Few-Step Diffusion Sampling Through Instance-Aware Discretizations	Liangyu Yuan et.al.	2603.17671	null
2026-03-18	FrescoDiffusion: 4K Image-to-Video with Prior-Regularized Tiled Diffusion	Hugo Caselles-Dupré et.al.	2603.17555	null
2026-03-18	Deep Learning-Based Airway Segmentation in Systemic Lupus Erythematosus Patients with Interstitial Lung Disease (SLE-ILD): A Comparative High-Resolution CT Analysis	Sirong Piao et.al.	2603.17547	null
2026-03-18	SHIFT: Motion Alignment in Video Diffusion Models with Adversarial Hybrid Fine-Tuning	Xi Ye et.al.	2603.17426	null
2026-03-18	Structured SIR: Efficient and Expressive Importance-Weighted Inference for High-Dimensional Image Registration	Ivor J. A. Simpson et.al.	2603.17415	null
2026-03-18	A 3D Reconstruction Benchmark for Asset Inspection	James L. Gray et.al.	2603.17358	null
2026-03-17	A Lensless Polarization Camera	Noa Kraicer et.al.	2603.17156	null
2026-03-17	Topology-Preserving Deep Joint Source-Channel Coding for Semantic Communication	Omar Erak et.al.	2603.17126	null
2026-03-17	Surg $Σ$ : A Spectrum of Large-Scale Multimodal Data and Foundation Models for Surgical Intelligence	Zhitao Zeng et.al.	2603.16822	null
2026-03-17	Preserving Vertical Structure in 3D-to-2D Projection for Permafrost Thaw Mapping	Justin McMillen et.al.	2603.16788	null
2026-03-17	Search2Motion: Training-Free Object-Level Motion Control via Attention-Consensus Search	Sainan Liu et.al.	2603.16711	null
2026-03-17	vAccSOL: Efficient and Transparent AI Vision Offloading for Mobile Robots	Adam Zahir et.al.	2603.16685	null
2026-03-17	HistoAtlas: A Pan-Cancer Morphology Atlas Linking Histomics to Molecular Programs and Clinical Outcomes	Pierre-Antoine Bannier et.al.	2603.16587	null
2026-03-17	Fanar 2.0: Arabic Generative AI Stack	FANAR TEAM et.al.	2603.16397	null
2026-03-17	The Era of End-to-End Autonomy: Transitioning from Rule-Based Driving to Large Driving Models	Eduardo Nebot et.al.	2603.16050	null
2026-03-17	Clinical Priors Guided Lung Disease Detection in 3D CT Scans	Kejin Lu et.al.	2603.15143	null
2026-03-16	FlatLands: Generative Floormap Completion From a Single Egocentric View	Subhransu S. Bhattacharjee et.al.	2603.16016	null
2026-03-16	Standardizing Medical Images at Scale for AI	Callen MacPhee et.al.	2603.15980	null
2026-03-16	GLANCE: Gaze-Led Attention Network for Compressed Edge-inference	Neeraj Solanki et.al.	2603.15717	null
2026-03-16	ViFeEdit: A Video-Free Tuner of Your Video Diffusion Transformer	Ruonan Yu et.al.	2603.15478	null
2026-03-16	Seeing Beyond: Extrapolative Domain Adaptive Panoramic Segmentation	Yuanfan Zheng et.al.	2603.15475	null
2026-03-16	Faster Inference of Flow-Based Generative Models via Improved Data-Noise Coupling	Aram Davtyan et.al.	2603.15279	null
2026-03-16	CATFormer: When Continual Learning Meets Spiking Transformers With Dynamic Thresholds	Vaishnavi Nagabhushana et.al.	2603.15184	null

(<a href=#updated-on-20260404>back to top</a>)

Video Editing

Publish Date	Title	Authors	PDF	Code
2026-04-02	VOID: Video Object and Interaction Deletion	Saman Motamed et.al.	2604.02296	null
2026-03-31	CutClaw: Agentic Hours-Long Video Editing via Music Synchronization	Shifang Zhao et.al.	2603.29664	null
2026-03-31	TrajectoryMover: Generative Movement of Object Trajectories in Videos	Kiran Chhatre et.al.	2603.29092	null
2026-03-31	X-World: Controllable Ego-Centric Multi-Camera World Models for Scalable End-to-End Driving	Chaoda Zheng et.al.	2603.19979	null
2026-03-30	AutoCut: End-to-end advertisement video editing based on multimodal discretization and controllable generation	Milton Zhou et.al.	2603.28366	null
2026-03-26	TRACE: Object Motion Editing in Videos with First-Frame Trajectory Guidance	Quynh Phung et.al.	2603.25707	null
2026-03-25	AVControl: Efficient Framework for Training Audio-Visual Controls	Matan Ben-Yosef et.al.	2603.24793	null
2026-03-25	Accelerating Diffusion-based Video Editing via Heterogeneous Caching: Beyond Full Computing at Sampled Denoising Timestep	Tianyi Liu et.al.	2603.24260	null
2026-03-24	RealMaster: Lifting Rendered Scenes into Photorealistic Video	Dana Cohen-Bar et.al.	2603.23462	null
2026-03-20	PerformRecast: Expression and Head Pose Disentanglement for Portrait Video Editing	Jiadong Liang et.al.	2603.19731	null
2026-03-19	SAMA: Factorized Semantic Anchoring and Motion Alignment for Instruction-Guided Video Editing	Xinyao Zhang et.al.	2603.19228	null
2026-03-19	EffectErase: Joint Video Object Removal and Insertion for High-Quality Effect Erasing	Yang Fu et.al.	2603.19224	null
2026-03-18	Versatile Editing of Video Content, Actions, and Dynamics without Training	Vladimir Kulikov et.al.	2603.17989	null
2026-03-18	ChopGrad: Pixel-Wise Losses for Latent Video Diffusion via Truncated Backpropagation	Dmitriy Rivkin et.al.	2603.17812	null
2026-03-18	SkyReels-V4: Multi-modal Video-Audio Generation, Inpainting and Editing model	Guibin Chen et.al.	2602.21818	null
2026-03-17	SparkVSR: Interactive Video Super-Resolution via Sparse Keyframe Propagation	Jiongze Yu et.al.	2603.16864	null
2026-03-14	Script-to-Slide Grounding: Grounding Script Sentences to Slide Objects for Automatic Instructional Video Generation	Rena Suzuki et.al.	2603.16931	null
2026-03-13	GA-Drive: Geometry-Appearance Decoupled Modeling for Free-viewpoint Driving Scene Generation	Hao Zhang et.al.	2602.20673	null
2026-03-10	When to Lock Attention: Training-Free KV Control in Video Diffusion	Tianyi Zeng et.al.	2603.09657	null
2026-03-10	From Ideal to Real: Stable Video Object Removal under Imperfect Conditions	Jiagao Hu et.al.	2603.09283	null
2026-03-06	Place-it-R1: Unlocking Environment-aware Reasoning Potential of MLLM for Video Object Insertion	Bohai Gu et.al.	2603.06140	null
2026-03-06	GenHOI: Towards Object-Consistent Hand-Object Interaction with Temporally Balanced and Spatially Selective Object Injection	Xuan Huang et.al.	2603.06048	null
2026-03-06	Training-free Latent Inter-Frame Pruning with Attention Recovery	Dennis Menn et.al.	2603.05811	null
2026-03-06	Kiwi-Edit: Versatile Video Editing via Instruction and Reference Guidance	Yiqi Lin et.al.	2603.02175	null
2026-03-06	UniVBench: Towards Unified Evaluation for Video Foundation Models	Jianhui Wei et.al.	2602.21835	null
2026-03-03	NOVA: Sparse Control, Dense Synthesis for Pair-Free Video Editing	Tianlin Pan et.al.	2603.02802	null
2026-03-01	FREE-Edit: Using Editing-aware Injection in Rectified Flow Models for Zero-shot Image-Driven Video Editing	Maomao Li et.al.	2603.01164	null
2026-02-25	StoryComposerAI: Supporting Human-AI Story Co-Creation Through Decomposition and Linking	Shuo Niu et.al.	2602.21486	null
2026-02-24	PropFly: Learning to Propagate via On-the-Fly Supervision from Pre-trained Video Diffusion Models	Wonyong Seo et.al.	2602.20583	null
2026-02-16	EditCtrl: Disentangled Local and Global Control for Real-Time Generative Video Editing	Yehonathan Litman et.al.	2602.15031	null

(<a href=#updated-on-20260404>back to top</a>)

Diffusion Models

Publish Date	Title	Authors	PDF	Code
2026-04-02	ActionParty: Multi-Subject Action Binding in Generative Video Games	Alexander Pondaven et.al.	2604.02330	null
2026-04-02	VOID: Video Object and Interaction Deletion	Saman Motamed et.al.	2604.02296	null
2026-04-02	Smoothing the Landscape: Causal Structure Learning via Diffusion Denoising Objectives	Hao Zhu et.al.	2604.02250	null
2026-04-02	Reflection Generation for Composite Image Using Diffusion Model	Haonan Zhao et.al.	2604.02168	null
2026-04-02	Why Gaussian Diffusion Models Fail on Discrete Data?	Alexander Shabalin et.al.	2604.02028	null
2026-04-02	Multiphase cross-diffusion models for tissue structures: modeling, analysis, numerics	Ansgar Jüngel et.al.	2604.01827	null
2026-04-02	SafeRoPE: Risk-specific Head-wise Embedding Rotation for Safe Generation in Rectified Flow Transformers	Xiang Yang et.al.	2604.01826	null
2026-04-02	Control-DINO: Feature Space Conditioning for Controllable Image-to-Video Diffusion	Edoardo A. Dominici et.al.	2604.01761	null
2026-04-02	SteerFlow: Steering Rectified Flows for Faithful Inversion-Based Image Editing	Thinh Dao et.al.	2604.01715	null
2026-04-02	Bias mitigation in graph diffusion models	Meng Yu et.al.	2604.01709	null
2026-04-02	Can Video Diffusion Models Predict Past Frames? Bidirectional Cycle Consistency for Reversible Interpolation	Lingyu Liu et.al.	2604.01700	null
2026-04-02	From Understanding to Erasing: Towards Complete and Stable Video Object Removal	Dingming Liu et.al.	2604.01693	null
2026-04-02	DynaVid: Learning to Generate Highly Dynamic Videos using Synthetic Motion Data	Wonjoon Jin et.al.	2604.01666	null
2026-04-02	Diffusion-Guided Adversarial Perturbation Injection for Generalizable Defense Against Facial Manipulations	Yue Li et.al.	2604.01635	null
2026-04-02	Cross-Domain Vessel Segmentation via Latent Similarity Mining and Iterative Co-Optimization	Zhanqiang Guo et.al.	2604.01553	null
2026-04-01	Learning and Generating Mixed States Prepared by Shallow Channel Circuits	Fangjun Hu et.al.	2604.01197	null
2026-04-01	ReinDriveGen: Reinforcement Post-Training for Out-of-Distribution Driving Scene Generation	Hao Zhang et.al.	2604.01129	null
2026-04-01	Region-Adaptive Generative Compression with Spatially Varying Diffusion Models	Lucas Relic et.al.	2604.01122	null
2026-04-01	Diff-VS: Efficient Audio-Aware Diffusion U-Net for Vocals Separation	Yun-Ning et.al.	2604.01120	null
2026-04-01	Inverse Design of Optical Multilayer Thin Films using Robust Masked Diffusion Models	Jonas Schaible et.al.	2604.01106	null
2026-04-01	PHASOR: Anatomy- and Phase-Consistent Volumetric Diffusion for CT Virtual Contrast Enhancement	Zilong Li et.al.	2604.01053	null
2026-04-01	EmoScene: A Dual-space Dataset for Controllable Affective Image Generation	Li He et.al.	2604.00933	null
2026-04-01	IDDM: Identity-Decoupled Personalized Diffusion Models with a Tunable Privacy-Utility Trade-off	Linyan Dai et.al.	2604.00903	null
2026-04-01	HICT: High-precision 3D CBCT reconstruction from a single X-ray	Wen Ma et.al.	2604.00792	null
2026-04-01	Learnability-Guided Diffusion for Dataset Distillation	Jeffrey A. Chan-Santiago et.al.	2604.00519	null
2026-04-01	Tucker Diffusion Model for High-dimensional Tensor Generation	Jianhua Guo et.al.	2604.00481	null
2026-04-01	Learning Humanoid Navigation from Human Data	Weizhuo Wang et.al.	2604.00416	null
2026-04-01	Deep Networks Favor Simple Data	Weyl Lu et.al.	2604.00394	null
2026-04-01	Behavioral Score Diffusion: Model-Free Trajectory Planning via Kernel-Based Score Estimation from Data	Shihao Li et.al.	2604.00391	null
2026-04-01	mmAnomaly: Leveraging Visual Context for Robust Anomaly Detection in the Non-Visual World with mmWave Radar	Tarik Reza Toha et.al.	2604.00382	null
2026-03-31	Video Models Reason Early: Exploiting Plan Commitment for Maze Solving	Kaleb Newman et.al.	2603.30043	null
2026-03-31	Conditional Diffusion-Based Point Cloud Imaging for UAV Position and Attitude Sensing	Xinhong Dai et.al.	2603.29822	null
2026-03-31	Emotion Diffusion Classifier with Adaptive Margin Discrepancy Training for Facial Expression Recognition	Rongkang Dong et.al.	2603.29578	null
2026-03-31	Total Variation Guarantees for Sampling with Stochastic Localization	Jakob Kellermann et.al.	2603.29555	null
2026-03-31	iPoster: Content-Aware Layout Generation for Interactive Poster Design via Graph-Enhanced Diffusion Models	Xudong Zhou et.al.	2603.29469	null
2026-03-31	NeoNet: An End-to-End 3D MRI-Based Deep Learning Framework for Non-Invasive Prediction of Perineural Invasion via Generation-Driven Classification	Youngung Han et.al.	2603.29449	null
2026-03-31	Ultra-short-term volatility surfaces	Federico M. Bandi et.al.	2603.29430	null
2026-03-31	Multi-AUV Cooperative Target Tracking Based on Supervised Diffusion-Aided Multi-Agent Reinforcement Learning	Jiaao Ma et.al.	2603.29426	null
2026-03-31	Pathogen diversity emerging from coevolutionary dynamics in interconnected systems	Davide Zanchetta et.al.	2603.29398	null
2026-03-31	CIPHER: Counterfeit Image Pattern High-level Examination via Representation	Kyeonghun Kim et.al.	2603.29356	null
2026-03-31	FOSCU: Feasibility of Synthetic MRI Generation via Duo-Diffusion Models for Enhancement of 3D U-Nets in Hepatic Segmentation	Youngung Han et.al.	2603.29343	null
2026-03-31	Differentiable Normative Guidance for Nash Bargaining Solution Recovery	Moirangthem Tiken Singh et.al.	2603.29297	null
2026-03-31	Diffusion Mental Averages	Phonphrm Thawatdamrongkit et.al.	2603.29239	null
2026-03-30	Generating Humanless Environment Walkthroughs from Egocentric Walking Tour Videos	Yujin Ham et.al.	2603.29036	null
2026-03-30	MMFace-DiT: A Dual-Stream Diffusion Transformer for High-Fidelity Multimodal Face Generation	Bharath Krishnamurthy et.al.	2603.29029	null
2026-03-30	Geometry-aware similarity metrics for neural representations on Riemannian and statistical manifolds	N Alex Cayco Gajic et.al.	2603.28764	null
2026-03-30	PoseDreamer: Scalable and Photorealistic Human Data Generation Pipeline with Diffusion Models	Lorenza Prospero et.al.	2603.28763	null
2026-03-30	On-the-fly Repulsion in the Contextual Space for Rich Diversity in Diffusion Transformers	Omer Dahary et.al.	2603.28762	null
2026-03-30	DreamLite: A Lightweight On-Device Unified Model for Image Generation and Editing	Kailai Feng et.al.	2603.28713	null
2026-03-30	Front Location for Go or Grow Models of Aerotaxis	Mete Demircigil et.al.	2603.28663	null
2026-03-30	$R_{dm}$ : Re-conceptualizing Distribution Matching as a Reward for Diffusion Distillation	Linqian Fan et.al.	2603.28460	null
2026-03-30	Deep Research of Deep Research: From Transformer to Agent, From AI to AI for Science	Yipeng Yu et.al.	2603.28361	null
2026-03-30	Intrinsically ultralow thermal conductivity in all-inorganic superatomic bulk crystals	Mingzhang Yang et.al.	2603.28267	null
2026-03-30	ColorFLUX: A Structure-Color Decoupling Framework for Old Photo Colorization	Bingchen Li et.al.	2603.28162	null
2026-03-30	SVGS: Single-View to 3D Object Editing via Gaussian Splatting	Pengcheng Xue et.al.	2603.28126	null
2026-03-30	Attention Frequency Modulation: Training-Free Spectral Modulation of Diffusion Cross-Attention	Seunghun Oh et.al.	2603.28114	null
2026-03-30	Physics-Embedded Feature Learning for AI in Medical Imaging	Pulock Das et.al.	2603.28057	null
2026-03-30	Self-Organizing Score-based Data Assimilation	Yuma Yamaoka et.al.	2603.28048	null
2026-03-30	From Independent to Correlated Diffusion: Generalized Generative Modeling with Probabilistic Computers	Nihal Sanjay Singh et.al.	2603.27996	null
2026-03-30	Beyond Dataset Distillation: Lossless Dataset Concentration via Diffusion-Assisted Distribution Alignment	Tongfei Liu et.al.	2603.27987	null
2026-03-29	Diversity Matters: Dataset Diversification and Dual-Branch Network for Generalized AI-Generated Image Detection	Nusrat Tasnim et.al.	2603.27800	null
2026-03-29	Heracles: Bridging Precise Tracking and Generative Synthesis for General Humanoid Control	Zelin Tao et.al.	2603.27756	null
2026-03-29	Bridging Schrödinger and Bass: A Semimartingale Optimal Transport Problem with Diffusion Control	Pierre Henry-Labordere et.al.	2603.27712	null
2026-03-29	Gated Condition Injection without Multimodal Attention: Towards Controllable Linear-Attention Transformers	Yuhe Liu et.al.	2603.27666	null
2026-03-26	PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference	Xiaofeng Mao et.al.	2603.25730	null
2026-03-26	S2D2: Fast Decoding for Diffusion LLMs via Training-Free Self-Speculation	Ligong Han et.al.	2603.25702	null
2026-03-26	Persistent Robot World Models: Stabilizing Multi-Step Rollouts via Reinforcement Learning	Jai Bardhan et.al.	2603.25685	null
2026-03-26	Beyond the Golden Data: Resolving the Motion-Vision Quality Dilemma via Timestep Selective Training	Xiangyang Luo et.al.	2603.25527	null
2026-03-26	Lightweight GenAI for Network Traffic Synthesis: Fidelity, Augmentation, and Classification	Giampaolo Bovenzi et.al.	2603.25507	null
2026-03-26	Temporally Decoupled Diffusion Planning for Autonomous Driving	Xiang Li et.al.	2603.25462	null
2026-03-26	Language-Free Generative Editing from One Visual Example	Omar Elezabi et.al.	2603.25441	null
2026-03-26	Lingshu-Cell: A generative cellular world model for transcriptome modeling toward virtual cells	Han Zhang et.al.	2603.25240	null
2026-03-26	Free-Lunch Long Video Generation via Layer-Adaptive O.O.D Correction	Jiahao Tian et.al.	2603.25209	null
2026-03-26	CardioDiT: Latent Diffusion Transformers for 4D Cardiac MRI Synthesis	Marvin Seyfarth et.al.	2603.25194	null
2026-03-26	VolDiT: Controllable Volumetric Medical Image Synthesis with Diffusion Transformers	Marvin Seyfarth et.al.	2603.25181	null
2026-03-26	Bilingual Text-to-Motion Generation: A New Benchmark and Baselines	Wanjiang Weng et.al.	2603.25178	null
2026-03-26	A Reaction-Advection-Diffusion Model to describe Non-Uniformities in Colorimetric Sensing using Thin Porous Substrates	Kulkarni Namratha et.al.	2603.25124	null
2026-03-26	Learning Explicit Continuous Motion Representation for Dynamic Gaussian Splatting from Monocular Videos	Xuankai Zhang et.al.	2603.25058	null
2026-03-26	BiFM: Bidirectional Flow Matching for Few-Step Image Editing and Generation	Yasong Dai et.al.	2603.24942	null
2026-03-25	Polynomial Speedup in Diffusion Models with the Multilevel Euler-Maruyama Method	Arthur Jacot et.al.	2603.24594	null
2026-03-25	Anti-I2V: Safeguarding your photos from malicious image-to-video generation	Duc Vu et.al.	2603.24570	null
2026-03-25	Reflected diffusion models adapt to low-dimensional data	Asbjørn Holk et.al.	2603.24495	null
2026-03-25	Analysis and numerical simulation of a spatio-temporal Ricker-type model for the control of Aedes aegypti mosquitoes with Sterile Insect Techniques	Oscar Eduardo Escobar-Lasso et.al.	2603.24460	null
2026-03-25	Teacher-Student Diffusion Model for Text-Driven 3D Hand Motion Generation	Ching-Lam Cheng et.al.	2603.24407	null
2026-03-25	ViHOI: Human-Object Interaction Synthesis with Visual Priors	Songjin Cai et.al.	2603.24383	null
2026-03-25	ScrollScape: Unlocking 32K Image Generation With Video Diffusion Priors	Haodong Yu et.al.	2603.24270	null
2026-03-25	LGTM: Training-Free Light-Guided Text-to-Image Diffusion Model via Initial Noise Manipulation	Ryugo Morita et.al.	2603.24086	null
2026-03-25	When Understanding Becomes a Risk: Authenticity and Safety Risks in the Emerging Image Generation Paradigm	Ye Leng et.al.	2603.24079	null
2026-03-25	HAM: A Training-Free Style Transfer Approach via Heterogeneous Attention Modulation for Diffusion Models	Yeqi He et.al.	2603.24043	null
2026-03-25	Lagrangian Relaxation Score-based Generation for Mixed Integer linear Programming	Ruobing Wang et.al.	2603.24033	null
2026-03-25	DepthArb: Training-Free Depth-Arbitrated Generation for Occlusion-Robust Image Synthesis	Hongjin Niu et.al.	2603.23924	null
2026-03-25	Latent Bias Alignment for High-Fidelity Diffusion Inversion in Real-World Image Reconstruction and Manipulation	Weiming Chen et.al.	2603.23903	null
2026-03-25	A simple model for conserved intracellular dynamics exhibits multiscale pattern formation, traveling protein domains and arrested coarsening of lipids in the membrane	Benjamin Winkler et.al.	2603.23856	null
2026-03-25	3D-LLDM: Label-Guided 3D Latent Diffusion Model for Improving High-Resolution Synthetic MR Imaging in Hepatic Structure Segmentation	Kyeonghun Kim et.al.	2603.23845	null
2026-03-24	DA-Flow: Degradation-Aware Optical Flow Estimation with Diffusion Models	Jaewon Min et.al.	2603.23499	null
2026-03-24	Foveated Diffusion: Efficient Spatially Adaptive Image and Video Generation	Brian Chao et.al.	2603.23491	null
2026-03-24	RealMaster: Lifting Rendered Scenes into Photorealistic Video	Dana Cohen-Bar et.al.	2603.23462	null
2026-03-24	Graph Energy Matching: Transport-Aligned Energy-Based Modeling for Graph Generation	Michal Balcerak et.al.	2603.23398	null
2026-03-24	Markov State–Space Modeling and Channel Characterization for DNA-Based Molecular Communication	Ruifeng Zheng et.al.	2603.23394	null
2026-03-24	FG-Portrait: 3D Flow Guided Editable Portrait Animation	Yating Xu et.al.	2603.23381	null
2026-03-24	ViBe: Ultra-High-Resolution Video Synthesis Born from Pure Images	Yunfeng Wu et.al.	2603.23326	null
2026-03-24	Permutation-Symmetrized Diffusion for Unconditional Molecular Generation	Gyeonghoon Ko et.al.	2603.23255	null
2026-03-24	GO-Renderer: Generative Object Rendering with 3D-aware Controllable Video Diffusion Models	Zekai Gu et.al.	2603.23246	null
2026-03-24	AeroScene: Progressive Scene Synthesis for Aerial Robotics	Nghia Vu et.al.	2603.23224	null
2026-03-24	Gimbal360: Differentiable Auto-Leveling for Canonicalized $360^\circ$ Panoramic Image Completion	Yuqin Lu et.al.	2603.23179	null
2026-03-24	Policy-based Tuning of Autoregressive Image Models with Instance- and Distribution-Level Rewards	Orhun Buğra Baran et.al.	2603.23086	null
2026-03-24	Zero-Shot Personalization of Objects via Textual Inversion	Aniket Roy et.al.	2603.23010	null
2026-03-24	Markov-Enforced Discrete Diffusion Model for Digital Semantic Symbol Error Correction	Yoon Huh et.al.	2603.22983	null
2026-03-24	Asymptotic Learning Curves for Diffusion Models with Random Features Score and Manifold Data	Anand Jerry George et.al.	2603.22962	null
2026-03-23	End-to-End Training for Unified Tokenization and Latent Denoising	Shivam Duggal et.al.	2603.22283	null
2026-03-23	Repurposing Geometric Foundation Models for Multi-view Diffusion	Wooseok Jang et.al.	2603.22275	null
2026-03-23	DUO-VSR: Dual-Stream Distillation for One-Step Video Super-Resolution	Zhengyao Lv et.al.	2603.22271	null
2026-03-23	SpatialReward: Verifiable Spatial Reward Modeling for Fine-Grained Spatial Consistency in Text-to-Image Generation	Sashuai Zhou et.al.	2603.22228	null
2026-03-23	DA-VAE: Plug-in Latent Compression for Diffusion via Detail Alignment	Xin Cai et.al.	2603.22125	null
2026-03-23	DTVI: Dual-Stage Textual and Visual Intervention for Safe Text-to-Image Generation	Binhong Tan et.al.	2603.22041	null
2026-03-23	APEG: Adaptive Physical Layer Authentication with Channel Extrapolation and Generative AI	Xiqi Cheng et.al.	2603.21923	null
2026-03-23	CLEAR: Context-Aware Learning with End-to-End Mask-Free Inference for Adaptive Video Subtitle Removal	Qingdong He et.al.	2603.21901	null
2026-03-23	ADaFuSE: Adaptive Diffusion-generated Image and Text Fusion for Interactive Text-to-Image Retrieval	Zhuocheng Zhang et.al.	2603.21886	null
2026-03-23	Not All Layers Are Created Equal: Adaptive LoRA Ranks for Personalized Image Generation	Donald Shenaj et.al.	2603.21884	null
2026-03-23	Adaptive Video Distillation: Mitigating Oversaturation and Temporal Collapse in Few-Step Generation	Yuyang You et.al.	2603.21864	null
2026-03-23	Climate Prompting: Generating the Madden-Julian Oscillation using Video Diffusion and Low-Dimensional Conditioning	Sulian Thual et.al.	2603.21856	null
2026-03-23	A hybrid wavelet-based physics-informed neural network for portfolio management	Bahadur Yadav et.al.	2603.21834	null
2026-03-23	Cognitive Agency Surrender: Defending Epistemic Sovereignty via Scaffolded AI Friction	Kuangzhe Xu et.al.	2603.21735	null
2026-03-23	Unimodular Diffusion and Interacting Vacuum Cosmology	Gopal Kashyap et.al.	2603.21675	null
2026-03-23	DiT-Flow: Speech Enhancement Robust to Multiple Distortions based on Flow Matching in Latent Space and Diffusion Transformers	Tianyu Cao et.al.	2603.21608	null
2026-03-23	PROBE: Diagnosing Residual Concept Capacity in Erased Text-to-Video Diffusion Models	Yiwei Xie et.al.	2603.21547	null
2026-03-23	Empirical Evaluation of Link Deletion Methods for Limiting Information Diffusion on Social Media	Shiori Furukawa et.al.	2603.21470	null
2026-03-22	Is the future of AI green? What can innovation diffusion models say about generative AI’s environmental impact?	Robert Viseur et.al.	2603.21419	null
2026-03-22	An InSAR Phase Unwrapping Framework for Large-scale and Complex Events	Yijia Song et.al.	2603.21378	null
2026-03-22	Efficient Coarse-to-Fine Diffusion Models with Time Step Sequence Redistribution	Yu-Shan Tai et.al.	2603.21348	null
2026-03-20	LumosX: Relate Any Identities with Their Attributes for Personalized Video Generation	Jiazheng Xing et.al.	2603.20192	null
2026-03-20	Wildfire Spread Scenarios: Increasing Sample Diversity of Segmentation Diffusion Models with Training-Free Methods	Sebastian Gerard et.al.	2603.20188	null
2026-03-20	Beyond Single Tokens: Distilling Discrete Diffusion Models via Discrete MMD	Emiel Hoogeboom et.al.	2603.20155	null
2026-03-20	How Out-of-Equilibrium Phase Transitions can Seed Pattern Formation in Trained Diffusion Models	Luca Ambrogioni et.al.	2603.20092	null
2026-03-20	Timestep-Aware Block Masking for Efficient Diffusion Model Inference	Haodong He et.al.	2603.19939	null
2026-03-20	A distribution-free lattice Boltzmann method for compartmental reaction-diffusion systems with application to epidemic modelling	Alessandro De Rosis et.al.	2603.19789	null
2026-03-20	Diminishing Returns in Expanding Generative Models and Godel-Tarski-Lob Limits	Angshul Majumdar et.al.	2603.19687	null
2026-03-20	ATHENA: Adaptive Test-Time Steering for Improving Count Fidelity in Diffusion Models	Mohammad Shahab Sepehri et.al.	2603.19676	null
2026-03-20	Making Video Models Adhere to User Intent with Minor Adjustments	Daniel Ajisafe et.al.	2603.19672	null
2026-03-20	OmniDiT: Extending Diffusion Transformer to Omni-VTON Framework	Weixuan Zeng et.al.	2603.19643	null
2026-03-20	On the role of memorization in learned priors for geophysical inverse problems	Ali Siahkoohi et.al.	2603.19629	null
2026-03-20	MagicSeg: Open-World Segmentation Pretraining via Counterfactural Diffusion-Based Auto-Generation	Kaixin Cai et.al.	2603.19575	null
2026-03-20	Accelerating Diffusion Decoders via Multi-Scale Sampling and One-Step Distillation	Chuhan Wang et.al.	2603.19570	null
2026-03-19	TRACE: Trajectory Recovery with State Propagation Diffusion for Urban Mobility	Jinming Wang et.al.	2603.19474	null
2026-03-19	TuLaBM: Tumor-Biased Latent Bridge Matching for Contrast-Enhanced MRI Synthesis	Atharva Rege et.al.	2603.19386	null
2026-03-19	Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding	Xianjin Wu et.al.	2603.19235	null
2026-03-19	Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer	Chenyang Gu et.al.	2603.19227	null
2026-03-19	Spectrally-Guided Diffusion Noise Schedules	Carlos Esteves et.al.	2603.19222	null
2026-03-19	Rethinking Vector Field Learning for Generative Segmentation	Chaoyang Wang et.al.	2603.19218	null
2026-03-19	RPiAE: A Representation-Pivoted Autoencoder Enhancing Both Image Generation and Editing	Yue Gong et.al.	2603.19206	null
2026-03-19	MIDST Challenge at SaTML 2025: Membership Inference over Diffusion-models-based Synthetic Tabular data	Masoumeh Shafieinejad et.al.	2603.19185	null
2026-03-19	ADAPT: Attention Driven Adaptive Prompt Scheduling and InTerpolating Orthogonal Complements for Rare Concepts Generation	Kwanyoung Lee et.al.	2603.19157	null
2026-03-19	D5P4: Partition Determinantal Point Process for Diversity in Parallel Discrete Diffusion Decoding	Jonathan Lys et.al.	2603.19146	null
2026-03-19	Revisiting Autoregressive Models for Generative Image Classification	Ilia Sudakov et.al.	2603.19122	null
2026-03-19	FUMO: Prior-Modulated Diffusion for Single Image Reflection Removal	Telang Xu et.al.	2603.19036	null
2026-03-19	Foundations of Schrödinger Bridges for Generative Modeling	Sophia Tang et.al.	2603.18992	null
2026-03-19	CRAFT: Aligning Diffusion Models with Fine-Tuning Is Easier Than You Think	Zening Sun et.al.	2603.18991	null
2026-03-19	Neural Galerkin Normalizing Flow for Transition Probability Density Functions of Diffusion Models	Riccardo Saporiti et.al.	2603.18907	null
2026-03-19	Translating MRI to PET through Conditional Diffusion Models with Enhanced Pathology Awareness	Yitong Li et.al.	2603.18896	null
2026-03-19	RadioDiff-FS: Physics-Informed Manifold Alignment in Few-Shot Diffusion Models for High-Fidelity Radio Map Construction	Xiucheng Wang et.al.	2603.18865	null
2026-03-18	AHOY! Animatable Humans under Occlusion from YouTube Videos with Gaussian Splatting and Video Diffusion Priors	Aymen Mir et.al.	2603.17975	null
2026-03-18	LaDe: Unified Multi-Layered Graphic Media Generation and Decomposition	Vlad-Constantin Lungu-Stan et.al.	2603.17965	null
2026-03-18	Generative Control as Optimization: Time Unconditional Flow Matching for Adaptive and Robust Robotic Control	Zunzhe Zhang et.al.	2603.17834	null
2026-03-18	TINA: Text-Free Inversion Attack for Unlearned Text-to-Image Diffusion Models	Qianlong Xiang et.al.	2603.17828	null
2026-03-18	ChopGrad: Pixel-Wise Losses for Latent Video Diffusion via Truncated Backpropagation	Dmitriy Rivkin et.al.	2603.17812	null
2026-03-18	CrowdGaussian: Reconstructing High-Fidelity 3D Gaussians for Human Crowd from a Single Image	Yizheng Song et.al.	2603.17779	null
2026-03-18	Towards Infinitely Long Neural Simulations: Self-Refining Neural Surrogate Models for Dynamical Systems	Qi Liu et.al.	2603.17750	null
2026-03-18	TAPESTRY: From Geometry to Appearance via Consistent Turntable Videos	Yan Zeng et.al.	2603.17735	null
2026-03-18	Adaptive Guidance for Retrieval-Augmented Masked Diffusion Models	Jaemin Kim et.al.	2603.17677	null
2026-03-18	Proof-of-Authorship for Diffusion-based AI Generated Content	De Zhang Lee et.al.	2603.17513	null
2026-03-18	A Tutorial on Learning-Based Radio Map Construction: Data, Paradigms, and Physics-Awarenes	Xiucheng Wang et.al.	2603.17499	null
2026-03-18	SHIFT: Motion Alignment in Video Diffusion Models with Adversarial Hybrid Fine-Tuning	Xi Ye et.al.	2603.17426	null
2026-03-18	Joint Degradation-Aware Arbitrary-Scale Super-Resolution for Variable-Rate Extreme Image Compression	Xinning Chai et.al.	2603.17408	null
2026-03-18	Motion-Adaptive Temporal Attention for Lightweight Video Generation with Stable Diffusion	Rui Hong et.al.	2603.17398	null
2026-03-18	Toward Phonology-Guided Sign Language Motion Generation: A Diffusion Baseline and Conditioning Analysis	Rui Hong et.al.	2603.17388	null
2026-03-17	V-Co: A Closer Look at Visual Representation Alignment via Co-Denoising	Han Lin et.al.	2603.16792	null
2026-03-17	Semi-supervised Latent Disentangled Diffusion Model for Textile Pattern Generation	Chenggong Hu et.al.	2603.16747	null
2026-03-17	World Reconstruction From Inconsistent Views	Lukas Höllein et.al.	2603.16736	null
2026-03-17	Self-Aware Markov Models for Discrete Reasoning	Gregor Kornhardt et.al.	2603.16661	null
2026-03-17	Face2Scene: Using Facial Degradation as an Oracle for Diffusion-Based Scene Restoration	Amirhossein Kazerouni et.al.	2603.16570	null
2026-03-17	Robust Physics-Guided Diffusion for Full-Waveform Inversion	Jishen Peng et.al.	2603.16393	null
2026-03-17	Encoding Predictability and Legibility for Style-Conditioned Diffusion Policy	Adrien Jacquet Crétides et.al.	2603.16368	null
2026-03-17	$D^3$-RSMDE: 40$\times$ Faster and High-Fidelity Remote Sensing Monocular Depth Estimation	Ruizhi Wang et.al.	2603.16362	null
2026-03-17	Iris: Bringing Real-World Priors into Diffusion Model for Monocular Depth Estimation	Xinhao Cai et.al.	2603.16340	null
2026-03-17	Probabilistic reconstruction of global sea surface temperature using generative diffusion models	Haijie Li et.al.	2603.16272	null
2026-03-17	VIGOR: VIdeo Geometry-Oriented Reward for Temporal Generative Alignment	Tengjiao Yin et.al.	2603.16271	null
2026-03-17	Leveling3D: Leveling Up 3D Reconstruction with Feed-Forward 3D Gaussian Splatting and Geometry-Aware Generation	Yiming Huang et.al.	2603.16211	null
2026-03-17	Physics-guided diffusion models for inverse design of disordered metamaterials	Ziyuan Xie et.al.	2603.16209	null
2026-03-17	S-VAM: Shortcut Video-Action Model by Self-Distilling Geometric and Semantic Foresight	Haodong Yan et.al.	2603.16195	null
2026-03-17	When Generative Augmentation Hurts: A Benchmark Study of GAN and Diffusion Models for Bias Correction in AI Classification Systems	Shesh Narayan Gupta et.al.	2603.16134	null

(<a href=#updated-on-20260404>back to top</a>)

Real-time Generation

Publish Date	Title	Authors	PDF	Code
2026-04-02	AromaGen: Interactive Generation of Rich Olfactory Experiences with Multimodal Language Models	Yunge Wen et.al.	2604.01650	null
2026-03-31	OmniRoam: World Wandering via Long-Horizon Panoramic Video Generation	Yuheng Liu et.al.	2603.30045	null
2026-03-31	From Skeletons to Semantics: Design and Deployment of a Hybrid Edge-Based Action Detection System for Public Safety	Ganen Sethupathy et.al.	2603.29777	null
2026-03-31	$R_\text{dm}$ : Re-conceptualizing Distribution Matching as a Reward for Diffusion Distillation	Linqian Fan et.al.	2603.28460	null
2026-03-28	Fair Benchmarking of Emerging One-Step Generative Models Against Multistep Diffusion and Flow Models	Advaith Ravishankar et.al.	2603.14186	null
2026-03-27	LagerNVS: Latent Geometry for Fully Neural Real-time Novel View Synthesis	Stanislaw Szymanowicz et.al.	2603.20176	null
2026-03-23	DUO-VSR: Dual-Stream Distillation for One-Step Video Super-Resolution	Zhengyao Lv et.al.	2603.22271	null
2026-03-23	Adaptive Video Distillation: Mitigating Oversaturation and Temporal Collapse in Few-Step Generation	Yuyang You et.al.	2603.21864	null
2026-03-22	Implicit Maximum Likelihood Estimation for Real-time Generative Model Predictive Control	Grayson Lee et.al.	2603.13733	null
2026-03-21	Smart Operation Theatre: An AI-based System for Surgical Gauze Counting	Saraf Krish et.al.	2603.20752	null
2026-03-19	cuGenOpt: A GPU-Accelerated General-Purpose Metaheuristic Framework for Combinatorial Optimization	Yuyang Liu et.al.	2603.19163	null
2026-03-19	Training-Free Sparse Attention for Fast Video Generation via Offline Layer-Wise Sparsity Profiling and Online Bidirectional Co-Clustering	Jiayi Luo et.al.	2603.18636	null
2026-03-18	Fast Beam-Brainstorm: Few-Step Generative Site-Specific Beamforming with Flexible Probing	Zihao Zhou et.al.	2603.17622	null
2026-03-18	Motion-Adaptive Temporal Attention for Lightweight Video Generation with Stable Diffusion	Rui Hong et.al.	2603.17398	null
2026-03-17	Unlearning for One-Step Generative Models via Unbalanced Optimal Transport	Hyundo Choi et.al.	2603.16489	null
2026-03-16	GDPO-SR: Group Direct Preference Optimization for One-Step Generative Image Super-Resolution	Qiaosi Yi et.al.	2603.16769	null
2026-03-16	Preconditioned One-Step Generative Modeling for Bayesian Inverse Problems in Function Spaces	Zilan Cheng et.al.	2603.14798	null
2026-03-15	GoldenStart: Q-Guided Priors and Entropy Control for Distilling Flow Policies	He Zhang et.al.	2603.14245	null
2026-03-12	Sinkhorn-Drifting Generative Models	Ping He et.al.	2603.12366	null
2026-03-12	FlashMotion: Few-Step Controllable Video Generation with Trajectory Guidance	Quanhao Li et.al.	2603.12146	null
2026-03-12	InSpatio-WorldFM: An Open-Source Real-Time Generative Frame Model	InSpatio Team et.al.	2603.11911	null
2026-03-11	Auroral Acceleration Generates Electron Beams in Jupiter’s Middle Magnetosphere	June Piasecki et.al.	2603.10760	null
2026-03-11	Riemannian MeanFlow for One-Step Generation on Manifolds	Zichen Zhong et.al.	2603.10718	null
2026-03-11	AlphaFlowTSE: One-Step Generative Target Speaker Extraction via Conditional AlphaFlow	Duojia Li et.al.	2603.10701	null
2026-03-10	FrameDiT: Diffusion Transformer with Frame-Level Matrix Attention for Efficient Video Generation	Minh Khoa Le et.al.	2603.09721	null
2026-03-09	WaDi: Weight Direction-aware Distillation for One-step Image Synthesis	Lei Wang et.al.	2603.08258	null
2026-03-08	TDM-R1: Reinforcing Few-Step Diffusion Models with Non-Differentiable Reward	Yihong Luo et.al.	2603.07700	null

(<a href=#updated-on-20260404>back to top</a>)

DiT Acceleration

Publish Date	Title	Authors	PDF	Code
2026-03-27	FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation	Dong Liu et.al.	2505.20353	null
2026-03-13	AccelAes: Accelerating Diffusion Transformers for Training-Free Aesthetic-Enhanced Image Generation	Xuanhua Yin et.al.	2603.12575	null
2026-03-05	Frequency-Aware Error-Bounded Caching for Accelerating Diffusion Transformers	Guandong Li et.al.	2603.05315	null
2026-02-28	Plug-and-Play Fidelity Optimization for Diffusion Transformer Acceleration via Cumulative Error Minimization	Tong Shao et.al.	2512.23258	null
2026-02-28	BWCache: Accelerating Video Diffusion Transformers through Block-Wise Caching	Hanshuai Cui et.al.	2509.13789	null
2026-02-13	ProCache: Constraint-Aware Feature Caching with Selective Computation for Diffusion Transformer Acceleration	Fanpu Cao et.al.	2512.17298	null
2026-02-11	SnapGen++: Unleashing Diffusion Transformers for Efficient High-Fidelity Image Generation on Edge Devices	Dongting Hu et.al.	2601.08303	null
2026-01-28	StreamFusion: Scalable Sequence Parallelism for Distributed Inference of Diffusion Transformers on GPUs	Jiacheng Yang et.al.	2601.20273	null
2026-01-15	TetriServe: Efficient DiT Serving for Heterogeneous Image Generation	Runyu Lu et.al.	2510.01565	null
2026-01-09	Sprint: Sparse-Dense Residual Fusion for Efficient Diffusion Transformers	Dogyun Park et.al.	2510.21986	null
2025-12-30	Bidirectional Sparse Attention for Faster Video Diffusion Training	Chenlu Zhan et.al.	2509.01085	null
2025-12-16	OUSAC: Optimized Guidance Scheduling with Adaptive Caching for DiT Acceleration	Ruitong Sun et.al.	2512.14096	null
2025-09-23	Optimizing Inference in Transformer-Based Models: A Multi-Method Benchmark	Siu Hang Ho et.al.	2509.17894	null
2025-08-26	*Direction Informed Trees (DIT): Optimal Path Planning via Direction Filter and Direction Cost Heuristic**	Liding Zhang et.al.	2508.19168	null
2025-05-16	Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration	Haipeng Fang et.al.	2505.11707	null

(<a href=#updated-on-20260404>back to top</a>)

Notes:

We have modified the sorting rule of the above table to prioritize papers based on the time of their latest update rather than their initial publication date. If an article has been recently modified, it will appear earlier in the list.

Function added:

Support more reliable text parser. Link
Support rich markdown format (better at parsing experimental tables). Link

This site is open source. Improve this page.