Adapting SynSin for Enhanced Video Frame Interpolation via Temporal Feature Fusion and Depth Consistency

Main Article Content

Ghasaq M Hameed

Abstract

Frame interpolation seeks to synthesize temporally consistent in-between frames that enhance video smoothness and visual continuity. We revisit SynSin—originally designed for single-image novel view synthesis—and reformulate it for interpolation by introducing (i) dual-frame input handling, (ii) temporal feature fusion and explicit temporal encoding, (iii) a dedicated interpolation module, (iv) depth-consistency constraints across inputs and the synthesized frame, and (v) a refinement stage to suppress artifacts. Training uses reconstruction-oriented loss while evaluation reports MSE, SSIM, and PSNR. We assess the approach on the indoor 7Scenes benchmark. The modified model yields validation MSE = 0.0011 and SSIM = 0.943, improving over an unmodified baseline (MSE = 0.0033, SSIM = 0.9327), a 66.7 % error reduction, and a +0.0103 absolute SSIM gain. These results indicate that a view-synthesis backbone can be effectively adapted to temporal synthesis, offering a simple, data-efficient route to competitive interpolation quality useful for video editing, animation, and streaming. Beyond performance, our study highlights a practical pathway for repurposing view-synthesis architectures to broader video tasks, encouraging unified designs that share geometry-aware depth reasoning and temporal modeling.

Article Details

Section

Articles

How to Cite

Adapting SynSin for Enhanced Video Frame Interpolation via Temporal Feature Fusion and Depth Consistency. (2026). Pharaonic Journal of Science, 2(1), 28-36. https://doi.org/10.71428/PJS.2026.0103