Henrique Morimitsu (恩瑞) is a member of the PRIR lab, where he works with Prof. Xiaobin Zhu (祝晓斌) and Prof. Xu-Cheng Yin (殷绪成). His research interests are mainly in the field of video motion estimation, especially focused on developing efficient and cost-effective methods.
CVPR 2025 Outstanding Reviewer. I am honored to have been selected as an Outstanding Reviewer for CVPR 2025, a distinction awarded to the top 5% of the 9872 reviewers for providing high-quality feedback.
Optical flow estimation is essential for video processing tasks, such as restoration and action recognition. The quality of videos is constantly increasing, with current standards reaching 8K resolution. However, optical flow methods are usually designed for low resolution and do not generalize to large inputs due to their rigid architectures. They adopt downscaling or input tiling to reduce the input size, causing a loss of details and global information. There is also a lack of optical flow benchmarks to judge the actual performance of existing methods on high-resolution samples. Previous works only conducted qualitative high-resolution evaluations on hand-picked samples. This paper fills this gap in optical flow estimation in two ways. We propose DPFlow, an adaptive optical flow architecture capable of generalizing up to 8K resolution inputs while trained with only low-resolution samples. We also introduce Kubric-NK, a new benchmark for evaluating optical flow methods with input resolutions ranging from 1K to 8K. Our high-resolution evaluation pushes the boundaries of existing methods and reveals new insights about their generalization capabilities. Extensive experimental results show that DPFlow achieves state-of-the-art results on the MPI-Sintel, KITTI 2015, Spring, and other high-resolution benchmarks.
@inproceedings{Morimitsu2025DPFlowAdaptiveOptical,author={Morimitsu, Henrique and Zhu, Xiaobin and Cesar-Jr., Roberto M. and Ji, Xiangyang and Yin, Xu-Cheng},booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition},title={{DPFlow}: Adaptive Optical Flow Estimation with a Dual-Pyramid Framework},year={2025},doi={https://doi.org/10.1109/CVPR52734.2025.01659},}
RAPIDFlow: Recurrent Adaptable Pyramids with Iterative Decoding for Efficient Optical Flow Estimation
Extracting motion information from videos with optical flow estimation is vital in multiple practical robot applications. Current optical flow approaches show remarkable accuracy, but top-performing methods have high computational costs and are unsuitable for embedded devices. Although some previous works have focused on developing low-cost optical flow strategies, their estimation quality has a noticeable gap with more robust methods. In this paper, we develop a novel method to efficiently estimate high-quality optical flow in embedded devices. Our proposed RAPIDFlow model combines efficient NeXt1D convolution blocks with a fully recurrent structure based on feature pyramids to decrease computational costs without significantly impacting estimation accuracy. The adaptable recurrent encoder produces multi-scale features with a single shared block, which allows us to adjust the pyramid length at inference time and make it more robust to changes in input size. Also, it enables our model to offer multiple tradeoffs between accuracy and speed to suit different applications. Experiments using a Jetson Orin NX embedded system on the MPI-Sintel and KITTI public benchmarks show that RAPIDFlow outperforms previous approaches by significant margins at faster speeds.
@inproceedings{Morimitsu2024RAPIDFlowRecurrentAdaptable,author={Morimitsu, Henrique and Zhu, Xiaobin and Cesar-Jr., Roberto M. and Ji, Xiangyang and Yin, Xu-Cheng},booktitle={International Conference on Robotics and Automation},title={{RAPIDFlow}: {Recurrent Adaptable Pyramids with Iterative Decoding} for Efficient Optical Flow Estimation},year={2024},doi={10.1109/ICRA57147.2024.10610277},}
Recurrent Partial Kernel Network for Efficient Optical Flow Estimation
Optical flow estimation is a challenging task consisting of predicting per-pixel motion vectors between images. Recent methods have employed larger and more complex models to improve the estimation accuracy. However, this impacts the widespread adoption of optical flow methods and makes it harder to train more general models since the optical flow data is hard to obtain. This paper proposes a small and efficient model for optical flow estimation. We design a new spatial recurrent encoder that extracts discriminative features at a significantly reduced size. Unlike standard recurrent units, we utilize Partial Kernel Convolution (PKConv) layers to produce variable multi-scale features with a single shared block. We also design efficient Separable Large Kernels (SLK) to capture large context information with low computational cost. Experiments on public benchmarks show that we achieve state-of-the-art generalization performance while requiring significantly fewer parameters and memory than competing methods. Our model ranks first in the Spring benchmark without finetuning, improving the results by over 10% while requiring an order of magnitude fewer FLOPs and over four times less memory than the following published method without finetuning.
@inproceedings{Morimitsu2024RecurrentPartialKernel,author={Morimitsu, Henrique and Zhu, Xiaobin and Ji, Xiangyang and Yin, Xu-Cheng},booktitle={AAAI Conference on Artificial Intelligence},title={Recurrent Partial Kernel Network for Efficient Optical Flow Estimation},year={2024},doi={10.1609/aaai.v38i5.28224},}