

About
I am an AI Research Engineer specializing in Multi-Modal AI, Vision-Language Models (VLM), and 3D Computer Vision. Currently pursuing my Ph.D. at DGIST, I focus on developing robust AI systems that bridge the gap between complex visual and textual data. With extensive R&D experience collaborating with domestic and international industry leaders like Hyundai Motor Company, Honda, Elith, and ETRI, I am always open to new opportunities. I welcome industry collaborations, research discussions, and job offers. Please feel free to contact me anytime!
News
- Mar 2026: Paper accepted at CVPR 2026: "CVA: Context-aware Video-text Alignment for Video Temporal Grounding"
- Sep 2025: Won 1st Place in ICCV 2025 Amazon Grocery Vision Challenge (TAL & STAL tracks)
- Jul 2025: Completed projects with Huvitz on real-time 3D reconstruction using LCD, PGO
- Dec 2024: Completed projects with ETRI on pill detection and recognition
- Nov 2024: Completed projects with HD Korea Shipbuilding & Offshore Engineering on camera calibration
Major Projects
CVA: Context-aware Video-text Alignment for Video Temporal Grounding
Sep 2025 - Dec 2025DGIST (CVPR 2026)
Proposed a novel framework for Video Temporal Grounding that achieves temporally sensitive video-text alignment robust to irrelevant background context, built on three key components: QCD, CBD loss, and CTE.
Achievement: Achieved State-of-the-Art performance on three major benchmarks: QVHighlights, Charades-STA, and TACoS. Paper accepted to CVPR 2026.
Amazon Grocery Vision Challenge (ICCV 2025)
Aug 2025 - Sep 2025Amazon & DGIST
Develop a multi-modal AI model for Temporal Action Localization (TAL) and Spatio-Temporal Action Localization (STAL) in grocery shopping scenarios.
Achievement: Achieved 1st place in both TAL and STAL tracks within just 1 month of development. Successfully deployed multi-modal model achieving state-of-the-art performance on Amazon grocery dataset.
Real time 3D Reconstruction using Dental Scanner
Jun 2024 - Jul 2025Huvitz
Develop a real-time 3D reconstruction system utilizing Iterative Closest Point (ICP), Loop Closure Detection (LCD), and Pose Graph Optimization (PGO) for robust and accurate mapping.
Achievement: Improved speed by up to 80% compared to the existing algorithm without performance degradation.
Development of a 3D Pose Estimation and Shape Reconstruction Program for Solid Pharmaceuticals
Sep 2024 - Dec 2024ETRI
Developed a prototype system to estimate 3D pose and reconstruct shapes of solid pharmaceuticals, enabling automatic pill detection, recognition, and counting without additional training.
Achievement: Demonstrated accurate pill classification and counting, showcasing potential for automated pharmaceutical management.
Algorithm Development for Automated Image Processing of Stereo Cameras
Sep 2024 - Nov 2024HD Korea Shipbuilding & Offshore Engineering
To design and implement core algorithms enabling automated image processing for stereo camera systems.
Achievement: Delivered a prototype calibration module and contributed to automation pipeline design. Further technical details remain confidential due to project agreements.
View all 11 projects →
R&D of AI Test and Evaluation Standard Model
Oct 2023 - Jun 2024ROKA Headquarters
Create a standard military training/test dataset and build a baseline AI model for introducing various AI weapon systems in the Army.
Achievement: Established initial standards for the Military Performance Certification Center (including dataset construction, baseline model development, and formulation of various strategies).
Establishment of Test and Evaluation Standards for AI Weapon Systems
Mar 2023 - Jun 2024ROKA Headquarters, U.S. Department of Defense
Develop new testing and evaluation standards for AI weapon systems, which differ significantly from traditional weapon systems.
Achievement: Established initial standards for the Military Performance Certification Center (including dataset construction, baseline model development, and formulation of various strategies).
Military Scientific Surveillance System
Mar 2023 - Sep 2023ROKA Headquarters
Reduce false/missed detections and improve true detections by building an AI-based surveillance system.
Achievement: Reduced false positives by 10% compared to the existing system.
Development of Car Location and Speed Estimation Module Using CCTV Footage
Aug 2022 - Dec 2022ETRI
Develop a module capable of estimating vehicle position and speed solely from CCTV video data.
Achievement: Achieved over 90% accuracy in vehicle speed estimation on the target dataset.
Robust Monocular Camera 3D Object Detection in Various Camera Environments
Mar 2021 - Jun 2022Hyundai
Improve the robustness of monocular camera-based 3D object detection, addressing significant performance degradation caused by varying camera environments.
Achievement: Diagnosed key factors affecting model accuracy and significantly improved performance: Accuracy increased from 20% to 80% for a 3-degree angle variation. Accuracy increased from 1% to 50% for a 5-degree angle variation. Research findings contributed to international patents and publications(CVPRw 2024).
3D Building Exterior Reconstruction
Aug 2020 - Dec 2020KETI
Develop a 3D reconstruction module using monocular images.
Achievement: Successfully built a 3D reconstruction module that processes monocular images to generate 3D structures.
Publications
CVA: Context-aware Video-text Alignment for Video Temporal Grounding
Sungho Moon*, Seunghun Lee*, Jiwan Seo, Sunghoon Im†
CVPR 2026
Proposed a context-aware framework for video temporal grounding with Query-aware Context Diversification, Context-invariant Boundary Discrimination loss, and Context-enhanced Transformer Encoder. Achieved SOTA on QVHighlights, Charades-STA, and TACoS.
Rotation Matters: Generalized Monocular 3D Object Detection for Various Camera Systems
SungHo Moon, JinWoo Bae, SungHoon Im
CVPR Workshop 2023
Proposed a generalized approach for monocular 3D object detection that addresses performance degradation caused by varying camera orientations and systems.









