SungHo Moon

Ph.D. Candidate, DGIST · AI Research Engineer

Recent Achievements

CVPR 2026·ICCV 2025 1st Place·10+ Industry Projects

About

AI Research Engineer with a track record spanning top-tier publications (CVPR, AAAI) and real-world deployments across 10+ industry collaborations. Pursuing my Ph.D. at DGIST, I specialize in Multi-Modal AI, Vision-Language Models, and 3D Computer Vision — building systems that bridge research and production. I have worked closely with Hyundai Motor Company, Honda, Elith, ETRI, and more. Always open to industry collaborations, research discussions, and new opportunities.

News

Mar 2026: Paper accepted at CVPR 2026: "CVA: Context-aware Video-text Alignment for Video Temporal Grounding"
Sep 2025: Won 1st Place in ICCV 2025 Amazon Grocery Vision Challenge (TAL & STAL tracks)
Jul 2025: Completed projects with Huvitz on real-time 3D reconstruction using LCD, PGO
Dec 2024: Completed projects with ETRI on pill detection and recognition
Nov 2024: Completed projects with HD Korea Shipbuilding & Offshore Engineering on camera calibration

Major Projects

Research

CVA: Context-aware Video-text Alignment for Video Temporal Grounding

Sep 2025 - Dec 2025

DGIST (CVPR 2026)

Proposed a novel framework for Video Temporal Grounding that achieves temporally sensitive video-text alignment robust to irrelevant background context, built on three key components: QCD, CBD loss, and CTE.

Achievement: Achieved State-of-the-Art performance on three major benchmarks: QVHighlights, Charades-STA, and TACoS. Paper accepted to CVPR 2026.

[Details][Code: Coming Soon]

Amazon Grocery Vision Challenge (ICCV 2025)

Aug 2025 - Sep 2025

Amazon & DGIST

Develop a multi-modal AI model for Temporal Action Localization (TAL) and Spatio-Temporal Action Localization (STAL) in grocery shopping scenarios.

Achievement: Achieved 1st place in both TAL and STAL tracks within just 1 month of development. Successfully deployed multi-modal model achieving state-of-the-art performance on Amazon grocery dataset.

[Details][Code: Coming Soon][Reference Site]

Industry

Real time 3D Reconstruction using Dental Scanner

Jun 2024 - Jul 2025

Huvitz

Develop a real-time 3D reconstruction system utilizing Iterative Closest Point (ICP), Loop Closure Detection (LCD), and Pose Graph Optimization (PGO) for robust and accurate mapping.

Achievement: Improved speed by up to 80% compared to the existing algorithm without performance degradation.

[Details: Confidential][Code: Confidential][Reference Video]

Development of a 3D Pose Estimation and Shape Reconstruction Program for Solid Pharmaceuticals

Sep 2024 - Dec 2024

ETRI

Developed a prototype system to estimate 3D pose and reconstruct shapes of solid pharmaceuticals, enabling automatic pill detection, recognition, and counting without additional training.

Achievement: Demonstrated accurate pill classification and counting, showcasing potential for automated pharmaceutical management.

[Details: Confidential][Code: Confidential]

Algorithm Development for Automated Image Processing of Stereo Cameras

Sep 2024 - Nov 2024

HD Korea Shipbuilding & Offshore Engineering

To design and implement core algorithms enabling automated image processing for stereo camera systems.

Achievement: Delivered a prototype calibration module and contributed to automation pipeline design. Further technical details remain confidential due to project agreements.

[Details: Confidential][Code: Confidential]

View all 9 industry projects →

R&D of AI Test and Evaluation Standard Model

Oct 2023 - Jun 2024

ROKA Headquarters

Create a standard military training/test dataset and build a baseline AI model for introducing various AI weapon systems in the Army.

Achievement: Established initial standards for the Military Performance Certification Center (including dataset construction, baseline model development, and formulation of various strategies).

[Details: Confidential][Code: Confidential]

Establishment of Test and Evaluation Standards for AI Weapon Systems

Mar 2023 - Jun 2024

ROKA Headquarters, U.S. Department of Defense

Develop new testing and evaluation standards for AI weapon systems, which differ significantly from traditional weapon systems.

Achievement: Established initial standards for the Military Performance Certification Center (including dataset construction, baseline model development, and formulation of various strategies).

[Details: Confidential][Code: Confidential]

Military Scientific Surveillance System

Mar 2023 - Sep 2023

ROKA Headquarters

Reduce false/missed detections and improve true detections by building an AI-based surveillance system.

Achievement: Reduced false positives by 10% compared to the existing system.

[Details: Confidential][Code: Confidential]

Development of Car Location and Speed Estimation Module Using CCTV Footage

Aug 2022 - Dec 2022

ETRI

Develop a module capable of estimating vehicle position and speed solely from CCTV video data.

Achievement: Achieved over 90% accuracy in vehicle speed estimation on the target dataset.

[Details: Confidential][Code: Confidential]

Robust Monocular Camera 3D Object Detection in Various Camera Environments

Mar 2021 - Jun 2022

Hyundai

Improve the robustness of monocular camera-based 3D object detection, addressing significant performance degradation caused by varying camera environments.

Achievement: Diagnosed key factors affecting model accuracy and significantly improved performance: Accuracy increased from 20% to 80% for a 3-degree angle variation. Accuracy increased from 1% to 50% for a 5-degree angle variation. Research findings contributed to international patents and publications(CVPRw 2024).

[Details: Confidential][Code: Confidential]

3D Building Exterior Reconstruction

Aug 2020 - Dec 2020

KETI

Develop a 3D reconstruction module using monocular images.

Achievement: Successfully built a 3D reconstruction module that processes monocular images to generate 3D structures.

[Details][Code: Confidential]

Publications

CVA: Context-aware Video-text Alignment for Video Temporal Grounding

Sungho Moon*, Seunghun Lee*, Jiwan Seo, Sunghoon Im†

CVPR 2026

Proposed a context-aware framework for video temporal grounding with Query-aware Context Diversification, Context-invariant Boundary Discrimination loss, and Context-enhanced Transformer Encoder. Achieved SOTA on QVHighlights, Charades-STA, and TACoS.

[Coming Soon][GitHub: Coming Soon]

Rotation Matters: Generalized Monocular 3D Object Detection for Various Camera Systems

SungHo Moon, JinWoo Bae, SungHoon Im

CVPR Workshop 2023

Proposed a generalized approach for monocular 3D object detection that addresses performance degradation caused by varying camera orientations and systems.

[Paper][GitHub: Confidential]

Deep Digging into the Generalization of Self-Supervised Monocular Depth Estimation

Jinwoo Bae, Sungho Moon, Sunghoon Im

AAAI 2023

Investigated the generalization capabilities of self-supervised monocular depth estimation methods across different domains and datasets.

[Paper][GitHub]

Industry Collaboration

Skills

AI & Computer Vision: Multi-Modal AI, Vision-Language Models (VLM), 3D Computer Vision, Temporal Action Localization, Pose Graph Optimization

Languages: Python, C++, C, MATLAB

Frameworks: PyTorch, OpenCV, Scikit-learn, TensorFlow

Specialties: 3D Reconstruction, Bundle Adjustment, Object Detection, Multi-Modal AI, Camera Calibration

Tools: Git, Linux, Docker, CUDA