Global Tracking based Multi-Object Tracking in Complex Environments
- Publication Type:
- Thesis
- Issue Date:
- 2025
Open Access
Copyright Clearance Process
- Recently Added
- In Progress
- Open Access
This item is open access.
Multi-object tracking (MOT) is a core task in computer vision, widely applied in autonomous driving, surveillance, and human-computer interaction. However, it remains challenging in real-world scenarios due to occlusion, motion blur, and missed detections. To address these limitations, we propose a novel transformer-based MOT framework that enhances temporal continuity, attention modeling, and robustness. Specifically, we incorporate a Kalman filter prediction module to supplement unreliable or missing detections, ensuring uninterrupted trajectory generation. We further integrate the Convolutional Block Attention Module (CBAM) into the transformer to emphasize key spatial and channel-wise features. To better handle dynamic target motion, we introduce MQTFormer, a multi-query tracking strategy that leverages multiple trajectory queries across frames. In addition, a multi-weight fusion mechanism combines feature similarity, detection confidence, and adaptive weighting for more robust association. Extensive experiments on standard benchmarks, including MOT17 and TAO, demonstrate that our method outperforms state-of-the-art approaches in terms of accuracy, robustness, and stability, especially under challenging conditions.
Please use this identifier to cite or link to this item:
