A Robust Technique for Motion-based Video Sequences Temporal alignment
Department of Electrical and Computer Engineering, University of Alberta
Abstract: In this paper, we propose a novel technique for temporal alignment of video sequences with similar planar motions acquired using uncalibrated cameras. In this technique, we model the motion-based video temporal alignment problem as a spatio-temporal discrete trajectory point sets alignment problem. First, the trajectory of the interested object is tracked throughout the videos. A probabilistic method is then developed to calculate the spatial correspondence, i.e., homography, between trajectory point sets. Next, the dynamic time warping technique (DTW) is applied to the spatial correspondence information to compute the temporal alignment of the videos. The experimental results show that the proposed technique provides a superior performance, approximately 42% to 74% improvement, over existing techniques for videos with similar trajectory patterns.
The experimental results for real videos are displayed below.
Please download the sample results (about 5 to 15 MB for each file) by clicking the title with underline and play it with Windows Media Player or other media player. Any concern, please send email through email@example.com.
The proposed technique is compared with other three techniques (RCB , STE , UBD ) on the UCF video  and coffee cup lifting videos.
The proposed technique is evaluated on one pair of videos with TaiChiQuan playing. Note the videos are captured from different view and different scenes.
The proposed technique is evaluated on three pairs of videos with ball throwing motion. Note the videos are captured from different view and different scenes.
The proposed technique is evaluated on four pairs of videos with coffee cup lifting and putting. Note that the similar action, i.e. lifting and putting the coffee cup, is performed by two different people under different scenes with different action speed.
1. C. Rao, A. Yilmaz, and M. Shah, “View-invariant representation and recognition of actions,” International Journal of Computer Vision, vol. 50, no. 2, pp. 203-226, Nov, 2002.
2. C. Rao, A. Gritai, M. Shah and T. F. S. Mahmood, “View-invariant alignment and matching of video sequences,” In proc. ICCV03, pp. 939-945, 2003.
3. M. Singh, et al., "Optimization of Symmetric Transfer Error for Sub-frame Video Synchronization," in Computer Vision - ECCV 2008, Pt Ii, Proceedings. vol. 5303, D. Forsyth, et al., Eds., ed, 2008, pp. 554-567.
4. C. Lu, M. Mandal. “Efficient Temporal Alignment of Video Sequences Using Unbiased Bidirectional Dynamic Time Warping,” Journal of Electronic Imaging, vol. 19, no. 4, pp. 0501-0504, Aug 2010.
5. Cen Rao, “View-Invariant Representations for Human Activity Recognition”, http://server.cs.ucf.edu/~vision/projects/ViewInvariance/ViewInvariance.html.