Main Content

vision.PointTracker

Track points in video using Kanade-Lucas-Tomasi (KLT) algorithm

Description

The point tracker object tracks a set of points using the Kanade-Lucas-Tomasi (KLT), feature-tracking algorithm. You can use the point tracker for video stabilization, camera motion estimation, and object tracking. It works particularly well for tracking objects that do not change shape and for those that exhibit visual texture. The point tracker is often used for short-term tracking as part of a larger tracking framework.

As the point tracker algorithm progresses over time, points can be lost due to lighting variation, out of plane rotation, or articulated motion. To track an object over a long period of time, you may need to reacquire points periodically.

To track a set of points:

  1. Create thevision.PointTrackerobject and set its properties.

  2. Call the object with arguments, as if it were a function.

To learn more about how System objects work, seeWhat Are System Objects?

Creation

Description

example

pointTracker = vision.PointTrackerreturns a point tracker object that tracks a set of points in a video.

pointTracker= vision.PointTracker(Name,Value)sets properties using one or more name-value pairs. Enclose each property name in quotes. For example,pointTracker = vision.PointTracker('NumPyramidLevels',3)

Initialize Tracking Process:

To initialize the tracking process, you must useinitializeto specify the initial locations of the points and the initial video frame.

initialize(pointTracker,points,I)initializes points to track and sets the initial video frame. The initial locationspoints, must be anM-by-2 array of [x y] coordinates. The initial video frame,I, must be a 2-D grayscale or RGB image and must be the same size and data type as the video frames passed to thestepmethod.

ThedetectFASTFeatures,detectSURFFeatures,detectHarrisFeatures, anddetectMinEigenFeaturesfunctions are few of the many ways to obtain the initial points for tracking.

Properties

expand all

Unless otherwise indicated, properties arenontunable, which means you cannot change their values after calling the object. Objects lock when you call them, and thereleasefunction unlocks them.

If a property istunable, you can change its value at any time.

For more information on changing property values, seeSystem Design in MATLAB Using System Objects.

Number of pyramid levels, specified as integer. The point tracker implementation of the KLT algorithm uses image pyramids. The tracker generates an image pyramid, where each level is reduced in resolution by a factor of two compared to the previous level. Selecting a pyramid level greater than 1, enables the algorithm to track the points at multiple levels of resolution, starting at the lowest level. Increasing the number of pyramid levels allows the algorithm to handle larger displacements of points between frames. However, computation cost also increases. Recommended values are between1and4.

Each pyramid level is formed by down-sampling the previous level by a factor of two in width and height. The point tracker begins tracking each point in the lowest resolution level, and continues tracking until convergence. The object propagates the result of that level to the next level as the initial guess of the point locations. In this way, the tracking is refined with each level, up to the original image. Using the pyramid levels allows the point tracker to handle large pixel motions, which can comprise distances greater than the neighborhood size.

Forward-backward误差阈值,指定为一个年代calar. If you set the value to less thaninf, the tracker tracks each point from the previous to the current frame. It then tracks the same points back to the previous frame. The object calculates the bidirectional error. This value is the distance in pixels from the original location of the points to the final location after the backward tracking. The corresponding points are considered invalid when the error is greater than the value set for this property. Recommended values are between0and3pixels.

Using the bidirectional error is an effective way to eliminate points that could not be reliably tracked. However, the bidirectional error requires additional computation. When you set theMaxBidirectionalErrorproperty toinf, the object does not compute the bidirectional error.

Size of neighborhood around each point being tracked, specified as a two-element vector, [height,width]. Theheightandwidthmust be odd integers. This neighborhood defines the area for the spatial gradient matrix computation. The minimum value forBlockSizeis [5 5]. Increasing the size of the neighborhood, increases the computation time.

Maximum number of search iterations for each point, specified as an integer. The KLT algorithm performs an iterative search for the new location of each point until convergence. Typically, the algorithm converges within 10 iterations. This property sets the limit on the number of search iterations. Recommended values are between10and50.

Usage

Description

example

[points,point_validity] = pointTracker(I)tracks the points in the input frame,I.

[points,point_validity,scores] = pointTracker(I)additionally returns the confidence score for each point.

setPoints(pointTracker,points)sets the points for tracking. The function sets theM-by-2pointsarray of [xy] coordinates with the points to track. You can use this function if the points need to be redetected because too many of them have been lost during tracking.

setPoints(pointTracker,points,point_validity)additionally lets you mark points as either valid or invalid. The input logical vectorpoint_validityof lengthM, contains the true or false value corresponding to the validity of the point to be tracked. The lengthM对应的数量oints. A false value indicates an invalid point that should not be tracked. For example, you can use this function with theestimateGeometricTransformfunction to determine the transformation between the point locations in the previous and current frames. You can mark the outliers as invalid.

Input Arguments

expand all

Video frame, specified as grayscale or truecolor (RGB).

Output Arguments

expand all

Tracked points, returned as anM-by-2 array of [x,y] coordinates that correspond to the new locations of the points in the input frame,I.

Reliability of track for each point, returned as anM-by-1 logical array. A point can be invalid for several reasons. The point can become invalid if it falls outside of the image. Also, it can become invalid if the spatial gradient matrix computed in its neighborhood is singular. If the bidirectional error is greater than theMaxBidirectionalErrorthreshold, this condition can also make the point invalid.

Confidence score between0and1, returned as anM-by-1 array. The values correspond to the degree of similarity between the neighborhood around the previous location and new location of each point. These values are computed as a function of the sum of squared differences between the previous and new neighborhoods. The greatest tracking confidence corresponds to a perfect match score of1.

Object Functions

To use an object function, specify the System object™ as the first input argument. For example, to release system resources of a System object namedobj, use this syntax:

release(obj)

expand all

initialize Initialize video frame and points to track
step RunSystem objectalgorithm
release Release resources and allow changes toSystem objectproperty values and input characteristics
reset Reset internal states ofSystem object

Examples

collapse all

Create System objects for reading and displaying video and for drawing a bounding box of the object.

videoReader = VideoReader('visionface.avi'); videoPlayer = vision.VideoPlayer('Position',[100,100,680,520]);

Read the first video frame, which contains the object, define the region.

objectFrame = readFrame(videoReader); objectRegion = [264,122,93,93];

作为一种替代方法,您可以使用以下逗号nds to select the object region using a mouse. The object must occupy the majority of the region:

figure; imshow(objectFrame);

objectRegion=round(getPosition(imrect))

Show initial frame with a red bounding box.

objectImage = insertShape(objectFrame,'Rectangle',objectRegion,'Color','red'); figure; imshow(objectImage); title('Red box shows object region');

Detect interest points in the object region.

points = detectMinEigenFeatures(im2gray(objectFrame),'ROI',objectRegion);

Display the detected points.

pointImage = insertMarker(objectFrame,points.Location,'+','Color','white'); figure; imshow(pointImage); title('Detected interest points');

Create a tracker object.

tracker = vision.PointTracker('MaxBidirectionalError',1);

Initialize the tracker.

initialize(tracker,points.Location,objectFrame);

Read, track, display points, and results in each video frame.

whilehasFrame(videoReader) frame = readFrame(videoReader); [points,validity] = tracker(frame); out = insertMarker(frame,points(validity, :),'+'); videoPlayer(out);end

Release the video player.

release(videoPlayer);

References

[1] Lucas, Bruce D. and Takeo Kanade. “An Iterative Image Registration Technique with an Application to Stereo Vision,”Proceedings of the 7th International Joint Conference on Artificial Intelligence, April, 1981, pp. 674–679.

[2] Tomasi, Carlo and Takeo Kanade.Detection and Tracking of Point Features, Computer Science Department, Carnegie Mellon University, April, 1991.

[3] Shi, Jianbo and Carlo Tomasi. “Good Features to Track,”IEEE Conference on Computer Vision and Pattern Recognition, 1994, pp. 593–600.

[4] Kalal, Zdenek, Krystian Mikolajczyk, and Jiri Matas. “Forward-Backward Error: Automatic Detection of Tracking Failures,”Proceedings of the 20th International Conference on Pattern Recognition, 2010, pages 2756–2759, 2010.

Extended Capabilities

Version History

Introduced in R2012b