top of page

​

Algorithms 

Viola-Jones
KLT

The Kanade-Lucas-Tomasi Algorithm (KLT) is an extension of the Viola-Jones algorithm that can be applied to videos. While the Viola-Jones algoirthm focused on detecting and location features, the KLT algorithm focuses on detecting and tracking features. The KLT Algorithm finds good points to track on a detected feature and uses the gradient of sequential frames to detect how the feature has moved. 

​

The KLT algorithm had its own pros and cons that we faced while trying to implement our filters on video files:

​

Pros:

-Based off of same concept as our filters for still images so it was relatively simple to extend our code to work for videos

-We were able to use all the same filters as those for still images, making it easier to judge how our results looked

​

Cons:

-Fairly long run time. in order to be implemented in real time like Snapchat we would have to figure out a way to significantly reduce the time it took our program to execute

-Difficulty with multiple faces. Filtering videos with multiple faces present was much more difficult to try to implement because it required more computation time, further slowing down the process as well as requiring us to continue to check for new features in successive frames

-The KLT Algorithm sometimes struggled to track features like eyes which were often shadowed in different frames of a video

This is the algorithm that Matlab uses to identify a given feature with the CascadeObjectDetector. It's based upon the detection of Haar features. 

​

They share many similarities to Haar wavelets and the Haar basis, but were created much later.

​

Haar features consist of light and dark rectangles that are common amongst faces. For example, the eyes of someone's face are darker than their cheeks. By using these universal truths between faces, we can begin to identify the faces within an image. In order for the output to be accurate, a large number of features must be used. The algorithm runs such that 

​

-The image is split into windows of a desired size. Since we don't interface with the algorithm directly, we don't know what this size is, but know that it is consistent.

-Each of these subwindows is tested for a specifc Haar feature, and using a threshold, gets either a definite "no" or a maybe. If the result is no, the region is ignored in the next iteration of the algorithm.

-If the result is a maybe, then that region is then tested for another set of Haar features common to faces. This is called cascading, and is done until a bounding box is created around the detected object.

​

Pros:

-efficient, even for large images. When running the code in matlab, we never faced slow runtimes when trying to detect faces (more than a second or two);

-The detector is both scale and location invariant, which is very important because the size and location of the faces of the images that we used are all different.

-This detection can be used for any feature, not just faces .We also detected noses and eyes using the algorithm.

​

Cons:

-Can detect faces that aren't there. Since it is size and location invariant, the algorithm would sometimes detect small faces in the corner due to hair or rumpled clothes. 

-Does not work on turned or unlit faces. All the images we used and tested are basic front-facing headshots.

bottom of page