Implement Face Recognition with OpenCV: A Beginner Guide
OpenCV Tutorial
Implement Face Recognition with OpenCV: A Beginner Guide

Face recognition with OpenCV

OpenCV provides support to perform face recognition (https://docs.opencv.org/4.0.

1/dd/d65/classcv_1_1face_1_1FaceRecognizer.html). Indeed, OpenCV provides three

different implementations to use:

Eigenfaces

Fisherfaces

Local Binary Patterns Histograms (LBPH)

These implementations perform the recognition in different ways. However, you

can use any of them by changing only the way the recognizers are created. More

specifically, to create these recognizers, the following code is necessary:

face_recognizer = cv2.face.LBPHFaceRecognizer_create()

face_recognizer = cv2.face.EigenFaceRecognizer_create()

face_recognizer = cv2.face.FisherFaceRecognizer_create()

Once created, and independently of the specific internal algorithm OpenCV is

going to use to perform the face recognition, the two key methods, train() and

predict(), should be used to perform both the training and the testing of the face

recognition system, and it should be noted that the way we use these methods is

independent of the recognizer created.

Therefore, it is very easy to try the three recognizers and select the one that

offers the best performance for a specific task. Having said that, LBPH should

provide better results than the other two methods when recognizing images in

the wild, where different environments and lighting conditions are usually

involved. Additionally, the LBPH face recognizer supports the update() method,

where you can update the face recognizer given new data. For the Eigenfaces

and Fisherfaces methods, this functionality is not possible.

In order to train the recognizer, the train() method should be called:

face_recognizer.train(faces, labels)

The cv2.face_FaceRecognizer.train(src, labels) method trains the specific face

recognizer, where src corresponds to the training set of images (faces), and

parameter labels set the corresponding label for each image in the training set.

To recognize a new face, the predict() method should be called:

label, confidence = face_recognizer.predict(face)

The cv2.face_FaceRecognizer.predict(src) method outputs (predicts) the recognition

of the new src image by outputting the predicted label and the associated

confidence.

Finally, OpenCV also provides the write() and read() methods to save the created

model and to load a previously created model, respectively. For both methods,

the filename parameter sets the name of the model to save or load:

cv2.face_FaceRecognizer.write(filename)

cv2.face_FaceRecognizer.read(filename)

As mentioned, the LBPH face recognizer can be updated using

the update() method:

cv2.face_FaceRecognizer.update(src, labels)

Here, src and labels set the new training examples that are going to be used to

update the LBPH recognizer.

Related

Getting All the Properties of Video Using OpenCV: A Simple Guide

video capture object First, we create the read_video_file_all_properties.py script to show all the properties. Some of these properties only work when we're working with cameras (not with video files). In these cases, a 0 value is returned. Additionally, we have created the decode_fourcc() function, which converts the value that's returned by capture.get(cv2.CAP_PROP_FOURCC) as a string value that contains the int representation of the codec. In this sense, this value should be converted into a four-byte char representation to output the codec properly. Therefore, the decode_fourcc() function copes with this. The code of this function is given as follows: def decode_fourcc(fourcc): """Decodes the fourcc value to get the four chars identifying it """ fourcc_int = int(fourcc) # We print the int value of fourcc print("int value of fourcc: '{}'".format(fourcc_int)) # We can also perform this in one line: # return "".join([chr((fourcc_int >> 8 * i) & 0xFF) for i in range(4)]) fourcc_decode = "" for i in range(4): int_value = fourcc_int >> 8 * i & 0xFF print("int_value: '{}'".format(int_value)) fourcc_decode += chr(int_value) return fourcc_decode To explain how it works, the following diagram summarizes the main steps: As you can see, the first step is to obtain the int representation of the value that's returned by capture.get(cv2.CAP_PROP_FOURCC), which is a string. Then, we iterate four times to get every eight bits and convert these eight bits into int. Finally, these int values are converted into char using the chr() function. It should be noted that we can perform this function in only one line of code, as follows: return "".join([chr((fourcc_int >> 8 * i) & 0xFF) for i in range(4)]) The CAP_PROP_POS_FRAMES property gives you the current frame of the video file and the CAP_PROP_POS_MSEC property gives you the timestamp of the current frame. We can also get the number of fps with the CAP_PROP_FPS property. The CAP_PROP_FRAME_COUNT property gives you the total number of frames of the video file. To get and print all the properties, use the following code: # Get and print these values: print("CAP_PROP_FPS : '{}'".format(capture.get(cv2.CAP_PROP_FPS))) print("CAP_PROP_POS_MSEC : '{}'".format(capture.get(cv2.CAP_PROP_POS_MSEC))) print("CAP_PROP_POS_FRAMES : '{}'".format(capture.get(cv2.CAP_PROP_POS_FRAMES))) print("CAP_PROP_FOURCC : '{}'".format(decode_fourcc(capture.get(cv2.CAP_PROP_FOURCC)))) print("CAP_PROP_FRAME_COUNT : '{}'".format(capture.get(cv2.CAP_PROP_FRAME_COUNT))) print("CAP_PROP_MODE : '{}'".format(capture.get(cv2.CAP_PROP_MODE))) print("CAP_PROP_BRIGHTNESS : '{}'".format(capture.get(cv2.CAP_PROP_BRIGHTNESS))) print("CAP_PROP_CONTRAST : '{}'".format(capture.get(cv2.CAP_PROP_CONTRAST))) print("CAP_PROP_SATURATION : '{}'".format(capture.get(cv2.CAP_PROP_SATURATION))) print("CAP_PROP_HUE : '{}'".format(capture.get(cv2.CAP_PROP_HUE))) print("CAP_PROP_GAIN : '{}'".format(capture.get(cv2.CAP_PROP_GAIN))) print("CAP_PROP_EXPOSURE : '{}'".format(capture.get(cv2.CAP_PROP_EXPOSURE))) print("CAP_PROP_CONVERT_RGB : '{}'".format(capture.get(cv2.CAP_PROP_CONVERT_RGB))) print("CAP_PROP_RECTIFICATION : '{}'".format(capture.get(cv2.CAP_PROP_RECTIFICATION))) print("CAP_PROP_ISO_SPEED : '{}'".format(capture.get(cv2.CAP_PROP_ISO_SPEED))) print("CAP_PROP_BUFFERSIZE : '{}'".format(capture.get(cv2.CAP_PROP_BUFFERSIZE))) You can view the full code of this script in the read_video_file_all_properties.py file.

Implement Simple Thresholding to Images in OpenCV: A Beginner Guide

In order to perform simple thresholding, OpenCV provides the cv2.threshold() function that was briefly introduced in the previous section. The signature for this method is as follows: cv2.threshold(src, thresh, maxval, type, dst=None) -> retval, dst The cv2.threshold() function applies a fixed-level thresholding to the src input array (multiple-channel, 8-bit or 32-bit floating point). The fixed level is adjusted by the thresh parameter, which sets the threshold value. The type parameter sets the thresholding type, which will be further explained in the next subsection. Different types are as follows: cv2.THRESH_BINARY cv2.THRESH_BINARY_INV cv2.THRESH_TRUNC cv2.THRESH_TOZERO cv2.THRESH_TOZERO_INV cv2.THRESH_OTSU cv2.THRESH_TRIANGLE Additionally, the maxval parameter sets the maximum value to use only with the cv2.THRESH_BINARY and cv2.THRESH_BINARY_INV thresholding types. Finally, the input image should be single channel only in the cv2.THRESH_OTSU and cv2.THRESH_TRIANGLE thresholding types. In this section, we will examine all the possible configurations to understand all of these parameters. Thresholding types The types of thresholding operation are described according to its formulation. Take into account that src is the source (original) image, and dst corresponds to the destination (result) image after thresholding. In this sense, src(x, y) will correspond to the intensity of the pixel (x, y) of the source image, and dst(x, y) will correspond to the intensity of the pixel (x, y) of the destination image. Here is the formula for cv2.THRESH_BINARY: So, if the intensity of the pixel src(x, y) is higher than thresh, then the new pixel intensity is set to a maxval parameter. Otherwise, the pixels are set to 0. Here is the formula for cv2.THRESH_BINARY_INV: So, if the intensity of the pixel src(x, y) is higher than thresh, then the new pixel intensity is set to 0. Otherwise, it is set to maxval. Here is the formula for cv2.THRESH_TRUNC: So, if the intensity of the pixel src(x, y) is higher than thresh, then the new pixel intensity is set to threshold. Otherwise, it is set to src(x, y). Here is the formula for cv2.THRESH_TOZERO: So, if the intensity of the pixel src(x, y) is higher than thresh, the new pixel value will be set to src(x, y). Otherwise, it is set to 0. Here is the formula for cv2.THRESH_TOZERO_INV: So, if the intensity of the pixel src(x, y) is greater than thresh, the new pixel value will be set to 0. Otherwise, it is set to src(x, y). Also, the special cv2.THRESH_OTSU and cv2.THRESH_TRIANGLE values can be combined with one of the values previously introduced (cv2.THRESH_BINARY, cv2.THRESH_BINARY_INV, cv2.THRESH_TRUNC, cv2.THRESH_TOZERO, and cv2.THRESH_TOZERO_INV). In these cases (cv2.THRESH_OTSU and cv2.THRESH_TRIANGLE), the thresholding operation (implemented only for 8-bit images) computes the optimal threshold value instead of the specified thresh value. It should be noted that the thresholding operation returns the computed optimal threshold value. The thresholding_simple_types.py script helps you understand the aforementioned types. We use the same sample image introduced in the previous section, and we perform a thresholding operation with a fixed threshold value (thresh = 100) with all the previous types. […]

Translating an Image in OpenCV: A Simple Introduction

Translating an image In order to translate an object, you need to create the 2 x 3 transformation matrix by using the NumPy array with float values providing the translation in both the x and y directions in pixels, as shown in the following code: M = np.float32([[1, 0, x], [0, 1, y]]) This gives the following M transformation matrix: Once this matrix has been created, the cv2.warpAffine() function is called, as shown in the following code: dst_image = cv2.warpAffine(image, M, (width, height)) The cv2.warpAffine() function transforms the source image using the M matrix provided. The third (width, height) argument establishes the size of the output image. Remember that image.shape returns (width, height). For example, if we want to translate an image with 200 pixels in the x direction and 30 pixels in the y direction, we use the following: height, width = image.shape[:2] M = np.float32([[1, 0, 200], [0, 1, 30]]) dst_image = cv2.warpAffine(image, M, (width, height)) Note that the translation can be also negative, as shown in the following code: M = np.float32([[1, 0, -200], [0, 1, -30]]) dst_image = cv2.warpAffine(image, M, (width, height)) Translating an image In order to translate an object, you need to create the 2 x 3 transformation matrix by using the NumPy array with float values providing the translation in both the x and y directions in pixels, as shown in the following code: M = np.float32([[1, 0, x], [0, 1, y]]) This gives the following M transformation matrix: Once this matrix has been created, the cv2.warpAffine() function is called, as shown in the following code: dst_image = cv2.warpAffine(image, M, (width, height)) The cv2.warpAffine() function transforms the source image using the M matrix provided. The third (width, height) argument establishes the size of the output image. Remember that image.shape returns (width, height). For example, if we want to translate an image with 200 pixels in the x direction and 30 pixels in the y direction, we use the following: height, width = image.shape[:2] M = np.float32([[1, 0, 200], [0, 1, 30]]) dst_image = cv2.warpAffine(image, M, (width, height)) Note that the translation can be also negative, as shown in the following code: M = np.float32([[1, 0, -200], [0, 1, -30]]) dst_image = cv2.warpAffine(image, M, (width, height))

Understanding Adaptive Thresholding in OpenCV: A Beginner Tutorial

Adaptive thresholding In the previous section, we have applied cv2.threshold() using a global threshold value. As we could see, the obtained results were not very good due to the different illumination conditions in the different areas of the image. In these cases, you can try adaptive thresholding. In OpenCV, the adaptive thresholding is performed by the cv2.adapativeThreshold() function. The signature for this method is as follows: adaptiveThreshold(src, maxValue, adaptiveMethod, thresholdType, blockSize, C[, dst]) -> dst This function applies an adaptive threshold to the src array (8-bit single-channel image). The maxValue parameter sets the value for the pixels in the dst image for which the condition is satisfied. The adaptiveMethod parameter sets the adaptive thresholding algorithm to use: cv2.ADAPTIVE_THRESH_MEAN_C: The T(x, y) threshold value is calculated as the mean of the blockSize x blockSize neighborhood of (x, y) minus the C parameter cv2.ADAPTIVE_THRESH_GAUSSIAN_C: The T(x, y) threshold value is calculated as the weighted sum of the blockSize x blockSize neighborhood of (x, y) minus the C parameter The blockSize parameter sets the size of the neighborhood area used to calculate a threshold value for the pixel, and it can take the values 3, 5, 7,... and so forth. The C parameter is just a constant subtracted from the means or weighted means (depending on the adaptive method set by the adaptiveMethod parameter). Commonly, this value is positive, but it can be zero or negative. Finally, the thresholdType parameter sets the cv2.THRESH_BINARY or cv2.THRESH_BINARY_INV thresholding types. According to the following formula where T(x, y) is the threshold calculated for each pixel, the thresholding_adaptive.py script applies adaptive thresholding to a test image using the cv2.ADAPTIVE_THRESH_MEAN_C and cv2.ADAPTIVE_THRESH_GAUSSIAN_C methods: Here is the formula for cv2.THRESH_BINARY: Here is the formula for cv2.THRESH_BINARY_INV: The output of this script can be seen in the following screenshot: In the previous screenshot, you can see the output after applying cv2.adaptiveThreshold() with different parameters. As previously mentioned, if your task is to recognize the digits, the adaptive thresholding can give you better thresholded images. However, as you can also see, a lot of noise appears in the image. In order to deal with it, you can apply some smoothing operations (see Cha pter 5, Image Processing Techniques). In this case, we can apply a bilateral filter, because it is highly useful in noise removal while maintaining sharp edges. In order to apply a bilateral filter, OpenCV provides the cv2.bilateralFilter() function. Therefore, we can apply the function before thresholding the image as follows: gray_image = cv2.bilateralFilter(gray_image, 15, 25, 25) The code for this example can be seen in the thresholding_adaptive_filter_noise.py script. The output can be seen in the following screenshot: You can see that applying a smoothing filter is a good solution to deal with noise. In this case, a bilateral filter is applied because we want to keep the edges sharp.

Compressing Contours in OpenCV: A Beginner Guide

Compressing contours Detected contours can be compressed to reduce the number of points. In this sense, OpenCV provides several methods to reduce the number of points. This can be set with the parameter method. Additionally, this compression can be disabled by setting the flag to cv2.CHAIN_APPROX_NONE, where all boundary points are stored; hence, no compression is performed. The cv2.CHAIN_APPROX_SIMPLE method can be used to compress the detected contours because it compresses horizontal, vertical, and diagonal segments of the contour, preserving only endpoints. For example, if we use cv2.CHAIN_APPROX_SIMPLE to compress the contour of a rectangle, it will only be composed of four points. Finally, OpenCV provides two more flags for compressing contours based on the Teh-Chin algorithm, which is a non-parametric method. The first step of this algorithm determines the region of support (ROS) for each point based on its local properties. Next, the algorithm computes measures of relative significance of each point. Finally, dominant points are detected by a process of non-maxima suppression. They use three different measures of significance, corresponding to different degrees of accuracy of discrete curvature measures: K-cosine measure K-curvature measure One curvature measure (k = 1 of 2)) Therefore, in connection with the discrete curvature measures, OpenCV provides two flags—cv2.CHAIN_APPROX_TC89_L1 and cv2.CHAIN_APPROX_TC89_KCOS. For a deeper explanation of this algorithm, you can see the publication On the Detection of Dominant Points on Digital Curves (1989). Just for clarification, _CT89_ encodes the initial letters of the name's authors (Teh and Chin) and, also, the year of publication (1989). In contours_approximation_method.py, the four aforementioned flags (cv2.CHAIN_APPROX_NONE, cv2.CHAIN_APPROX_SIMPLE, cv2.CHAIN_APPROX_TC89_L1, and cv2.CHAIN_APPROX_TC89_KCOS) for the method parameter are used to encode the two detected contours in the image. The output of this script can be seen in the next screenshot: As can be seen, the points defining the contour are shown in white, showing how the four methods (cv2.CHAIN_APPROX_NONE, cv2.CHAIN_APPROX_SIMPLE, cv2.CHAIN_APPROX_TC89_L1, and cv2.CHAIN_APPROX_TC89_KCOS) compress the detected contours for the two provided shapes.