Face Mask Detection using OpenCV and TensorFlow



I in this blog, we'll discuss our two-phase face mask detector, detailing how our computer vision and deep learning pipeline will be implemented.


From there, we'll review the dataset we'll be using to train our face mask detector.


Then I will show you how to use the IPython script to train a face mask detector with our dataset using Keras and Tensorflow.


We'll use the jupyter notebook to review our dataset, train a face mask detector, and review the results.


Given the trained face mask detector, we'll proceed to implement two more additional python scripts used to:

  1. Detect face mask in images

  2. Detect face mask in real-time video streams

We'll wrap up the post be looking a the results of applying our face mask detector.


Two-phase face mask detector


In order to train the face mask detector, we need to break our project into two distinct phases, each with its own respective sub-steps :

  • Training: Here we’ll focus on loading our face mask detection dataset from disk, training a model (using Keras/TensorFlow) on this dataset, and then serializing the face mask detector to disk

  • Deployment: Once the face mask detector is trained, we can then move on to loading the mask detector, performing face detection, and then classifying each face as with_mask or without_mask

we'll review each of these phases and associated subjects in detail in the remainder of this blog, but in the meantime, let's take a look at the dataset we'll be using to train our face mask detector.


Our face mask detection dataset


The dataset we'll be using here today was created by wobot intelligence available on kaggle.


The dataset consists of 6000+ images belonging to two classes:

  • with_mask: 4000 images

  • without_mask: 2000 images

You can download the dataset from kaggle click here


Our goal is to train a deep learning model to detect whether a person is or not wearing a mask.


Project Structure


Once you download the dataset in the download section, extract the dataset to your project directory, you will be presented with the following directory structure:

.
├── face-mask-detection-dataset
│   ├── Medical mask
|   |   └── Medical mask
|   |       └── Medical mask
|   |           ├── annotations
|   |           └── images
|   |
│   ├── submission.csv
│   └── train.csv
|
├── EDA_and_train_model.ipynb
├── detect_mask_image.py
├── detect_mask_video.py
├── mask_detector.model
└── test_image.jpg

The face-mask-detection-dataset/ directory contains the data described in the "Our face mask detection dataset" section.


we have to consider one image example so that you can test the static image face mask detector.


we'll be reviewing three python scripts in this blog:

  • train_mask_detector.ipynb

  • detect_mask_image.py

  • detect_mask_video.py

Now let's get started


Visualizing our face mask detector dataset using NumPy, Pandas, and Matplotlib

Let's begin with our first step of importing packages, data loading and analyzing the data


import visualization libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import os

import all the necessary libraries and packages like numpy, pandas, and matplotlib for data analysis and visualization purposes of data. Pandas package is used to load, visualize, and manipulate CSV data in the dataset. Matplotlib is used to plot and visualize all the data from the dataset.



Loading datasets

images = os.path.join('/face-mask-detection-dataset/Medical mask/Medical mask/Medical Mask/images')
train=pd.read_csv(os.path.join("/face-mask-detection-dataset/train.csv"))

Load all the image directories which we'll be using for training our detector model. Also, load the CSV file in which the labels are given for the supervised learning of our model.


train.csv

train.head()








check the CSV file using pandas head() function and analyze the table. check whether the file contains null values or not, data is given in which format, how many classes are given, which classes are useful and which are not, whether the feature given is useful or not, etc. And then proceed with the data for further processing.


As we need only faces with mask and without the mask

options=['face_with_mask','face_no_mask']
train= train[train['classname'].isin(options)]
train.sort_values('name',axis=0,inplace=True)

When we came to know after analyzing data, there are some classes that we don't need for our model as we are creating the model which detects whether the person is or not wearing the face mask. So, we have to take only those classes naming 'face_with_mask' and 'face_no_mask' which we actually need for the model.



Creating training data

Now let's starts to convert our raw data into useful data which we'll be using to our model for training.


Define variables

data = []
img_size = 50
path='/face-mask-detection-dataset/Medical mask/Medical mask/Medical Mask/images/'

Arrange data

for i in range(len(train)):
    arr = []
    for j in train.iloc[i]:
        arr.append(j)
    img_array = cv2.imread(os.path.join(images, arr[0]), cv2.IMREAD_GRAYSCALE)
    crop_image = img_array[arr[2]:arr[4], arr[1]:arr[3]]
    new_img_array = cv2.resize(crop_image, (img_size, img_size))
    data.append([new_img_array, arr[5]])

Here we comprised all the data in a single variable that contains image data with labels and image coordinates for cropping faces in images for training.


Separate training data and labels

x=[]
y=[]
for features, labels in data:
    x.append(features)
    y.append(labels)
from sklearn.preprocessing import LabelEncoder
lbl=LabelEncoder()
y=lbl.fit_transform(y)

Here we separated the dependent and independent variables and label encode the class name using LabelEncoder class.

an independent variable consists of image data and the dependent variable consists of Labels of image data for training our face mask detector model.


Convert data into numpy array

x=np.array(x).reshape(-1,50,50,1)
x=tf.keras.utils.normalize(x,axis=1)
from tensorflow.keras.utils import to_categorical
y = to_categorical(y)

Now convert the image data into a standard form using feature scaling by normalizing it to improve our model accuracy. Also, our labels are converted to categorical form as per the TensorFlow requirement.



Training Face Mask Detection Model using TensorFlow/Keras

Now let's come to most interesting part of Artificial Intelligence and Machine Learning that is training our model and make it ready to make predictions over inputs.


import TensorFlow libraries

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.layers import Flatten
from tensorflow.keras.optimizers import Adam

We are using the TensorFlow library for this model to train so import all the required packages from TensorFlow and proceed to the model building.


Build Model

model = keras.models.Sequential([
    Conv2D(100,(3, 3),input_shape=x.shape[1:],activation='relu'),
    MaxPooling2D(pool_size=(2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D(pool_size=(2, 2)),
    Flatten(),
    Dense(50, activation='relu'),
    Dropout(0.2),

    Dense(2, activation='softmax')
])

In our training model of TensorFlow, we have taken a total of four layers of which one is an input layer of input size (50,50) and activation function as 'relu' and one output layer using Dense class with activation function 'softmax' with output dimension of two as we have two classes for prediction.


Train Model

opt = tf.keras.optimizers.Adam(lr=1e-3, decay=1e-5)
model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy']) 

history = model.fit(x,y,epochs=30,batch_size=5)

we have to define an optimizer for our model and here we'll be using Adam optimizer class to compile our model. Then compile the model with adam optimizer and categorical_crossentropy loss function with accuracy metrics for evaluation of model while training.

finally, we'll start to train our model with fit function passing training data and labels with 30 epochs and batch_size of 5 images per step.


plot training loss and accuracy

N = 30
H = history
plt.style.use("ggplot")
plt.figure()
plt.plot(np.arange(0, N), H.history["loss"], label="train_loss")
plt.plot(np.arange(0, N), H.history["accuracy"], label="train_acc")
plt.title("Training Loss and Accuracy")
plt.xlabel("Epoch #")
plt.ylabel("Loss/Accuracy")
plt.legend(loc="lower left")










By plotting the training graph of loss and accuracy we are able to visualize how our model training, whether the loss is increasing or decreasing, how much accuracy it is showing, and whether our model is overfitting or underfitting. So we can do some changes in the model to improve our accuracy.


Save Model

model.save('model.h5')

save our trained model to use it in our platform based application like in android or web applications. Further, we are going to use this model in real-time face mask detection.



Implementing our face mask detector for images with OpenCV


Now that our face mask detector is trained, let’s learn how we can:

  1. Load an input image from disk

  2. Detect faces in the image

  3. Apply our face mask detector to classify the face as either with_mask or without_mask

Open up the detect_mask_image.py file in your directory structure, and let’s get started:

complete source code is available here


Import packages

import cv2
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import load_model

Import pre-trained models

face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
classifier = load_model('model.h5')

we need to recognize the face of the person to detect the mask on the face, so load the frontalface_harcascade file to detect the faces.

To download the haar cascade XML file click here

load the haar cascade file using the cv2 CascadeClassifier class by passing just the name of the file.

Also, load our trained face mask detection model using load_model class from TensorFlow.


Recognition Function

def recognize(img):
    img = cv2.resize(img, (50, 50))
    x = np.array(img).reshape(-1, 50, 50, 1)
    x = tf.keras.utils.normalize(x, axis=1)
    pred = classifier.predict(x)
    pred = np.argmax(pred, axis=1)
    sol = None
    if pred == 1:
        sol = 'No Mask'
    elif pred == 0:
        sol = 'Mask'

    return sol

This function takes an input of an image array of cropped faces and returns the prediction of the model whether the person is or not wearing the mask and gives the text Mask or No Mask.


Make Predictions

# take a sample image for testing the implementation
frame = cv2.imread('test_image.png')
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray, 1.3, 5)
for face in faces:
    x1, y1, width, height = face
    x2, y2 = x1 + width, y1 + height
    crop_image = gray[y1:y2, x1:x2]
    res = recognize(crop_image)
    cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 0, 255), 2)
    cv2.putText(frame, str(res), (x1 + 5, y1 - 5), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2)


cv2.imshow('face detection- Haarcascade', frame)
cv2.waitKey(0)
cv2.destroyAllWindows()



Implementing our face mask detector in real-time video streams with OpenCV


At this point, we know we can apply face mask detection to static images — but what about real-time video streams?

Let’s find out.


Open up the detect_mask_video.py file in your directory structure, and insert the following code:

complete source code is given HERE


import cv2
import numpy as np
import tensorflow as tf
from tensorflow.keras.models import load_model

face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
classifier = load_model('model_new.h5')
vc = cv2.VideoCapture(0)


def recognize(img):
    img = cv2.resize(img, (50, 50))
    x = np.array(img).reshape(-1, 50, 50, 1)
    x = tf.keras.utils.normalize(x, axis=1)
    pred = classifier.predict(x)
    pred = np.argmax(pred, axis=1)
    sol = None
    if pred == 1:
        sol = 'Mask'
    elif pred == 0:
        sol = 'No Mask'

    return sol


while True:
    _, frame = vc.read()
    frame = cv2.flip(frame, 1)
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(gray, 1.3, 5)
    for face in faces:
        x1, y1, width, height = face
        x2, y2 = x1 + width, y1 + height
        crop_image = gray[y1:y2, x1:x2]
        res = recognize(crop_image)
        if res == 'Mask':
            cv2.rectangle(frame, (x1,y1), (x2,y2), (0,255,0), 2)
        elif res == 'No Mask':
            cv2.rectangle(frame, (x1,y1), (x2,y2), (0,0,255), 2)

        cv2.putText(frame, str(res), (x1 + 5, y1 - 5), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2)

    cv2.imshow('face detection- Haarcascade', frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
vc.release()
cv2.destroyAllWindows()

Live video stream:-













Here, you can see that our face mask detector is capable of running in real-time (and is correct in its predictions as well).


SOURCE CODE



Suggestions for improvement


As you can see from the results sections above, our face mask detector is working quite well despite:

  1. Having limited training data

  2. The with_mask class being artificially generated (see the “How was our face mask dataset created?” section above).

To improve our face mask detection model further, you should gather actual images (rather than artificially generated images) of people wearing masks.

While our artificial dataset worked well in this case, there’s no substitute for the real thing.

Secondly, you should also gather images of faces that may “confuse” our classifier into thinking the person is wearing a mask when in fact they are not — potential examples include shirts wrapped around faces, bandana over the mouth, etc.

All of these are examples of something that could be confused as a face mask by our face mask detector.

Finally, you should consider training a dedicated two-class object detector rather than a simple image classifier.

Our current method of detecting whether a person is wearing a mask or not is a two-step process:

  1. Step #1: Perform face detection

  2. Step #2: Apply our face mask detector to each face

The problem with this approach is that a face mask, by definition, obscures part of the face. If enough of the face is obscured, the face cannot be detected, and therefore, the face mask detector will not be applied.

To circumvent that issue, you should train a two-class object detector that consists of a with_mask class and without_mask class.

Combining an object detector with a dedicated with_mask the class will allow improvement of the model in two respects.

First, the object detector will be able to naturally detect people wearing masks that otherwise would have been impossible for the face detector to detect due to too much of the face being obscured.

Secondly, this approach reduces our computer vision pipeline to a single step — rather than applying face detection and then our face mask detector model, all we need to do is apply the object detector to give us bounding boxes for people both

with_mask and without_mask in a single forward pass of the network.


Not only is such a method more computationally efficient, but it’s also more “elegant” and end-to-end.

Contact Us

Mob:  9067957548

blogpost@datasciencefever.com

  • Instagram
  • Facebook
  • LinkedIn
Rate UsDon’t love itNot greatGoodGreatLove itRate Us
Google Home: Over 100 of the funniest questions to ask Google Home