Blog: Using a Neural Network for Face Tracking on Android

In one of our previous blog posts, we talked about face tracking basics. We have demonstrated a simple mobile Face Tracking application, based on native iOS frameworks. Nonetheless, it had some serious limitations and restrictions. We will now take a step forward and talk about more complex face tracking mobile application, where a neural network performs the detection of users’ faces.

We should note that the implementation of neural networks in mobile applications proved to be quite a challenging task. Delivering a decent result for a wide range of mobile devices requires using nontrivial solutions and really good optimization hacks. Check the following video to learn more about using neural networks for image recognition in Android applications.

We can use this neural network or train a new one to solve specific tasks according to your business needs. After that, we can implement this neural network in the development of a mobile application for you. Feel free to contact us now to propel your mobile application development project! Find more details about this state-of-the-art application below.

Background on using neural networks for image recognition

Neural networks use the non-linear approach by the means of artificial intelligence algorithms to model the way a human brain solves tasks. Thus, the neural networks are capable of providing the similar or even better results than the human brain.

You may train the artificial neural networks just as the human brain. Moreover, the machine learning process may be directed in a certain way. In terms of image processing this means that you can train the neural network to detect various objects (cats, dogs, weapons — you name it), not only the human faces.

Training a neural network requires some time and, what’s even more important, some serious computer power. It can last for 2-3 days or even around a week. The training time depends on the complexity of the task and the size of the sample data set. Interestingly enough, we can use the previously trained neural network (there are plenty of them in the market already) if it fits our requirements.

Practical usage of neural networks shows that they have fairly high hardware requirements. Talking about mobile applications based on neural networks, usually the best results are reached while using on top-class smartphones. Even mid-range devices produce weak results in image recognition. So, various optimization tricks become a must-have for this type of applications.

Face tracking neural network landmarks

We used the previously created neural network trained to find the 68 landmarks of the user’s face in this Android Face Tracking application. This artificial intelligence detects faces in an utterly precise way. It can recognize people, particularly if we imply a limited number of users. In this case, the biggest issue relates to adding new faces to the neural network. Every time this happens, you have to retrain your neural network.

Optimization best practices

Let us review the optimization best practices that you may implement in the applications that use the neural networks as their core feature.

  • There is no need to analyze the original high-resolution image from the smartphone’s camera. The neural network can precisely detect the face on a lower resolution image, and will do it even faster! Therefore, we decided to reduce the image size before sending it to the neural network for the face recognition.
  • The neural network cannot process 30 images per second even on top-class smartphones (30 fps is a typical frame rate of a video stream). Considering this fact we developed a smart solution  that allows us to keep the desired framerate of the video output. Generally speaking, the neural network proceeds up to 10 images per second, and we keep the last detected position of the user’s face (or users’ faces) for the rest of the frames.
  • After detecting a face, the application may simply analyze its movement, not using the neural network to detect the face in every frame. In fact, all those 68 face landmarks for every frame in the video are overwhelming, because the face cannot disappear from the video in a fraction of a second.
  • Also, we can use a face pattern to find it on the subsequent frames instead of searching for a new face every time. The search of a new face can be performed just once or twice per second: that will drastically decrease the neural network’s request rate.
  • Another outstanding hack is about using 2 or 3 smaller neural networks instead of one big neural network. You can separately train them for detecting eyes, mouth and the facial contour, for example. This trick allows parallel usage of multiple processor cores, while one big neural network can use only one processor core. This is vastly critical for the mobile devices with mid-range and entry-level processors.

There are more various impressive optimization strategies. However, we decided to talk only about the strategies that deliver the most efficient results in this blog.

Application structure

Let’s examine the internal structure of the application.

Neural Network Face Tracking Android Application Structure

We use the live picture from the smartphone’s camera as the data source in our application. The application extracts the image frame from the video stream and reduces its size. Then the app uses the OpenCV library to detect the position of the user’s face.

If the current frame includes the user’s face, the application crops the redundant part of the image and then passes it to the neural network for further analysis.

We use the custom dlib library to interact with the neural network. Actually, the dlib module receives the image and forwards it to the input of the neural network. Then the dlib module processes the result from the output of the neural network. Basically, the output consists of the coordinates of 68 face landmarks detected by the neural network.

The application adds these landmarks to the corresponding frame of the live camera stream and goes back to the processing of the subsequent frames. At the same time, the users can see the modified video on the smartphone’s screen.

 neural network handwriting

Developing this Face Tracking Android application, we have discovered the huge potential of the neural networks. They can be featured in various industries and can be trained to solve different tasks, such as handwriting recognition, for example. At the same time, we can definitely state that the applications based on the artificial neural networks, and the neural networks themselves, do require some hard optimization.

We can use the practical experience of implementing neural networks in mobile apps to develop various applications for your needs. If you are planning to build a mobile application using the neural network, contact us to get a free consultation.