Project 05 / Computer Vision

Lens

A small experiment in turning live webcam frames into emotion labels.

Role

Solo build — exploration

Stack

Python OpenCV DeepFace MTCNN

FIG. 01 — Pipeline Each frame moves through four steps. The dominant emotion comes out the other end.

The Idea

Webcams and pretrained models are everywhere, but stitching them together into something that reads emotions in real time is its own small puzzle. Lens is the experiment that ties them together.

OpenCV opens the webcam and reads frames in a loop. MTCNN locates the face inside each frame. DeepFace then runs emotion classification on the cropped region and returns the dominant label — happy, sad, neutral, and so on.

Built mainly as a hands-on way to learn computer vision and the OpenCV + DeepFace stack, with an eye toward future projects like emotion-aware media players or sentiment analysis during user interactions.

The core loop

emotion_detection.py

import cv2
from deepface import DeepFace

def check_emotion(frame):
    # run DeepFace with MTCNN backend on the captured frame
    analysis = DeepFace.analyze(
        frame,
        actions=['emotion'],
        enforce_detection=False,
        detector_backend='mtcnn'
    )
    # handle both single-face and multi-face return types
    if isinstance(analysis, list):
        dominant = analysis[0]['dominant_emotion']
    else:
        dominant = analysis['dominant_emotion']
    print(f"Detected emotion: {dominant}")
    return dominant

def main():
    capture = cv2.VideoCapture(0)
    while True:
        captured, frame = capture.read()
        check_emotion(frame)

if __name__ == "__main__":
    main()

FIG. 02 The whole thing fits on one screen. MTCNN handles tough lighting; DeepFace handles the rest.

What's happening under the hood

Three libraries, one tight loop.

/ 01

Frame capture OpenCV · cv2.VideoCapture

OpenCV opens the default webcam and reads raw frames in a loop. Each frame is a NumPy array passed straight into the analysis function — no intermediate file writes, no extra buffers.

/ 02

Face localization MTCNN

MTCNN is configured as the detector backend because it holds up better than the default Haar cascades — especially in low light or when faces aren't perfectly centered. More reliable detection means fewer false negatives.

/ 03

Emotion classification DeepFace

DeepFace runs the pretrained emotion model and returns a dictionary of scores. The code grabs the dominant emotion and handles both single-face and multi-face cases — so a list response from multiple people in frame doesn't break the loop.

NOTE

Honest status: the emotion detection loop runs and prints results to the console correctly. The cv2.imshow call for displaying the live video frame is a known issue and is currently commented out — analysis works, the visual playback doesn't. What's next: fix the display issue, then build on top of it — an emotion-counter, an emotion-aware music player, or a sentiment overlay during conversations.

Want to poke around the code?

View on GitHub ← Back to projects