All projects
Project 05 / Computer Vision

Lens

A small experiment in turning live webcam frames into emotion labels.

Role
Solo build — exploration
Stack
Python OpenCV DeepFace MTCNN
FRAME DETECT ANALYZE LABEL cv2.VideoCapture(0) FACE DETECTED MTCNN DeepFace.analyze actions=['emotion'] backend='mtcnn' enforce_detection=False happy dominant_emotion A loop reads webcam frames → MTCNN locates faces → DeepFace classifies emotion → result printed to console
FIG. 01 — Pipeline Each frame moves through four steps. The dominant emotion comes out the other end.

Webcams and pretrained models are everywhere, but stitching them together into something that reads emotions in real time is its own small puzzle. Lens is the experiment that ties them together.

OpenCV opens the webcam and reads frames in a loop. MTCNN locates the face inside each frame. DeepFace then runs emotion classification on the cropped region and returns the dominant label — happy, sad, neutral, and so on.

Built mainly as a hands-on way to learn computer vision and the OpenCV + DeepFace stack, with an eye toward future projects like emotion-aware media players or sentiment analysis during user interactions.

The core loop
emotion_detection.py
import cv2 from deepface import DeepFace def check_emotion(frame): # run DeepFace with MTCNN backend on the captured frame analysis = DeepFace.analyze( frame, actions=['emotion'], enforce_detection=False, detector_backend='mtcnn' ) # handle both single-face and multi-face return types if isinstance(analysis, list): dominant = analysis[0]['dominant_emotion'] else: dominant = analysis['dominant_emotion'] print(f"Detected emotion: {dominant}") return dominant def main(): capture = cv2.VideoCapture(0) while True: captured, frame = capture.read() check_emotion(frame) if __name__ == "__main__": main()
FIG. 02 The whole thing fits on one screen. MTCNN handles tough lighting; DeepFace handles the rest.
What's happening under the hood

Three libraries, one tight loop.

/ 01
Frame capture OpenCV · cv2.VideoCapture

OpenCV opens the default webcam and reads raw frames in a loop. Each frame is a NumPy array passed straight into the analysis function — no intermediate file writes, no extra buffers.

/ 02
Face localization MTCNN

MTCNN is configured as the detector backend because it holds up better than the default Haar cascades — especially in low light or when faces aren't perfectly centered. More reliable detection means fewer false negatives.

/ 03
Emotion classification DeepFace

DeepFace runs the pretrained emotion model and returns a dictionary of scores. The code grabs the dominant emotion and handles both single-face and multi-face cases — so a list response from multiple people in frame doesn't break the loop.

NOTE

Honest status: the emotion detection loop runs and prints results to the console correctly. The cv2.imshow call for displaying the live video frame is a known issue and is currently commented out — analysis works, the visual playback doesn't. What's next: fix the display issue, then build on top of it — an emotion-counter, an emotion-aware music player, or a sentiment overlay during conversations.

Want to poke around the code?