An open-source, automated, lecture recording system
that tracks the presenter in 4K video streams

Aimed at providing a cost-effective solution to enhance learning

Overview

The Center for Innovation in Learning and Teaching (CILT) implements an automated lecture recording system at the University of Cape Town (UCT). The current software solutions are proprietary and use expensive Pan-Tilt-Zoom (PTZ) cameras.

The Problem

CILT is experimenting with cheaper 4K (3840x2160 pixels) wide-angle cameras to record lectures. Since the resultant videos are very large (~2GB), we were asked to create an open-source automated lecture recording system capable of 1) extracting a reduced resolution stream from the 4K video input, and 2) ensuring the board content is legible and in the frame at all times. Our solution uses computer vision algorithms to determine a suitable smaller crop region to extract from each 4K frame.

Proposed Solution

We implemented an automated lecture tracking system in C++ using the OpenCV library. The system takes in a video produced by a 4K camera and processes it in three stages. The first stage is used to detect the boards and their usage. The second stage detects and tracks the position of the lecturer, and the third stage uses the information from the previous stages to frame the lecturer and the boards in a smaller window and saving this to a video file.

Board Detection

We find candidate boards by their edges and save them as rectangles. These rectangles are evaluated to determine which are really boards. The amount of content on a board determines when last it was used. All the boards are then captured in an enclosing frame.

Lecturer Tracking

We detect movement by comparing the differences between frames and storing them in rectangles. The lecturer is then found in one of these rectangles based on the time spent on screen.

Virtual Cinematographer

We use lecturer positions to decide whether (and how) to pan the virtual camera such that both lecturer and the board being referred to are kept in-frame.