1. Case Overview
1.1 Background
To implement a face recognition unlocking function, used in his real-life escape room business. Overall, the demand description is simple, but due to many constraints, thought needs to be put into architecture and selection.
1.2 Deployment Effect

Since the game is still online, specific operation videos will not be released here.
1.3 Player Experience
-
After players discover and enter the space, they see their real-time image on the display screen.
-
When players observe closely, the current frame is captured for face recognition, and the real-time image displays the watermark subtitle “Authenticating”.
-
If face authentication fails, the watermark subtitle changes to “Authentication Failed” and remains for 2 seconds before disappearing, returning to the initial state. Players continue to search for game clues and re-authenticate.
-
If face authentication succeeds, the watermark subtitle changes to “Authentication Successful” and the safe door opens, leading to subsequent game stages.
2. Product Requirements
2.1 Requirement Description
The requirements were quite clear when proposed, and the core logic is not complicated.
-
Face Recognition: Authentication through face recognition.
-
Lock Management: Open the box door upon authentication, remain locked if not.
-
Feedback Prompt: Real-time video feedback is needed, clear guidance to optimize player experience.
2.2 Constraint Description
After all, it is a business, so practicality and cost are highly demanded. The key is not to affect the game process while ensuring player experience.
-
Low Cost: Requires low construction and maintenance costs.
-
Easy Maintenance: Low technical requirements for maintenance personnel; any staff can quickly restore in case of hardware or software failure.
-
High Reliability: High recognition accuracy, strong fault tolerance, and low failure rate during continuous system operation.
-
Limited Space: The entire system, excluding the display, electromagnetic lock, and safe, cannot exceed a volume of 20cm*15cm*15cm.
-
Insufficient Lighting: The real scene is small, with overhead light but no side light, resulting in longer exposure time.
-
Universal Power Supply: Only provides 5V and 12V DC interfaces.
-
Parallel Processing: The authentication and feedback processes run in parallel; during authentication, the feedback system cannot be interrupted or blocked, preventing players from experiencing noticeable interruptions or freezes.
-
Weak Network Environment: Due to many room partitions and shared networks, internet speed is limited, with occasional delays.
2.3 Function Design
There are various possible architecture solutions (comparisons between different solutions will be made at the end), and below is an explanation of the final online solution.
2.3.1 Set Process
Refer to 1.3 Player Experience for the process and effect.
2.3.2 Configurable Content
-
Tencent Cloud Key Pair
Modify the configuration file to adapt to Tencent Cloud account switching (test account/official account).
-
Personnel Library ID
Modify the configuration file to specify different personnel libraries (test library/official library).
-
Watermark Prompt
Change the corresponding image to change the watermark. The reason for using image management instead of text configuration is that the image configuration mode does not require font library support, does not need to configure display size, and is easy to embed patterns. Since it is WYSIWYG, the requirements for maintenance personnel are low.
-
Shutdown Option
Can configure whether to automatically shut down after task completion to prepare for game environment reset, reducing reset workload.
2.3.3 Operation and Maintenance
-
System Operation Management
When the scene is started, power is unified. After authentication, it automatically shuts down to complete the reset.
-
Fault Handling
Hardware and software failures: Cannot start, can start but no display, can start with system exceptions, can start with unknown exceptions, etc., replace the Raspberry Pi or other hardware.
Network failure: Normal operation, cannot authenticate, check the network + check cloud logs to resolve network issues;
Cloud product exceptions: Run for 4 months without occurrence, can be ignored; if it occurs, contact cloud after-sales;
2.3.4 Cost Analysis
-
Hardware Cost: 500 to 600 yuan.
-
Spare Parts Cost: 1:1 spare parts, 500 to 600 yuan.
-
Operating Cost: 0 yuan on the cloud, using free quota; electricity and internet costs can be ignored.
3. Technical Implementation
3.1 System Architecture

3.1.1 Hardware Composition:

-
Raspberry Pi: Terminal main control
-
Camera: Video input
-
Sensor: Ultrasonic distance measurement
-
Display: Video output
-
Relay: Control electromagnetic lock
-
Electromagnetic Lock: Control safe door
3.1.2 Key Features
-
Image Recognition: Uses image recognition instead of video stream to reduce network bandwidth requirements.
-
Low Recognition Requirements: Underexposed photos also have high recognition rates.
-
Trigger Recognition: Players stay in the scene for a long time; the trigger mode avoids high-frequency authentication and false unlocking, while reducing authentication costs.
-
Distance Measurement Selection: Ultrasonic sensor technology is mature, low cost (3 yuan); laser sensor is high cost (30 yuan)
-
Multi-Process: Video processing and monitoring authentication are implemented by two processes, avoiding blocking situations, and using inter-process communication for reliable interaction.
3.2 System Setup
3.2.1 Tencent Cloud Configuration
-
Register Account
Follow the documentation to obtain the API key.
-
Configure Face Recognition
Access the official website console, create a personnel library, create personnel, and upload photos to establish the authentication basis.
The “Personnel Library ID” used is key information for specifying the personnel library during subsequent API calls for recognition.
Note: Since this case only recognizes one person, there is no need to match personnel ID, so no personnel ID is specified.
3.2.2 Raspberry Pi Configuration
-
Install System
Visit www.raspberrypi.org to obtain the image and install it. Note that the desktop version must be installed; otherwise, HDMI output needs to be managed separately.
-
Configure Network
Enter the command line, execute “raspi-config”, select “Network Options”, and configure the WiFi access point. To fix the IP, edit the /etc/dhcpcd.conf file and add configuration information.
# Please refer to your local network planning for specific content
interface wlan0
static ip_address=192.168.0.xx/24
static routers=192.168.0.1
static domain_name_servers=192.168.0.1 192.168.0.2
-
Install Tencent Cloud SDK
Refer to the guide document to install the dependency library for calling Tencent Cloud API.
sudo apt-get install python-pip -y
pip install tencentcloud-sdk-python
-
Install Image Processing Library
The system defaults to installing python2.7, but the opencv library needs to be installed. (The download package is large, and the default source is a foreign site, which is slower. Please search for how to change the Raspberry Pi to a domestic source and choose a nearby source.)
sudo apt-get install libopencv-dev -y
sudo apt-get install python-opencv -y
-
Deploy Code
Visit GitHub to get the source code, copy the contents of the src folder to /home/pi/faceid.
Change the configuration information in /home/pi/faceid/config.json to your cloud API key(sid/skey), personnel library ID(facegroupid), and adjust other configurations as needed.
-
Configure Autostart
Need to configure graphical interface autostart to ensure video output is sent to the display via HDMI. Edit /home/pi/.config/autostart/faceid.desktop and write the following content.
Type=Application
Exec=python /home/pi/faceid/main.py
3.2.3 Hardware Wiring
Raspberry Pi GPIO Diagram

Camera
-
CSI Interface

Ultrasonic Sensor
-
TrigPin: BCM-24 / GPIO24
-
EchoPin: BCM-23 / GPIO23
-
VCC: Connect to 5V
-
GND: Connect to GND
Relay
4-pin side connects to Raspberry Pi GPIO pins.
-
VCC: Connect to 5V
-
GND/RGND: Connect to GND
-
CH1: BCM-12 / GPIO12
The 3-port side connects to the electromagnetic lock.
-
The initial state is that the electromagnetic lock connects to the normally closed end.
-
For relay principles, please refer to 3.3.4 Hardware Related section.
3.2.4 Test Run
After completing the above work, power on the system, check the display screen for local feedback, and view system logs for cloud recognition results.
3.3 Code Logic and Involved Technologies
3.3.1 Process Pseudocode
# Monitoring authentication process - main process
Get application configuration (API ID/Key, etc.)
Initialize GPIO pins (prepare to control sensors, relays)
Start video management process (auxiliary process)
Start loop:
if not distance measurement meets trigger standard:
continue
Communicate with auxiliary process (capture current frame, save to specified path, and add "Authenticating" watermark)
Call cloud API, use the frame image for face recognition
if recognition successful:
Communicate with auxiliary process (change watermark to "Authentication Successful")
Wait for 5 seconds
Shut down or continue running (as specified by the su2halt field in config.json)
else:
Communicate with auxiliary process (change watermark to "Authentication Failed")
Wait for 2 seconds
Communicate with auxiliary process (clear watermark)
# Video management process - auxiliary process
Initialize camera
Start loop:
Capture frame
Access inter-process shared queue
Perform different operations based on messages (save frame image/add different watermarks/do nothing)
Output frame
3.3.2 Video and Recognition
-
Real-time Video
As shown in the pseudocode above, real-time video is displayed by processing frames sequentially and continuously outputting.
-
Trigger Recognition
The distance sensor confirms that an object is close, and if the distance change is less than 2cm within 0.3 seconds, it is confirmed as waiting for authentication status. After a delay of 0.3 seconds, the image frame is captured. The reason for the additional delay is that when the object stops, there may be twisting, fine-tuning, and other movements. If the frame is captured directly, due to insufficient lighting (as mentioned in the constraints above), blurriness may occur, so another delay ensures a stable image capture.
-
Face Recognition
Please refer to the documentation introduction.
3.3.3 Image Watermark
-
Watermark Principle
In OpenCV, various image processing functions are provided, such as image-text processing (image plus text), image-image processing (addition/subtraction/multiplication/division/bitwise operations between images), etc. By using different processing methods, various effects can be achieved, such as adding text to the background image, adding images to the background, masking processing, etc. In this case, a masking processing method based on bitwise operations is used.
-
Watermark Image
To facilitate maintenance and updates, this case uses images as the source of watermarks to avoid font library constraints, increase flexibility, and easily add graphics to the watermark, directly defining the watermark size by resolution, and achieving WYSIWYG.
The default watermark image has a white background and black text.
-
Watermark Processing Logic
To highlight the floating effect of the watermark, the black area in the watermark image is made transparent and overlaid onto the original image. Due to the transparency effect of the font, the watermark font color changes with the base video, resulting in a more pronounced effect.
Source code explanation
# img1 is the current video frame (background), img2 is the watermark image read
def addpic(img1,img2):
# Focus area ROI - take the image in the background that will be edited by the watermark image
rows, cols = img2.shape[:2]
roi = img1[:rows, :cols]
# Grayscale image - avoid watermark image being non-pure black and white
img2gray = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
# Generate mask - filter light colors, bitwise operation to take non
ret, mask = cv2.threshold(img2gray, 220, 255, 3) #cv2.THRESH_BINARY
mask_inv = cv2.bitwise_not(mask)
# Generate watermark area image - crop the font part from the background image, generate the final image for the watermark area, and replace the original watermark area
img1_bg = cv2.bitwise_and(roi, roi, mask=mask_inv)
dst = cv2.add(img1_bg, img2)
img1[:rows, :cols] = dst
return img1
Watermark effect illustration (the illustration enlarges the watermark area to highlight the effect; in the actual application scheme, the watermark area is smaller).

3.3.4 Hardware Related
-
Ultrasonic Distance Measurement
The ultrasonic sensor (4 pins: VCC, Trig, Echo, GND), the Trig pin outputs a high level greater than 10μs to activate the emission of ultrasonic waves, and after receiving the reflected wave, the Echo pin outputs a sustained high level, the duration is the time from “wave emission to reception”.
That is: distance result (meters) = duration of high level on Echo pin * 340 meters / 2
-
Relay
The 5V relay module used has dual-sided wiring, one side for power and signal (4 pins, compatible with 3.3V signals), and the other side for path opening and closing management (3 ports).
The relay implements a “single-pole double-throw” mode on the “path management side”; the direction of the single pole is controlled by the high and low levels of the “CH1 pin” on the power and signal side.
During installation, the electromagnetic lock is powered by default to connect to the normally closed end. After sending a signal to the relay, it switches to the normally open end, and the electromagnetic lock is powered off to unlock.
-
GPIO
GPIO (General-purpose input/output) provides the ability to connect hardware in a pin manner. The Raspberry Pi 3B+ has 40 GPIO pins (please refer to the reference diagram in 3.2.3 Hardware Wiring), and the official operating system Raspbian can use the RPi.GPIO library installed by default in Python to operate.
4. Others
4.1 Solution Selection Comparison
The core of the design lies in the face authentication module, which directly impacts cost and stability, and the above solution was ultimately chosen (balancing cost, maintainability, and reliability). Other alternative face recognition solutions were once considered:
4.1.1 Local Recognition Solution A:
Using the ESP-EYE chip, all completed by the chip, relying on ESP-IDF, ESP-WHO, developed in C.
Low hardware cost (module cost 189*2), high development and maintenance costs (C development).
Problem: Difficult to update configurations and fault analysis. Suitable for large deployment scenarios.
4.1.2 Local Recognition Solution B:
Using Raspberry Pi for direct face recognition, mature solution with abundant open-source code.
Medium hardware cost, low development cost, high maintenance cost.
Problem: High load on Raspberry Pi; even using the interval frame algorithm, it can only maintain below 20fps, with noticeable stuttering. Further optimization may be limited by personal experience issues, making it difficult to maintain long-term stable operation.
4.1.3 Local Recognition Solution C:
Using BM1880 edge computing development boards or other image processing boards, with good community reputation and framework support.
Problem: High hardware cost (module cost 1000*2), high development and maintenance costs (C development). If using a computing stick, it requires X86_64 as the base platform, with limited cost reduction and unchanged complexity. Suitable for scenarios requiring expansion capabilities.
4.1.4 Cloud Recognition Solution A:
Using Tencent Cloud’s video intelligent analysis product, simplifying terminal architecture, using Raspberry Pi Zero to stream to the cloud (implementation solution will be released later), obtaining recognition results while supporting high-frequency multiple retrieval features.
Low deployment cost (terminal video-related module 150 yuan), low operating cost (currently 0.28 yuan/minute, calculated based on 20 minutes of single operation in this scenario, single game cost 5.6 yuan).
Problem: Highly dependent on network stability; disconnection and other situations affect experience. Under the network constraints of this case, it affects usage effects and is more suitable for application scenarios with better network conditions and high-frequency retrieval.