Skip to main content

Babymonitor #1

· 7 min read
Hampus Londögård

Hi 👋

I’m building a babymonitor. It’s not gonna be anything novel, neither the first or last in history. But it’s a relevant project to me, and it makes me happy! 🤓

In this blog I'll walk through different ways to stream video from the raspberry pi to the network, capturing it in a client browser.

Background

It all started when I talked to an old friend and he said that his in-laws gifted them the "Rolls-Royce" of babymonitors. The monitor has:

  • Bidirectional audio.
  • Unidirectional video.
    1. Night Vision.
    2. Rotation Horizontally.
    3. Zoom.
  • Temperature
  • Play soothing bedtime songs.

Incredibly cool!  

This led to a simple equation:
awesome + expensive + programmer = Do It Yourself (DIY)

Obviously I need to have the same specifications and a little better, while "cheaper" (my hours are "free" 🤓).

The greatest part? I have a natural deadline which enforce me to finish the project!

Goals

KISS - Keep It Simple Stupid

I've collected the following equipment:

  • Raspberry Pi 3B (had one already)
  • 5 MP camera with mechanical IR filter
  • 2 servos (rotating camera)
  • Temperature / Humidity Sensor
  • Microphone Array
  • Speaker

My bottleneck is the Raspberry Pi's performance really. And with performance comes optimisations, which I love! It makes following the KISS principle a tad harder! 😅

I have settled on one of three languages to write the video streaming in, either golang, rust or python.
My initial idea is that the simpler parts will be a FastAPI (Python) server, like temperature and moving servos. Python really is lingua franca on the Raspberry Pi and the support is amazing.  

From my initial experimentation I found Python to require a shit ton of CPU power to livestream video, as such I believe rust or golang will be the way to go. 🚀

Live Streaming: Initial Experimentation

I've tried multiple things, HTTP, HLS, Websocket & WebRTC. Each step proves a more complex, albeit more optimal, solution. Each with it's trade-offs.

Some worthy mentions of other solutions is Real Time Streaming Protocol (RTSP).

Protocols / Variants

Describing the protocol and how it's implemented, in a very general way.

Hypertext Transfer Protocol, lingua franca protocol of the internet, is a way to stream both video and audio. It's easy but not efficient.

How: Stream chunks using HTTP messages and let your <video>  element handle the consumption of the stream.

Each protocol comes with positives and negatives.

HTTPHLSWebsocketWebRTC
Pros+ Easy to implement.
+ Simple protocol.
+ Easy to implement
+ CPU efficient.
+ Easy to do "live" streams.
+ Low latency.
+ CPU efficient.
+ Supports all my use-cases
+ Low Latency.
+ CPU efficient.
Cons- CPU inefficient (HTTP header overhead).- High latency (5-10s+). - Hard to consume on client.
- Bi-directional streaming is also hard.
- Not straightforward implementation
- Less documentation than HLS/HTTP.

Implementations

The provided MJPEG server from picamera2 is excelent show-case on how to stream the video. It sets up a simple HTML with a <img> element which streams new frames using MJPEG  which is Motion-JPEG.

The performance is pretty OK, considering it's Python & MJPEG. Compared to H264 which works much more effectively. 
We see the CPU hovering around 130-150%, but the largest drawback is the network bandwidth, at ~50Mb/s compared to H.264 at ~3.5Mb/s.
This is because MJPEG sends the full frame each time, H.264 sends a frame and then some delta frame until it sends a full frame again. This has drawbacks and positives, the bandwidth is low but quality can suffer.

Code
#!/usr/bin/python3

# Mostly copied from https://picamera.readthedocs.io/en/release-1.13/recipes2.html
# Run this script, then point a web browser at http:<this-ip-address>:8000
# Note: needs simplejpeg to be installed (pip3 install simplejpeg).

import io
import logging
import socketserver
from http import server
from threading import Condition, Thread

from picamera2 import Picamera2
from picamera2.encoders import JpegEncoder
from picamera2.outputs import FileOutput

PAGE = """\
<html>
<head>
<title>picamera2 MJPEG streaming demo</title>
</head>
<body>
<h1>Picamera2 MJPEG Streaming Demo</h1>
<img src="stream.mjpg" width="640" height="480" />
</body>
</html>
"""


class StreamingOutput(io.BufferedIOBase):
def __init__(self):
self.frame = None
self.condition = Condition()

def write(self, buf):
with self.condition:
self.frame = buf
self.condition.notify_all()


class StreamingHandler(server.BaseHTTPRequestHandler):
def do_GET(self):
if self.path == '/':
self.send_response(301)
self.send_header('Location', '/index.html')
self.end_headers()
elif self.path == '/index.html':
content = PAGE.encode('utf-8')
self.send_response(200)
self.send_header('Content-Type', 'text/html')
self.send_header('Content-Length', len(content))
self.end_headers()
self.wfile.write(content)
elif self.path == '/stream.mjpg':
self.send_response(200)
self.send_header('Age', 0)
self.send_header('Cache-Control', 'no-cache, private')
self.send_header('Pragma', 'no-cache')
self.send_header('Content-Type', 'multipart/x-mixed-replace; boundary=FRAME')
self.end_headers()
try:
while True:
with output.condition:
output.condition.wait()
frame = output.frame
self.wfile.write(b'--FRAME\r\n')
self.send_header('Content-Type', 'image/jpeg')
self.send_header('Content-Length', len(frame))
self.end_headers()
self.wfile.write(frame)
self.wfile.write(b'\r\n')
except Exception as e:
logging.warning(
'Removed streaming client %s: %s',
self.client_address, str(e))
else:
self.send_error(404)
self.end_headers()


class StreamingServer(socketserver.ThreadingMixIn, server.HTTPServer):
allow_reuse_address = True
daemon_threads = True


picam2 = Picamera2()
picam2.configure(picam2.create_video_configuration(main={"size": (640, 480)}))
output = StreamingOutput()
picam2.start_recording(JpegEncoder(), FileOutput(output))

try:
address = ('', 8000)
server = StreamingServer(address, StreamingHandler)
server.serve_forever()
finally:
picam2.stop_recording()

Performance

Stats taken from top.

HardwareHTTP - MJPEGHLSWebsocket no connectionWebsocketWebRTCWebRTC (aiortc)
CPU150%40%<=0.2%40%170%250-350%
RAM6%4%<=0.4%6%5%

Ending Notes

This is what I have currently. In the next blog I'll go through how we'll set up a backend which will allow us to use the sensors, move the servos and stream audio/video.

I think the bidirectional communication will require a third blog, and then manufacturing a 3D-printed case as a fourth!

Until next time,
Hampus Londögård