BOP Hearing Recorder

The Problem

Pennsylvania Board of Pardons hearings are public and livestreamed, but staff and legal volunteers at Do Moore Good needed a reliable recording of each one for case review and documentation. The previous approach involved people logging into the PAcast stream from their own computers and using screen recording software.

That created two problems. First, multiple people joining the same limited-capacity stream put load on the broadcast, causing buffering and quality issues for everyone watching. Second, screen recording is fragile: it requires a personal computer to stay on, unlocked, and connected for the full 4 to 6 hour hearing. A Windows update, a dead battery, or a dropped wifi connection ends the recording silently, with no warning and no backup.

Multiple viewers on a capacity-limited stream caused buffering and quality degradation
Screen recordings fail silently: no alert if someone's laptop dies or disconnects mid-hearing
Someone had to be physically present at a computer for the full duration of every hearing
Output quality was inconsistent: mouse cursors, desktop notifications, and variable resolution
No centralized archive: recordings lived on personal hard drives with no consistent naming or storage

Before and After

Before

Screen recording on personal laptops

Volunteer joins the PAcast stream as a viewer, adding to broadcast load
Screen recording software captures whatever is visible on-screen
Computer must stay on and connected for the full 4 to 6 hour hearing
Any interruption (update, sleep, disconnect) silently ends the recording
No notification if something goes wrong during the hearing
Files stored locally with no consistent archive

After

Server-side HLS capture, automated

Server pulls raw video data directly from the stream source, not as a viewer
Recording starts automatically at the scheduled time, no one needs to be present
Two parallel processes run simultaneously: if one fails, the other keeps going
Email alerts on start, every hour, and immediately if anything goes wrong
Files upload directly to a shared Google Drive folder after the hearing ends
Retries automatically for 30 minutes if the stream is late going live

How It Works

The recording process is fundamentally different from screen recording. Instead of capturing pixels off a display, the server pulls the raw HLS video segments directly from the broadcast source. This is the same data a browser receives when watching the stream, but saved straight to disk rather than rendered on screen.

Direct HLS Capture

yt-dlp extracts the raw HLS manifest URL from the PAcast broadcast page, then ffmpeg records the video and audio streams directly to a fragmented MP4 file. No browser, no display, no screen involved. The server is not a viewer on the stream.

Redundant Recording

Two independent ffmpeg processes run simultaneously against the same stream URL. Each writes to its own file. If one process crashes mid-hearing, the other continues uninterrupted. Both files are available for download after the hearing ends.

Scheduled and Self-Starting

Hearings are scheduled via the web UI. A cron job checks every minute for recordings due to start. If the stream is not yet live at the scheduled time, the system retries automatically every two minutes for up to 30 minutes before marking the job as failed and sending an alert.

Playable Mid-Recording

Files are saved as fragmented MP4, which writes its header at the start rather than the end. This means the recording is a valid, playable file at any moment during capture. A standard MP4 written incorrectly becomes unplayable if the process is interrupted.

Google Drive Archive

After the hearing ends, files upload to a shared Google Drive folder via service account. The upload supports both personal Drive and Google Workspace Shared Drives. After a successful upload, local files can be deleted to free server disk space.

Whisper Transcription

Recordings can be transcribed using faster-whisper running locally on the server (large-v3 model). This produces a searchable text transcript of the full hearing without sending audio to any third-party service.

System Architecture

PAcast HLS
Public livestream

yt-dlp
Extracts stream URL

Django App

Scheduler + controls

ffmpeg
x2 parallel processes

MP4 Files

Local server storage

Google Drive

Shared archive

Email Alerts

Start / hourly / error

Whisper

Local transcription

A cron job fires every minute to start any scheduled recordings whose time has arrived. Auth uses magic-link email: no passwords, no OAuth flow required.

Notification Pipeline

A hearing can run for four to six hours with no one watching it. The notification system is designed so that if something goes wrong, there is always an email that explains what happened and includes a direct link to retry.

Recording started confirms that ffmpeg connected to the stream and is writing to disk. This fires as soon as the first byte is captured.
Hourly heartbeat fires every hour while the recording is live, including the current file size and the last 20 lines of the ffmpeg log. Makes it easy to confirm the recording is still running during a long hearing.
Stopped early fires if the recording ends before the expected duration. Indicates the stream may have dropped and prompts a manual check.
Failed to start fires after all 15 retries are exhausted (30 minutes of attempts) with a direct link to retry manually. Triggered when the stream never goes live.
Worker crash fires if any unhandled exception occurs inside the recording process, with the full Python traceback. Covers every error Python can catch.

Application Mockups

Dashboard

Disk free: 48.3 GB

Test recording + New recording

	Label	Status	Started	Duration	File size	Drive
View	Jun 10 AM Session	Recording	9:00 AM	2h 14m	4.2 GB	Pending upload
View	Jun 10 AM Session (backup)	Recording	9:00 AM	2h 14m	4.1 GB	Pending upload
View	May 13 Full Day	Uploaded	May 13	5h 48m	deleted	Drive ↗
View	Apr 8 PM Session	Uploaded	Apr 8	3h 22m	deleted	Drive ↗

Recording Detail

Jun 10 AM Session

Recording Started 9:00 AM · running 2h 14m

Primary process

Status Active (PID 18432)

File size4.2 GB

Outputrecording_12_jun10_am.mp4

Backup process

Status Active (PID 18433)

File size4.1 GB

Outputrecording_12_jun10_am_backup.mp4

Spot-check (last 30s) View log Stop recording

Pipeline Test

Run a short test recording against a known-good public stream before a hearing to verify the full pipeline is working.

Stream	Description
BBC News HD	Video + audio. Best all-around pipeline test.	Run test
BBC World Service	Audio only. Tests audio recording path.	Run test
PAcast (BOP)	The real stream. Only live during hearings.	Run test

After a test completes, use the Spot-check button on the recording detail page to download the last 30 seconds and confirm both audio and video are present before the hearing starts.

Impact

No one needs to be at a computer for the duration of a hearing. The server starts and stops the recording automatically on schedule.
Eliminated streaming load from the broadcast by capturing at the HLS level rather than joining as a viewer, improving stream quality for others watching live.
Redundant processes mean no single point of failure. If one recording crashes mid-hearing, the backup file captures the full session.
Every failure sends an immediate email with a direct link to retry, so nothing fails silently during a long hearing.
Recordings upload directly to a shared Drive folder, creating a consistent, accessible archive without manual file transfers.
The pipeline is testable before every hearing using pre-configured public streams, with a spot-check tool to verify audio and video are both present.
Local Whisper transcription produces a searchable text record of each hearing without sending audio data to any external service.

Tech Stack

Django Python ffmpeg yt-dlp faster-whisper SQLite Google Drive API Magic Link Auth Apache + Gunicorn systemd + cron