Early in November, I was fortunate to have been able to contribute and attend Linux App Summit in Barcelona. I was able to work with some excellent people and develop our Voctomix based streaming solution.
This is a great opportunity to write this small article, outlining our Voctomix configuration, some of the learnings that we have made during Linux App Summit and detail some of the plans that we have going forward.
Tuxedo computers have been very supportive of our efforts towards free software live streaming and as such, have loaned us 2 of their laptops to use during events and drive our solution forward.
Traditionally, one of the primary constraints that we’ve worked around is CPU power, leading us to look into solutions that allow us to delegate CPU and GPU intensive workloads to post-processing. Having access to these laptops allowed us to live-stream the event.
Linux App Summit was a great opportunity to test and develop our current configuration. We’ve found that free software events are great milestones to work towards.
Voctomix was a free software tool developed by the team behind the Chaos Communication Congress event. Voctomix is a collection of components, written in Python that that co-ordinate GStreamer pipelines across multiple devices. From a Live Streaming point of view, Voctomix provides a video mixer capable of mixing 2 live cameras and 1 screen grabbing input.
Some of the key capabilities of Voctomix that matter to us:
- Voctomix can receive video streams over a TCP Socket
- The Voctomix control panel can run on a separate device than the video mixer itself, allowing us to limit the blast radius of failure
- The interface for Sources (Audio / Video input) and Sinks are TCP Sockets with a common GStreamer based protocol.
For Linux App Summit, we decided to configure 1 camera to track the presenter with another to show a wide view of the conference hall. Alongside the two camera inputs, we also configured an HDMI Grabber to capture the output of the presenter’s laptop.
- We used a Canon HDV30 Video Camera as a Zoom Camera, capturing output with Firewire. We are able to use
- We used a Logitech C920 Webcam as a Wide Camera. The C920 is capable of providing a 720p H264 feed
- We use the sender component of a Lenkeng HDMI Extender to capture HDMI output from the presenter laptop, I’ll detail more on this in a future entry.
As for output feeds, we capture the Voctomix output and split it into 2 feeds
- Youtube Feed
- h264 timestamped files split hourly.
A common feature of most of our video systems is that we normally capture video or audio on different nodes on the same LAN. Where it’s possible to send raw video directly to the Voctomix server in raw format, the bandwidth required to send 1080p video at 30 frames per second is fairly prohibitive on most commodity hardware.
As we normally don’t work with broadcast-quality cameras, it’s possible to take a small hit on perceived video quality to solve this problem. We can encode video to H264 on a low-cpu preset and send video to a shared RTMP server on the same LAN.
In this configuration, we run a pre-configured Docker container to manage live transport of video from the source hosts.
This container is run as a service on the same machine running Voctomix to limit the impact to latency.
docker run -d -p 1935:1935 --name nginx-rtmp tiangolo/nginx-rtmp
Firewire as a technology is still impressive by today’s standards. We can rely on Firewire as a transport mechanism for 1080i video, it is, however, unlikely that we will find modern hardware with a firewire port.
For this event, we were using the firewire output of a Canon HDV30 camera. This was connected to a Lenovo X200 Laptop with a Firewire X200 ExpressCard. We take the video from the firewire input, then using FFmpeg, convert this to H264 and send it to the RTMP Media Bus that we are running on the video mixer.
On the mixer laptop, we can consume the RTMP feed, convert it back to raw video and pass this directly on to the Voctomix Server.
As a second camera input, we used a C920 webcam. The C920 is commonly used by game streamers on twitch and has been designed to make several assumptions regarding lighting and focus and maximise the quality in the majority of cases.
Similar to the way that we can ingest video feeds into Voctomix, we can consume the output stream in the same way. We can use either FFMPEG or GStreamer to consume these feeds, process, then either stream to another location or save to a local file. In this case, we worked with 2 consumers.
while [ 0 ]
while [ 0 ]
Following the event, we’re able to focus on 3 primary challenges:
We had some challenges around audio input, we’re going to be trying a few strategies to improve audio capture across multiple devices. We would traditionally deploy several boundary microphones as a backup, I’m interested in finding a more practical way to utilise these on a live stream.
Voctomix has been built following a microservice style approach, this gives us some flexibility around the deployment of components. I would like to apply some DevOps principals to how we develop our video pipeline, using tools such as Ansible to deploy changes across each device. It would also be sensible to build system packages for some of our components.
Post-processing is currently a very manual process. This year, we looked into storing all mixer events in a time-series database while retaining all raw footage for later processing. I’m planning to investigate how we can automatically generate a Kdenlive file from this data, allowing us to “Re-Master” our live events.