Mark Oliver's World

Posted: 20/01/2022

Working With Ffmpeg

I have been working with audio and video files a fair bit recently in my day job. We do video and audio call recording software, so this has allowed me to play around with FFmpeg

I have had to :

  • Create test videos in MP4, & H264
  • Create test audio files in WAV & MP3
  • Crop a video into smaller view sizes (e.g. remove monitor 1 and 2 from a 3 monitor viewed video)
  • Convert a H264 stream to MP4
  • Convert WAV to MP3
  • Resample video frame rates
  • Combine multiple videos into a single video to present a "zoom like" combined view.

The latter one has been the hardest so far, it requires using the "filter_complex" option. So I will take you through how I generated it, this is the complete command:

ffmpeg -i vid1.h264 -i vid2.h264 ... -i vid100.h264 -filter_complex "hstack=8,format=yuv420p,scale=1024:-1" -c:v libx264 -crf 18 output.mp4

  • ffmpeg is the tool we are using. An awesome audio an video manipulation and generation too,
  • -i <filename> Specify this as many times as you want.
  • -filter_complex The type of filter we are going to use, and uh oh its a complex one.
  • hstack=X (where X is the number of input files above)
  • ,format=yuv420p Defines the pixel format to use.
  • ,scale=1024 Defines the videos total width, with each of the sub videos taking up X amount of space as a percentage.
  • -c:v libx264 This is the AVC Encoder to use: https://ffmpeg.org/ffmpeg-codecs.html#libx264_002c-libx264rgb
  • -crf 18 The Constant Rate Factor (The range of the quantizer scale is 0-51: where 0 is lossless, 23 is default, and 51 is worst possible. A lower value is a higher quality and a subjectively sane range is 18-28. Consider 18 to be visually lossless or nearly so: it should look the same or nearly the same as the input but it isn't technically lossless. = https://trac.ffmpeg.org/wiki/Encode/H.264 )
  • output.mp4 The filename to output the combined video as.

This was a really helpful stackoverflow post: https://stackoverflow.com/a/33764934/15722683

You will have noticed, that command only adds the videos in a single row, which is not what I wanted, so to solve this, in comes xstack:

Which forces you to define an x/y layout of your videos.
Optionally, you may need to resize your videos to match, to give an even view, this is what I came up with

ffmpeg -i 1.h264 -i 2.h264 -i 3.h264 -i 4.h264 -i 5.h264 -i 6.h264 -i 7.h264 -i 8.h264 -i 9.h264 -i 10.h264 -i 11.h264 -i 12.h264 -i 13.h264 -i 14.h264 -i 15.h264 -i 16.h264 -filter_complex "[0:v]scale=iw/4:-1[v0];[1:v]scale=iw/4:-1[v1];[2:v]scale=iw/4:-1[v2];[3:v]scale=iw/4:-1[v3];[4:v]scale=iw/4:-1[v4];[5:v]scale=iw/4:-1[v5];[6:v]scale=iw/4:-1[v6];[7:v]scale=iw/4:-1[v7];[8:v]scale=iw/4:-1[v8];[9:v]scale=iw/4:-1[v9];[10:v]scale=iw/4:-1[v10];[11:v]scale=iw/4:-1[v11];[12:v]scale=iw/4:-1[v12];[13:v]scale=iw/4:-1[v13];[14:v]scale=iw/4:-1[v14];[15:v]scale=iw/4:-1[v15];[v0][v1][v2][v3][v4][v5][v6][v7][v8][v9][v10][v11][v12][v13][v14][v15]xstack=inputs=16:layout=0_0|w0_0|w0+w1_0|w0+w1+w2_0|0_h0|w4_h0|w4+w5_h0|w4+w5+w6_h0|0_h0+h4|w8_h0+h4|w8+w9_h0+h4|w8+w9+w10_h0+h4|0_h0+h4+h8|w12_h0+h4+h8|w12+w13_h0+h4+h8|w12+w13+w14_h0+h4+h8" output.mp4

I will figure out what that command breaks down to another day, but for now, it takes 16 different input files, resizes them all, and then stitches them together in a 4x4 grid.

Immediately I was asked to add in variable length files and pad them to start and finish at different times, which again can be achieved with the filter_complex command in FFmpeg - Did I say how awesome it is.

I'll write more about this soon, but for now, I am dealing with some memory reduction issues at work, so not focussed on this.

Enjoy your day, and thats for reading.


Thanks for reading this post.

If you want to reach out, catch me on Twitter!

I am always open to mentoring people, so get in touch.