Encoding with x264

H.264 has been the de facto standard video format across the internet for the past decade. It is widely supported for playback in all modern browsers and many hardware devices such as gaming consoles and phones. It provides better video quality at smaller file sizes compared to its predecessors.

x264 is a mature, free, open-source encoder for the H.264 video format.

Prerequisites

To get started, you’ll need two things:

A video to encode. For the examples, we will pipe in a video from VapourSynth, which you should be able to do if you’ve been following the previous sections of this guide.
The x264 encoder.

Here’s how we get a copy of the x264 encoder:

Windows

Official Windows builds are available here.

Linux/macOS

Generally, x264 will be available through your distribution’s package manager. Here are a few examples:

Ubuntu/Debian: sudo apt install x264
Arch Linux: sudo pacman -S x264
macOS: brew install x264

Getting Started

x264 is very configurable, and the options may seem overwhelming. But you can get started encoding by using the presets x264 provides and understanding a few basic concepts. We’ll walk through those concepts with the following examples.

Example 1: General-Purpose Encoding

Open up a terminal window, and navigate to the folder where your VapourSynth script lives. Let’s run the following command:

vspipe --y4m myvideo.vpy - | x264 --demuxer y4m --preset veryfast --tune animation --crf 24 -o x264output.mkv -

Let’s run through what each of these options means:

`vspipe --y4m myvideo.vpy -`

This portion loads your VapourSynth script and pipes it to stdout, adding y4m headers that x264 can decode. If you use Linux, you’re probably familiar with how piping works. If you’re not, it’s basically a way of chaining two commands together. In this case, we want to chain vspipe, the program that reads VapourSynth scripts, with x264, our encoder.

`--demuxer y4m`

This tells x264 that we’re providing it with a y4m file. This matches up with the --y4m flag that we gave to the vspipe command.

`--preset veryfast`

x264 has a set of presets to switch between faster encoding, or higher quality. The full list of presets, from fastest to slowest, is:

ultrafast
superfast
veryfast
faster
fast
medium
slow
slower
veryslow
placebo

You will almost never want to use the extreme settings, but generally, if you want good quality and don’t care about how long the encode takes, slower or veryslow are recommended. In this example, because we are just demonstrating how x264 works, we want a fast encode and have chosen veryfast.

For the curious, you can see a full list of the settings enabled by each preset by running x264 --fullhelp | less (Linux/Mac) or x264 --fullhelp | more (Windows). However, this probably won’t mean much at the moment. Don’t worry, this page will explain later what all of those settings mean.

Disclaimer: x264’s fullhelp is not guaranteed to be up-to-date.

`--tune animation`

Beyond the preset chosen, x264 allows us to further tune the encoding settings for the type of content we’re working with. The following tunings are generally the most useful:

film: Recommended for live action videos.
animation: Recommended for anime or cartoons with flat textures. For 3D animation (e.g. Pixar movies), you may find better results with film.
grain: Recommended for particularly grainy films.

You don’t need to use a tuning, but it generally helps to produce a better-looking video.

`--crf 24`

Constant Rate Factor (CRF) is a constant-quality, 1-pass encoding mode. In layman’s terms, this means that we don’t need the output to meet a specific filesize, we just want the output to meet a certain quality level. CRF ranges from 0 to 51 (for 8-bit encoding), with 0 being the best quality and 51 being the smallest filesize, but there is a certain range of CRF settings that are generally most useful. Here are some guidelines:

CRF 13: This is considered visually lossless to videophiles. This can produce rather large files, but is a good choice if you want high quality videos. Some fansubbing groups use this for Blu-ray encodes.
CRF 16-18: This is considered visually lossless to most viewers, and leans toward high quality while still providing a reasonable filesize. This is a typical range for fansub encodes.
CRF 21-24: This provides a good balance between quality and filesize. Some quality loss becomes visible, but this is generally a good choice where filesize becomes a concern, such as for videos viewed over the internet.
CRF 26-30: This prioritizes filesize, and quality loss becomes more obvious. It is generally not recommended to go higher than CRF 30 in any real-world encoding scenario, unless you want your videos to look like they were made for dial-up.

`-o x264output.mkv -`

This last portion tells which files to use for the input and output. We use -o to tell which filename to write the encoded file to. In this case, x264 will write a file at x264output.mkv in the current directory.

The last argument we are passing to x264 is the input file. In this case, we pass - for the input file, which tells x264 to use the piped output from vspipe. The input argument is the only positional argument, so it does not need to be last; x264 will recognize it as the only argument without a -- flag before it.

Example 2: Targeted File Size

For the next example, let’s say we want to make sure our encode fits onto a single 4.7GB DVD¹. How would we do that in x264?

First, we’ll need to figure out what bitrate our encode should be, in kilobits per second. This means we’ll need to know a couple of things:

The length of our video, in seconds. For this example, let’s say our movie is 2 hours (120 minutes) long. We’ll convert that to seconds: 120 minutes * 60 minutes/second = 7200 seconds.
Our target filesize. We know that this is 4.7GB, but we need to convert it to kilobits. We can do this with the following steps:

$$ \begin{aligned} 4.7~\mathrm{GB}\times \frac{1000~\mathrm{MB}}{\mathrm{GB}} &= 4700~\mathrm{MB}\\ 4700~\mathrm{MB}\times \frac{1000~\mathrm{KB}}{\mathrm{MB}} &= 4,700,000~\mathrm{KB}\\ 4,700,000~\mathrm{KB}\times \frac{8~\mathrm{Kbit}}{\mathrm{KB}} &= 37,600,000~\mathrm{Kbit} \end{aligned} $$

Now we divide the kilobit size we calculated by our video length, to find our kilobit per second target bitrate:

$$ 37,600,000~\mathrm{Kbit}\div 7200~\mathrm{seconds} \approx 5222~\mathrm{Kbps} $$

There is also a python script that can handle this calculation for us:

>>> from bitrate_filesize import *
>>> find_bitrate('4.7 GB', seconds=7200)
bitrate should be 5,222 kbps

And here’s how we could add that to our x264 command:

vspipe --y4m myvideo.vpy - | x264 --demuxer y4m --preset veryfast --bitrate 5222 -o x264output.mkv -

The --bitrate option, by itself, says that we want to do a 1-pass, average-bitrate encode. In other words, the encoder will still give more bits to sections of the video that have more detail or motion, but the average bitrate of the video will be close to what we requested.

Example 3: 2-Pass Encoding

So far, we’ve only done 1-pass encodes. While using CRF 1-pass is great when you don’t have a target bitrate, it’s recommended not to use 1-pass for targeted-bitrate encodes, because the encoder can’t know what’s coming ahead of the current section of video. This means it can’t make good decisions about what parts of the video need the most bitrate.

How do we fix this? x264 supports what is known as 2-pass encoding. In 2-pass mode, x264 runs through the video twice, the first time analyzing it to determine where to place keyframes and which sections of video need the most bitrate, and the second time performing the actual encode. 2-pass mode is highly recommended if you need to target a certain bitrate.

Here’s how we would run our first pass:

vspipe --y4m myvideo.vpy - | x264 --demuxer y4m --preset veryfast --pass 1 --bitrate 5222 -o x264output.mkv -

This creates a stats file in our current directory, which x264 will use in the second pass:

vspipe --y4m myvideo.vpy - | x264 --demuxer y4m --preset veryfast --pass 2 --bitrate 5222 -o x264output.mkv -

You’ll notice all we had to change was --pass 1 to --pass 2. Simple!

Although x264 will automatically use faster settings for the first pass, it should be no surprise that 2-pass encoding is slower than 1-pass encoding. Therefore, there are still certain use cases where 1-pass, bitrate-targeted video is a good fit, such as streaming.

Recap

We covered the basics of how to encode in x264, including speed presets, tunings, and three different encoding modes.

Here is a summary of when to use each encoding mode:

1-pass Constant Quality (CRF):
- Good for: General-purpose encoding
- Bad for: Streaming; obtaining a certain file size
1-pass Average Bitrate:
- Good for: Streaming
- Bad for: Everything else
2-pass Average Bitrate:
- Good for: Obtaining a certain file size
- Bad for: Streaming

Advanced Configuration

Coming Soon

Source: https://web.archive.org/web/20190203114601/http://www.mpeg.org/MPEG/DVD/Book_A/Specs.html