| 1 | ********************************* |
| 2 | Application Programming Interface |
| 3 | ********************************* |
| 4 | |
| 5 | Introduction |
| 6 | ============ |
| 7 | |
| 8 | x265 is written primarily in C++ and x86 assembly language but the |
| 9 | public facing programming interface is C for the widest possible |
| 10 | portability. This C interface is wholly defined within :file:`x265.h` |
| 11 | in the source/ folder of our source tree. All of the functions and |
| 12 | variables and enumerations meant to be used by the end-user are present |
| 13 | in this header. |
| 14 | |
| 15 | Where possible, x265 has tried to keep its public API as close as |
| 16 | possible to x264's public API. So those familiar with using x264 through |
| 17 | its C interface will find x265 quite familiar. |
| 18 | |
| 19 | This file is meant to be read in-order; the narrative follows linearly |
| 20 | through the various sections |
| 21 | |
| 22 | Build Considerations |
| 23 | ==================== |
| 24 | |
| 25 | The choice of Main or Main10 profile encodes is made at compile time; |
| 26 | the internal pixel depth influences a great deal of variable sizes and |
| 27 | thus 8 and 10bit pixels are handled as different build options |
| 28 | (primarily to maintain the performance of the 8bit builds). libx265 |
| 29 | exports a variable **x265_max_bit_depth** which indicates how the |
| 30 | library was compiled (it will contain a value of 8 or 10). Further, |
| 31 | **x265_version_str** is a pointer to a string indicating the version of |
| 32 | x265 which was compiled, and **x265_build_info_str** is a pointer to a |
| 33 | string identifying the compiler and build options. |
| 34 | |
| 35 | .. Note:: |
| 36 | |
| 37 | **x265_version_str** is only updated when **cmake** runs. If you are |
| 38 | making binaries for others to use, it is recommended to run |
| 39 | **cmake** prior to **make** in your build scripts. |
| 40 | |
| 41 | x265 will accept input pixels of any depth between 8 and 16 bits |
| 42 | regardless of the depth of its internal pixels (8 or 10). It will shift |
| 43 | and mask input pixels as required to reach the internal depth. If |
| 44 | downshifting is being performed using our CLI application, the |
| 45 | :option:`--dither` option may be enabled to reduce banding. This feature |
| 46 | is not available through the C interface. |
| 47 | |
| 48 | Encoder |
| 49 | ======= |
| 50 | |
| 51 | The primary object in x265 is the encoder object, and this is |
| 52 | represented in the public API as an opaque typedef **x265_encoder**. |
| 53 | Pointers of this type are passed to most encoder functions. |
| 54 | |
| 55 | A single encoder generates a single output bitstream from a sequence of |
| 56 | raw input pictures. Thus if you need multiple output bitstreams you |
| 57 | must allocate multiple encoders. You may pass the same input pictures |
| 58 | to multiple encoders, the encode function does not modify the input |
| 59 | picture structures (the pictures are copied into the encoder as the |
| 60 | first step of encode). |
| 61 | |
| 62 | Encoder allocation is a reentrant function, so multiple encoders may be |
| 63 | safely allocated in a single process. The encoder access functions are |
| 64 | not reentrant for a single encoder, so the recommended use case is to |
| 65 | allocate one client thread per encoder instance (one thread for all |
| 66 | encoder instances is possible, but some encoder access functions are |
| 67 | blocking and thus this would be less efficient). |
| 68 | |
| 69 | .. Note:: |
| 70 | |
| 71 | There is one caveat to having multiple encoders within a single |
| 72 | process. All of the encoders must use the same maximum CTU size |
| 73 | because many global variables are configured based on this size. |
| 74 | Encoder allocation will fail if a mis-matched CTU size is attempted. |
| 75 | |
| 76 | An encoder is allocated by calling **x265_encoder_open()**:: |
| 77 | |
| 78 | /* x265_encoder_open: |
| 79 | * create a new encoder handler, all parameters from x265_param are copied */ |
| 80 | x265_encoder* x265_encoder_open(x265_param *); |
| 81 | |
| 82 | The returned pointer is then passed to all of the functions pertaining |
| 83 | to this encode. A large amount of memory is allocated during this |
| 84 | function call, but the encoder will continue to allocate memory as the |
| 85 | first pictures are passed to the encoder; until its pool of picture |
| 86 | structures is large enough to handle all of the pictures it must keep |
| 87 | internally. The pool size is determined by the lookahead depth, the |
| 88 | number of frame threads, and the maximum number of references. |
| 89 | |
| 90 | As indicated in the comment, **x265_param** is copied internally so the user |
| 91 | may release their copy after allocating the encoder. Changes made to |
| 92 | their copy of the param structure have no affect on the encoder after it |
| 93 | has been allocated. |
| 94 | |
| 95 | Param |
| 96 | ===== |
| 97 | |
| 98 | The **x265_param** structure describes everything the encoder needs to |
| 99 | know about the input pictures and the output bitstream and most |
| 100 | everything in between. |
| 101 | |
| 102 | The recommended way to handle these param structures is to allocate them |
| 103 | from libx265 via:: |
| 104 | |
| 105 | /* x265_param_alloc: |
| 106 | * Allocates an x265_param instance. The returned param structure is not |
| 107 | * special in any way, but using this method together with x265_param_free() |
| 108 | * and x265_param_parse() to set values by name allows the application to treat |
| 109 | * x265_param as an opaque data struct for version safety */ |
| 110 | x265_param *x265_param_alloc(); |
| 111 | |
| 112 | In this way, your application does not need to know the exact size of |
| 113 | the param structure (the build of x265 could potentially be a bit newer |
| 114 | than the copy of :file:`x265.h` that your application compiled against). |
| 115 | |
| 116 | Next you perform the initial *rough cut* configuration of the encoder by |
| 117 | chosing a performance preset and optional tune factor |
| 118 | **x265_preset_names** and **x265_tune_names** respectively hold the |
| 119 | string names of the presets and tune factors (see :ref:`presets |
| 120 | <preset-tune-ref>` for more detail on presets and tune factors):: |
| 121 | |
| 122 | /* returns 0 on success, negative on failure (e.g. invalid preset/tune name). */ |
| 123 | int x265_param_default_preset(x265_param *, const char *preset, const char *tune); |
| 124 | |
| 125 | Now you may optionally specify a profile. **x265_profile_names** |
| 126 | contains the string names this function accepts:: |
| 127 | |
| 128 | /* (can be NULL, in which case the function will do nothing) |
| 129 | * returns 0 on success, negative on failure (e.g. invalid profile name). */ |
| 130 | int x265_param_apply_profile(x265_param *, const char *profile); |
| 131 | |
| 132 | Finally you configure any remaining options by name using repeated calls to:: |
| 133 | |
| 134 | /* x265_param_parse: |
| 135 | * set one parameter by name. |
| 136 | * returns 0 on success, or returns one of the following errors. |
| 137 | * note: BAD_VALUE occurs only if it can't even parse the value, |
| 138 | * numerical range is not checked until x265_encoder_open(). |
| 139 | * value=NULL means "true" for boolean options, but is a BAD_VALUE for non-booleans. */ |
| 140 | #define X265_PARAM_BAD_NAME (-1) |
| 141 | #define X265_PARAM_BAD_VALUE (-2) |
| 142 | int x265_param_parse(x265_param *p, const char *name, const char *value); |
| 143 | |
| 144 | See :ref:`string options <string-options-ref>` for the list of options (and their |
| 145 | descriptions) which can be set by **x265_param_parse()**. |
| 146 | |
| 147 | After the encoder has been created, you may release the param structure:: |
| 148 | |
| 149 | /* x265_param_free: |
| 150 | * Use x265_param_free() to release storage for an x265_param instance |
| 151 | * allocated by x265_param_alloc() */ |
| 152 | void x265_param_free(x265_param *); |
| 153 | |
| 154 | .. Note:: |
| 155 | |
| 156 | Using these methods to allocate and release the param structures |
| 157 | helps future-proof your code in many ways, but the x265 API is |
| 158 | versioned in such a way that we prevent linkage against a build of |
| 159 | x265 that does not match the version of the header you are compiling |
| 160 | against. This is function of the X265_BUILD macro. |
| 161 | |
| 162 | **x265_encoder_parameters()** may be used to get a copy of the param |
| 163 | structure from the encoder after it has been opened, in order to see the |
| 164 | changes made to the parameters for auto-detection and other reasons:: |
| 165 | |
| 166 | /* x265_encoder_parameters: |
| 167 | * copies the current internal set of parameters to the pointer provided |
| 168 | * by the caller. useful when the calling application needs to know |
| 169 | * how x265_encoder_open has changed the parameters. |
| 170 | * note that the data accessible through pointers in the returned param struct |
| 171 | * (e.g. filenames) should not be modified by the calling application. */ |
| 172 | void x265_encoder_parameters(x265_encoder *, x265_param *); |
| 173 | |
| 174 | Pictures |
| 175 | ======== |
| 176 | |
| 177 | Raw pictures are passed to the encoder via the **x265_picture** structure. |
| 178 | Just like the param structure we recommend you allocate this structure |
| 179 | from the encoder to avoid potential size mismatches:: |
| 180 | |
| 181 | /* x265_picture_alloc: |
| 182 | * Allocates an x265_picture instance. The returned picture structure is not |
| 183 | * special in any way, but using this method together with x265_picture_free() |
| 184 | * and x265_picture_init() allows some version safety. New picture fields will |
| 185 | * always be added to the end of x265_picture */ |
| 186 | x265_picture *x265_picture_alloc(); |
| 187 | |
| 188 | Regardless of whether you allocate your picture structure this way or |
| 189 | whether you simply declare it on the stack, your next step is to |
| 190 | initialize the structure via:: |
| 191 | |
| 192 | /*** |
| 193 | * Initialize an x265_picture structure to default values. It sets the pixel |
| 194 | * depth and color space to the encoder's internal values and sets the slice |
| 195 | * type to auto - so the lookahead will determine slice type. |
| 196 | */ |
| 197 | void x265_picture_init(x265_param *param, x265_picture *pic); |
| 198 | |
| 199 | x265 does not perform any color space conversions, so the raw picture's |
| 200 | color space (chroma sampling) must match the color space specified in |
| 201 | the param structure used to allocate the encoder. **x265_picture_init** |
| 202 | initializes this field to the internal color space and it is best to |
| 203 | leave it unmodified. |
| 204 | |
| 205 | The picture bit depth is initialized to be the encoder's internal bit |
| 206 | depth but this value should be changed to the actual depth of the pixels |
| 207 | being passed into the encoder. If the picture bit depth is more than 8, |
| 208 | the encoder assumes two bytes are used to represent each sample |
| 209 | (little-endian shorts). |
| 210 | |
| 211 | The user is responsible for setting the plane pointers and plane strides |
| 212 | (in units of bytes, not pixels). The presentation time stamp (**pts**) |
| 213 | is optional, depending on whether you need accurate decode time stamps |
| 214 | (**dts**) on output. |
| 215 | |
| 216 | If you wish to override the lookahead or rate control for a given |
| 217 | picture you may specify a slicetype other than X265_TYPE_AUTO, or a |
| 218 | forceQP value other than 0. |
| 219 | |
| 220 | x265 does not modify the picture structure provided as input, so you may |
| 221 | reuse a single **x265_picture** for all pictures passed to a single |
| 222 | encoder, or even all pictures passed to multiple encoders. |
| 223 | |
| 224 | Structures allocated from the library should eventually be released:: |
| 225 | |
| 226 | /* x265_picture_free: |
| 227 | * Use x265_picture_free() to release storage for an x265_picture instance |
| 228 | * allocated by x265_picture_alloc() */ |
| 229 | void x265_picture_free(x265_picture *); |
| 230 | |
| 231 | |
| 232 | Analysis Buffers |
| 233 | ================ |
| 234 | |
| 235 | Analysis information can be saved and reused to between encodes of the |
| 236 | same video sequence (generally for multiple bitrate encodes). The best |
| 237 | results are attained by saving the analysis information of the highest |
| 238 | bitrate encode and reuse it in lower bitrate encodes. |
| 239 | |
| 240 | When saving or loading analysis data, buffers must be allocated for |
| 241 | every picture passed into the encoder using:: |
| 242 | |
| 243 | /* x265_alloc_analysis_data: |
| 244 | * Allocate memory to hold analysis meta data, returns 1 on success else 0 */ |
| 245 | int x265_alloc_analysis_data(x265_picture*); |
| 246 | |
| 247 | Note that this is very different from the typical semantics of |
| 248 | **x265_picture**, which can be reused many times. The analysis buffers must |
| 249 | be re-allocated for every input picture. |
| 250 | |
| 251 | Analysis buffers passed to the encoder are owned by the encoder until |
| 252 | they pass the buffers back via an output **x265_picture**. The user is |
| 253 | responsible for releasing the buffers when they are finished with them |
| 254 | via:: |
| 255 | |
| 256 | /* x265_free_analysis_data: |
| 257 | * Use x265_free_analysis_data to release storage of members allocated by |
| 258 | * x265_alloc_analysis_data */ |
| 259 | void x265_free_analysis_data(x265_picture*); |
| 260 | |
| 261 | |
| 262 | Encode Process |
| 263 | ============== |
| 264 | |
| 265 | The output of the encoder is a series of NAL packets, which are always |
| 266 | returned concatenated in consecutive memory. HEVC streams have SPS and |
| 267 | PPS and VPS headers which describe how the following packets are to be |
| 268 | decoded. If you specified :option:`--repeat-headers` then those headers |
| 269 | will be output with every keyframe. Otherwise you must explicitly query |
| 270 | those headers using:: |
| 271 | |
| 272 | /* x265_encoder_headers: |
| 273 | * return the SPS and PPS that will be used for the whole stream. |
| 274 | * *pi_nal is the number of NAL units outputted in pp_nal. |
| 275 | * returns negative on error, total byte size of payload data on success |
| 276 | * the payloads of all output NALs are guaranteed to be sequential in memory. */ |
| 277 | int x265_encoder_headers(x265_encoder *, x265_nal **pp_nal, uint32_t *pi_nal); |
| 278 | |
| 279 | Now we get to the main encode loop. Raw input pictures are passed to the |
| 280 | encoder in display order via:: |
| 281 | |
| 282 | /* x265_encoder_encode: |
| 283 | * encode one picture. |
| 284 | * *pi_nal is the number of NAL units outputted in pp_nal. |
| 285 | * returns negative on error, zero if no NAL units returned. |
| 286 | * the payloads of all output NALs are guaranteed to be sequential in memory. */ |
| 287 | int x265_encoder_encode(x265_encoder *encoder, x265_nal **pp_nal, uint32_t *pi_nal, x265_picture *pic_in, x265_picture *pic_out); |
| 288 | |
| 289 | These pictures are queued up until the lookahead is full, and then the |
| 290 | frame encoders in turn are filled, and then finally you begin receiving |
| 291 | a output NALs (corresponding to a single output picture) with each input |
| 292 | picture you pass into the encoder. |
| 293 | |
| 294 | Once the pipeline is completely full, **x265_encoder_encode()** will |
| 295 | block until the next output picture is complete. |
| 296 | |
| 297 | .. note:: |
| 298 | |
| 299 | Optionally, if the pointer of a second **x265_picture** structure is |
| 300 | provided, the encoder will fill it with data pertaining to the |
| 301 | output picture corresponding to the output NALs, including the |
| 302 | recontructed image, POC and decode timestamp. These pictures will be |
| 303 | in encode (or decode) order. |
| 304 | |
| 305 | When the last of the raw input pictures has been sent to the encoder, |
| 306 | **x265_encoder_encode()** must still be called repeatedly with a |
| 307 | *pic_in* argument of 0, indicating a pipeline flush, until the function |
| 308 | returns a value less than or equal to 0 (indicating the output bitstream |
| 309 | is complete). |
| 310 | |
| 311 | At any time during this process, the application may query running |
| 312 | statistics from the encoder:: |
| 313 | |
| 314 | /* x265_encoder_get_stats: |
| 315 | * returns encoder statistics */ |
| 316 | void x265_encoder_get_stats(x265_encoder *encoder, x265_stats *, uint32_t statsSizeBytes); |
| 317 | |
| 318 | Cleanup |
| 319 | ======= |
| 320 | |
| 321 | At the end of the encode, the application will want to trigger logging |
| 322 | of the final encode statistics, if :option:`--csv` had been specified:: |
| 323 | |
| 324 | /* x265_encoder_log: |
| 325 | * write a line to the configured CSV file. If a CSV filename was not |
| 326 | * configured, or file open failed, or the log level indicated frame level |
| 327 | * logging, this function will perform no write. */ |
| 328 | void x265_encoder_log(x265_encoder *encoder, int argc, char **argv); |
| 329 | |
| 330 | Finally, the encoder must be closed in order to free all of its |
| 331 | resources. An encoder that has been flushed cannot be restarted and |
| 332 | reused. Once **x265_encoder_close()** has been called, the encoder |
| 333 | handle must be discarded:: |
| 334 | |
| 335 | /* x265_encoder_close: |
| 336 | * close an encoder handler */ |
| 337 | void x265_encoder_close(x265_encoder *); |
| 338 | |
| 339 | When the application has completed all encodes, it should call |
| 340 | **x265_cleanup()** to free process global resources like the thread pool; |
| 341 | particularly if a memory-leak detection tool is being used:: |
| 342 | |
| 343 | /*** |
| 344 | * Release library static allocations |
| 345 | */ |
| 346 | void x265_cleanup(void); |