Commit | Line | Data |
---|---|---|
72b9787e JB |
1 | ********************************* |
2 | Application Programming Interface | |
3 | ********************************* | |
4 | ||
5 | Introduction | |
6 | ============ | |
7 | ||
8 | x265 is written primarily in C++ and x86 assembly language but the | |
9 | public facing programming interface is C for the widest possible | |
10 | portability. This C interface is wholly defined within :file:`x265.h` | |
11 | in the source/ folder of our source tree. All of the functions and | |
12 | variables and enumerations meant to be used by the end-user are present | |
13 | in this header. | |
14 | ||
15 | Where possible, x265 has tried to keep its public API as close as | |
16 | possible to x264's public API. So those familiar with using x264 through | |
17 | its C interface will find x265 quite familiar. | |
18 | ||
19 | This file is meant to be read in-order; the narrative follows linearly | |
20 | through the various sections | |
21 | ||
22 | Build Considerations | |
23 | ==================== | |
24 | ||
25 | The choice of Main or Main10 profile encodes is made at compile time; | |
26 | the internal pixel depth influences a great deal of variable sizes and | |
27 | thus 8 and 10bit pixels are handled as different build options | |
28 | (primarily to maintain the performance of the 8bit builds). libx265 | |
29 | exports a variable **x265_max_bit_depth** which indicates how the | |
30 | library was compiled (it will contain a value of 8 or 10). Further, | |
31 | **x265_version_str** is a pointer to a string indicating the version of | |
32 | x265 which was compiled, and **x265_build_info_str** is a pointer to a | |
33 | string identifying the compiler and build options. | |
34 | ||
b53f7c52 JB |
35 | .. Note:: |
36 | ||
37 | **x265_version_str** is only updated when **cmake** runs. If you are | |
38 | making binaries for others to use, it is recommended to run | |
39 | **cmake** prior to **make** in your build scripts. | |
40 | ||
72b9787e JB |
41 | x265 will accept input pixels of any depth between 8 and 16 bits |
42 | regardless of the depth of its internal pixels (8 or 10). It will shift | |
43 | and mask input pixels as required to reach the internal depth. If | |
44 | downshifting is being performed using our CLI application, the | |
45 | :option:`--dither` option may be enabled to reduce banding. This feature | |
46 | is not available through the C interface. | |
47 | ||
48 | Encoder | |
49 | ======= | |
50 | ||
51 | The primary object in x265 is the encoder object, and this is | |
52 | represented in the public API as an opaque typedef **x265_encoder**. | |
53 | Pointers of this type are passed to most encoder functions. | |
54 | ||
55 | A single encoder generates a single output bitstream from a sequence of | |
56 | raw input pictures. Thus if you need multiple output bitstreams you | |
57 | must allocate multiple encoders. You may pass the same input pictures | |
58 | to multiple encoders, the encode function does not modify the input | |
59 | picture structures (the pictures are copied into the encoder as the | |
60 | first step of encode). | |
61 | ||
62 | Encoder allocation is a reentrant function, so multiple encoders may be | |
63 | safely allocated in a single process. The encoder access functions are | |
64 | not reentrant for a single encoder, so the recommended use case is to | |
65 | allocate one client thread per encoder instance (one thread for all | |
66 | encoder instances is possible, but some encoder access functions are | |
67 | blocking and thus this would be less efficient). | |
68 | ||
69 | .. Note:: | |
70 | ||
71 | There is one caveat to having multiple encoders within a single | |
72 | process. All of the encoders must use the same maximum CTU size | |
73 | because many global variables are configured based on this size. | |
74 | Encoder allocation will fail if a mis-matched CTU size is attempted. | |
75 | ||
76 | An encoder is allocated by calling **x265_encoder_open()**:: | |
77 | ||
78 | /* x265_encoder_open: | |
79 | * create a new encoder handler, all parameters from x265_param are copied */ | |
80 | x265_encoder* x265_encoder_open(x265_param *); | |
81 | ||
82 | The returned pointer is then passed to all of the functions pertaining | |
83 | to this encode. A large amount of memory is allocated during this | |
84 | function call, but the encoder will continue to allocate memory as the | |
85 | first pictures are passed to the encoder; until its pool of picture | |
86 | structures is large enough to handle all of the pictures it must keep | |
87 | internally. The pool size is determined by the lookahead depth, the | |
88 | number of frame threads, and the maximum number of references. | |
89 | ||
90 | As indicated in the comment, **x265_param** is copied internally so the user | |
91 | may release their copy after allocating the encoder. Changes made to | |
92 | their copy of the param structure have no affect on the encoder after it | |
93 | has been allocated. | |
94 | ||
95 | Param | |
96 | ===== | |
97 | ||
98 | The **x265_param** structure describes everything the encoder needs to | |
99 | know about the input pictures and the output bitstream and most | |
100 | everything in between. | |
101 | ||
102 | The recommended way to handle these param structures is to allocate them | |
103 | from libx265 via:: | |
104 | ||
105 | /* x265_param_alloc: | |
106 | * Allocates an x265_param instance. The returned param structure is not | |
107 | * special in any way, but using this method together with x265_param_free() | |
108 | * and x265_param_parse() to set values by name allows the application to treat | |
109 | * x265_param as an opaque data struct for version safety */ | |
110 | x265_param *x265_param_alloc(); | |
111 | ||
112 | In this way, your application does not need to know the exact size of | |
113 | the param structure (the build of x265 could potentially be a bit newer | |
114 | than the copy of :file:`x265.h` that your application compiled against). | |
115 | ||
116 | Next you perform the initial *rough cut* configuration of the encoder by | |
117 | chosing a performance preset and optional tune factor | |
118 | **x265_preset_names** and **x265_tune_names** respectively hold the | |
119 | string names of the presets and tune factors (see :ref:`presets | |
120 | <preset-tune-ref>` for more detail on presets and tune factors):: | |
121 | ||
122 | /* returns 0 on success, negative on failure (e.g. invalid preset/tune name). */ | |
123 | int x265_param_default_preset(x265_param *, const char *preset, const char *tune); | |
124 | ||
125 | Now you may optionally specify a profile. **x265_profile_names** | |
126 | contains the string names this function accepts:: | |
127 | ||
128 | /* (can be NULL, in which case the function will do nothing) | |
129 | * returns 0 on success, negative on failure (e.g. invalid profile name). */ | |
130 | int x265_param_apply_profile(x265_param *, const char *profile); | |
131 | ||
132 | Finally you configure any remaining options by name using repeated calls to:: | |
133 | ||
134 | /* x265_param_parse: | |
135 | * set one parameter by name. | |
136 | * returns 0 on success, or returns one of the following errors. | |
137 | * note: BAD_VALUE occurs only if it can't even parse the value, | |
138 | * numerical range is not checked until x265_encoder_open(). | |
139 | * value=NULL means "true" for boolean options, but is a BAD_VALUE for non-booleans. */ | |
140 | #define X265_PARAM_BAD_NAME (-1) | |
141 | #define X265_PARAM_BAD_VALUE (-2) | |
142 | int x265_param_parse(x265_param *p, const char *name, const char *value); | |
143 | ||
144 | See :ref:`string options <string-options-ref>` for the list of options (and their | |
145 | descriptions) which can be set by **x265_param_parse()**. | |
146 | ||
147 | After the encoder has been created, you may release the param structure:: | |
148 | ||
149 | /* x265_param_free: | |
150 | * Use x265_param_free() to release storage for an x265_param instance | |
151 | * allocated by x265_param_alloc() */ | |
152 | void x265_param_free(x265_param *); | |
153 | ||
154 | .. Note:: | |
155 | ||
156 | Using these methods to allocate and release the param structures | |
157 | helps future-proof your code in many ways, but the x265 API is | |
158 | versioned in such a way that we prevent linkage against a build of | |
159 | x265 that does not match the version of the header you are compiling | |
160 | against. This is function of the X265_BUILD macro. | |
161 | ||
162 | **x265_encoder_parameters()** may be used to get a copy of the param | |
163 | structure from the encoder after it has been opened, in order to see the | |
164 | changes made to the parameters for auto-detection and other reasons:: | |
165 | ||
166 | /* x265_encoder_parameters: | |
167 | * copies the current internal set of parameters to the pointer provided | |
168 | * by the caller. useful when the calling application needs to know | |
169 | * how x265_encoder_open has changed the parameters. | |
170 | * note that the data accessible through pointers in the returned param struct | |
171 | * (e.g. filenames) should not be modified by the calling application. */ | |
172 | void x265_encoder_parameters(x265_encoder *, x265_param *); | |
173 | ||
174 | Pictures | |
175 | ======== | |
176 | ||
177 | Raw pictures are passed to the encoder via the **x265_picture** structure. | |
178 | Just like the param structure we recommend you allocate this structure | |
179 | from the encoder to avoid potential size mismatches:: | |
180 | ||
181 | /* x265_picture_alloc: | |
182 | * Allocates an x265_picture instance. The returned picture structure is not | |
183 | * special in any way, but using this method together with x265_picture_free() | |
184 | * and x265_picture_init() allows some version safety. New picture fields will | |
185 | * always be added to the end of x265_picture */ | |
186 | x265_picture *x265_picture_alloc(); | |
187 | ||
188 | Regardless of whether you allocate your picture structure this way or | |
189 | whether you simply declare it on the stack, your next step is to | |
190 | initialize the structure via:: | |
191 | ||
192 | /*** | |
193 | * Initialize an x265_picture structure to default values. It sets the pixel | |
194 | * depth and color space to the encoder's internal values and sets the slice | |
195 | * type to auto - so the lookahead will determine slice type. | |
196 | */ | |
197 | void x265_picture_init(x265_param *param, x265_picture *pic); | |
198 | ||
199 | x265 does not perform any color space conversions, so the raw picture's | |
200 | color space (chroma sampling) must match the color space specified in | |
201 | the param structure used to allocate the encoder. **x265_picture_init** | |
202 | initializes this field to the internal color space and it is best to | |
203 | leave it unmodified. | |
204 | ||
205 | The picture bit depth is initialized to be the encoder's internal bit | |
206 | depth but this value should be changed to the actual depth of the pixels | |
207 | being passed into the encoder. If the picture bit depth is more than 8, | |
208 | the encoder assumes two bytes are used to represent each sample | |
209 | (little-endian shorts). | |
210 | ||
211 | The user is responsible for setting the plane pointers and plane strides | |
212 | (in units of bytes, not pixels). The presentation time stamp (**pts**) | |
213 | is optional, depending on whether you need accurate decode time stamps | |
214 | (**dts**) on output. | |
215 | ||
216 | If you wish to override the lookahead or rate control for a given | |
217 | picture you may specify a slicetype other than X265_TYPE_AUTO, or a | |
218 | forceQP value other than 0. | |
219 | ||
220 | x265 does not modify the picture structure provided as input, so you may | |
221 | reuse a single **x265_picture** for all pictures passed to a single | |
222 | encoder, or even all pictures passed to multiple encoders. | |
223 | ||
224 | Structures allocated from the library should eventually be released:: | |
225 | ||
226 | /* x265_picture_free: | |
227 | * Use x265_picture_free() to release storage for an x265_picture instance | |
228 | * allocated by x265_picture_alloc() */ | |
229 | void x265_picture_free(x265_picture *); | |
230 | ||
231 | ||
232 | Analysis Buffers | |
233 | ================ | |
234 | ||
235 | Analysis information can be saved and reused to between encodes of the | |
236 | same video sequence (generally for multiple bitrate encodes). The best | |
237 | results are attained by saving the analysis information of the highest | |
238 | bitrate encode and reuse it in lower bitrate encodes. | |
239 | ||
240 | When saving or loading analysis data, buffers must be allocated for | |
241 | every picture passed into the encoder using:: | |
242 | ||
243 | /* x265_alloc_analysis_data: | |
244 | * Allocate memory to hold analysis meta data, returns 1 on success else 0 */ | |
245 | int x265_alloc_analysis_data(x265_picture*); | |
246 | ||
247 | Note that this is very different from the typical semantics of | |
248 | **x265_picture**, which can be reused many times. The analysis buffers must | |
249 | be re-allocated for every input picture. | |
250 | ||
251 | Analysis buffers passed to the encoder are owned by the encoder until | |
252 | they pass the buffers back via an output **x265_picture**. The user is | |
253 | responsible for releasing the buffers when they are finished with them | |
254 | via:: | |
255 | ||
256 | /* x265_free_analysis_data: | |
257 | * Use x265_free_analysis_data to release storage of members allocated by | |
258 | * x265_alloc_analysis_data */ | |
259 | void x265_free_analysis_data(x265_picture*); | |
260 | ||
261 | ||
262 | Encode Process | |
263 | ============== | |
264 | ||
265 | The output of the encoder is a series of NAL packets, which are always | |
266 | returned concatenated in consecutive memory. HEVC streams have SPS and | |
267 | PPS and VPS headers which describe how the following packets are to be | |
268 | decoded. If you specified :option:`--repeat-headers` then those headers | |
269 | will be output with every keyframe. Otherwise you must explicitly query | |
270 | those headers using:: | |
271 | ||
272 | /* x265_encoder_headers: | |
273 | * return the SPS and PPS that will be used for the whole stream. | |
274 | * *pi_nal is the number of NAL units outputted in pp_nal. | |
275 | * returns negative on error, total byte size of payload data on success | |
276 | * the payloads of all output NALs are guaranteed to be sequential in memory. */ | |
277 | int x265_encoder_headers(x265_encoder *, x265_nal **pp_nal, uint32_t *pi_nal); | |
278 | ||
279 | Now we get to the main encode loop. Raw input pictures are passed to the | |
280 | encoder in display order via:: | |
281 | ||
282 | /* x265_encoder_encode: | |
283 | * encode one picture. | |
284 | * *pi_nal is the number of NAL units outputted in pp_nal. | |
285 | * returns negative on error, zero if no NAL units returned. | |
286 | * the payloads of all output NALs are guaranteed to be sequential in memory. */ | |
287 | int x265_encoder_encode(x265_encoder *encoder, x265_nal **pp_nal, uint32_t *pi_nal, x265_picture *pic_in, x265_picture *pic_out); | |
288 | ||
289 | These pictures are queued up until the lookahead is full, and then the | |
290 | frame encoders in turn are filled, and then finally you begin receiving | |
291 | a output NALs (corresponding to a single output picture) with each input | |
292 | picture you pass into the encoder. | |
293 | ||
294 | Once the pipeline is completely full, **x265_encoder_encode()** will | |
295 | block until the next output picture is complete. | |
296 | ||
297 | .. note:: | |
298 | ||
299 | Optionally, if the pointer of a second **x265_picture** structure is | |
300 | provided, the encoder will fill it with data pertaining to the | |
301 | output picture corresponding to the output NALs, including the | |
302 | recontructed image, POC and decode timestamp. These pictures will be | |
303 | in encode (or decode) order. | |
304 | ||
305 | When the last of the raw input pictures has been sent to the encoder, | |
306 | **x265_encoder_encode()** must still be called repeatedly with a | |
307 | *pic_in* argument of 0, indicating a pipeline flush, until the function | |
308 | returns a value less than or equal to 0 (indicating the output bitstream | |
309 | is complete). | |
310 | ||
311 | At any time during this process, the application may query running | |
312 | statistics from the encoder:: | |
313 | ||
314 | /* x265_encoder_get_stats: | |
315 | * returns encoder statistics */ | |
316 | void x265_encoder_get_stats(x265_encoder *encoder, x265_stats *, uint32_t statsSizeBytes); | |
317 | ||
318 | Cleanup | |
319 | ======= | |
320 | ||
321 | At the end of the encode, the application will want to trigger logging | |
322 | of the final encode statistics, if :option:`--csv` had been specified:: | |
323 | ||
324 | /* x265_encoder_log: | |
325 | * write a line to the configured CSV file. If a CSV filename was not | |
326 | * configured, or file open failed, or the log level indicated frame level | |
327 | * logging, this function will perform no write. */ | |
328 | void x265_encoder_log(x265_encoder *encoder, int argc, char **argv); | |
329 | ||
330 | Finally, the encoder must be closed in order to free all of its | |
331 | resources. An encoder that has been flushed cannot be restarted and | |
332 | reused. Once **x265_encoder_close()** has been called, the encoder | |
333 | handle must be discarded:: | |
334 | ||
335 | /* x265_encoder_close: | |
336 | * close an encoder handler */ | |
337 | void x265_encoder_close(x265_encoder *); | |
338 | ||
339 | When the application has completed all encodes, it should call | |
340 | **x265_cleanup()** to free process global resources like the thread pool; | |
341 | particularly if a memory-leak detection tool is being used:: | |
342 | ||
343 | /*** | |
344 | * Release library static allocations | |
345 | */ | |
346 | void x265_cleanup(void); |