Commit | Line | Data |
---|---|---|
72b9787e JB |
1 | ********************************* |
2 | Application Programming Interface | |
3 | ********************************* | |
4 | ||
5 | Introduction | |
6 | ============ | |
7 | ||
8 | x265 is written primarily in C++ and x86 assembly language but the | |
9 | public facing programming interface is C for the widest possible | |
10 | portability. This C interface is wholly defined within :file:`x265.h` | |
11 | in the source/ folder of our source tree. All of the functions and | |
12 | variables and enumerations meant to be used by the end-user are present | |
13 | in this header. | |
14 | ||
15 | Where possible, x265 has tried to keep its public API as close as | |
16 | possible to x264's public API. So those familiar with using x264 through | |
17 | its C interface will find x265 quite familiar. | |
18 | ||
19 | This file is meant to be read in-order; the narrative follows linearly | |
20 | through the various sections | |
21 | ||
22 | Build Considerations | |
23 | ==================== | |
24 | ||
25 | The choice of Main or Main10 profile encodes is made at compile time; | |
26 | the internal pixel depth influences a great deal of variable sizes and | |
27 | thus 8 and 10bit pixels are handled as different build options | |
28 | (primarily to maintain the performance of the 8bit builds). libx265 | |
29 | exports a variable **x265_max_bit_depth** which indicates how the | |
30 | library was compiled (it will contain a value of 8 or 10). Further, | |
31 | **x265_version_str** is a pointer to a string indicating the version of | |
32 | x265 which was compiled, and **x265_build_info_str** is a pointer to a | |
33 | string identifying the compiler and build options. | |
34 | ||
35 | x265 will accept input pixels of any depth between 8 and 16 bits | |
36 | regardless of the depth of its internal pixels (8 or 10). It will shift | |
37 | and mask input pixels as required to reach the internal depth. If | |
38 | downshifting is being performed using our CLI application, the | |
39 | :option:`--dither` option may be enabled to reduce banding. This feature | |
40 | is not available through the C interface. | |
41 | ||
42 | Encoder | |
43 | ======= | |
44 | ||
45 | The primary object in x265 is the encoder object, and this is | |
46 | represented in the public API as an opaque typedef **x265_encoder**. | |
47 | Pointers of this type are passed to most encoder functions. | |
48 | ||
49 | A single encoder generates a single output bitstream from a sequence of | |
50 | raw input pictures. Thus if you need multiple output bitstreams you | |
51 | must allocate multiple encoders. You may pass the same input pictures | |
52 | to multiple encoders, the encode function does not modify the input | |
53 | picture structures (the pictures are copied into the encoder as the | |
54 | first step of encode). | |
55 | ||
56 | Encoder allocation is a reentrant function, so multiple encoders may be | |
57 | safely allocated in a single process. The encoder access functions are | |
58 | not reentrant for a single encoder, so the recommended use case is to | |
59 | allocate one client thread per encoder instance (one thread for all | |
60 | encoder instances is possible, but some encoder access functions are | |
61 | blocking and thus this would be less efficient). | |
62 | ||
63 | .. Note:: | |
64 | ||
65 | There is one caveat to having multiple encoders within a single | |
66 | process. All of the encoders must use the same maximum CTU size | |
67 | because many global variables are configured based on this size. | |
68 | Encoder allocation will fail if a mis-matched CTU size is attempted. | |
69 | ||
70 | An encoder is allocated by calling **x265_encoder_open()**:: | |
71 | ||
72 | /* x265_encoder_open: | |
73 | * create a new encoder handler, all parameters from x265_param are copied */ | |
74 | x265_encoder* x265_encoder_open(x265_param *); | |
75 | ||
76 | The returned pointer is then passed to all of the functions pertaining | |
77 | to this encode. A large amount of memory is allocated during this | |
78 | function call, but the encoder will continue to allocate memory as the | |
79 | first pictures are passed to the encoder; until its pool of picture | |
80 | structures is large enough to handle all of the pictures it must keep | |
81 | internally. The pool size is determined by the lookahead depth, the | |
82 | number of frame threads, and the maximum number of references. | |
83 | ||
84 | As indicated in the comment, **x265_param** is copied internally so the user | |
85 | may release their copy after allocating the encoder. Changes made to | |
86 | their copy of the param structure have no affect on the encoder after it | |
87 | has been allocated. | |
88 | ||
89 | Param | |
90 | ===== | |
91 | ||
92 | The **x265_param** structure describes everything the encoder needs to | |
93 | know about the input pictures and the output bitstream and most | |
94 | everything in between. | |
95 | ||
96 | The recommended way to handle these param structures is to allocate them | |
97 | from libx265 via:: | |
98 | ||
99 | /* x265_param_alloc: | |
100 | * Allocates an x265_param instance. The returned param structure is not | |
101 | * special in any way, but using this method together with x265_param_free() | |
102 | * and x265_param_parse() to set values by name allows the application to treat | |
103 | * x265_param as an opaque data struct for version safety */ | |
104 | x265_param *x265_param_alloc(); | |
105 | ||
106 | In this way, your application does not need to know the exact size of | |
107 | the param structure (the build of x265 could potentially be a bit newer | |
108 | than the copy of :file:`x265.h` that your application compiled against). | |
109 | ||
110 | Next you perform the initial *rough cut* configuration of the encoder by | |
111 | chosing a performance preset and optional tune factor | |
112 | **x265_preset_names** and **x265_tune_names** respectively hold the | |
113 | string names of the presets and tune factors (see :ref:`presets | |
114 | <preset-tune-ref>` for more detail on presets and tune factors):: | |
115 | ||
116 | /* returns 0 on success, negative on failure (e.g. invalid preset/tune name). */ | |
117 | int x265_param_default_preset(x265_param *, const char *preset, const char *tune); | |
118 | ||
119 | Now you may optionally specify a profile. **x265_profile_names** | |
120 | contains the string names this function accepts:: | |
121 | ||
122 | /* (can be NULL, in which case the function will do nothing) | |
123 | * returns 0 on success, negative on failure (e.g. invalid profile name). */ | |
124 | int x265_param_apply_profile(x265_param *, const char *profile); | |
125 | ||
126 | Finally you configure any remaining options by name using repeated calls to:: | |
127 | ||
128 | /* x265_param_parse: | |
129 | * set one parameter by name. | |
130 | * returns 0 on success, or returns one of the following errors. | |
131 | * note: BAD_VALUE occurs only if it can't even parse the value, | |
132 | * numerical range is not checked until x265_encoder_open(). | |
133 | * value=NULL means "true" for boolean options, but is a BAD_VALUE for non-booleans. */ | |
134 | #define X265_PARAM_BAD_NAME (-1) | |
135 | #define X265_PARAM_BAD_VALUE (-2) | |
136 | int x265_param_parse(x265_param *p, const char *name, const char *value); | |
137 | ||
138 | See :ref:`string options <string-options-ref>` for the list of options (and their | |
139 | descriptions) which can be set by **x265_param_parse()**. | |
140 | ||
141 | After the encoder has been created, you may release the param structure:: | |
142 | ||
143 | /* x265_param_free: | |
144 | * Use x265_param_free() to release storage for an x265_param instance | |
145 | * allocated by x265_param_alloc() */ | |
146 | void x265_param_free(x265_param *); | |
147 | ||
148 | .. Note:: | |
149 | ||
150 | Using these methods to allocate and release the param structures | |
151 | helps future-proof your code in many ways, but the x265 API is | |
152 | versioned in such a way that we prevent linkage against a build of | |
153 | x265 that does not match the version of the header you are compiling | |
154 | against. This is function of the X265_BUILD macro. | |
155 | ||
156 | **x265_encoder_parameters()** may be used to get a copy of the param | |
157 | structure from the encoder after it has been opened, in order to see the | |
158 | changes made to the parameters for auto-detection and other reasons:: | |
159 | ||
160 | /* x265_encoder_parameters: | |
161 | * copies the current internal set of parameters to the pointer provided | |
162 | * by the caller. useful when the calling application needs to know | |
163 | * how x265_encoder_open has changed the parameters. | |
164 | * note that the data accessible through pointers in the returned param struct | |
165 | * (e.g. filenames) should not be modified by the calling application. */ | |
166 | void x265_encoder_parameters(x265_encoder *, x265_param *); | |
167 | ||
168 | Pictures | |
169 | ======== | |
170 | ||
171 | Raw pictures are passed to the encoder via the **x265_picture** structure. | |
172 | Just like the param structure we recommend you allocate this structure | |
173 | from the encoder to avoid potential size mismatches:: | |
174 | ||
175 | /* x265_picture_alloc: | |
176 | * Allocates an x265_picture instance. The returned picture structure is not | |
177 | * special in any way, but using this method together with x265_picture_free() | |
178 | * and x265_picture_init() allows some version safety. New picture fields will | |
179 | * always be added to the end of x265_picture */ | |
180 | x265_picture *x265_picture_alloc(); | |
181 | ||
182 | Regardless of whether you allocate your picture structure this way or | |
183 | whether you simply declare it on the stack, your next step is to | |
184 | initialize the structure via:: | |
185 | ||
186 | /*** | |
187 | * Initialize an x265_picture structure to default values. It sets the pixel | |
188 | * depth and color space to the encoder's internal values and sets the slice | |
189 | * type to auto - so the lookahead will determine slice type. | |
190 | */ | |
191 | void x265_picture_init(x265_param *param, x265_picture *pic); | |
192 | ||
193 | x265 does not perform any color space conversions, so the raw picture's | |
194 | color space (chroma sampling) must match the color space specified in | |
195 | the param structure used to allocate the encoder. **x265_picture_init** | |
196 | initializes this field to the internal color space and it is best to | |
197 | leave it unmodified. | |
198 | ||
199 | The picture bit depth is initialized to be the encoder's internal bit | |
200 | depth but this value should be changed to the actual depth of the pixels | |
201 | being passed into the encoder. If the picture bit depth is more than 8, | |
202 | the encoder assumes two bytes are used to represent each sample | |
203 | (little-endian shorts). | |
204 | ||
205 | The user is responsible for setting the plane pointers and plane strides | |
206 | (in units of bytes, not pixels). The presentation time stamp (**pts**) | |
207 | is optional, depending on whether you need accurate decode time stamps | |
208 | (**dts**) on output. | |
209 | ||
210 | If you wish to override the lookahead or rate control for a given | |
211 | picture you may specify a slicetype other than X265_TYPE_AUTO, or a | |
212 | forceQP value other than 0. | |
213 | ||
214 | x265 does not modify the picture structure provided as input, so you may | |
215 | reuse a single **x265_picture** for all pictures passed to a single | |
216 | encoder, or even all pictures passed to multiple encoders. | |
217 | ||
218 | Structures allocated from the library should eventually be released:: | |
219 | ||
220 | /* x265_picture_free: | |
221 | * Use x265_picture_free() to release storage for an x265_picture instance | |
222 | * allocated by x265_picture_alloc() */ | |
223 | void x265_picture_free(x265_picture *); | |
224 | ||
225 | ||
226 | Analysis Buffers | |
227 | ================ | |
228 | ||
229 | Analysis information can be saved and reused to between encodes of the | |
230 | same video sequence (generally for multiple bitrate encodes). The best | |
231 | results are attained by saving the analysis information of the highest | |
232 | bitrate encode and reuse it in lower bitrate encodes. | |
233 | ||
234 | When saving or loading analysis data, buffers must be allocated for | |
235 | every picture passed into the encoder using:: | |
236 | ||
237 | /* x265_alloc_analysis_data: | |
238 | * Allocate memory to hold analysis meta data, returns 1 on success else 0 */ | |
239 | int x265_alloc_analysis_data(x265_picture*); | |
240 | ||
241 | Note that this is very different from the typical semantics of | |
242 | **x265_picture**, which can be reused many times. The analysis buffers must | |
243 | be re-allocated for every input picture. | |
244 | ||
245 | Analysis buffers passed to the encoder are owned by the encoder until | |
246 | they pass the buffers back via an output **x265_picture**. The user is | |
247 | responsible for releasing the buffers when they are finished with them | |
248 | via:: | |
249 | ||
250 | /* x265_free_analysis_data: | |
251 | * Use x265_free_analysis_data to release storage of members allocated by | |
252 | * x265_alloc_analysis_data */ | |
253 | void x265_free_analysis_data(x265_picture*); | |
254 | ||
255 | ||
256 | Encode Process | |
257 | ============== | |
258 | ||
259 | The output of the encoder is a series of NAL packets, which are always | |
260 | returned concatenated in consecutive memory. HEVC streams have SPS and | |
261 | PPS and VPS headers which describe how the following packets are to be | |
262 | decoded. If you specified :option:`--repeat-headers` then those headers | |
263 | will be output with every keyframe. Otherwise you must explicitly query | |
264 | those headers using:: | |
265 | ||
266 | /* x265_encoder_headers: | |
267 | * return the SPS and PPS that will be used for the whole stream. | |
268 | * *pi_nal is the number of NAL units outputted in pp_nal. | |
269 | * returns negative on error, total byte size of payload data on success | |
270 | * the payloads of all output NALs are guaranteed to be sequential in memory. */ | |
271 | int x265_encoder_headers(x265_encoder *, x265_nal **pp_nal, uint32_t *pi_nal); | |
272 | ||
273 | Now we get to the main encode loop. Raw input pictures are passed to the | |
274 | encoder in display order via:: | |
275 | ||
276 | /* x265_encoder_encode: | |
277 | * encode one picture. | |
278 | * *pi_nal is the number of NAL units outputted in pp_nal. | |
279 | * returns negative on error, zero if no NAL units returned. | |
280 | * the payloads of all output NALs are guaranteed to be sequential in memory. */ | |
281 | int x265_encoder_encode(x265_encoder *encoder, x265_nal **pp_nal, uint32_t *pi_nal, x265_picture *pic_in, x265_picture *pic_out); | |
282 | ||
283 | These pictures are queued up until the lookahead is full, and then the | |
284 | frame encoders in turn are filled, and then finally you begin receiving | |
285 | a output NALs (corresponding to a single output picture) with each input | |
286 | picture you pass into the encoder. | |
287 | ||
288 | Once the pipeline is completely full, **x265_encoder_encode()** will | |
289 | block until the next output picture is complete. | |
290 | ||
291 | .. note:: | |
292 | ||
293 | Optionally, if the pointer of a second **x265_picture** structure is | |
294 | provided, the encoder will fill it with data pertaining to the | |
295 | output picture corresponding to the output NALs, including the | |
296 | recontructed image, POC and decode timestamp. These pictures will be | |
297 | in encode (or decode) order. | |
298 | ||
299 | When the last of the raw input pictures has been sent to the encoder, | |
300 | **x265_encoder_encode()** must still be called repeatedly with a | |
301 | *pic_in* argument of 0, indicating a pipeline flush, until the function | |
302 | returns a value less than or equal to 0 (indicating the output bitstream | |
303 | is complete). | |
304 | ||
305 | At any time during this process, the application may query running | |
306 | statistics from the encoder:: | |
307 | ||
308 | /* x265_encoder_get_stats: | |
309 | * returns encoder statistics */ | |
310 | void x265_encoder_get_stats(x265_encoder *encoder, x265_stats *, uint32_t statsSizeBytes); | |
311 | ||
312 | Cleanup | |
313 | ======= | |
314 | ||
315 | At the end of the encode, the application will want to trigger logging | |
316 | of the final encode statistics, if :option:`--csv` had been specified:: | |
317 | ||
318 | /* x265_encoder_log: | |
319 | * write a line to the configured CSV file. If a CSV filename was not | |
320 | * configured, or file open failed, or the log level indicated frame level | |
321 | * logging, this function will perform no write. */ | |
322 | void x265_encoder_log(x265_encoder *encoder, int argc, char **argv); | |
323 | ||
324 | Finally, the encoder must be closed in order to free all of its | |
325 | resources. An encoder that has been flushed cannot be restarted and | |
326 | reused. Once **x265_encoder_close()** has been called, the encoder | |
327 | handle must be discarded:: | |
328 | ||
329 | /* x265_encoder_close: | |
330 | * close an encoder handler */ | |
331 | void x265_encoder_close(x265_encoder *); | |
332 | ||
333 | When the application has completed all encodes, it should call | |
334 | **x265_cleanup()** to free process global resources like the thread pool; | |
335 | particularly if a memory-leak detection tool is being used:: | |
336 | ||
337 | /*** | |
338 | * Release library static allocations | |
339 | */ | |
340 | void x265_cleanup(void); |