Imported Upstream version 1.4
[deb_x265.git] / doc / reST / api.rst
1 *********************************
2 Application Programming Interface
3 *********************************
4
5 Introduction
6 ============
7
8 x265 is written primarily in C++ and x86 assembly language but the
9 public facing programming interface is C for the widest possible
10 portability. This C interface is wholly defined within :file:`x265.h`
11 in the source/ folder of our source tree. All of the functions and
12 variables and enumerations meant to be used by the end-user are present
13 in this header.
14
15 Where possible, x265 has tried to keep its public API as close as
16 possible to x264's public API. So those familiar with using x264 through
17 its C interface will find x265 quite familiar.
18
19 This file is meant to be read in-order; the narrative follows linearly
20 through the various sections
21
22 Build Considerations
23 ====================
24
25 The choice of Main or Main10 profile encodes is made at compile time;
26 the internal pixel depth influences a great deal of variable sizes and
27 thus 8 and 10bit pixels are handled as different build options
28 (primarily to maintain the performance of the 8bit builds). libx265
29 exports a variable **x265_max_bit_depth** which indicates how the
30 library was compiled (it will contain a value of 8 or 10). Further,
31 **x265_version_str** is a pointer to a string indicating the version of
32 x265 which was compiled, and **x265_build_info_str** is a pointer to a
33 string identifying the compiler and build options.
34
35 x265 will accept input pixels of any depth between 8 and 16 bits
36 regardless of the depth of its internal pixels (8 or 10). It will shift
37 and mask input pixels as required to reach the internal depth. If
38 downshifting is being performed using our CLI application, the
39 :option:`--dither` option may be enabled to reduce banding. This feature
40 is not available through the C interface.
41
42 Encoder
43 =======
44
45 The primary object in x265 is the encoder object, and this is
46 represented in the public API as an opaque typedef **x265_encoder**.
47 Pointers of this type are passed to most encoder functions.
48
49 A single encoder generates a single output bitstream from a sequence of
50 raw input pictures. Thus if you need multiple output bitstreams you
51 must allocate multiple encoders. You may pass the same input pictures
52 to multiple encoders, the encode function does not modify the input
53 picture structures (the pictures are copied into the encoder as the
54 first step of encode).
55
56 Encoder allocation is a reentrant function, so multiple encoders may be
57 safely allocated in a single process. The encoder access functions are
58 not reentrant for a single encoder, so the recommended use case is to
59 allocate one client thread per encoder instance (one thread for all
60 encoder instances is possible, but some encoder access functions are
61 blocking and thus this would be less efficient).
62
63 .. Note::
64
65 There is one caveat to having multiple encoders within a single
66 process. All of the encoders must use the same maximum CTU size
67 because many global variables are configured based on this size.
68 Encoder allocation will fail if a mis-matched CTU size is attempted.
69
70 An encoder is allocated by calling **x265_encoder_open()**::
71
72 /* x265_encoder_open:
73 * create a new encoder handler, all parameters from x265_param are copied */
74 x265_encoder* x265_encoder_open(x265_param *);
75
76 The returned pointer is then passed to all of the functions pertaining
77 to this encode. A large amount of memory is allocated during this
78 function call, but the encoder will continue to allocate memory as the
79 first pictures are passed to the encoder; until its pool of picture
80 structures is large enough to handle all of the pictures it must keep
81 internally. The pool size is determined by the lookahead depth, the
82 number of frame threads, and the maximum number of references.
83
84 As indicated in the comment, **x265_param** is copied internally so the user
85 may release their copy after allocating the encoder. Changes made to
86 their copy of the param structure have no affect on the encoder after it
87 has been allocated.
88
89 Param
90 =====
91
92 The **x265_param** structure describes everything the encoder needs to
93 know about the input pictures and the output bitstream and most
94 everything in between.
95
96 The recommended way to handle these param structures is to allocate them
97 from libx265 via::
98
99 /* x265_param_alloc:
100 * Allocates an x265_param instance. The returned param structure is not
101 * special in any way, but using this method together with x265_param_free()
102 * and x265_param_parse() to set values by name allows the application to treat
103 * x265_param as an opaque data struct for version safety */
104 x265_param *x265_param_alloc();
105
106 In this way, your application does not need to know the exact size of
107 the param structure (the build of x265 could potentially be a bit newer
108 than the copy of :file:`x265.h` that your application compiled against).
109
110 Next you perform the initial *rough cut* configuration of the encoder by
111 chosing a performance preset and optional tune factor
112 **x265_preset_names** and **x265_tune_names** respectively hold the
113 string names of the presets and tune factors (see :ref:`presets
114 <preset-tune-ref>` for more detail on presets and tune factors)::
115
116 /* returns 0 on success, negative on failure (e.g. invalid preset/tune name). */
117 int x265_param_default_preset(x265_param *, const char *preset, const char *tune);
118
119 Now you may optionally specify a profile. **x265_profile_names**
120 contains the string names this function accepts::
121
122 /* (can be NULL, in which case the function will do nothing)
123 * returns 0 on success, negative on failure (e.g. invalid profile name). */
124 int x265_param_apply_profile(x265_param *, const char *profile);
125
126 Finally you configure any remaining options by name using repeated calls to::
127
128 /* x265_param_parse:
129 * set one parameter by name.
130 * returns 0 on success, or returns one of the following errors.
131 * note: BAD_VALUE occurs only if it can't even parse the value,
132 * numerical range is not checked until x265_encoder_open().
133 * value=NULL means "true" for boolean options, but is a BAD_VALUE for non-booleans. */
134 #define X265_PARAM_BAD_NAME (-1)
135 #define X265_PARAM_BAD_VALUE (-2)
136 int x265_param_parse(x265_param *p, const char *name, const char *value);
137
138 See :ref:`string options <string-options-ref>` for the list of options (and their
139 descriptions) which can be set by **x265_param_parse()**.
140
141 After the encoder has been created, you may release the param structure::
142
143 /* x265_param_free:
144 * Use x265_param_free() to release storage for an x265_param instance
145 * allocated by x265_param_alloc() */
146 void x265_param_free(x265_param *);
147
148 .. Note::
149
150 Using these methods to allocate and release the param structures
151 helps future-proof your code in many ways, but the x265 API is
152 versioned in such a way that we prevent linkage against a build of
153 x265 that does not match the version of the header you are compiling
154 against. This is function of the X265_BUILD macro.
155
156 **x265_encoder_parameters()** may be used to get a copy of the param
157 structure from the encoder after it has been opened, in order to see the
158 changes made to the parameters for auto-detection and other reasons::
159
160 /* x265_encoder_parameters:
161 * copies the current internal set of parameters to the pointer provided
162 * by the caller. useful when the calling application needs to know
163 * how x265_encoder_open has changed the parameters.
164 * note that the data accessible through pointers in the returned param struct
165 * (e.g. filenames) should not be modified by the calling application. */
166 void x265_encoder_parameters(x265_encoder *, x265_param *);
167
168 Pictures
169 ========
170
171 Raw pictures are passed to the encoder via the **x265_picture** structure.
172 Just like the param structure we recommend you allocate this structure
173 from the encoder to avoid potential size mismatches::
174
175 /* x265_picture_alloc:
176 * Allocates an x265_picture instance. The returned picture structure is not
177 * special in any way, but using this method together with x265_picture_free()
178 * and x265_picture_init() allows some version safety. New picture fields will
179 * always be added to the end of x265_picture */
180 x265_picture *x265_picture_alloc();
181
182 Regardless of whether you allocate your picture structure this way or
183 whether you simply declare it on the stack, your next step is to
184 initialize the structure via::
185
186 /***
187 * Initialize an x265_picture structure to default values. It sets the pixel
188 * depth and color space to the encoder's internal values and sets the slice
189 * type to auto - so the lookahead will determine slice type.
190 */
191 void x265_picture_init(x265_param *param, x265_picture *pic);
192
193 x265 does not perform any color space conversions, so the raw picture's
194 color space (chroma sampling) must match the color space specified in
195 the param structure used to allocate the encoder. **x265_picture_init**
196 initializes this field to the internal color space and it is best to
197 leave it unmodified.
198
199 The picture bit depth is initialized to be the encoder's internal bit
200 depth but this value should be changed to the actual depth of the pixels
201 being passed into the encoder. If the picture bit depth is more than 8,
202 the encoder assumes two bytes are used to represent each sample
203 (little-endian shorts).
204
205 The user is responsible for setting the plane pointers and plane strides
206 (in units of bytes, not pixels). The presentation time stamp (**pts**)
207 is optional, depending on whether you need accurate decode time stamps
208 (**dts**) on output.
209
210 If you wish to override the lookahead or rate control for a given
211 picture you may specify a slicetype other than X265_TYPE_AUTO, or a
212 forceQP value other than 0.
213
214 x265 does not modify the picture structure provided as input, so you may
215 reuse a single **x265_picture** for all pictures passed to a single
216 encoder, or even all pictures passed to multiple encoders.
217
218 Structures allocated from the library should eventually be released::
219
220 /* x265_picture_free:
221 * Use x265_picture_free() to release storage for an x265_picture instance
222 * allocated by x265_picture_alloc() */
223 void x265_picture_free(x265_picture *);
224
225
226 Analysis Buffers
227 ================
228
229 Analysis information can be saved and reused to between encodes of the
230 same video sequence (generally for multiple bitrate encodes). The best
231 results are attained by saving the analysis information of the highest
232 bitrate encode and reuse it in lower bitrate encodes.
233
234 When saving or loading analysis data, buffers must be allocated for
235 every picture passed into the encoder using::
236
237 /* x265_alloc_analysis_data:
238 * Allocate memory to hold analysis meta data, returns 1 on success else 0 */
239 int x265_alloc_analysis_data(x265_picture*);
240
241 Note that this is very different from the typical semantics of
242 **x265_picture**, which can be reused many times. The analysis buffers must
243 be re-allocated for every input picture.
244
245 Analysis buffers passed to the encoder are owned by the encoder until
246 they pass the buffers back via an output **x265_picture**. The user is
247 responsible for releasing the buffers when they are finished with them
248 via::
249
250 /* x265_free_analysis_data:
251 * Use x265_free_analysis_data to release storage of members allocated by
252 * x265_alloc_analysis_data */
253 void x265_free_analysis_data(x265_picture*);
254
255
256 Encode Process
257 ==============
258
259 The output of the encoder is a series of NAL packets, which are always
260 returned concatenated in consecutive memory. HEVC streams have SPS and
261 PPS and VPS headers which describe how the following packets are to be
262 decoded. If you specified :option:`--repeat-headers` then those headers
263 will be output with every keyframe. Otherwise you must explicitly query
264 those headers using::
265
266 /* x265_encoder_headers:
267 * return the SPS and PPS that will be used for the whole stream.
268 * *pi_nal is the number of NAL units outputted in pp_nal.
269 * returns negative on error, total byte size of payload data on success
270 * the payloads of all output NALs are guaranteed to be sequential in memory. */
271 int x265_encoder_headers(x265_encoder *, x265_nal **pp_nal, uint32_t *pi_nal);
272
273 Now we get to the main encode loop. Raw input pictures are passed to the
274 encoder in display order via::
275
276 /* x265_encoder_encode:
277 * encode one picture.
278 * *pi_nal is the number of NAL units outputted in pp_nal.
279 * returns negative on error, zero if no NAL units returned.
280 * the payloads of all output NALs are guaranteed to be sequential in memory. */
281 int x265_encoder_encode(x265_encoder *encoder, x265_nal **pp_nal, uint32_t *pi_nal, x265_picture *pic_in, x265_picture *pic_out);
282
283 These pictures are queued up until the lookahead is full, and then the
284 frame encoders in turn are filled, and then finally you begin receiving
285 a output NALs (corresponding to a single output picture) with each input
286 picture you pass into the encoder.
287
288 Once the pipeline is completely full, **x265_encoder_encode()** will
289 block until the next output picture is complete.
290
291 .. note::
292
293 Optionally, if the pointer of a second **x265_picture** structure is
294 provided, the encoder will fill it with data pertaining to the
295 output picture corresponding to the output NALs, including the
296 recontructed image, POC and decode timestamp. These pictures will be
297 in encode (or decode) order.
298
299 When the last of the raw input pictures has been sent to the encoder,
300 **x265_encoder_encode()** must still be called repeatedly with a
301 *pic_in* argument of 0, indicating a pipeline flush, until the function
302 returns a value less than or equal to 0 (indicating the output bitstream
303 is complete).
304
305 At any time during this process, the application may query running
306 statistics from the encoder::
307
308 /* x265_encoder_get_stats:
309 * returns encoder statistics */
310 void x265_encoder_get_stats(x265_encoder *encoder, x265_stats *, uint32_t statsSizeBytes);
311
312 Cleanup
313 =======
314
315 At the end of the encode, the application will want to trigger logging
316 of the final encode statistics, if :option:`--csv` had been specified::
317
318 /* x265_encoder_log:
319 * write a line to the configured CSV file. If a CSV filename was not
320 * configured, or file open failed, or the log level indicated frame level
321 * logging, this function will perform no write. */
322 void x265_encoder_log(x265_encoder *encoder, int argc, char **argv);
323
324 Finally, the encoder must be closed in order to free all of its
325 resources. An encoder that has been flushed cannot be restarted and
326 reused. Once **x265_encoder_close()** has been called, the encoder
327 handle must be discarded::
328
329 /* x265_encoder_close:
330 * close an encoder handler */
331 void x265_encoder_close(x265_encoder *);
332
333 When the application has completed all encodes, it should call
334 **x265_cleanup()** to free process global resources like the thread pool;
335 particularly if a memory-leak detection tool is being used::
336
337 /***
338 * Release library static allocations
339 */
340 void x265_cleanup(void);