| 1 | Filter design |
| 2 | ============= |
| 3 | |
| 4 | This document explains guidelines that should be observed (or ignored with |
| 5 | good reason) when writing filters for libavfilter. |
| 6 | |
| 7 | In this document, the word “frame” indicates either a video frame or a group |
| 8 | of audio samples, as stored in an AVFilterBuffer structure. |
| 9 | |
| 10 | |
| 11 | Format negotiation |
| 12 | ================== |
| 13 | |
| 14 | The query_formats method should set, for each input and each output links, |
| 15 | the list of supported formats. |
| 16 | |
| 17 | For video links, that means pixel format. For audio links, that means |
| 18 | channel layout, sample format (the sample packing is implied by the sample |
| 19 | format) and sample rate. |
| 20 | |
| 21 | The lists are not just lists, they are references to shared objects. When |
| 22 | the negotiation mechanism computes the intersection of the formats |
| 23 | supported at each end of a link, all references to both lists are replaced |
| 24 | with a reference to the intersection. And when a single format is |
| 25 | eventually chosen for a link amongst the remaining list, again, all |
| 26 | references to the list are updated. |
| 27 | |
| 28 | That means that if a filter requires that its input and output have the |
| 29 | same format amongst a supported list, all it has to do is use a reference |
| 30 | to the same list of formats. |
| 31 | |
| 32 | query_formats can leave some formats unset and return AVERROR(EAGAIN) to |
| 33 | cause the negotiation mechanism to try again later. That can be used by |
| 34 | filters with complex requirements to use the format negotiated on one link |
| 35 | to set the formats supported on another. |
| 36 | |
| 37 | |
| 38 | Buffer references ownership and permissions |
| 39 | =========================================== |
| 40 | |
| 41 | Principle |
| 42 | --------- |
| 43 | |
| 44 | Audio and video data are voluminous; the buffer and buffer reference |
| 45 | mechanism is intended to avoid, as much as possible, expensive copies of |
| 46 | that data while still allowing the filters to produce correct results. |
| 47 | |
| 48 | The data is stored in buffers represented by AVFilterBuffer structures. |
| 49 | They must not be accessed directly, but through references stored in |
| 50 | AVFilterBufferRef structures. Several references can point to the |
| 51 | same buffer; the buffer is automatically deallocated once all |
| 52 | corresponding references have been destroyed. |
| 53 | |
| 54 | The characteristics of the data (resolution, sample rate, etc.) are |
| 55 | stored in the reference; different references for the same buffer can |
| 56 | show different characteristics. In particular, a video reference can |
| 57 | point to only a part of a video buffer. |
| 58 | |
| 59 | A reference is usually obtained as input to the start_frame or |
| 60 | filter_frame method or requested using the ff_get_video_buffer or |
| 61 | ff_get_audio_buffer functions. A new reference on an existing buffer can |
| 62 | be created with the avfilter_ref_buffer. A reference is destroyed using |
| 63 | the avfilter_unref_bufferp function. |
| 64 | |
| 65 | Reference ownership |
| 66 | ------------------- |
| 67 | |
| 68 | At any time, a reference “belongs” to a particular piece of code, |
| 69 | usually a filter. With a few caveats that will be explained below, only |
| 70 | that piece of code is allowed to access it. It is also responsible for |
| 71 | destroying it, although this is sometimes done automatically (see the |
| 72 | section on link reference fields). |
| 73 | |
| 74 | Here are the (fairly obvious) rules for reference ownership: |
| 75 | |
| 76 | * A reference received by the filter_frame method (or its start_frame |
| 77 | deprecated version) belongs to the corresponding filter. |
| 78 | |
| 79 | Special exception: for video references: the reference may be used |
| 80 | internally for automatic copying and must not be destroyed before |
| 81 | end_frame; it can be given away to ff_start_frame. |
| 82 | |
| 83 | * A reference passed to ff_filter_frame (or the deprecated |
| 84 | ff_start_frame) is given away and must no longer be used. |
| 85 | |
| 86 | * A reference created with avfilter_ref_buffer belongs to the code that |
| 87 | created it. |
| 88 | |
| 89 | * A reference obtained with ff_get_video_buffer or ff_get_audio_buffer |
| 90 | belongs to the code that requested it. |
| 91 | |
| 92 | * A reference given as return value by the get_video_buffer or |
| 93 | get_audio_buffer method is given away and must no longer be used. |
| 94 | |
| 95 | Link reference fields |
| 96 | --------------------- |
| 97 | |
| 98 | The AVFilterLink structure has a few AVFilterBufferRef fields. The |
| 99 | cur_buf and out_buf were used with the deprecated |
| 100 | start_frame/draw_slice/end_frame API and should no longer be used. |
| 101 | src_buf, cur_buf_copy and partial_buf are used by libavfilter internally |
| 102 | and must not be accessed by filters. |
| 103 | |
| 104 | Reference permissions |
| 105 | --------------------- |
| 106 | |
| 107 | The AVFilterBufferRef structure has a perms field that describes what |
| 108 | the code that owns the reference is allowed to do to the buffer data. |
| 109 | Different references for the same buffer can have different permissions. |
| 110 | |
| 111 | For video filters that implement the deprecated |
| 112 | start_frame/draw_slice/end_frame API, the permissions only apply to the |
| 113 | parts of the buffer that have already been covered by the draw_slice |
| 114 | method. |
| 115 | |
| 116 | The value is a binary OR of the following constants: |
| 117 | |
| 118 | * AV_PERM_READ: the owner can read the buffer data; this is essentially |
| 119 | always true and is there for self-documentation. |
| 120 | |
| 121 | * AV_PERM_WRITE: the owner can modify the buffer data. |
| 122 | |
| 123 | * AV_PERM_PRESERVE: the owner can rely on the fact that the buffer data |
| 124 | will not be modified by previous filters. |
| 125 | |
| 126 | * AV_PERM_REUSE: the owner can output the buffer several times, without |
| 127 | modifying the data in between. |
| 128 | |
| 129 | * AV_PERM_REUSE2: the owner can output the buffer several times and |
| 130 | modify the data in between (useless without the WRITE permissions). |
| 131 | |
| 132 | * AV_PERM_ALIGN: the owner can access the data using fast operations |
| 133 | that require data alignment. |
| 134 | |
| 135 | The READ, WRITE and PRESERVE permissions are about sharing the same |
| 136 | buffer between several filters to avoid expensive copies without them |
| 137 | doing conflicting changes on the data. |
| 138 | |
| 139 | The REUSE and REUSE2 permissions are about special memory for direct |
| 140 | rendering. For example a buffer directly allocated in video memory must |
| 141 | not modified once it is displayed on screen, or it will cause tearing; |
| 142 | it will therefore not have the REUSE2 permission. |
| 143 | |
| 144 | The ALIGN permission is about extracting part of the buffer, for |
| 145 | copy-less padding or cropping for example. |
| 146 | |
| 147 | |
| 148 | References received on input pads are guaranteed to have all the |
| 149 | permissions stated in the min_perms field and none of the permissions |
| 150 | stated in the rej_perms. |
| 151 | |
| 152 | References obtained by ff_get_video_buffer and ff_get_audio_buffer are |
| 153 | guaranteed to have at least all the permissions requested as argument. |
| 154 | |
| 155 | References created by avfilter_ref_buffer have the same permissions as |
| 156 | the original reference minus the ones explicitly masked; the mask is |
| 157 | usually ~0 to keep the same permissions. |
| 158 | |
| 159 | Filters should remove permissions on reference they give to output |
| 160 | whenever necessary. It can be automatically done by setting the |
| 161 | rej_perms field on the output pad. |
| 162 | |
| 163 | Here are a few guidelines corresponding to common situations: |
| 164 | |
| 165 | * Filters that modify and forward their frame (like drawtext) need the |
| 166 | WRITE permission. |
| 167 | |
| 168 | * Filters that read their input to produce a new frame on output (like |
| 169 | scale) need the READ permission on input and must request a buffer |
| 170 | with the WRITE permission. |
| 171 | |
| 172 | * Filters that intend to keep a reference after the filtering process |
| 173 | is finished (after filter_frame returns) must have the PRESERVE |
| 174 | permission on it and remove the WRITE permission if they create a new |
| 175 | reference to give it away. |
| 176 | |
| 177 | * Filters that intend to modify a reference they have kept after the end |
| 178 | of the filtering process need the REUSE2 permission and must remove |
| 179 | the PRESERVE permission if they create a new reference to give it |
| 180 | away. |
| 181 | |
| 182 | |
| 183 | Frame scheduling |
| 184 | ================ |
| 185 | |
| 186 | The purpose of these rules is to ensure that frames flow in the filter |
| 187 | graph without getting stuck and accumulating somewhere. |
| 188 | |
| 189 | Simple filters that output one frame for each input frame should not have |
| 190 | to worry about it. |
| 191 | |
| 192 | filter_frame |
| 193 | ------------ |
| 194 | |
| 195 | This method is called when a frame is pushed to the filter's input. It |
| 196 | can be called at any time except in a reentrant way. |
| 197 | |
| 198 | If the input frame is enough to produce output, then the filter should |
| 199 | push the output frames on the output link immediately. |
| 200 | |
| 201 | As an exception to the previous rule, if the input frame is enough to |
| 202 | produce several output frames, then the filter needs output only at |
| 203 | least one per link. The additional frames can be left buffered in the |
| 204 | filter; these buffered frames must be flushed immediately if a new input |
| 205 | produces new output. |
| 206 | |
| 207 | (Example: frame rate-doubling filter: filter_frame must (1) flush the |
| 208 | second copy of the previous frame, if it is still there, (2) push the |
| 209 | first copy of the incoming frame, (3) keep the second copy for later.) |
| 210 | |
| 211 | If the input frame is not enough to produce output, the filter must not |
| 212 | call request_frame to get more. It must just process the frame or queue |
| 213 | it. The task of requesting more frames is left to the filter's |
| 214 | request_frame method or the application. |
| 215 | |
| 216 | If a filter has several inputs, the filter must be ready for frames |
| 217 | arriving randomly on any input. Therefore, any filter with several inputs |
| 218 | will most likely require some kind of queuing mechanism. It is perfectly |
| 219 | acceptable to have a limited queue and to drop frames when the inputs |
| 220 | are too unbalanced. |
| 221 | |
| 222 | request_frame |
| 223 | ------------- |
| 224 | |
| 225 | This method is called when a frame is wanted on an output. |
| 226 | |
| 227 | For an input, it should directly call filter_frame on the corresponding |
| 228 | output. |
| 229 | |
| 230 | For a filter, if there are queued frames already ready, one of these |
| 231 | frames should be pushed. If not, the filter should request a frame on |
| 232 | one of its inputs, repeatedly until at least one frame has been pushed. |
| 233 | |
| 234 | Return values: |
| 235 | if request_frame could produce a frame, it should return 0; |
| 236 | if it could not for temporary reasons, it should return AVERROR(EAGAIN); |
| 237 | if it could not because there are no more frames, it should return |
| 238 | AVERROR_EOF. |
| 239 | |
| 240 | The typical implementation of request_frame for a filter with several |
| 241 | inputs will look like that: |
| 242 | |
| 243 | if (frames_queued) { |
| 244 | push_one_frame(); |
| 245 | return 0; |
| 246 | } |
| 247 | while (!frame_pushed) { |
| 248 | input = input_where_a_frame_is_most_needed(); |
| 249 | ret = ff_request_frame(input); |
| 250 | if (ret == AVERROR_EOF) { |
| 251 | process_eof_on_input(); |
| 252 | } else if (ret < 0) { |
| 253 | return ret; |
| 254 | } |
| 255 | } |
| 256 | return 0; |
| 257 | |
| 258 | Note that, except for filters that can have queued frames, request_frame |
| 259 | does not push frames: it requests them to its input, and as a reaction, |
| 260 | the filter_frame method will be called and do the work. |
| 261 | |
| 262 | Legacy API |
| 263 | ========== |
| 264 | |
| 265 | Until libavfilter 3.23, the filter_frame method was split: |
| 266 | |
| 267 | - for video filters, it was made of start_frame, draw_slice (that could be |
| 268 | called several times on distinct parts of the frame) and end_frame; |
| 269 | |
| 270 | - for audio filters, it was called filter_samples. |