| 1 | This document is a tutorial/initiation for writing simple filters in |
| 2 | libavfilter. |
| 3 | |
| 4 | Foreword: just like everything else in FFmpeg, libavfilter is monolithic, which |
| 5 | means that it is highly recommended that you submit your filters to the FFmpeg |
| 6 | development mailing-list and make sure it is applied. Otherwise, your filter is |
| 7 | likely to have a very short lifetime due to more a less regular internal API |
| 8 | changes, and a limited distribution, review, and testing. |
| 9 | |
| 10 | Bootstrap |
| 11 | ========= |
| 12 | |
| 13 | Let's say you want to write a new simple video filter called "foobar" which |
| 14 | takes one frame in input, changes the pixels in whatever fashion you fancy, and |
| 15 | outputs the modified frame. The most simple way of doing this is to take a |
| 16 | similar filter. We'll pick edgedetect, but any other should do. You can look |
| 17 | for others using the `./ffmpeg -v 0 -filters|grep ' V->V '` command. |
| 18 | |
| 19 | - sed 's/edgedetect/foobar/g;s/EdgeDetect/Foobar/g' libavfilter/vf_edgedetect.c > libavfilter/vf_foobar.c |
| 20 | - edit libavfilter/Makefile, and add an entry for "foobar" following the |
| 21 | pattern of the other filters. |
| 22 | - edit libavfilter/allfilters.c, and add an entry for "foobar" following the |
| 23 | pattern of the other filters. |
| 24 | - ./configure ... |
| 25 | - make -j<whatever> ffmpeg |
| 26 | - ./ffmpeg -i http://samples.ffmpeg.org/image-samples/lena.pnm -vf foobar foobar.png |
| 27 | Note here: you can obviously use a random local image instead of a remote URL. |
| 28 | |
| 29 | If everything went right, you should get a foobar.png with Lena edge-detected. |
| 30 | |
| 31 | That's it, your new playground is ready. |
| 32 | |
| 33 | Some little details about what's going on: |
| 34 | libavfilter/allfilters.c:avfilter_register_all() is called at runtime to create |
| 35 | a list of the available filters, but it's important to know that this file is |
| 36 | also parsed by the configure script, which in turn will define variables for |
| 37 | the build system and the C: |
| 38 | |
| 39 | --- after running configure --- |
| 40 | |
| 41 | $ grep FOOBAR config.mak |
| 42 | CONFIG_FOOBAR_FILTER=yes |
| 43 | $ grep FOOBAR config.h |
| 44 | #define CONFIG_FOOBAR_FILTER 1 |
| 45 | |
| 46 | CONFIG_FOOBAR_FILTER=yes from the config.mak is later used to enable the filter in |
| 47 | libavfilter/Makefile and CONFIG_FOOBAR_FILTER=1 from the config.h will be used |
| 48 | for registering the filter in libavfilter/allfilters.c. |
| 49 | |
| 50 | Filter code layout |
| 51 | ================== |
| 52 | |
| 53 | You now need some theory about the general code layout of a filter. Open your |
| 54 | libavfilter/vf_foobar.c. This section will detail the important parts of the |
| 55 | code you need to understand before messing with it. |
| 56 | |
| 57 | Copyright |
| 58 | --------- |
| 59 | |
| 60 | First chunk is the copyright. Most filters are LGPL, and we are assuming |
| 61 | vf_foobar is as well. We are also assuming vf_foobar is not an edge detector |
| 62 | filter, so you can update the boilerplate with your credits. |
| 63 | |
| 64 | Doxy |
| 65 | ---- |
| 66 | |
| 67 | Next chunk is the Doxygen about the file. See http://ffmpeg.org/doxygen/trunk/. |
| 68 | Detail here what the filter is, does, and add some references if you feel like |
| 69 | it. |
| 70 | |
| 71 | Context |
| 72 | ------- |
| 73 | |
| 74 | Skip the headers and scroll down to the definition of FoobarContext. This is |
| 75 | your local state context. It is already filled with 0 when you get it so do not |
| 76 | worry about uninitialized read into this context. This is where you put every |
| 77 | "global" information you need, typically the variable storing the user options. |
| 78 | You'll notice the first field "const AVClass *class"; it's the only field you |
| 79 | need to keep assuming you have a context. There are some magic you don't care |
| 80 | about around this field, just let it be (in first position) for now. |
| 81 | |
| 82 | Options |
| 83 | ------- |
| 84 | |
| 85 | Then comes the options array. This is what will define the user accessible |
| 86 | options. For example, -vf foobar=mode=colormix:high=0.4:low=0.1. Most options |
| 87 | have the following pattern: |
| 88 | name, description, offset, type, default value, minimum value, maximum value, flags |
| 89 | |
| 90 | - name is the option name, keep it simple, lowercase |
| 91 | - description are short, in lowercase, without period, and describe what they |
| 92 | do, for example "set the foo of the bar" |
| 93 | - offset is the offset of the field in your local context, see the OFFSET() |
| 94 | macro; the option parser will use that information to fill the fields |
| 95 | according to the user input |
| 96 | - type is any of AV_OPT_TYPE_* defined in libavutil/opt.h |
| 97 | - default value is an union where you pick the appropriate type; "{.dbl=0.3}", |
| 98 | "{.i64=0x234}", "{.str=NULL}", ... |
| 99 | - min and max values define the range of available values, inclusive |
| 100 | - flags are AVOption generic flags. See AV_OPT_FLAG_* definitions |
| 101 | |
| 102 | In doubt, just look at the other AVOption definitions all around the codebase, |
| 103 | there are tons of examples. |
| 104 | |
| 105 | Class |
| 106 | ----- |
| 107 | |
| 108 | AVFILTER_DEFINE_CLASS(foobar) will define a unique foobar_class with some kind |
| 109 | of signature referencing the options, etc. which will be referenced in the |
| 110 | definition of the AVFilter. |
| 111 | |
| 112 | Filter definition |
| 113 | ----------------- |
| 114 | |
| 115 | At the end of the file, you will find foobar_inputs, foobar_outputs and |
| 116 | the AVFilter ff_vf_foobar. Don't forget to update the AVFilter.description with |
| 117 | a description of what the filter does, starting with a capitalized letter and |
| 118 | ending with a period. You'd better drop the AVFilter.flags entry for now, and |
| 119 | re-add them later depending on the capabilities of your filter. |
| 120 | |
| 121 | Callbacks |
| 122 | --------- |
| 123 | |
| 124 | Let's now study the common callbacks. Before going into details, note that all |
| 125 | these callbacks are explained in details in libavfilter/avfilter.h, so in |
| 126 | doubt, refer to the doxy in that file. |
| 127 | |
| 128 | init() |
| 129 | ~~~~~~ |
| 130 | |
| 131 | First one to be called is init(). It's flagged as cold because not called |
| 132 | often. Look for "cold" on |
| 133 | http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html for more |
| 134 | information. |
| 135 | |
| 136 | As the name suggests, init() is where you eventually initialize and allocate |
| 137 | your buffers, pre-compute your data, etc. Note that at this point, your local |
| 138 | context already has the user options initialized, but you still haven't any |
| 139 | clue about the kind of data input you will get, so this function is often |
| 140 | mainly used to sanitize the user options. |
| 141 | |
| 142 | Some init()s will also define the number of inputs or outputs dynamically |
| 143 | according to the user options. A good example of this is the split filter, but |
| 144 | we won't cover this here since vf_foobar is just a simple 1:1 filter. |
| 145 | |
| 146 | uninit() |
| 147 | ~~~~~~~~ |
| 148 | |
| 149 | Similarly, there is the uninit() callback, doing what the name suggest. Free |
| 150 | everything you allocated here. |
| 151 | |
| 152 | query_formats() |
| 153 | ~~~~~~~~~~~~~~~ |
| 154 | |
| 155 | This is following the init() and is used for the format negotiation, basically |
| 156 | where you say what pixel format(s) (gray, rgb 32, yuv 4:2:0, ...) you accept |
| 157 | for your inputs, and what you can output. All pixel formats are defined in |
| 158 | libavutil/pixfmt.h. If you don't change the pixel format between the input and |
| 159 | the output, you just have to define a pixel formats array and call |
| 160 | ff_set_common_formats(). For more complex negotiation, you can refer to other |
| 161 | filters such as vf_scale. |
| 162 | |
| 163 | config_props() |
| 164 | ~~~~~~~~~~~~~~ |
| 165 | |
| 166 | This callback is not necessary, but you will probably have one or more |
| 167 | config_props() anyway. It's not a callback for the filter itself but for its |
| 168 | inputs or outputs (they're called "pads" - AVFilterPad - in libavfilter's |
| 169 | lexicon). |
| 170 | |
| 171 | Inside the input config_props(), you are at a point where you know which pixel |
| 172 | format has been picked after query_formats(), and more information such as the |
| 173 | video width and height (inlink->{w,h}). So if you need to update your internal |
| 174 | context state depending on your input you can do it here. In edgedetect you can |
| 175 | see that this callback is used to allocate buffers depending on these |
| 176 | information. They will be destroyed in uninit(). |
| 177 | |
| 178 | Inside the output config_props(), you can define what you want to change in the |
| 179 | output. Typically, if your filter is going to double the size of the video, you |
| 180 | will update outlink->w and outlink->h. |
| 181 | |
| 182 | filter_frame() |
| 183 | ~~~~~~~~~~~~~~ |
| 184 | |
| 185 | This is the callback you are waiting from the beginning: it is where you |
| 186 | process the received frames. Along with the frame, you get the input link from |
| 187 | where the frame comes from. |
| 188 | |
| 189 | static int filter_frame(AVFilterLink *inlink, AVFrame *in) { ... } |
| 190 | |
| 191 | You can get the filter context through that input link: |
| 192 | |
| 193 | AVFilterContext *ctx = inlink->dst; |
| 194 | |
| 195 | Then access your internal state context: |
| 196 | |
| 197 | FoobarContext *foobar = ctx->priv; |
| 198 | |
| 199 | And also the output link where you will send your frame when you are done: |
| 200 | |
| 201 | AVFilterLink *outlink = ctx->outputs[0]; |
| 202 | |
| 203 | Here, we are picking the first output. You can have several, but in our case we |
| 204 | only have one since we are in a 1:1 input-output situation. |
| 205 | |
| 206 | If you want to define a simple pass-through filter, you can just do: |
| 207 | |
| 208 | return ff_filter_frame(outlink, in); |
| 209 | |
| 210 | But of course, you probably want to change the data of that frame. |
| 211 | |
| 212 | This can be done by accessing frame->data[] and frame->linesize[]. Important |
| 213 | note here: the width does NOT match the linesize. The linesize is always |
| 214 | greater or equal to the width. The padding created should not be changed or |
| 215 | even read. Typically, keep in mind that a previous filter in your chain might |
| 216 | have altered the frame dimension but not the linesize. Imagine a crop filter |
| 217 | that halves the video size: the linesizes won't be changed, just the width. |
| 218 | |
| 219 | <-------------- linesize ------------------------> |
| 220 | +-------------------------------+----------------+ ^ |
| 221 | | | | | |
| 222 | | | | | |
| 223 | | picture | padding | | height |
| 224 | | | | | |
| 225 | | | | | |
| 226 | +-------------------------------+----------------+ v |
| 227 | <----------- width -------------> |
| 228 | |
| 229 | Before modifying the "in" frame, you have to make sure it is writable, or get a |
| 230 | new one. Multiple scenarios are possible here depending on the kind of |
| 231 | processing you are doing. |
| 232 | |
| 233 | Let's say you want to change one pixel depending on multiple pixels (typically |
| 234 | the surrounding ones) of the input. In that case, you can't do an in-place |
| 235 | processing of the input so you will need to allocate a new frame, with the same |
| 236 | properties as the input one, and send that new frame to the next filter: |
| 237 | |
| 238 | AVFrame *out = ff_get_video_buffer(outlink, outlink->w, outlink->h); |
| 239 | if (!out) { |
| 240 | av_frame_free(&in); |
| 241 | return AVERROR(ENOMEM); |
| 242 | } |
| 243 | av_frame_copy_props(out, in); |
| 244 | |
| 245 | // out->data[...] = foobar(in->data[...]) |
| 246 | |
| 247 | av_frame_free(&in); |
| 248 | return ff_filter_frame(outlink, out); |
| 249 | |
| 250 | In-place processing |
| 251 | ~~~~~~~~~~~~~~~~~~~ |
| 252 | |
| 253 | If you can just alter the input frame, you probably just want to do that |
| 254 | instead: |
| 255 | |
| 256 | av_frame_make_writable(in); |
| 257 | // in->data[...] = foobar(in->data[...]) |
| 258 | return ff_filter_frame(outlink, in); |
| 259 | |
| 260 | You may wonder why a frame might not be writable. The answer is that for |
| 261 | example a previous filter might still own the frame data: imagine a filter |
| 262 | prior to yours in the filtergraph that needs to cache the frame. You must not |
| 263 | alter that frame, otherwise it will make that previous filter buggy. This is |
| 264 | where av_frame_make_writable() helps (it won't have any effect if the frame |
| 265 | already is writable). |
| 266 | |
| 267 | The problem with using av_frame_make_writable() is that in the worst case it |
| 268 | will copy the whole input frame before you change it all over again with your |
| 269 | filter: if the frame is not writable, av_frame_make_writable() will allocate |
| 270 | new buffers, and copy the input frame data. You don't want that, and you can |
| 271 | avoid it by just allocating a new buffer if necessary, and process from in to |
| 272 | out in your filter, saving the memcpy. Generally, this is done following this |
| 273 | scheme: |
| 274 | |
| 275 | int direct = 0; |
| 276 | AVFrame *out; |
| 277 | |
| 278 | if (av_frame_is_writable(in)) { |
| 279 | direct = 1; |
| 280 | out = in; |
| 281 | } else { |
| 282 | out = ff_get_video_buffer(outlink, outlink->w, outlink->h); |
| 283 | if (!out) { |
| 284 | av_frame_free(&in); |
| 285 | return AVERROR(ENOMEM); |
| 286 | } |
| 287 | av_frame_copy_props(out, in); |
| 288 | } |
| 289 | |
| 290 | // out->data[...] = foobar(in->data[...]) |
| 291 | |
| 292 | if (!direct) |
| 293 | av_frame_free(&in); |
| 294 | return ff_filter_frame(outlink, out); |
| 295 | |
| 296 | Of course, this will only work if you can do in-place processing. To test if |
| 297 | your filter handles well the permissions, you can use the perms filter. For |
| 298 | example with: |
| 299 | |
| 300 | -vf perms=random,foobar |
| 301 | |
| 302 | Make sure no automatic pixel conversion is inserted between perms and foobar, |
| 303 | otherwise the frames permissions might change again and the test will be |
| 304 | meaningless: add av_log(0,0,"direct=%d\n",direct) in your code to check that. |
| 305 | You can avoid the issue with something like: |
| 306 | |
| 307 | -vf format=rgb24,perms=random,foobar |
| 308 | |
| 309 | ...assuming your filter accepts rgb24 of course. This will make sure the |
| 310 | necessary conversion is inserted before the perms filter. |
| 311 | |
| 312 | Timeline |
| 313 | ~~~~~~~~ |
| 314 | |
| 315 | Adding timeline support |
| 316 | (http://ffmpeg.org/ffmpeg-filters.html#Timeline-editing) is often an easy |
| 317 | feature to add. In the most simple case, you just have to add |
| 318 | AVFILTER_FLAG_SUPPORT_TIMELINE_GENERIC to the AVFilter.flags. You can typically |
| 319 | do this when your filter does not need to save the previous context frames, or |
| 320 | basically if your filter just alter whatever goes in and doesn't need |
| 321 | previous/future information. See for instance commit 86cb986ce that adds |
| 322 | timeline support to the fieldorder filter. |
| 323 | |
| 324 | In some cases, you might need to reset your context somehow. This is handled by |
| 325 | the AVFILTER_FLAG_SUPPORT_TIMELINE_INTERNAL flag which is used if the filter |
| 326 | must not process the frames but still wants to keep track of the frames going |
| 327 | through (to keep them in cache for when it's enabled again). See for example |
| 328 | commit 69d72140a that adds timeline support to the phase filter. |
| 329 | |
| 330 | Threading |
| 331 | ~~~~~~~~~ |
| 332 | |
| 333 | libavfilter does not yet support frame threading, but you can add slice |
| 334 | threading to your filters. |
| 335 | |
| 336 | Let's say the foobar filter has the following frame processing function: |
| 337 | |
| 338 | dst = out->data[0]; |
| 339 | src = in ->data[0]; |
| 340 | |
| 341 | for (y = 0; y < inlink->h; y++) { |
| 342 | for (x = 0; x < inlink->w; x++) |
| 343 | dst[x] = foobar(src[x]); |
| 344 | dst += out->linesize[0]; |
| 345 | src += in ->linesize[0]; |
| 346 | } |
| 347 | |
| 348 | The first thing is to make this function work into slices. The new code will |
| 349 | look like this: |
| 350 | |
| 351 | for (y = slice_start; y < slice_end; y++) { |
| 352 | for (x = 0; x < inlink->w; x++) |
| 353 | dst[x] = foobar(src[x]); |
| 354 | dst += out->linesize[0]; |
| 355 | src += in ->linesize[0]; |
| 356 | } |
| 357 | |
| 358 | The source and destination pointers, and slice_start/slice_end will be defined |
| 359 | according to the number of jobs. Generally, it looks like this: |
| 360 | |
| 361 | const int slice_start = (in->height * jobnr ) / nb_jobs; |
| 362 | const int slice_end = (in->height * (jobnr+1)) / nb_jobs; |
| 363 | uint8_t *dst = out->data[0] + slice_start * out->linesize[0]; |
| 364 | const uint8_t *src = in->data[0] + slice_start * in->linesize[0]; |
| 365 | |
| 366 | This new code will be isolated in a new filter_slice(): |
| 367 | |
| 368 | static int filter_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { ... } |
| 369 | |
| 370 | Note that we need our input and output frame to define slice_{start,end} and |
| 371 | dst/src, which are not available in that callback. They will be transmitted |
| 372 | through the opaque void *arg. You have to define a structure which contains |
| 373 | everything you need: |
| 374 | |
| 375 | typedef struct ThreadData { |
| 376 | AVFrame *in, *out; |
| 377 | } ThreadData; |
| 378 | |
| 379 | If you need some more information from your local context, put them here. |
| 380 | |
| 381 | In you filter_slice function, you access it like that: |
| 382 | |
| 383 | const ThreadData *td = arg; |
| 384 | |
| 385 | Then in your filter_frame() callback, you need to call the threading |
| 386 | distributor with something like this: |
| 387 | |
| 388 | ThreadData td; |
| 389 | |
| 390 | // ... |
| 391 | |
| 392 | td.in = in; |
| 393 | td.out = out; |
| 394 | ctx->internal->execute(ctx, filter_slice, &td, NULL, FFMIN(outlink->h, ctx->graph->nb_threads)); |
| 395 | |
| 396 | // ... |
| 397 | |
| 398 | return ff_filter_frame(outlink, out); |
| 399 | |
| 400 | Last step is to add AVFILTER_FLAG_SLICE_THREADS flag to AVFilter.flags. |
| 401 | |
| 402 | For more example of slice threading additions, you can try to run git log -p |
| 403 | --grep 'slice threading' libavfilter/ |
| 404 | |
| 405 | Finalization |
| 406 | ~~~~~~~~~~~~ |
| 407 | |
| 408 | When your awesome filter is finished, you have a few more steps before you're |
| 409 | done: |
| 410 | |
| 411 | - write its documentation in doc/filters.texi, and test the output with make |
| 412 | doc/ffmpeg-filters.html. |
| 413 | - add a FATE test, generally by adding an entry in |
| 414 | tests/fate/filter-video.mak, add running make fate-filter-foobar GEN=1 to |
| 415 | generate the data. |
| 416 | - add an entry in the Changelog |
| 417 | - edit libavfilter/version.h and increase LIBAVFILTER_VERSION_MINOR by one |
| 418 | (and reset LIBAVFILTER_VERSION_MICRO to 100) |
| 419 | - git add ... && git commit -m "avfilter: add foobar filter." && git format-patch -1 |
| 420 | |
| 421 | When all of this is done, you can submit your patch to the ffmpeg-devel |
| 422 | mailing-list for review. If you need any help, feel free to come on our IRC |
| 423 | channel, #ffmpeg-devel on irc.freenode.net. |