ffmpeg/doc/writing_filters.txt

   1 This document is a tutorial/initiation for writing simple filters in
   2 libavfilter.
   3
   4 Foreword: just like everything else in FFmpeg, libavfilter is monolithic, which
   5 means that it is highly recommended that you submit your filters to the FFmpeg
   6 development mailing-list and make sure it is applied. Otherwise, your filter is
   7 likely to have a very short lifetime due to more a less regular internal API
   8 changes, and a limited distribution, review, and testing.
   9
  10 Bootstrap
  11 =========
  12
  13 Let's say you want to write a new simple video filter called "foobar" which
  14 takes one frame in input, changes the pixels in whatever fashion you fancy, and
  15 outputs the modified frame. The most simple way of doing this is to take a
  16 similar filter.  We'll pick edgedetect, but any other should do. You can look
  17 for others using the `./ffmpeg -v 0 -filters|grep ' V->V '` command.
  18
  19  - cp libavfilter/vf_{edgedetect,foobar}.c
  20  - sed -i s/edgedetect/foobar/g -i libavfilter/vf_foobar.c
  21  - sed -i s/EdgeDetect/Foobar/g -i libavfilter/vf_foobar.c
  22  - edit libavfilter/Makefile, and add an entry for "foobar" following the
  23    pattern of the other filters.
  24  - edit libavfilter/allfilters.c, and add an entry for "foobar" following the
  25    pattern of the other filters.
  26  - ./configure ...
  27  - make -j<whatever> ffmpeg
  28  - ./ffmpeg -i tests/lena.pnm -vf foobar foobar.png
  29
  30 If everything went right, you should get a foobar.png with Lena edge-detected.
  31
  32 That's it, your new playground is ready.
  33
  34 Some little details about what's going on:
  35 libavfilter/allfilters.c:avfilter_register_all() is called at runtime to create
  36 a list of the available filters, but it's important to know that this file is
  37 also parsed by the configure script, which in turn will define variables for
  38 the build system and the C:
  39
  40     --- after running configure ---
  41
  42     $ grep FOOBAR config.mak
  43     CONFIG_FOOBAR_FILTER=yes
  44     $ grep FOOBAR config.h
  45     #define CONFIG_FOOBAR_FILTER 1
  46
  47 CONFIG_FOOBAR_FILTER=yes from the config.mak is later used to enable the filter in
  48 libavfilter/Makefile and CONFIG_FOOBAR_FILTER=1 from the config.h will be used
  49 for registering the filter in libavfilter/allfilters.c.
  50
  51 Filter code layout
  52 ==================
  53
  54 You now need some theory about the general code layout of a filter. Open your
  55 libavfilter/vf_foobar.c. This section will detail the important parts of the
  56 code you need to understand before messing with it.
  57
  58 Copyright
  59 ---------
  60
  61 First chunk is the copyright. Most filters are LGPL, and we are assuming
  62 vf_foobar is as well. We are also assuming vf_foobar is not an edge detector
  63 filter, so you can update the boilerplate with your credits.
  64
  65 Doxy
  66 ----
  67
  68 Next chunk is the Doxygen about the file. See http://ffmpeg.org/doxygen/trunk/.
  69 Detail here what the filter is, does, and add some references if you feel like
  70 it.
  71
  72 Context
  73 -------
  74
  75 Skip the headers and scroll down to the definition of FoobarContext. This is
  76 your local state context. It is already filled with 0 when you get it so do not
  77 worry about uninitialized read into this context. This is where you put every
  78 "global" information you need, typically the variable storing the user options.
  79 You'll notice the first field "const AVClass *class"; it's the only field you
  80 need to keep assuming you have a context. There are some magic you don't care
  81 about around this field, just let it be (in first position) for now.
  82
  83 Options
  84 -------
  85
  86 Then comes the options array. This is what will define the user accessible
  87 options. For example, -vf foobar=mode=colormix:high=0.4:low=0.1. Most options
  88 have the following pattern:
  89   name, description, offset, type, default value, minimum value, maximum value, flags
  90
  91  - name is the option name, keep it simple, lowercase
  92  - description are short, in lowercase, without period, and describe what they
  93    do, for example "set the foo of the bar"
  94  - offset is the offset of the field in your local context, see the OFFSET()
  95    macro; the option parser will use that information to fill the fields
  96    according to the user input
  97  - type is any of AV_OPT_TYPE_* defined in libavutil/opt.h
  98  - default value is an union where you pick the appropriate type; "{.dbl=0.3}",
  99    "{.i64=0x234}", "{.str=NULL}", ...
 100  - min and max values define the range of available values, inclusive
 101  - flags are AVOption generic flags. See AV_OPT_FLAG_* definitions
 102
 103 In doubt, just look at the other AVOption definitions all around the codebase,
 104 there are tons of examples.
 105
 106 Class
 107 -----
 108
 109 AVFILTER_DEFINE_CLASS(foobar) will define a unique foobar_class with some kind
 110 of signature referencing the options, etc. which will be referenced in the
 111 definition of the AVFilter.
 112
 113 Filter definition
 114 -----------------
 115
 116 At the end of the file, you will find foobar_inputs, foobar_outputs and
 117 the AVFilter ff_vf_foobar. Don't forget to update the AVFilter.description with
 118 a description of what the filter does, starting with a capitalized letter and
 119 ending with a period. You'd better drop the AVFilter.flags entry for now, and
 120 re-add them later depending on the capabilities of your filter.
 121
 122 Callbacks
 123 ---------
 124
 125 Let's now study the common callbacks. Before going into details, note that all
 126 these callbacks are explained in details in libavfilter/avfilter.h, so in
 127 doubt, refer to the doxy in that file.
 128
 129 init()
 130 ~~~~~~
 131
 132 First one to be called is init(). It's flagged as cold because not called
 133 often. Look for "cold" on
 134 http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html for more
 135 information.
 136
 137 As the name suggests, init() is where you eventually initialize and allocate
 138 your buffers, pre-compute your data, etc. Note that at this point, your local
 139 context already has the user options initialized, but you still haven't any
 140 clue about the kind of data input you will get, so this function is often
 141 mainly used to sanitize the user options.
 142
 143 Some init()s will also define the number of inputs or outputs dynamically
 144 according to the user options. A good example of this is the split filter, but
 145 we won't cover this here since vf_foobar is just a simple 1:1 filter.
 146
 147 uninit()
 148 ~~~~~~~~
 149
 150 Similarly, there is the uninit() callback, doing what the name suggest. Free
 151 everything you allocated here.
 152
 153 query_formats()
 154 ~~~~~~~~~~~~~~~
 155
 156 This is following the init() and is used for the format negotiation, basically
 157 where you say what pixel format(s) (gray, rgb 32, yuv 4:2:0, ...) you accept
 158 for your inputs, and what you can output. All pixel formats are defined in
 159 libavutil/pixfmt.h. If you don't change the pixel format between the input and
 160 the output, you just have to define a pixel formats array and call
 161 ff_set_common_formats(). For more complex negotiation, you can refer to other
 162 filters such as vf_scale.
 163
 164 config_props()
 165 ~~~~~~~~~~~~~~
 166
 167 This callback is not necessary, but you will probably have one or more
 168 config_props() anyway. It's not a callback for the filter itself but for its
 169 inputs or outputs (they're called "pads" - AVFilterPad - in libavfilter's
 170 lexicon).
 171
 172 Inside the input config_props(), you are at a point where you know which pixel
 173 format has been picked after query_formats(), and more information such as the
 174 video width and height (inlink->{w,h}). So if you need to update your internal
 175 context state depending on your input you can do it here. In edgedetect you can
 176 see that this callback is used to allocate buffers depending on these
 177 information. They will be destroyed in uninit().
 178
 179 Inside the output config_props(), you can define what you want to change in the
 180 output. Typically, if your filter is going to double the size of the video, you
 181 will update outlink->w and outlink->h.
 182
 183 filter_frame()
 184 ~~~~~~~~~~~~~~
 185
 186 This is the callback you are waiting from the beginning: it is where you
 187 process the received frames. Along with the frame, you get the input link from
 188 where the frame comes from.
 189
 190     static int filter_frame(AVFilterLink *inlink, AVFrame *in) { ... }
 191
 192 You can get the filter context through that input link:
 193
 194     AVFilterContext *ctx = inlink->dst;
 195
 196 Then access your internal state context:
 197
 198     FoobarContext *foobar = ctx->priv;
 199
 200 And also the output link where you will send your frame when you are done:
 201
 202     AVFilterLink *outlink = ctx->outputs[0];
 203
 204 Here, we are picking the first output. You can have several, but in our case we
 205 only have one since we are in a 1:1 input-output situation.
 206
 207 If you want to define a simple pass-through filter, you can just do:
 208
 209     return ff_filter_frame(outlink, in);
 210
 211 But of course, you probably want to change the data of that frame.
 212
 213 This can be done by accessing frame->data[] and frame->linesize[].  Important
 214 note here: the width does NOT match the linesize. The linesize is always
 215 greater or equal to the width. The padding created should not be changed or
 216 even read. Typically, keep in mind that a previous filter in your chain might
 217 have altered the frame dimension but not the linesize. Imagine a crop filter
 218 that halves the video size: the linesizes won't be changed, just the width.
 219
 220     <-------------- linesize ------------------------>
 221     +-------------------------------+----------------+ ^
 222     |                               |                | |
 223     |                               |                | |
 224     |           picture             |    padding     | | height
 225     |                               |                | |
 226     |                               |                | |
 227     +-------------------------------+----------------+ v
 228     <----------- width ------------->
 229
 230 Before modifying the "in" frame, you have to make sure it is writable, or get a
 231 new one. Multiple scenarios are possible here depending on the kind of
 232 processing you are doing.
 233
 234 Let's say you want to change one pixel depending on multiple pixels (typically
 235 the surrounding ones) of the input. In that case, you can't do an in-place
 236 processing of the input so you will need to allocate a new frame, with the same
 237 properties as the input one, and send that new frame to the next filter:
 238
 239     AVFrame *out = ff_get_video_buffer(outlink, outlink->w, outlink->h);
 240     if (!out) {
 241         av_frame_free(&in);
 242         return AVERROR(ENOMEM);
 243     }
 244     av_frame_copy_props(out, in);
 245
 246     // out->data[...] = foobar(in->data[...])
 247
 248     av_frame_free(&in);
 249     return ff_filter_frame(outlink, out);
 250
 251 In-place processing
 252 ~~~~~~~~~~~~~~~~~~~
 253
 254 If you can just alter the input frame, you probably just want to do that
 255 instead:
 256
 257     av_frame_make_writable(in);
 258     // in->data[...] = foobar(in->data[...])
 259     return ff_filter_frame(outlink, in);
 260
 261 You may wonder why a frame might not be writable. The answer is that for
 262 example a previous filter might still own the frame data: imagine a filter
 263 prior to yours in the filtergraph that needs to cache the frame. You must not
 264 alter that frame, otherwise it will make that previous filter buggy. This is
 265 where av_frame_make_writable() helps (it won't have any effect if the frame
 266 already is writable).
 267
 268 The problem with using av_frame_make_writable() is that in the worst case it
 269 will copy the whole input frame before you change it all over again with your
 270 filter: if the frame is not writable, av_frame_make_writable() will allocate
 271 new buffers, and copy the input frame data. You don't want that, and you can
 272 avoid it by just allocating a new buffer if necessary, and process from in to
 273 out in your filter, saving the memcpy. Generally, this is done following this
 274 scheme:
 275
 276     int direct = 0;
 277     AVFrame *out;
 278
 279     if (av_frame_is_writable(in)) {
 280         direct = 1;
 281         out = in;
 282     } else {
 283         out = ff_get_video_buffer(outlink, outlink->w, outlink->h);
 284         if (!out) {
 285             av_frame_free(&in);
 286             return AVERROR(ENOMEM);
 287         }
 288         av_frame_copy_props(out, in);
 289     }
 290
 291     // out->data[...] = foobar(in->data[...])
 292
 293     if (!direct)
 294         av_frame_free(&in);
 295     return ff_filter_frame(outlink, out);
 296
 297 Of course, this will only work if you can do in-place processing. To test if
 298 your filter handles well the permissions, you can use the perms filter. For
 299 example with:
 300
 301     -vf perms=random,foobar
 302
 303 Make sure no automatic pixel conversion is inserted between perms and foobar,
 304 otherwise the frames permissions might change again and the test will be
 305 meaningless: add av_log(0,0,"direct=%d\n",direct) in your code to check that.
 306 You can avoid the issue with something like:
 307
 308     -vf format=rgb24,perms=random,foobar
 309
 310 ...assuming your filter accepts rgb24 of course. This will make sure the
 311 necessary conversion is inserted before the perms filter.
 312
 313 Timeline
 314 ~~~~~~~~
 315
 316 Adding timeline support
 317 (http://ffmpeg.org/ffmpeg-filters.html#Timeline-editing) is often an easy
 318 feature to add. In the most simple case, you just have to add
 319 AVFILTER_FLAG_SUPPORT_TIMELINE_GENERIC to the AVFilter.flags. You can typically
 320 do this when your filter does not need to save the previous context frames, or
 321 basically if your filter just alter whatever goes in and doesn't need
 322 previous/future information. See for instance commit 86cb986ce that adds
 323 timeline support to the fieldorder filter.
 324
 325 In some cases, you might need to reset your context somehow. This is handled by
 326 the AVFILTER_FLAG_SUPPORT_TIMELINE_INTERNAL flag which is used if the filter
 327 must not process the frames but still wants to keep track of the frames going
 328 through (to keep them in cache for when it's enabled again). See for example
 329 commit 69d72140a that adds timeline support to the phase filter.
 330
 331 Threading
 332 ~~~~~~~~~
 333
 334 libavfilter does not yet support frame threading, but you can add slice
 335 threading to your filters.
 336
 337 Let's say the foobar filter has the following frame processing function:
 338
 339     dst = out->data[0];
 340     src = in ->data[0];
 341
 342     for (y = 0; y < inlink->h; y++) {
 343         for (x = 0; x < inlink->w; x++)
 344             dst[x] = foobar(src[x]);
 345         dst += out->linesize[0];
 346         src += in ->linesize[0];
 347     }
 348
 349 The first thing is to make this function work into slices. The new code will
 350 look like this:
 351
 352     for (y = slice_start; y < slice_end; y++) {
 353         for (x = 0; x < inlink->w; x++)
 354             dst[x] = foobar(src[x]);
 355         dst += out->linesize[0];
 356         src += in ->linesize[0];
 357     }
 358
 359 The source and destination pointers, and slice_start/slice_end will be defined
 360 according to the number of jobs. Generally, it looks like this:
 361
 362     const int slice_start = (in->height *  jobnr   ) / nb_jobs;
 363     const int slice_end   = (in->height * (jobnr+1)) / nb_jobs;
 364     uint8_t       *dst = out->data[0] + slice_start * out->linesize[0];
 365     const uint8_t *src =  in->data[0] + slice_start *  in->linesize[0];
 366
 367 This new code will be isolated in a new filter_slice():
 368
 369     static int filter_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { ... }
 370
 371 Note that we need our input and output frame to define slice_{start,end} and
 372 dst/src, which are not available in that callback. They will be transmitted
 373 through the opaque void *arg. You have to define a structure which contains
 374 everything you need:
 375
 376     typedef struct ThreadData {
 377         AVFrame *in, *out;
 378     } ThreadData;
 379
 380 If you need some more information from your local context, put them here.
 381
 382 In you filter_slice function, you access it like that:
 383
 384     const ThreadData *td = arg;
 385
 386 Then in your filter_frame() callback, you need to call the threading
 387 distributor with something like this:
 388
 389     ThreadData td;
 390
 391     // ...
 392
 393     td.in  = in;
 394     td.out = out;
 395     ctx->internal->execute(ctx, filter_slice, &td, NULL, FFMIN(outlink->h, ctx->graph->nb_threads));
 396
 397     // ...
 398
 399     return ff_filter_frame(outlink, out);
 400
 401 Last step is to add AVFILTER_FLAG_SLICE_THREADS flag to AVFilter.flags.
 402
 403 For more example of slice threading additions, you can try to run git log -p
 404 --grep 'slice threading' libavfilter/
 405
 406 Finalization
 407 ~~~~~~~~~~~~
 408
 409 When your awesome filter is finished, you have a few more steps before you're
 410 done:
 411
 412  - write its documentation in doc/filters.texi, and test the output with make
 413    doc/ffmpeg-filters.html.
 414  - add a FATE test, generally by adding an entry in
 415    tests/fate/filter-video.mak, add running make fate-filter-foobar GEN=1 to
 416    generate the data.
 417  - add an entry in the Changelog
 418  - edit libavfilter/version.h and increase LIBAVFILTER_VERSION_MINOR by one
 419    (and reset LIBAVFILTER_VERSION_MICRO to 100)
 420  - git add ... && git commit -m "avfilter: add foobar filter." && git format-patch -1
 421
 422 When all of this is done, you can submit your patch to the ffmpeg-devel
 423 mailing-list for review.  If you need any help, feel free to come on our IRC
 424 channel, #ffmpeg-devel on irc.freenode.net.