ffmpeg/doc/writing_filters.txt

   1 This document is a tutorial/initiation for writing simple filters in
   2 libavfilter.
   3
   4 Foreword: just like everything else in FFmpeg, libavfilter is monolithic, which
   5 means that it is highly recommended that you submit your filters to the FFmpeg
   6 development mailing-list and make sure it is applied. Otherwise, your filter is
   7 likely to have a very short lifetime due to more a less regular internal API
   8 changes, and a limited distribution, review, and testing.
   9
  10 Bootstrap
  11 =========
  12
  13 Let's say you want to write a new simple video filter called "foobar" which
  14 takes one frame in input, changes the pixels in whatever fashion you fancy, and
  15 outputs the modified frame. The most simple way of doing this is to take a
  16 similar filter.  We'll pick edgedetect, but any other should do. You can look
  17 for others using the `./ffmpeg -v 0 -filters|grep ' V->V '` command.
  18
  19  - sed 's/edgedetect/foobar/g;s/EdgeDetect/Foobar/g' libavfilter/vf_edgedetect.c > libavfilter/vf_foobar.c
  20  - edit libavfilter/Makefile, and add an entry for "foobar" following the
  21    pattern of the other filters.
  22  - edit libavfilter/allfilters.c, and add an entry for "foobar" following the
  23    pattern of the other filters.
  24  - ./configure ...
  25  - make -j<whatever> ffmpeg
  26  - ./ffmpeg -i http://samples.ffmpeg.org/image-samples/lena.pnm -vf foobar foobar.png
  27    Note here: you can obviously use a random local image instead of a remote URL.
  28
  29 If everything went right, you should get a foobar.png with Lena edge-detected.
  30
  31 That's it, your new playground is ready.
  32
  33 Some little details about what's going on:
  34 libavfilter/allfilters.c:avfilter_register_all() is called at runtime to create
  35 a list of the available filters, but it's important to know that this file is
  36 also parsed by the configure script, which in turn will define variables for
  37 the build system and the C:
  38
  39     --- after running configure ---
  40
  41     $ grep FOOBAR config.mak
  42     CONFIG_FOOBAR_FILTER=yes
  43     $ grep FOOBAR config.h
  44     #define CONFIG_FOOBAR_FILTER 1
  45
  46 CONFIG_FOOBAR_FILTER=yes from the config.mak is later used to enable the filter in
  47 libavfilter/Makefile and CONFIG_FOOBAR_FILTER=1 from the config.h will be used
  48 for registering the filter in libavfilter/allfilters.c.
  49
  50 Filter code layout
  51 ==================
  52
  53 You now need some theory about the general code layout of a filter. Open your
  54 libavfilter/vf_foobar.c. This section will detail the important parts of the
  55 code you need to understand before messing with it.
  56
  57 Copyright
  58 ---------
  59
  60 First chunk is the copyright. Most filters are LGPL, and we are assuming
  61 vf_foobar is as well. We are also assuming vf_foobar is not an edge detector
  62 filter, so you can update the boilerplate with your credits.
  63
  64 Doxy
  65 ----
  66
  67 Next chunk is the Doxygen about the file. See http://ffmpeg.org/doxygen/trunk/.
  68 Detail here what the filter is, does, and add some references if you feel like
  69 it.
  70
  71 Context
  72 -------
  73
  74 Skip the headers and scroll down to the definition of FoobarContext. This is
  75 your local state context. It is already filled with 0 when you get it so do not
  76 worry about uninitialized read into this context. This is where you put every
  77 "global" information you need, typically the variable storing the user options.
  78 You'll notice the first field "const AVClass *class"; it's the only field you
  79 need to keep assuming you have a context. There are some magic you don't care
  80 about around this field, just let it be (in first position) for now.
  81
  82 Options
  83 -------
  84
  85 Then comes the options array. This is what will define the user accessible
  86 options. For example, -vf foobar=mode=colormix:high=0.4:low=0.1. Most options
  87 have the following pattern:
  88   name, description, offset, type, default value, minimum value, maximum value, flags
  89
  90  - name is the option name, keep it simple, lowercase
  91  - description are short, in lowercase, without period, and describe what they
  92    do, for example "set the foo of the bar"
  93  - offset is the offset of the field in your local context, see the OFFSET()
  94    macro; the option parser will use that information to fill the fields
  95    according to the user input
  96  - type is any of AV_OPT_TYPE_* defined in libavutil/opt.h
  97  - default value is an union where you pick the appropriate type; "{.dbl=0.3}",
  98    "{.i64=0x234}", "{.str=NULL}", ...
  99  - min and max values define the range of available values, inclusive
 100  - flags are AVOption generic flags. See AV_OPT_FLAG_* definitions
 101
 102 In doubt, just look at the other AVOption definitions all around the codebase,
 103 there are tons of examples.
 104
 105 Class
 106 -----
 107
 108 AVFILTER_DEFINE_CLASS(foobar) will define a unique foobar_class with some kind
 109 of signature referencing the options, etc. which will be referenced in the
 110 definition of the AVFilter.
 111
 112 Filter definition
 113 -----------------
 114
 115 At the end of the file, you will find foobar_inputs, foobar_outputs and
 116 the AVFilter ff_vf_foobar. Don't forget to update the AVFilter.description with
 117 a description of what the filter does, starting with a capitalized letter and
 118 ending with a period. You'd better drop the AVFilter.flags entry for now, and
 119 re-add them later depending on the capabilities of your filter.
 120
 121 Callbacks
 122 ---------
 123
 124 Let's now study the common callbacks. Before going into details, note that all
 125 these callbacks are explained in details in libavfilter/avfilter.h, so in
 126 doubt, refer to the doxy in that file.
 127
 128 init()
 129 ~~~~~~
 130
 131 First one to be called is init(). It's flagged as cold because not called
 132 often. Look for "cold" on
 133 http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html for more
 134 information.
 135
 136 As the name suggests, init() is where you eventually initialize and allocate
 137 your buffers, pre-compute your data, etc. Note that at this point, your local
 138 context already has the user options initialized, but you still haven't any
 139 clue about the kind of data input you will get, so this function is often
 140 mainly used to sanitize the user options.
 141
 142 Some init()s will also define the number of inputs or outputs dynamically
 143 according to the user options. A good example of this is the split filter, but
 144 we won't cover this here since vf_foobar is just a simple 1:1 filter.
 145
 146 uninit()
 147 ~~~~~~~~
 148
 149 Similarly, there is the uninit() callback, doing what the name suggest. Free
 150 everything you allocated here.
 151
 152 query_formats()
 153 ~~~~~~~~~~~~~~~
 154
 155 This is following the init() and is used for the format negotiation, basically
 156 where you say what pixel format(s) (gray, rgb 32, yuv 4:2:0, ...) you accept
 157 for your inputs, and what you can output. All pixel formats are defined in
 158 libavutil/pixfmt.h. If you don't change the pixel format between the input and
 159 the output, you just have to define a pixel formats array and call
 160 ff_set_common_formats(). For more complex negotiation, you can refer to other
 161 filters such as vf_scale.
 162
 163 config_props()
 164 ~~~~~~~~~~~~~~
 165
 166 This callback is not necessary, but you will probably have one or more
 167 config_props() anyway. It's not a callback for the filter itself but for its
 168 inputs or outputs (they're called "pads" - AVFilterPad - in libavfilter's
 169 lexicon).
 170
 171 Inside the input config_props(), you are at a point where you know which pixel
 172 format has been picked after query_formats(), and more information such as the
 173 video width and height (inlink->{w,h}). So if you need to update your internal
 174 context state depending on your input you can do it here. In edgedetect you can
 175 see that this callback is used to allocate buffers depending on these
 176 information. They will be destroyed in uninit().
 177
 178 Inside the output config_props(), you can define what you want to change in the
 179 output. Typically, if your filter is going to double the size of the video, you
 180 will update outlink->w and outlink->h.
 181
 182 filter_frame()
 183 ~~~~~~~~~~~~~~
 184
 185 This is the callback you are waiting from the beginning: it is where you
 186 process the received frames. Along with the frame, you get the input link from
 187 where the frame comes from.
 188
 189     static int filter_frame(AVFilterLink *inlink, AVFrame *in) { ... }
 190
 191 You can get the filter context through that input link:
 192
 193     AVFilterContext *ctx = inlink->dst;
 194
 195 Then access your internal state context:
 196
 197     FoobarContext *foobar = ctx->priv;
 198
 199 And also the output link where you will send your frame when you are done:
 200
 201     AVFilterLink *outlink = ctx->outputs[0];
 202
 203 Here, we are picking the first output. You can have several, but in our case we
 204 only have one since we are in a 1:1 input-output situation.
 205
 206 If you want to define a simple pass-through filter, you can just do:
 207
 208     return ff_filter_frame(outlink, in);
 209
 210 But of course, you probably want to change the data of that frame.
 211
 212 This can be done by accessing frame->data[] and frame->linesize[].  Important
 213 note here: the width does NOT match the linesize. The linesize is always
 214 greater or equal to the width. The padding created should not be changed or
 215 even read. Typically, keep in mind that a previous filter in your chain might
 216 have altered the frame dimension but not the linesize. Imagine a crop filter
 217 that halves the video size: the linesizes won't be changed, just the width.
 218
 219     <-------------- linesize ------------------------>
 220     +-------------------------------+----------------+ ^
 221     |                               |                | |
 222     |                               |                | |
 223     |           picture             |    padding     | | height
 224     |                               |                | |
 225     |                               |                | |
 226     +-------------------------------+----------------+ v
 227     <----------- width ------------->
 228
 229 Before modifying the "in" frame, you have to make sure it is writable, or get a
 230 new one. Multiple scenarios are possible here depending on the kind of
 231 processing you are doing.
 232
 233 Let's say you want to change one pixel depending on multiple pixels (typically
 234 the surrounding ones) of the input. In that case, you can't do an in-place
 235 processing of the input so you will need to allocate a new frame, with the same
 236 properties as the input one, and send that new frame to the next filter:
 237
 238     AVFrame *out = ff_get_video_buffer(outlink, outlink->w, outlink->h);
 239     if (!out) {
 240         av_frame_free(&in);
 241         return AVERROR(ENOMEM);
 242     }
 243     av_frame_copy_props(out, in);
 244
 245     // out->data[...] = foobar(in->data[...])
 246
 247     av_frame_free(&in);
 248     return ff_filter_frame(outlink, out);
 249
 250 In-place processing
 251 ~~~~~~~~~~~~~~~~~~~
 252
 253 If you can just alter the input frame, you probably just want to do that
 254 instead:
 255
 256     av_frame_make_writable(in);
 257     // in->data[...] = foobar(in->data[...])
 258     return ff_filter_frame(outlink, in);
 259
 260 You may wonder why a frame might not be writable. The answer is that for
 261 example a previous filter might still own the frame data: imagine a filter
 262 prior to yours in the filtergraph that needs to cache the frame. You must not
 263 alter that frame, otherwise it will make that previous filter buggy. This is
 264 where av_frame_make_writable() helps (it won't have any effect if the frame
 265 already is writable).
 266
 267 The problem with using av_frame_make_writable() is that in the worst case it
 268 will copy the whole input frame before you change it all over again with your
 269 filter: if the frame is not writable, av_frame_make_writable() will allocate
 270 new buffers, and copy the input frame data. You don't want that, and you can
 271 avoid it by just allocating a new buffer if necessary, and process from in to
 272 out in your filter, saving the memcpy. Generally, this is done following this
 273 scheme:
 274
 275     int direct = 0;
 276     AVFrame *out;
 277
 278     if (av_frame_is_writable(in)) {
 279         direct = 1;
 280         out = in;
 281     } else {
 282         out = ff_get_video_buffer(outlink, outlink->w, outlink->h);
 283         if (!out) {
 284             av_frame_free(&in);
 285             return AVERROR(ENOMEM);
 286         }
 287         av_frame_copy_props(out, in);
 288     }
 289
 290     // out->data[...] = foobar(in->data[...])
 291
 292     if (!direct)
 293         av_frame_free(&in);
 294     return ff_filter_frame(outlink, out);
 295
 296 Of course, this will only work if you can do in-place processing. To test if
 297 your filter handles well the permissions, you can use the perms filter. For
 298 example with:
 299
 300     -vf perms=random,foobar
 301
 302 Make sure no automatic pixel conversion is inserted between perms and foobar,
 303 otherwise the frames permissions might change again and the test will be
 304 meaningless: add av_log(0,0,"direct=%d\n",direct) in your code to check that.
 305 You can avoid the issue with something like:
 306
 307     -vf format=rgb24,perms=random,foobar
 308
 309 ...assuming your filter accepts rgb24 of course. This will make sure the
 310 necessary conversion is inserted before the perms filter.
 311
 312 Timeline
 313 ~~~~~~~~
 314
 315 Adding timeline support
 316 (http://ffmpeg.org/ffmpeg-filters.html#Timeline-editing) is often an easy
 317 feature to add. In the most simple case, you just have to add
 318 AVFILTER_FLAG_SUPPORT_TIMELINE_GENERIC to the AVFilter.flags. You can typically
 319 do this when your filter does not need to save the previous context frames, or
 320 basically if your filter just alter whatever goes in and doesn't need
 321 previous/future information. See for instance commit 86cb986ce that adds
 322 timeline support to the fieldorder filter.
 323
 324 In some cases, you might need to reset your context somehow. This is handled by
 325 the AVFILTER_FLAG_SUPPORT_TIMELINE_INTERNAL flag which is used if the filter
 326 must not process the frames but still wants to keep track of the frames going
 327 through (to keep them in cache for when it's enabled again). See for example
 328 commit 69d72140a that adds timeline support to the phase filter.
 329
 330 Threading
 331 ~~~~~~~~~
 332
 333 libavfilter does not yet support frame threading, but you can add slice
 334 threading to your filters.
 335
 336 Let's say the foobar filter has the following frame processing function:
 337
 338     dst = out->data[0];
 339     src = in ->data[0];
 340
 341     for (y = 0; y < inlink->h; y++) {
 342         for (x = 0; x < inlink->w; x++)
 343             dst[x] = foobar(src[x]);
 344         dst += out->linesize[0];
 345         src += in ->linesize[0];
 346     }
 347
 348 The first thing is to make this function work into slices. The new code will
 349 look like this:
 350
 351     for (y = slice_start; y < slice_end; y++) {
 352         for (x = 0; x < inlink->w; x++)
 353             dst[x] = foobar(src[x]);
 354         dst += out->linesize[0];
 355         src += in ->linesize[0];
 356     }
 357
 358 The source and destination pointers, and slice_start/slice_end will be defined
 359 according to the number of jobs. Generally, it looks like this:
 360
 361     const int slice_start = (in->height *  jobnr   ) / nb_jobs;
 362     const int slice_end   = (in->height * (jobnr+1)) / nb_jobs;
 363     uint8_t       *dst = out->data[0] + slice_start * out->linesize[0];
 364     const uint8_t *src =  in->data[0] + slice_start *  in->linesize[0];
 365
 366 This new code will be isolated in a new filter_slice():
 367
 368     static int filter_slice(AVFilterContext *ctx, void *arg, int jobnr, int nb_jobs) { ... }
 369
 370 Note that we need our input and output frame to define slice_{start,end} and
 371 dst/src, which are not available in that callback. They will be transmitted
 372 through the opaque void *arg. You have to define a structure which contains
 373 everything you need:
 374
 375     typedef struct ThreadData {
 376         AVFrame *in, *out;
 377     } ThreadData;
 378
 379 If you need some more information from your local context, put them here.
 380
 381 In you filter_slice function, you access it like that:
 382
 383     const ThreadData *td = arg;
 384
 385 Then in your filter_frame() callback, you need to call the threading
 386 distributor with something like this:
 387
 388     ThreadData td;
 389
 390     // ...
 391
 392     td.in  = in;
 393     td.out = out;
 394     ctx->internal->execute(ctx, filter_slice, &td, NULL, FFMIN(outlink->h, ctx->graph->nb_threads));
 395
 396     // ...
 397
 398     return ff_filter_frame(outlink, out);
 399
 400 Last step is to add AVFILTER_FLAG_SLICE_THREADS flag to AVFilter.flags.
 401
 402 For more example of slice threading additions, you can try to run git log -p
 403 --grep 'slice threading' libavfilter/
 404
 405 Finalization
 406 ~~~~~~~~~~~~
 407
 408 When your awesome filter is finished, you have a few more steps before you're
 409 done:
 410
 411  - write its documentation in doc/filters.texi, and test the output with make
 412    doc/ffmpeg-filters.html.
 413  - add a FATE test, generally by adding an entry in
 414    tests/fate/filter-video.mak, add running make fate-filter-foobar GEN=1 to
 415    generate the data.
 416  - add an entry in the Changelog
 417  - edit libavfilter/version.h and increase LIBAVFILTER_VERSION_MINOR by one
 418    (and reset LIBAVFILTER_VERSION_MICRO to 100)
 419  - git add ... && git commit -m "avfilter: add foobar filter." && git format-patch -1
 420
 421 When all of this is done, you can submit your patch to the ffmpeg-devel
 422 mailing-list for review.  If you need any help, feel free to come on our IRC
 423 channel, #ffmpeg-devel on irc.freenode.net.