How to make a custom input filter

Benjamin Melançon's picture
Submitted by Benjamin Melançon on 2010, July 9 - 06:46
  • [The source code of the module developed in this section can had at gitorious.org/remarkup. It will be on Drupal.org but this author is Waiting For Git.]

    Drupal's input filters are a fairly simple and powerful way to change how our content is displayed.

    Along the way, we will see how easy, non-scary, and useful making a module can be.

    Scenario

    Definitive Guide and other Apress books slightly emphasize tips, notes, and other additional commentary on the main text by setting it apart in a box.

    For the chapters and extended parts of the book presented online, we can easily make a similar look with HTML and CSS. However, we don't want the authors to have to type the HTML code in every time they want a highlighted tip, note, caution, reality check or anything else. Entering anything like the HTML tags surrounding the below tip, apart from the tedium of the task, mean a very good chance of making a minor mistake that makes display inconsistent.

    <div class="featured-element tip"><span class="featured-name"><span class="leading-square">T</span>ip</span> Hand-entering HTML code that involves divs or spans and classes or IDs is a strong sign we're doing it wrong.</div>

    We know what we want to do. Now where do we start?

    Having learned about hooks, we might be tempted to intercept nodes when they are saved using hook_node_insert() or hook_node_update() and make our changes . We should resist this temptation. One of Drupal's distinguishing characteristics is that it does not lay a finger on our content. What we see before we save is exactly what we see when we edit it again. This means our data is never corrupted. Accepting that as the Very Good Thing it is, we may want to replace our placeholders with our cool styling. But that means Drupal would have to do that work of processing the text every time it showed a node. Fortunately – before we try to make a caching system ourselves – changing the way user-inputted text looks is a common problem in Drupal. This is a problem that has long been solved by Drupal core itself. A way of managing modifications to content when it is displayed has lived in its own module, Filter module, since Drupal 5. We can find it right in the administration pages of Drupal 7 in Administer » Configuration » Content Authoring » Text formats admin/config/content/formats.

    So there is an example in core. Ten files (filter.admin.inc filter.css, filter.js, filter.test, filter.admin.js, filter.info, filter.module, filter.api.php, filter.install, and filter.pages.inc) seems a little intimidating. We'll take a look at it, but it would be nice to find a module that implemented just the provision of an input format, not the entire filter system.

    Where can we find a good example?

    A project initiated by Randy Fay (rfay) while Drupal 7 was still in development provides the excellent (and now obvious) answer: the Examples suite of modules. We can download it just like any other project at http://drupal.org/project/examples

    If we named the module "tip" and decided to contribute it to drupal.org, but that "tip" was too presumptuous a name, find-and-replace on those common three letters would not be pretty.

    Instead, if we namespace our custom modules with the name of our site project, it's easy to fix all function names with find-and-replace. Our custom modules are always easy to spot this way, also.

    So let's have at it! We make a directory named whatever we choose to name the module, in this case dgd7_tip, and start making the basic necessary module files, also named after the module, starting with dgd7_tip.info.

    Command-line instructions


    mkdir dgd7_tip
    cd dgd7_tip/
    vi dgd7_tip.info

    ; $Id$

    name = Tip formatter
    description = [dgd7_tip] Text format filter for tips, notes, hints and other emphasized paragraphs of text.
    core = 7.x
    files[] = dgd7_tip.module

    [@TODO some of these notes and hints should perhaps be redistributed to other chapters talking about making modules.]

    vi dgd7_tip.module

    <?php
    // $Id$

    /**
     * Implements hook_filter_info().
     */

    function dgd7_tip_filter_info() {
     
    $filters = array();
     
    $filters['dgd7_tip'] = array(
       
    'title' => t('Tip formatter'),
       
    'description' => t('Allows simple notation to indicate paragraphs of text to be emphasized as tips, notes, hints, or other specially featured interjections..'),
       
    'prepare callback' => '_dgd7_tip_prepare',
       
    'process callback' => '_dgd7_tip_process',
       
    'tips callback' => '_dgd7_tip_tips',
      );
      return
    $filters;
    }

    /**
     * Implements filter prepare callback.
     */
    function _dgd7_tip_prepare() {

    }

    /**
     * Implements filter process callback.
     */
    function _dgd7_tip_process() {

    }

    /**
     * Implements filter tips callback.
     */
    function _dgd7_tip_tips() {

    }

    ?>

    There! That looks neat and tidy. Our module won't even have any undefined function errors if we enable it, though all the filter callback functions are empty so it wouldn't do anything, either.

    From the PHP Manual, http://php.net/preg_replace: "The e modifier makes preg_replace() treat the replacement parameter as PHP code after the appropriate references substitution is done."

    Know When to Fold 'Em

    (For readers who cannot tell from the section title: We will not get any development done in this section.)

    Unfortunately, the "Add another item" functionality which we noticed for unlimited value fields is specific to the Field module. The code for the AJAX callback field_add_more_js() and related functionality in modules/field/field.form.inc may be instructive, but there's nothing in Drupal 7's FormsAPI to automate.

    So what do we do, at this point in building a module? We punt. We make it as simple as possible.

    In fact, on first pass, it's best to make our module with no user interface at all. We're only breaking that rule here because we want to use the usual method of saving filter information, which unfortunately is not API-friendly.

    Using the power of custom markup for good, not evil

    Using It could also be used to add a class to blockquote tags, or transform Providing instructions on the filter setting form

    We wanted to provide instruction on filling in the tag, before, and after fields. The usual Drupal way of providing a '#description' in our form element array is not a good option because we want to describe all fields together, and only once, not for every set (there's at least three sets on every settings form). We want, therefore, some text that's simply above

    Strangely, declaring a form element with just a value (which makes it the default markup element type) was not working. Fieldsets displayed their description, but not prettily and without a title.

    The image_resize_filter module, which we were already using, happens to have this same sort of disembodied help text. So we steal how Nathan Haug (quicksketch) did it. He pasted it right into a theme function.

    (This author was tempted to try to make each form element have the key save with the value of . Fortunately, he stopped. That way lies madness. It would have made the data storage seem slightly saner save, one less nesting of an array, but it would have made form generation impossible.

    A quick look at the page source and example code from core shows us we don't have to worry about namespace conflicts when "edit-filters-filter-html-settings-allowed-html" - prefixing it with "filters", then the specific filter "filter html", then "settings", before finally using the name assigned to the filter, "allowed html".

    If we look at the form array we see the same thing:

    We can see this by looking at the variables in a debugger (see chapter {devenv}, or for debugging without a debugger, by putting a drupal_set_message('<pre>' . var_export($form,TRUE) . '</pre>'); in an implementation of hook_form_alter().

    Evolution of a Module

    We warn people what we know won't work on the project page and the README.txt.
    * Tags are not meant to be nested.

    html tag
    drupal input filter tags
    drupal 7 make exportable
    drupal 7 input filters exportable

    Prior art

    http://drupal.org/project/input_formats
    "Input formats is an API that allows for the export and import of input formats like an object. This module makes it possible to export and import wysiwyg editor settings into Features."

    sorta similar to
    http://drupal.org/project/bbcode
    and Markdown Filter
    http://drupal.org/project/markdown
    and
    http://drupal.org/project/textile (which has a 7.x version)

    http://drupal.org/project/simplehtmldom (D7 version- it simply includes the http://simplehtmldom.sourceforge.net/ library.

    Users of this module may also want to see Typogrify module
    http://drupal.org/project/typogrify

    There is a Drupal 6 version of what we're doing out there. http://drupal.org/project/reptag

    But it has a totally different architecture - it doesn't use input filters! - and the Drupal 6 version never hit a stable release. This eases our conscience a bit about making a duplicate module. So much so that we consider ripping off the name of the module and just add an s: reptags.

    It's also very possible that we'll want to become a sort of submodule of Flexifilter module, which has the talented cwgordon as a maintainer
    http://drupal.org/project/flexifilter

    A project in the same vein as Flexifilter, http://drupal.org/project/customfilter - has actually been around longer and is more actively maintained.

    Neither had D7 branches as of September 21.

    Making our own hook

    Now the fun part. When developing Drupal, we get to implement other modules' hooks all the time. It's something of a rare treat to create our own hook!

    Creating a hook is very metaphysical

    The most common way of calling a Drupal hook.

    (We should pass in formatter information to our hook, even though we aren't planning to use it. We always figure on someone else doing something stranger with our API than we would ever imagine.) @TODO

    Combining code-provided defaults and administrator-set overriden or new settings

    Then we use a common and handy trick to have any administrator-set settings .

    This wouldn't work if we hadn't gone to the effort of making the array key be a
    array_merge() would just add them all together.

    For something of the same reason, adding two blank replacement tag arrays to the total ($rt) array doesn't add two blank form elements in our foreach over $rt. Each has an empty string '' as its key, so are combined into one.

    Instead, we factor out the creation of the form elements so we can just call it as many times as we want

    Disturbing discovery: There's no validation of any input filter settings at all. Which means no model for the validation we need (inclusion of a /).

    We also want to save our replacement markup keyed to the closing tag, as we've set up all our arrays, and not to

    We could implement hook_form_alter() and add a form-wide validation function to

    But the apparently easier, gentler approach is to use the #element_validate form property on a specific form element.

    http://api.drupal.org/api/drupal/developer--topics--forms_api_reference.html#element_validate

    Searching for element_validate in the code base

    Command-line steps


    grep -nHR 'element_validate' modules/

    ($element, &$form_state)

    Searching the web for php count number characters in a string (and some clicking around) brought us to: http://php.net/substr_count

    Apparently, all the other parts of the form can be dispensed with.

    It's important for us to understand that the author did this without any deep understanding of the filter form saving system when he started-- or necessarily when he finished. But it worked.

    An empty prepare (or process) callback will result in empty content anywhere that input format is applied!

    regular expression do not interpret string
    php escape regex special characters
    We learned that things go crazily to hell if we don't use preg_quote() before searching for our strings.

    As we could see when we tried it directly: // $text = preg_replace('@

    @se',

    Almost exactly as we want to use it, in the function for displaying an administrative form! Right at the top of the form:

      drupal_add_css(drupal_get_path('module', 'block') . '/block.css');

    Renaming our module

    We've put a ridiculous amount of work into this module now. We have to share it, and "dgd7_tip" is not a good name at all.

    After an embarrassing amount of time spent considering possible names... (tagfilter- it sort of indicates that it is a filter module. tagreplace? reptags? replacemarkup, repmark- remark. This author would like a cookie for refraining from taking the very tempting 'remark' project namespace. May something awesome that has to do with the English word remark, rather than "replacing markup", take that spot.) ... we decide on remarkup, for Re-markup or Replace markup.

    Some IDEs provide tools for replacing text in multiple files, and some may provide tools for renaming files, but we can also handle this ourselves, command-line style.

    With a little help from a Drupal handbook page (sed - replace text in single or multiple files, http://drupal.org/node/128513) and the shockingly non-Drupal site Debian Administration (easily renaming multiple files, http://www.debian-administration.org/articles/150), we rename our module with four lines typed into our terminal:

    Command-line steps


    sed -i 's/dgd7_tip/remarkup/g' *
    rename 's/dgd7_tip/remarkup/' *
    cd ../
    mv dgd7_tip remarkup

    This changes every function name and our API hook name, which incorporates our module name, per best practice to avoid namespace conflicts. (Our module name is guaranteed to be unique if hosted on drupal.org, so prefacing our hook name with our module name helps ensure that no one else is using the hook for some other purpose. Note that this means renaming a module is simply something we do not do once we're hosted on Drupal.org.

    Sharing our module on Gitorious

    Sharing on Gitorious.org or GitHub requires no application process, as getting the ability to put code on drupal.org does, so can be a good place to start sharing our work early. (There's still nothing like putting the code on drupal.org, though, to get the attention of users and reviewers alike.)

    After signing up for the account on one of these free services, and providing our public key to them, we can create a project and push our repository for our module there.

    Command-line steps


    cd remarkup
    git checkout master
    git remote add origin [email protected]:remarkup/remarkup.git
    git push origin master

    Conclusion

    We made plenty of compromises in making this module, but we got some essential things correct:

    • We gave it an API.
    • We gave it a UI.

    By going beyond our immediate needs - and by providing an API that allows our module to be extended without patching it - we made it much more likely that someone else will pick up where we left off.

    Even if we'd skipped the UI and the API, it would be a good idea to share this module, but not on Drupal.org. Github or Gitorious or our [@TODO: theoretical?] git.drupal.org sandbox would be the place to put it.

    Making a site-specific module that uses our API

    Wait, didn't we have some goal of our own, quite apart from making a module that other people might find useful?

    We'll write our site-specific code now. The cool thing, with all the work we've already done, our

    /**
    * Implements hook_remarkup_defaults().
    */
    function dgd7_remarkup_defaults() {
    return array(
    '[/tip]' => array(
    'before' => 'Tip',
    'after' => '',
    ),
    '[/reality]' => array(
    'before' => 'Reality',
    'after' => '',
    ),
    );
    }
    ?>

    But look at that– we're introducing inconsistencies along with the redundant code. Even though we are in the very simple, supply-data submodule, we can still automate stuff!

    function dgd7_remarkup_defaults() {
    $rm = array();
    // Define the simple tips-style replacements, machine and human-readable.
    $tips = array(
    'tip' => t('Tip'),
    'note' => t('Note'),
    'hint' => t('Hint'),
    'reality' => t('Reality'),
    'caution' => t('Caution'),
    'gotcha' => t('Gotcha'),
    'new' => t('New in 7'),
    );
    return $rm;
    }
    ?>

    Let's not forget our custom module's .info file (unless we've already created it elsewhere in the chronology of this book...)

    dgd7.info

    ; $Id$
    name = DGD7 Custom Code
    description = [dgd7] Glue, or custom, code for DefinitiveDrupal.org
    core = 7.x
    files[] = dgd7.module
    files[] = dgd7.css
    dependencies[] = remarkup

    The Payoff

    We enable both modules. Now we have to edit the text formats we want to use, such as Filtered HTML at admin/config/content/formats/2 and Full HTML at admin/config/content/formats/3.

    Ordering input filters

    The order of input filters is very important for what works. or ones provided by modules contributed by others.

    Saving a text format invalidates the node cache.

  • Book element

  • How to