Monday, August 02, 2010

DGC V: Customizing and Patching Confluence

This blog post is part of the DevOps Guide to Confluence series. In this chapter of the guide, we’ll have a look at how to customize and patch Confluence.

Customizing Confluence

Before we talk about any customization at all, I need to warn you. Any kind of customization of Confluence (or any other software) comes with a maintenance and support cost. The problems usually arise during or after a Confluence upgrade, and if they catch you unprepared, you might get yourself in a lot of trouble. Keep this in mind and before you customize anything, justify your intent.

There are several ways how to customize Confluence. For some the maintenance and support cost is low, others give you lots of flexibility at a higher cost. So depending on your needs and requirements you can pick one of the following.

Confluence User Macros

I already mentioned these in the Confluence Configuration chapter — they are easy to create and usually don't break during upgrades, but they are a nightmare to maintain. Avoid them.

Confluence Themes, HTML Headers and Footers

You can easily inject html code in the header and footer by editing the appropriate sections of the Admin UI (described in the config chapter). If this html code contains visual elements, then it's possible that your code will break during upgrades. In general I would avoid editing headers and footers in this way as much as I could unless I was doing something very simple.

Confluence themes are the way to go. You can either pick a theme that was already built and published by someone else, or you can build our own. Building your own theme will give you the most flexibility, but the cost of maintaining and supporting it will be the highest. You can do some things to cut corners, but be prepared to do some Confluence plugin development (a Confluence theme, is really just a type of Confluence plugin).

What worked well for me and our heavily customized theme, is to create our theme as a patch for the Confluence default theme. I simply symlink all the relevant files from Confluence source code into a directory structure that can be built as a Confluence theme/plugin, add my atlassian-plugin.xml and patch the files with changes I need no matter how complex they are. The advantage of this approach is that my theme will always be compatible with my Confluence version (after rebase) and I get all the new features introduced in the new version. The downside is that I often need to rebase my patches during Confluence upgrades, but with a good patch management solution (see below) this headache can be greatly minimized.

Lastly there is Theme Builder from Adaptavist. I haven't personally used this Confluence plugin because it was not popular when we initially created our theme and it was not desirable for us to depend on yet another (unknown at that time) vendor during our Confluence upgrades. If I were about to start creating a theme from scratch I would compare it with my patching method and see what gives me the most benefits. The main concern with Theme Builder I have, is my ability to version control the theme, which if not easily possible might be the deal breaker for me and many others.

Confluence Plugins

I mentioned Confluence Plugins already in the previous chapter, so I'm not going to repeat myself here.

What I'm going to add is that you really can extend and customize Confluence in crazy ways via the plugins. You can either discover the existing plugins at Atlassian Plugin Exchange or you can build your own with Maven (or the Plugin SDK), Java (or another Java compatible language) and Atlassian Plugin Framework.

The nice thing about plugins is that they are encapsulated pieces of code that interact with the rest of Confluence via public API and additionally they are hot plugable. This means that in theory they should work after a Confluence upgrade and that you can install and uninstall them on the fly without a need for restart. While the latter is true in practice, the former is not always the case. Confluence's public apis sometimes change, plugins rely on behavior that was not considered to be part of the public api and the UI changes all the time, so any CSS/javascript code that relies on absolute or relative positioning or fixed DOM structure will need ocassional fixes during upgrades.

Patching Confluence (source) files

Lastly I'm going to mention that one can modify Confluence's behavior by modifying the Confluence core files, this is a large topic and deserves its own section. ;-)

Patching Confluence

Patching Confluence is definitely the most advanced way to customize Confluence, especially if you start changing the Java source code, recompiling and creating your own war files. On the other hand, this way you get the most flexibility and will be able to change anything you want, even those things that plugins can't, all at your own risk.

Issues to Be Aware of

There are several potential issues that you should be aware of before you head down this route:

  • you might break something unintentionally — you can mitigate this with testing
  • you might have a hard time preserving your changes during an upgrade — you can minimize the problems by using good patch management strategy (discussed below)
  • you might have a problem getting support — this was never a problem for me mainly because most of my changes have been very isolated, so I could quickly tell if an issue is caused by my patch or if there is a bug in Confluence.

Reasons for Patching

The reasons for which you might want to patch Confluence fall generally into these four categories:

  • config change - as I mentioned in the Confluence Configuration chapter, some of the configuration is done by modifying files that are part of the Confluence standalone or war distribution (usually those in WEB-INF/classes directory). This is quite unfortunate because it adds a significant overhead to upgrades. I would much prefer if this configuration could be done via files in Confluence Home directory, but until that happens, the best way to manage these changes is by treating them as patches.
  • security fix - Atlassian often releases patches that fix security vulnerabilities in older versions of Confluence. What they actually release is a binary or textual file that represents the fixed version of the affected code. This file can then be just dropped into the appropriate location in (typically) WEB-INF/classes directory and the issue is fixed. This is a nice quick hack, but if your site is bigger or you plan to be on an older version for an extended period, your situation will be a lot more maintainable if you transform the fix into a patch against your version of Confluence.
  • temporary bug fix - occasionally Atlassian releases a temporary fix for an issue in a form similar to a security fix that will later on be properly fixed. In the meantime, the temporary fix can be used to work around the problem. Again, for a bigger site things will be a lot more maintainable if you manage this change as a patch.
  • a ui/behavior change - and lastly if you run a bigger site with lots of requirements coming from different groups of users, you might need to add a feature or disable an existing feature, add or remove a UI element, or change some behavior of Confluence in a way that is not possible via a plugin or a theme, in this case you definitively want to maintain every single such a change as an isolated patch. If you don't then you'll be in a big trouble when a time to upgrade Confluence comes.

Patching Methods

Now that we know why we would be interested in patching Confluence, let's look at how to do it. Again, there are several ways, depending on what do you need to patch.

  • patch the non-compilable files - these typically include config files as well as, javascript, css and template files. If you are patching only these types of files, then you can just create patches against the Confluence standalone or war distributions (since no compilation for these is needed). More likely than not once you get down the patching path, you'll want or need to do a lot more though.
  • patch files that require re-compilation - the Java source code. Soon after I realized that I needed to patch Confluence, I ended up modifying the java source code in order to fix bug or modify behavior. Atlassian makes this relatively easy to do, because along with the binary releases they also offer source code releases which can be used to build Confluence from sources on your own. This is an huge benefit for their customers, especially those who are willing to get their hands dirty to get the most out of Confluence. Once you have access to buildable source code, you can patch it and create your own builds with relatively small effort. The benefit of creating patches against the source release is that you can patch anything and everything (though I'm not saying that you should), starting from config files, js, css and template files all the way to core java class files; and all of that in a consistent and reliable way.

Patch Management

As I mentioned already, whenever you modify the source code you want to create an isolated patch that is a logical grouping of changes needed for one bug fix, config change or feature. Once you have many smaller patches like these you can apply or omit them in a build or update them one at a time when needed.

If you were to use the standard command line tools like diff and patch to work with these patches, you would probably go nuts quickly. There are far better, higher level solutions that can be used. Distributed source code management tools that are becoming an unstoppable force in the SCM arena of software development offer features that make patch management a piece of cake.

Git, for example, offer's a feature called Stash which allows you to create and maintain patches against your git repository. I don't have a personal experience with git-stash, but from the docs it looks like it should do what we want.

The solution that I've been using and loving for the past 3 years is Mercurial and it's core plugin Mercurial Queues. Working with this plugin is also well documented here and here.

Here are some main points for how I do my patching:

  • I store Confluence sources in my main Mercurial repository. I simply grab the source zip from Atlassian's website, unzip it, rename the root directory to "confluence" and put it to my repository.
  • When a new version of Confluence is released, I delete the confluence directory in the working copy of my repo and replace it with the files from the new zip file and commit the files with --addremove flag, which will automatically add all the new files and remove all the deleted files to/from the repository. This allows me to track diffs between Confluence versions, which is very handy when I'm debugging a new issue and want to find out in which release it was introduced.
  • In addition to this main repository I have a versioned (stored as a real hg repo) Mercurial Queue associated with it. The patch repo is very easy to create, just by issuing hg qinit -c command.
  • Now every time I want to change something, I create a new patch with hg qnew mypatchname.patch, modify the confluence source and then just do hg qrefresh to move my changes to mypatchaname.patch. This doesn't commit the changes in the patch repo, you have to do that explicitly via hg qcommit or by changing your current directory to .hg/patches and issuing a regular hg commit there.
  • Once you have patches in your queue, you can now easily apply and unapply patches with commands like hg qpush, hg qpop and hg qgoto.
  • Additionally you can set "guards" on patches, so you can create collections of patches that should be applied only for certain builds. For example, if you have some patches that should be applied only in development environment, you can set guards on them via hg qguard and then switch between these collections via hg qselect followed by hg qpop -a and hg qpush -a.
  • If you have a need to modify an existing patch, just hg qgoto to it, modify the confluence source code and run hg qrefresh and finallyhg qcommit.
  • In order to store binary files in your patches (e.g. images), you'll need to tell MQ to use Git's extended diff format. This is done by adding the following lines into your ~/.hgrc:
    [defaults]
    diff = --git
    qrefresh = --git
    email = --git
  • When upgrading Confluence, you first need to hg qpop all the patches, replace and commit the new confluence sources as discussed above and then reapply your patches with hg qpush -a. If you are lucky all changes will apply smoothly, however sooner or later, especially once your patch collection grows into decent size you'll need to rebase your patches. The conflicts are resolved via a 3-way merge, which can be either done by hand, or by using a fancy 3-way merge tool. Once all the patches are applied, be sure to test that everything still works as originally intended.
  • To make conflict resolution less frequent, I strive to create patches that do as little as possible to get things done. I avoid any major refactorings, api changes and "forget" about some best practices, especially in those cases when I know that Atlassian won't be interested in accepting my patch upstream. For these patches the main focus should be on getting things done, robustness and maintainability.
  • Lastly, if there is a patch that is generally useful for all Confluence users, I usually attach it to relevant RFE/bug report on Confluence's bug tracker. The fewer patches I have to maintain the better. When a patch is accepted upstream, I simply remove it with hg qrm.

I could go on and on about what a life-saver Mercurial Queues are but the best way to get to know them is to do some experimentation on your own. I strongly encourage you to do that, it's a good tool to have in your toolbox.

Just to give you some inspiration, here is a list of some of patches that I created for our build:

  • bundle jdbc driver with the war (by specifying new maven dependency in confluence's pom file)
  • configure Seraph login/security framework
  • replace Confluence's favicon with ours
  • modify the default log4j config
  • customize error pages
  • turn Australian English language pack into US English
  • remove all lower() function calls from Hibernate mapping files and Java classes to get major boost in db performance (see CONF-10030 and this doc)
  • disable mail archiving UI
  • enforce that our custom theme is the default and only available theme
  • remote api security enhancements
  • allow only members of our employee group to become space admins
  • as I already mentioned, our whole theme plugin is implemented as a bunch of patches against the default Confluence theme
  • and the list goes on and on...

Conclusion

In this chapter we went through several possible ways to customize Confluence. Plugins and themes are definitely the safest and most manageable way to go, however patching if done right, will give you the most flexibility. If you use the right tools for patch management (like Mercurial Queues), you'll be able to manage big collections of patches with a very little maintenance overhead.

Next time we'll have a look at a non-technical aspect of running a large Confluence wiki site.

No comments: