Friday, July 30, 2010

DGC III: Confluence Configuration and Tuning

This blog post is part of the DevOps Guide to Confluence series. In this chapter of the guide, we’ll have a look at Confluence configuration and tuning.

There are four ways how one can modify Confluence's runtime behavior:

  • Config Files in Confluence Home directory
  • Config Files in WEB-INF/classes
  • JVM Options
  • Admin UI

Config Files in Confluence Home directory

Confluence Home directory contains one or more config files that control runtime behavior of Confluence. The most important file is confluence.cfg.xml that must be present in order for Confluence to start. This file can be modified by hand while confluence is shut down, but also gets modified by Confluence occasionally (mostly during upgrades). Your changes will be preserved, as long as you made them while Confluence was offline.

Another relevant file is tangosol-coherence-override.xml which must unfortunately be used to override Confluence’s lame multicast configuration needed for cluster configuration (see below).

Lastly there is config/confluence-coherence-cache-config-clustered.xml which contains configuration of the Confluence cache. Generally you don't want to modify this file by hand. I’ll come back to talk about cache configuration later in the Admin UI section of this chapter.

In general it is advisable to be very consistent about your environment, so that you can then just have a single version of these files that you can distribute on all servers when needed. This includes the directory layout, network interface names, and so on.

A combination of the first two files will allow you to configure the following:

Clustering

As I mentioned, this configuration is split between two config files. confluence.cfg.xml contains confluence.cluster.* properties, which allow you to set multicast IP, interface and TTL, but not the port. Only tangosol-coherence-override.xml can do that.

The cluster IP is by default derived from a "cluster name" specified via the Admin UI or installation wizard. For some reason Atlassian believes that in an enterprise environment one can just let a software pick a random IP and port to run multicast on. I don’t know about any serious datacenter where things work this way. You’ll likely want to explicitly set IP, port, interface name and TTL and the only way to do that is by modifying these files by hand and ignoring the "cluster name" setting in the UI. Make sure that settings are consistent in both files.

DB Connection Pool

Confluence comes with an embedded connection pool. I believe that you can use your own too (if it comes with your servlet container), but I’d suggest sticking with the embedded one since it is widely used and Atlassian runs their tests with it also. The pool is configured via confluence.cfg.xml and its hibernate.c3p0.* properties. The most important property is pool max_size which will prevent the pool from opening more than a defined number of connections at a time. You want this number to be higher than your typical peak concurrent request count (are you monitoring that?), but not higher than what your db can handle. We have ours set to 300, which is double of our occasional peaks. Don’t forget that in order to take advantage of these connections, you’ll likely need to also increase the worker thread count in your servlet container.

DB Connection

The connection is configured via hibernate.connection.* properties in confluence.cfg.xml. Depending on your db, you might need to specify several settings for the connection to work well and grok UTF-8. For our MySQL db, we need to set the connection url to something like

jdbc:mysql://server:3306/wikisdb?autoReconnect=true&useUnicode=true&characterEncoding=utf8
Note that if you are editing this file by hand, you must escape illegal xml characters. More info about db connection can be found in the Confluence documentation.

Config Files in WEB-INF/classes

Just a side note: if you are building confluence from source then these files can be found at confluence/confluence-project/conf-webapp/src/main/resources/.

These files are the most cumbersome to work with because you need to apply your changes to them after each upgrade. I'll describe how we use our automated patching machinery to do this in the future chapter of this guide. For now let's just go over the available config files and what you can change here.

atlassian-user.xml - used to configure user provisioning, e.g. LDAP. For more info read the docs.

confluence-init.properties - this file allows you to specify the path to Confluence Home directory. There is a better way to set this; see the JVM Options section below.

log4j.properties - modify logging preferences, this can also be done via the UI, but AFAIK the changes are not preserved after restart or upgrade.

seraph-config.xml - controls authentication framework. You'll likely need to modify this file if you have a custom authenticator and login page.

I should note that there are many other (usually xml) configuration files bundled with individual jars in WEB-INF/lib, but those rarely need to be modified.

JVM Options

Another way to configure certain settings is via JVM options. From the complete list of recognized options these are the ones we use:

-Dcom.atlassian.user.experimentalMapping=true - this is a critically important setting for us with 180k users. Without it, our cluster panics due to data overload (CONF-12319), unfortunately despite Atlassian’s claims that this experimental feature is production ready, it got broken soon after release, and then again recently, so you’ll have to patch atlassian-user module to get it to work.

-Dconfluence.disable.peopledirectory.anonymous=true - for big public deployments the people directory is a privacy risk and generally useless for anonymous users, we have it disabled for anonymous users.

-Dconfluence.disable.mailpolling=true - early on we decided that we don’t want people to build up mail archives on our site. While the feature is useful for small internal wikis, it’s too much of a risk with little reward to provide it on a public wiki. Unfortunately, this option only disables mail fetching. The UI for setting up mail archives will still be present in the wiki; you'll have to patch Confluence to remove it.

I didn't learn about -Dconfluence.home until recently. I would much prefer to use it than to mess with confluence-init.properties file in WEB-INF/classes.

Admin UI

Most of the Confluence settings can be configured via Confluence admin interface. The downside is that the configuration is not being versioned, and there is no easy way see diffs and to roll back unless you want to hack the db and replace data from backups. With that in mind lets look at the most important settings.

General Configuration

Server Base Url - make sure this is set up correctly, otherwise confluence and its plugins won’t work properly.

Users see Rich Text Editor by default - we have this set to off. In the past many RTE bugs were causing headaches to our writers especially those who did lots of editing. In Confluence 3.2 and 3.3 the editor has improved a lot and it might be the time for us to reconsider this decision.

CamelCase Links - this used to be one of THE wiki features in general a few years ago, but as wikis have matured and people started creating more and more content, the automatic linking started to cause more problems than help. We have it off.

Threaded Comments - very useful; make sure it’s on.

Remote API (XML-RPC & SOAP) - we have ours on, but I patched the remote api code to restrict access to it.

Compress HTTP Responses - OMG please turn this on if is isn't already. It’s a major performance booster. Alternatively you might want to do the compression in your webserver as Tim pointed out in comments below.

JavaScript served in header - we have this on, but for better performance it should be off. Unfortunately that breaks many plugins and legacy code that uses obtrusive javascript. Since this option has been around for a while, it might be worth it to just set it to off and deal with the remaining broken things as they are identified.

User email visibility - we have this set to visible to admins only, but our power users found it too be a collaboration barrier so I patched the code and made emails visible to our global employees group in addition to the admin group. It would be nice if confluence allowed such a configuration out the of box.

Anonymous Access to Remote API - No sane person will leave this on. If I were in charge, I would go as far as removing it from Confluence product.

Anti XSS Mode - This is a very handy feature. Not 100% bulletproof, but it helped to significantly decrease the number of XSS exploits in Confluence since its introduction.

Attachment Maximum Size (B) - I mentioned this one already in the first chapter when discussing the db configuration. If you are running a cluster (or think that you will eventually run it), set this to some low value. Ours is 5MB.

Connection Timeouts - these options are pretty handy when you have lots of feed macros, gadgets and other plugins that pull contet from remote sites. In order to prevent worker thread pileup in your servlet container don’t go beyond the default 10sec (which is already pretty high).

Daily Backup Administration

As I previously mentioned, this backup feature is useless for anything but tiny sites. Disable it.

Manage Referrers

Collecting referrers is ok, but don’t display them publicly if you run a site on the Internet. Otherwise you run a risk of exposing some internal only URIs that might contain confidential information.

Languages

Most of our documentation and content is written in American English, but unfortunately Atlassian doesn’t provide such a language pack. I just patch the default Australian English pack to get a US English pack. It works great and is almost no hassle to maintain.

User macros

I discourage their use in enterprise environement. The lack of versioning, automated testing and documentation makes them a nightmare to maintain. Just create Confluence plugins for everything you need.

PDF Export Language Support

This is a tricky one. It took us quite a while to find the right single font that could be used to generate PDFs in almost all languages. Finally we found soui_zhs.ttf, which is distributed with OpenOffice. It’s a huge file, but it works like charm for all kinds of non-wester languages.

Themes

For reasons I’ll discuss later, we disabled all the themes except for our custom one, which is the global and default space theme. To disable a theme you have to go to plugins view and disable the appropriate theme plugins.

Cache Statistics

The name of this section in the UI is misleading, because not only can you view cache statistics here, but more importantly you can fully control the cache size via the UI. And in this case, I’m really glad that there is a UI to manage the cache config xml file, which due to its size is really hard to work with by hand. The changes you make via the UI are persisted in the Confluence Home directory and propagated thought the cluster.

Out of all the things you can tune via the admin UI, the cache tuning will have the biggest impact on your site’s performance. Confluence ships with cache settings optimized for smaller sites, so increasing the cache size is unavoidable for larger deployments.

Tuning the cache settings is a time-consuming process because you need to balance the memory consumption with performance improvements. Usually I revisit the cache stats once a month and look for caches that are performing badly because the number of objects allowed in that particular cache is low. Confluence caching system is composed of many caches that are controlled via this UI.

The best indicator of an overflowing cache is when the "Effectiveness" value is low (under 70-80%) AND “Percent Used” value is high (over 80%) AND usually the “Expired” value will be relatively high compared to “Hit” value in the same cell. This means that Confluence needs to go to the DB too often, even though it could cache the data in memory if the cache was bigger.

If you don’t understand what all the cache names and numbers mean, don’t worry about that too much. As long as you don’t make any dramatic changes too quickly and you monitor your JVM heap usage, you can’t break anything.

As you increase the cache sized, you’ll eventually start running out of heap space. That’s why you need to monitor the JVM and increase the -Xmx value as needed. If the number of concurrent users increases, you might also need to slightly increase the -Xmn value (see the JVM Tuning chapter for more info).

I wish Atlassian would provide better descriptions for all the available caches, because unless you know Confluence internals well, you won’t know what you are doing and that doesn’t feel good. Additionally, I’d like to see a way to limit memory usage, not the number of objects, because their size varies. Ideally, I'd really like to be able to just say "Use 3GB of memory for cache and distribute it in the most efficient way. Oh and let me know if you need more or less memory to work effectively". It would be better if Atlassian moved away from an in-process cache which in my opinion is not a good fit for Confluence. Maybe we'll get there one day.

Plugins

This section of the Admin UI is where you can install, uninstall, enable and disable plugins and their modules. There is also a Plugin Repository which additionally allows you to install plugins from Altassian’s remote servers or user specified URIs. The recently released Atlassian Universal Plugin Manager will eventually replace the latter one (or both?), I’m glad to see that happening.

I suggest that you disable plugins that you don’t use or don’t want your users to use as soon as possible. We disabled all the bundled themes because we wanted to provide users with only one custom theme developed and maintained by us (I’ll explain the reasoning in a future chapter). For security reasons thehtml and html-include macros should in my opinion be disabled on all but family Confluence deployments. And for performance reasons Confluence Usage Stats plugin is not suitable for any bigger deployments.

Plugin installation is very easy to do. That’s both good and bad. The plugin framework provided by Confluence is a very sophisticated piece of software which allows you to install and uninstall plugins on the fly without any need to restart the server. Need to quickly install a fixed version of a buggy plugin without disturbing hundreds or thousands of users that are currently using your site? Done. That’s how easy it is.

On the other hand, it is tempting to install plugins just because they have cool names or promise great features. You can do that in your dev or test environment, but in production you should only install plugins that you picked after some serious consideration.

This is what I look for when deciding whether to install a plugin or not:

  • was the functionality provided by the plugin requested by larger group of users or is the plugin needed for site administration purposes?
  • was the plugin developed and tested in-house, if no is it supported by Atlassian, if no can we or some respectable Atlassian partner support it should there be some problems?
  • is the plugin compatible with our confluence version? does it have a track record of being compatible or was it made compatible with new Confluence versions as they were released?
  • are there no major unresolved bugs in the areas of performance, scalability, data integrity and security?
  • does the plugin have an automated test suite with good test coverage?

If you answer “yes” to all of these questions, then you may go ahead do a trial before installing the plugin in production. Otherwise, you might provide your feedback to the plugin authors and wait if the pending issues get resolved before proceeding.

I don’t want to be harsh, but especially 2-3 years ago most of the plugins created for Confluence were crap. But as the platform matures, and Atlassian partners get involved more, the quality of available plugins has been slowly increasing. The main issue that I see is that the existing plugins are not developed and tested with large scale deployments in mind. Hopefully things will change as more and more deployments grow beyond small and medium sites. It’s unfortunate that even some commercial plugins, suffer from the very same issues that plague plugins created by bunch of volunteers and enthusiast. So pick your plugins carefully, do a trial, check for unresolved bugs and existing user complaints, and then decide.

I've been reasonably active in the Atlassian development community and from these interactions, I'd like to highlight the work done by Dan Hardiker (Adaptavist) and Roberto Dominguez (Comalatech). And though I haven't worked with guys from CustomWare, they are also considered to be pretty sharp.

Be especially careful with plugins that provide new macros for the wiki content. Once you install such a plugin you won't be able to uninstall it without breaking wiki pages until all the references to that macro are removed (with tens of thousands of pages and no ability to track the references this might be a big challenge).

In general however, try to keep the number of plugins low. It’s better for performance and you won’t get in trouble as often when you need to upgrade Confluence but some of the plugins you use are not compatible with the new Confluence version.

Conclusion

You should now have a good idea about how to configure Confluence and where this configuration is done. In the next chapters we'll look at upgrading Confluence, patching and more.

9 comments:

Tim Moore said...

Hi again, Igor,

One specific question: you recommend the use of "Compress HTTP Responses". I'm curious about why you have Confluence do the compression instead of using mod_deflate in the Apache that sits in front of it.

Thanks
-- Tim

Igor Minar said...

Good point, Tim. Actually I don't care where the compression is being done, as long as it is done somewhere.

The built in compression filter blindly compresses all the responses (even png and other files that can't be efficiently compressed), so setting the compression in the webserver will likely perform better.

Thanks for bringing that up.

Scott Farquhar said...

Igor,

Which version of Confluence are you using? I believe that 2.7 and earlier had the problem of compressing images, however, this was fixed with the move to a better gzip filter in 2.8.

Cheers,
Scott

Igor Minar said...

Hi Scott,

We are at 3.2.1 and when I tested it the other day, it was sending png attachments with the compression on.

/i

snk said...

Hi Igor,

I REALLY appreciate your write-ups on Confluence - thank you for taking the time to share your experience with us running much smaller Confluence installs.

One question - where do I find soui_zhs.ttf? We've been using Code2000.ttf and getting complaints on appearance, but we need to support multiple languages on PDF export. I'd like to test soui_zhs.ttf but can't seem to find it.

Thanks, Scott

Igor Minar said...

Scott,

It's been a long time since we troubleshooted out PDF issues, but I'm quite certain that the soui_zhs.ttf font we ended up using was part of Open Office and that's where we got it from.

hth,
Igor

Tarun Sapra said...

Hello Igor,

I am working on upgrading to confluence 3.2.1 from confluence 2.10 and I am having problem with Anti-Xss feature, I am using getting a feed from JIRA to display onto my confluence but that feed contains some html and i am not able to render the html i.e. the html is displayed along with tags onto the confluence I have tried using suffix "html" in the velocity templates but still it isn't workin'...any ideas would be very welcome :)

Irving Popovetsky said...

Igor,

soui_zhs.ttf isn't part of OpenOffice, it isn't in the distribution or the source repository.

Can you post this file and share?

Igor Minar said...

You are right. I wonder how I got the file renamed like this. In any case, I do get some hits when searching for soui.ttf in openoffice, my best guess is that that's the right file.