Sunday, August 08, 2010

Change of Status

$ sqlplus -s
SQL> connect hr@oracle.com/hr
SQL> UPDATE employees SET current = false WHERE email = "Igor.Minar@oracle.com";
SQL> COMMIT;
SQL> disconnect
SQL> exit
$ curl -X POST -H "Content-Type: application/json" \
   -d '{ "firstName":"Igor", "lastName":"Minar"}' \
   http://google.com/employee/

Tuesday, August 03, 2010

Thanks for All the Fish

Hi guys,

Most of you don't know, but today is the 3rd birthday of wikis.sun.com. In 2007 a bunch of us decided that it was worth it to boldly go where no man has gone before and on August 3, 2007 we launched wikis.sun.com.

At that time very few corporations were actively using some kind of wiki internally and there was no known significant public wiki deployment run by a corporation. We were astonished to see the uptake and user interest and watched the project grow from a few power users and a few dozens of wiki pages to tens of thousands of users and tens of thousands of wiki pages.

Thank you all for contributing, providing feedback and helping us to make the project successful.

Despite being a small team (did you know that officially there wasn't a single person working on wikis full-time?), we managed to get a lot done and I'm very proud of our accomplishments.

I do believe that there is still room for improvements, but I made a decision that these improvements will have to be implemented by someone else. I have found a new challenge that I'm going to pursue and unfortunately it's time for me to hand the wikis project over to a new group that will oversee the operations and development of the site. I did my best to make this transition as smooth as possible and I'm hopeful that the site will be in good hands.

Just by chance my last day at Oracle coincides with wikis' 3rd birth day, I'll take that as a good sign. I wish you all the best and I hope to see you around. Internet is a small place.

Good luck to you all.

Cheers,
Igor

PS: If you want to stay in touch, you can find me at linkedin at: http://www.linkedin.com/in/igorminar

DGC VI: Wiki Organization and Working with the Community

This blog post is part of the DevOps Guide to Confluence series. In this chapter of the guide, we’ll have a look at wiki organization and working with the user community. This post is going to be more subjective than the others, because the recommendation I'm going to make apply to a wiki site with similar goals and purpose as ours. I'm just going to share our experience and hopefully some of it will be useful for others.

The Purpose

First thing that should be clear for you when building a wiki site is what is the purpose that it's going to serve. Confluence has been successfully used for many purposes ranging from team collaboration, documentation writing, to website CMS system just to mention a few. When our team set out to build a wiki site, the goal was to create a wiki platform that could be used by anyone in our company to publicly collaborate with external parties without having to deploy and maintain their own wiki.

It was a pleasant surprise when one of the first groups of users who joined our pilot three years ago were technical writers eager to drop their heavy-weight tools with lots of fancy features in exchange for lightweight and more importantly inclusive collaboration tool. The main issue they were facing was that their processes and tools were very exclusive, and next to impossible for a non-writer to quickly join in order to make small edits. This resulted in lots of proxying of engineering feedback, and inevitable delays. With a wiki, the barrier for entry is very low for almost everyone. There is nothing to install or configure, a browser is all one needs. A wiki allowed a relatively small and overloaded team of technical writers to more efficiently gather and more importantly incorporate feedback from subject matter experts into the documentation. Of course there were trade-offs, mainly in the area of post processing the content for printable documentation (i.e. generating PDFs), but I'm hopeful that as the wiki system matures, more attention will be paid to make this area stronger (Atlassian: hint hint).

Anyway, with the tech writers on board, the purpose, goals and evolution of our site got heavily influenced by their feedback. In exchange we received a lot of high quality content that attracted new users who started using the wiki. This kind of bootstrap of the site greatly helped to speed up the viral adoption across our thirty-thousand-employee company.

Wiki Organization

When we launched our site three years ago, there were no other big corporations with a public facing wiki site (many corporations didn't even have an internal wiki yet, boy that has all changed since then), this put us into a position where we had to be the first explorers in search of best practices as well as things that didn't work at all.

Fortunately, since our team successfully pioneered the area of corporate blogging before the wikis launch, we had some experience with building communities that we could leverage.

Some of the main principles that we reused from our blogs site were:

  • Make the rules and policies as simple as possible
  • It is a goal shared by all employees to create a good image of the company and make the company succesful. We should trust their judgement and empower them to be able to do the right thing.
  • The team running the site is small, so the employees should be able to do as much as possible on their own (self-provisioning FTW!)
  • Since we trust our employees, we should delegate as much decision making and as many responsibilities as possible, and let them delegate some to others, otherwise we won't be able to scale.
  • There should be very little (close to none) policing or content organization done by the core team. We don't have the man-power for that. Besides, the Internet is not being policed by anyone and things tend to just work out. The popular, well organized and valuable content bubbles up, in one way or another.

Implemented Actions

With our principles laid out, we took these actions:

  • We integrated Confluence with our single sign-on and user provisioning system, which made it super easy for employees and external users to log in using their existing accounts.
  • Based on the information in our identity systems, we enrolled accounts of all of our employees into an employee-specific Confluence group, which we utilized when setting up global permissions.
  • The global permissions were set up so that employees (and only employees) could create new wiki spaces on their own, whenever they had the need, for whatever purpose
  • We also opened up all wiki spaces to be viewable by anyone on the Internet, but we left it up to the space admins to restrict permissions if they felt like it was necessary.
  • In order to mitigate spam issues, we made it impossible for anonymous users to obtain any write permissions either on the global or space level
  • We created a single Confluence theme that was applied to all wiki spaces by default and disabled all the other themes. This was done mainly for technical reasons — the Confluence UI has been changing dramatically over the last few years and these changes often resulted in a need to modify a custom theme to make it compatible with these changes. If we allowed anyone to create their own theme, we'd never be able to upgrade, because of a fear that we'd break someone's theme or alternatively we would have to coordinate our updates with all the maintainers of custom themes
  • We created an internal mailing list where space admins and other wiki users could share their experience, ask questions, and report issues.

Things We Need to Work on

Nobody's perfect and neither are we, let's look at what could we improve.

I know I just said that popular content always bubbles up, but considering how hideous the default Confluence front page is, I'd much prefer to utilize that real estate better and highlight popular or interesting content there.

I also think that we could do a better job at highlighting hardworking community members. There were some elaborate attempts to do this, but in my opinion a more lightweight approach could be more suitable for most of the sites.

Lastly, I think that staying in touch with our community is very important, and we could have done a better job at it if we had e.g. quarterly internal mini-conferences on various topics during which we could better gather their feedback. Also some better organized training sessions for our novice users could help boost our growth even further.

Conclusion

The recommendations and practices that worked for us might not be suitable for all Confluence deployments, but in our case things have worked out. There are still many areas where we could have done a better job, but I guess it's good to always have some space for improvements.

In the next chapter of my guide, we'll discuss issues and solutions that are specific for Internet-facing Confluence deployments.

Monday, August 02, 2010

DGC V: Customizing and Patching Confluence

This blog post is part of the DevOps Guide to Confluence series. In this chapter of the guide, we’ll have a look at how to customize and patch Confluence.

Customizing Confluence

Before we talk about any customization at all, I need to warn you. Any kind of customization of Confluence (or any other software) comes with a maintenance and support cost. The problems usually arise during or after a Confluence upgrade, and if they catch you unprepared, you might get yourself in a lot of trouble. Keep this in mind and before you customize anything, justify your intent.

There are several ways how to customize Confluence. For some the maintenance and support cost is low, others give you lots of flexibility at a higher cost. So depending on your needs and requirements you can pick one of the following.

Confluence User Macros

I already mentioned these in the Confluence Configuration chapter — they are easy to create and usually don't break during upgrades, but they are a nightmare to maintain. Avoid them.

Confluence Themes, HTML Headers and Footers

You can easily inject html code in the header and footer by editing the appropriate sections of the Admin UI (described in the config chapter). If this html code contains visual elements, then it's possible that your code will break during upgrades. In general I would avoid editing headers and footers in this way as much as I could unless I was doing something very simple.

Confluence themes are the way to go. You can either pick a theme that was already built and published by someone else, or you can build our own. Building your own theme will give you the most flexibility, but the cost of maintaining and supporting it will be the highest. You can do some things to cut corners, but be prepared to do some Confluence plugin development (a Confluence theme, is really just a type of Confluence plugin).

What worked well for me and our heavily customized theme, is to create our theme as a patch for the Confluence default theme. I simply symlink all the relevant files from Confluence source code into a directory structure that can be built as a Confluence theme/plugin, add my atlassian-plugin.xml and patch the files with changes I need no matter how complex they are. The advantage of this approach is that my theme will always be compatible with my Confluence version (after rebase) and I get all the new features introduced in the new version. The downside is that I often need to rebase my patches during Confluence upgrades, but with a good patch management solution (see below) this headache can be greatly minimized.

Lastly there is Theme Builder from Adaptavist. I haven't personally used this Confluence plugin because it was not popular when we initially created our theme and it was not desirable for us to depend on yet another (unknown at that time) vendor during our Confluence upgrades. If I were about to start creating a theme from scratch I would compare it with my patching method and see what gives me the most benefits. The main concern with Theme Builder I have, is my ability to version control the theme, which if not easily possible might be the deal breaker for me and many others.

Confluence Plugins

I mentioned Confluence Plugins already in the previous chapter, so I'm not going to repeat myself here.

What I'm going to add is that you really can extend and customize Confluence in crazy ways via the plugins. You can either discover the existing plugins at Atlassian Plugin Exchange or you can build your own with Maven (or the Plugin SDK), Java (or another Java compatible language) and Atlassian Plugin Framework.

The nice thing about plugins is that they are encapsulated pieces of code that interact with the rest of Confluence via public API and additionally they are hot plugable. This means that in theory they should work after a Confluence upgrade and that you can install and uninstall them on the fly without a need for restart. While the latter is true in practice, the former is not always the case. Confluence's public apis sometimes change, plugins rely on behavior that was not considered to be part of the public api and the UI changes all the time, so any CSS/javascript code that relies on absolute or relative positioning or fixed DOM structure will need ocassional fixes during upgrades.

Patching Confluence (source) files

Lastly I'm going to mention that one can modify Confluence's behavior by modifying the Confluence core files, this is a large topic and deserves its own section. ;-)

Patching Confluence

Patching Confluence is definitely the most advanced way to customize Confluence, especially if you start changing the Java source code, recompiling and creating your own war files. On the other hand, this way you get the most flexibility and will be able to change anything you want, even those things that plugins can't, all at your own risk.

Issues to Be Aware of

There are several potential issues that you should be aware of before you head down this route:

  • you might break something unintentionally — you can mitigate this with testing
  • you might have a hard time preserving your changes during an upgrade — you can minimize the problems by using good patch management strategy (discussed below)
  • you might have a problem getting support — this was never a problem for me mainly because most of my changes have been very isolated, so I could quickly tell if an issue is caused by my patch or if there is a bug in Confluence.

Reasons for Patching

The reasons for which you might want to patch Confluence fall generally into these four categories:

  • config change - as I mentioned in the Confluence Configuration chapter, some of the configuration is done by modifying files that are part of the Confluence standalone or war distribution (usually those in WEB-INF/classes directory). This is quite unfortunate because it adds a significant overhead to upgrades. I would much prefer if this configuration could be done via files in Confluence Home directory, but until that happens, the best way to manage these changes is by treating them as patches.
  • security fix - Atlassian often releases patches that fix security vulnerabilities in older versions of Confluence. What they actually release is a binary or textual file that represents the fixed version of the affected code. This file can then be just dropped into the appropriate location in (typically) WEB-INF/classes directory and the issue is fixed. This is a nice quick hack, but if your site is bigger or you plan to be on an older version for an extended period, your situation will be a lot more maintainable if you transform the fix into a patch against your version of Confluence.
  • temporary bug fix - occasionally Atlassian releases a temporary fix for an issue in a form similar to a security fix that will later on be properly fixed. In the meantime, the temporary fix can be used to work around the problem. Again, for a bigger site things will be a lot more maintainable if you manage this change as a patch.
  • a ui/behavior change - and lastly if you run a bigger site with lots of requirements coming from different groups of users, you might need to add a feature or disable an existing feature, add or remove a UI element, or change some behavior of Confluence in a way that is not possible via a plugin or a theme, in this case you definitively want to maintain every single such a change as an isolated patch. If you don't then you'll be in a big trouble when a time to upgrade Confluence comes.

Patching Methods

Now that we know why we would be interested in patching Confluence, let's look at how to do it. Again, there are several ways, depending on what do you need to patch.

  • patch the non-compilable files - these typically include config files as well as, javascript, css and template files. If you are patching only these types of files, then you can just create patches against the Confluence standalone or war distributions (since no compilation for these is needed). More likely than not once you get down the patching path, you'll want or need to do a lot more though.
  • patch files that require re-compilation - the Java source code. Soon after I realized that I needed to patch Confluence, I ended up modifying the java source code in order to fix bug or modify behavior. Atlassian makes this relatively easy to do, because along with the binary releases they also offer source code releases which can be used to build Confluence from sources on your own. This is an huge benefit for their customers, especially those who are willing to get their hands dirty to get the most out of Confluence. Once you have access to buildable source code, you can patch it and create your own builds with relatively small effort. The benefit of creating patches against the source release is that you can patch anything and everything (though I'm not saying that you should), starting from config files, js, css and template files all the way to core java class files; and all of that in a consistent and reliable way.

Patch Management

As I mentioned already, whenever you modify the source code you want to create an isolated patch that is a logical grouping of changes needed for one bug fix, config change or feature. Once you have many smaller patches like these you can apply or omit them in a build or update them one at a time when needed.

If you were to use the standard command line tools like diff and patch to work with these patches, you would probably go nuts quickly. There are far better, higher level solutions that can be used. Distributed source code management tools that are becoming an unstoppable force in the SCM arena of software development offer features that make patch management a piece of cake.

Git, for example, offer's a feature called Stash which allows you to create and maintain patches against your git repository. I don't have a personal experience with git-stash, but from the docs it looks like it should do what we want.

The solution that I've been using and loving for the past 3 years is Mercurial and it's core plugin Mercurial Queues. Working with this plugin is also well documented here and here.

Here are some main points for how I do my patching:

  • I store Confluence sources in my main Mercurial repository. I simply grab the source zip from Atlassian's website, unzip it, rename the root directory to "confluence" and put it to my repository.
  • When a new version of Confluence is released, I delete the confluence directory in the working copy of my repo and replace it with the files from the new zip file and commit the files with --addremove flag, which will automatically add all the new files and remove all the deleted files to/from the repository. This allows me to track diffs between Confluence versions, which is very handy when I'm debugging a new issue and want to find out in which release it was introduced.
  • In addition to this main repository I have a versioned (stored as a real hg repo) Mercurial Queue associated with it. The patch repo is very easy to create, just by issuing hg qinit -c command.
  • Now every time I want to change something, I create a new patch with hg qnew mypatchname.patch, modify the confluence source and then just do hg qrefresh to move my changes to mypatchaname.patch. This doesn't commit the changes in the patch repo, you have to do that explicitly via hg qcommit or by changing your current directory to .hg/patches and issuing a regular hg commit there.
  • Once you have patches in your queue, you can now easily apply and unapply patches with commands like hg qpush, hg qpop and hg qgoto.
  • Additionally you can set "guards" on patches, so you can create collections of patches that should be applied only for certain builds. For example, if you have some patches that should be applied only in development environment, you can set guards on them via hg qguard and then switch between these collections via hg qselect followed by hg qpop -a and hg qpush -a.
  • If you have a need to modify an existing patch, just hg qgoto to it, modify the confluence source code and run hg qrefresh and finallyhg qcommit.
  • In order to store binary files in your patches (e.g. images), you'll need to tell MQ to use Git's extended diff format. This is done by adding the following lines into your ~/.hgrc:
    [defaults]
    diff = --git
    qrefresh = --git
    email = --git
  • When upgrading Confluence, you first need to hg qpop all the patches, replace and commit the new confluence sources as discussed above and then reapply your patches with hg qpush -a. If you are lucky all changes will apply smoothly, however sooner or later, especially once your patch collection grows into decent size you'll need to rebase your patches. The conflicts are resolved via a 3-way merge, which can be either done by hand, or by using a fancy 3-way merge tool. Once all the patches are applied, be sure to test that everything still works as originally intended.
  • To make conflict resolution less frequent, I strive to create patches that do as little as possible to get things done. I avoid any major refactorings, api changes and "forget" about some best practices, especially in those cases when I know that Atlassian won't be interested in accepting my patch upstream. For these patches the main focus should be on getting things done, robustness and maintainability.
  • Lastly, if there is a patch that is generally useful for all Confluence users, I usually attach it to relevant RFE/bug report on Confluence's bug tracker. The fewer patches I have to maintain the better. When a patch is accepted upstream, I simply remove it with hg qrm.

I could go on and on about what a life-saver Mercurial Queues are but the best way to get to know them is to do some experimentation on your own. I strongly encourage you to do that, it's a good tool to have in your toolbox.

Just to give you some inspiration, here is a list of some of patches that I created for our build:

  • bundle jdbc driver with the war (by specifying new maven dependency in confluence's pom file)
  • configure Seraph login/security framework
  • replace Confluence's favicon with ours
  • modify the default log4j config
  • customize error pages
  • turn Australian English language pack into US English
  • remove all lower() function calls from Hibernate mapping files and Java classes to get major boost in db performance (see CONF-10030 and this doc)
  • disable mail archiving UI
  • enforce that our custom theme is the default and only available theme
  • remote api security enhancements
  • allow only members of our employee group to become space admins
  • as I already mentioned, our whole theme plugin is implemented as a bunch of patches against the default Confluence theme
  • and the list goes on and on...

Conclusion

In this chapter we went through several possible ways to customize Confluence. Plugins and themes are definitely the safest and most manageable way to go, however patching if done right, will give you the most flexibility. If you use the right tools for patch management (like Mercurial Queues), you'll be able to manage big collections of patches with a very little maintenance overhead.

Next time we'll have a look at a non-technical aspect of running a large Confluence wiki site.