The problem is not really a bug in Confluence (even though using an overly-complex regular expressions doesn't help the case), but it's a problem in JDK. After a brief search at bugs.sun.com I found the root cause of our StackOverflowError documented as bug 6337993.
Originally I tried to mitigate the situation by increasing the memory reserved for the stack data via the
-Xss JVM parameter. This helped a little bit, but wasn't good enough in most cases.
Last week I decided to go against everything I've been taught, and wrote a patch for Confluence that wraps the part of code that results in the StackOverflowError into a try/catch block. I know that any throwables that extend from Error should not be caught by a client code because they usually indicate a failure that only JVM should try to recover from, but IMO in the case of StackOverflowError, the situation is a bit different. That is mainly because before throwing the StackOverflowError, JVM pops the stack, so by the time the code execution gets to the catch block, JVM has already recovered from the error.
I don't claim this to be a solution to the problem, it's just a workaround that works better than increasing the stack size in this particular case. The fact that Confluence doesn't find all the URLs in wiki pages (used mainly to list outgoing links in the page info view) is just a small sacrifice, compared to inability to save or copy the page.
As for the solution, it seems that reimplementing Java's regular expressions library would be the most suitable one. I tried to run a code that fails in Java in JRuby, which uses a port of Oniguruma regex engine for Java and things worked flawlessly and as I read it also gives JRuby a performance boost over java.util.regex.