Showing posts with label Mediacast. Show all posts
Showing posts with label Mediacast. Show all posts

Thursday, February 05, 2009

Announcing grizzly-sendfile!

It's my pleasure to finally announce grizzly-sendfile v0.2 - the first stable version of a project that I started after I got one of those "Sudden Burst of Ideas" last summer.

For people who follow the grizzly development or mediacast.sun.com, this is not exactly hot news. grizzly-sendfile has been used by mediacast since last September and mentioned on the grizzly mailing list several times since then, but I haven't had time to promote it and explain what it does and how it works, so here I go.

If you don't care about my diary notes, skip down to "What is grizzly-sendfile".

A bit of background: the whole story goes back to the end of 2007 when a bunch of us where finishing up the rewrite of mediacast.sun.com in JRuby on Rails. At that time we realized that one of the most painful parts of the rewrite would be implementing the file streaming functionality. Back then Rails was single-threaded (not any more, yay!), so sending the data from rails was not an option. Fortunately, my then-colleague Peter, came up with an idea to use a servlet filter to intercept empty download responses from rails and stream the files from this filter. That did the trick for us, but it was a pretty ugly solution that was unreliable from time to time and was PITA to extend and maintain.

At around this time, I learned about X-Sendfile - a not well known http header - that some webservers (e.g. apache and ligttpd) support. This header could be used to offload file transfers from an application to the web server. Rails supports it natively via the :x_sendfile option of send_file method.

I started looking for the X-Sendfile support in GlassFish, which we have been using at mediacast, but it was missing. After some emails with glassfish and grizzly folks, mainly Jean-Francois, I learned that the core component of glassfish called grizzly could be extended via custom filters, which could implement this functionality.

The idea stuck in my head for a few weeks. I looked up some info on grizzly and NIO and then during one overnight drive to San Diego, I designed grizzly-sendfile in my head. It took many nights and a few weekends to get it into reasonable shape and test it under load with some custom faban benchmarks that I had to write, but in late August I had version 0.1 and was able to "sell" it to Rama as a replacement of the servlet filter madness that we were using at mediacast.

Except for a few initial bugs that showed up under some unusual circumstances, the 0.1 version was very stable. A few minor 0.1.x releases were followed by 0.2 version, which was installed on mediacast servers some time in November. Since then I've worked on docs and setting up the project at kenai.com.

What is grizzly-sendfile?

From the wiki: grizzly-sendfile is an extension for grizzly - a NIO framework that among other things powers GlassFish application server.

The goal of this extension is to facilitate an efficient file transfer functionality, which would allow applications to delegate file transfers to the application server, while retaining control over which file to send, access control or any other application specific logic.

How does it work?

By mixing some NIO "magic" and leveraging code of the hard working grizzly team, I was able to come up with an ARP (asynchronous request processing) filter for grizzly. This filter can be easily plugged in to grizzly (and glassfish v2) and will intercept all the responses that contain X-Sendfile header. The value of this header is the path of the file that the application that processed the request wants to send to the client.

All that an application needs to do is to set the header. In Java a simple example of such a code looks like this:
 response.setHeader("X-Sendfile", "/path/to/file.avi");
In Rails, it looks even nicer:
send_file '/path/to.png', :x_sendfile => true
That's it, grizzly-sendfile will take care of the rest.



Why should you care?


For me it was all about keeping my code clean and solving problems at layers where they made the most sense. Then it was also about performance and scalability - the kind of stuff that one can do with NIO, can't be now done in JavaEE because of its synchronous nature. And then of course it was about having full control over downloads (like successful download notification and other customizations that are possible via grizzly-sendfile plugins). Oh, and I must mention JMX monitoring:



What's next?

There is a lot of stuff on my roadmap. Two of the main missing features are partial downloads and glassfish v3 (grizzly 1.9.x) support. Then there is better monitoring and tons of performance and scalability tuning, which I haven't really focus on yet. A lot of the API still needs to be polished and cleaned-up. Also needed is a solid test suite that is more fine grained than the system/integration tests that I created with faban.

Can you use grizzly-sendfile?

Yeah, go for it. This is my pet project that I developed in my free time. The project is licensed under GPL2, so you can even grab the code if you want.

Can you help?

Sure. Code reviews, patches, suggestions and help with testing and documentation are more than welcome!

Friday, February 08, 2008

Mediacast Deployment Diagram

During our last meeting, Arun suggested that I publish the high-level deployment diagram of mediacast.sun.com. So here it is:



I described everything in textual form in my previous blog entry about mediacast.

Sunday, January 27, 2008

JRuby on Rails Rewrite of mediacast.sun.com Launched

A few days ago, we finally released Mediacast 2.0 - a complete rewrite of the old Mediacast application.



The original application was based on a few servlets, filters, and a lot of JSPs. It was all put together in hurry a few of years ago, when one of those fire-drill requests came to provide Sun employees with a site where they could publish files. The old code was hard to maintain and extend, and that's why we decided to EOL the old code base and write something better from scratch.

I'm a fan of Rails and for the past year I've been amazed by the great progress of the JRuby project. This was one of the reasons why I suggested that we could try to rewrite the app in Rails and deploy it in a regular Java web container, thanks to JRuby and Goldspike. It took some time for Rama to give us a go-ahead, but finally in late September, two other colleagues (with no Rails or JRuby experience) and I started to work on the rewrite alongside our other projects (forums.sun.com and wikis.sun.com).

Many people asked us why JRuby on Rails was picked for this project. Here is an incomplete list of reasons:
  • we were starting from scratch, so we were not tied to any legacy code, and could pick any web framework that runs on Java
  • proof of concept project to evaluate the technology for other uses across our organization
  • verify that Rails really delivers the rapid development promise
  • something new/fun for the team to help balance out the not-so-fun stuff :)

Here is my experience:

Rails and JRuby Learning Curve


I worked on my first Rails project in the summer of 2006 and since then I worked on a few other (internal) Rails projects. But having to teach others and deploy an application externally using JRuby was a new experience for me.

As far as teaching goes, I can't critique myself, but what I can say is that I had a lot of good help thanks to some Rails, Ruby and JRuby books, and many websites and blogs that have sprung up on the Internet in the last couple of years.

Learning JRuby (with previous Ruby and Java experience) is easy, because most of the time "it just works" and very rarely is the developer aware that the C-based MRI Ruby interpreter is not being used. The ability to access and seamlessly integrate with existing or new Java code is a huge plus without which we would not have been able to launch this project successfully (more on this later).

The deployment part was a different story. The Ruby on Rails application can be deployed in many different ways and JRuby offers a new alternative approach to all of them. Instead of e.g. having Apache webserver reverse-proxy requests to an army of Mongrels, JRuby and Goldspike make it possible to deploy Rails applications in a JavaEE web container. IMO this is much cleaner than anything else that is available in the non-JRuby Rails world. The downside is that this deployment method is quite new and there isn't a lot of documentation and community knowledge about it.

Application Architecture


Our app is a simple database-driven Rails application consisting of a handful of models, 4 controllers, and a bunch of view templates. The only piece of data that is not stored in the database are the actual files, which are stored on the file system.

To be on the safe side when it comes to performance, we employed quite a lot of fragment caching, which sped up our application quite a bit.

Development Environment


Our development environment is based on a self-contained1 JRuby (1.0.3) on Rails (1.2.6) application stored in a Mercurial repository. The IDE we use is NetBeans 6 with Ruby support and Mercurial plugin. The DB of our choice is MySQL and servers we use during development are WEBrick and Glassfish v2ur1.

NetBeans makes it super easy to write the Ruby code and offers a lot of neat features for Rails application development. I have to say that Tor and the gang did a great job. In fact I switched from Eclipse to NetBeans thanks to its great Ruby support.

Because the application runs in the JVM we can use JConsole to monitor the app while load testing which comes in really handy!

So all is good here, except for one thing. I got used to using the amazing ruby-debug debugger for debugging my Ruby on Rails applications. This debugger is not yet available for JRuby on Rails application (unless you hack your way through). I read somewhere that jruby-debug should be available with NetBeans 6.1. Once that is done, the JRoR dev environment will be on par with RoR (in fact thanks to tools like JConsole it will be superior).

Production Environment


We use a pair of load-balanced T2000 with Solaris 10, JDK6 and SJSAS 9.1u1. These two servers share a nfs NAS drive used for file storage and fragment cache storage. The DB backend is a MySQL database server, which we access through a connection pool set up in the app server.

All of this was fairly easy to set up. As you can see there were no special requirements when comparing this environment to a usual JavaEE production environment.

The Good


Once my colleagues grasped some RoR basics, we got the core of the application up and running fairly fast. Sometimes it still amazes even me how much one can do with Rails in a short amount of time. Thanks to a small amount of code one needs to write, the code review process is fast as well, and fixing a bug often means changing only a few lines of code.

The Bad


Since most of the C-based gems are not compatible out-of-box with JRuby (at the moment), the jruby-extras project aims to deliver JRuby compatible versions of these libraries. One of these libraries - JRuby-OpenSSL - was needed for me to integrate our app with our authentication webservice. I soon found out that JRuby-OpenSSL was not completely implemented yet and parts of functionality that I needed were missing. I'm sure that it is just a matter of time when problems like this will go away (if it hasn't happened already) as these libraries will mature.

The Ugly


The Goldspike project provides a bridge between the JavaEE and RoR world. It does this by dispatching the incoming HTTP requests into Rails running in a JRuby runtime.

This works great for all requests that take little time to process, but if you have long-running requests like large file uploads or downloads, these requests will occupy your JRuby runtimes and you soon realize that you are running out of runtimes in your small pool, at which point your application becomes unresponsive for any new requests.

This was the biggest problem we hit, but thanks to the possibility to seamlessly integrate Java code into our JRoR app and a great idea that my colleague Peter had, we solved this issue quite elegantly.

JRuby and Java Come to the Rescue


The two main issues we faced and that consumed most of my time on this project were related to the immaturity of the libraries around JRuby and application characteristics specific to JRoR deployment. Both of these issues can be resolved thanks to years of long Java and JavaEE history, and their libraries.

The problem with reliable connection to our SOAP-over-HTTPS based authentication webservice was resolved by generating a client Java webservice stub and using that instead of SOAP4R which didn't work properly because of the problems with JRuby-OpenSSL.

The second problem with the long-running processes occupying our precious JRuby runtimes, we solved by using a servlet filter and a fake HttpServletResponse (that we sent to Goldspike instead of the original one) and streaming the data from the filter instead from the runtime. I'll write a separate blog entry on this.

Overall Impression


This project was/is fun to work on. We experimented with quite a few new (for us) technologies and learned a lot along the way. To be honest, I expected to deploy the app much sooner but at that time I was not aware of the two above mentioned issues which consumed a lot of my time.

I was a bit worried about the performance of the application, but that turned out to be a non issue once we had the download servlet filter in place, and with the performance improvements in JRuby 1.1 things will be even better in the future.

Overall I'm happy with the outcome of our project and I look forward to adding more functionality to the application.

Our Future Plans


From the infrastructure perspective:
  • upgrade to JRuby 1.1 and Rails 2.0
  • start using Warbler, which looks to be superior to Goldspike's Rails plugin, for building the war file
  • experiment with in-memory session state replication in Glassfish

From the feature perspective:
  • better categorization of media items
  • search functionality via integration with search.sun.com
  • previews of media items
  • new UI design
  • and the toughest one - audio and video streaming


JRuby/Goldspike/Glassfish Things I Would Like to See Improved


  • I'm not sure if it is us and our (mis)configuration, Goldspike, JRuby or Glassfish, but right now we need quite a big JVM heap to keep things running. With 8 JRuby instances in the pool and Http thread count set to 512, we need -Xmx set to 2-2.5GB. This is a bit too much I think. We'll have to look into this and find the culprit.
  • The application startup is quite slow, especially on a machine like T2000, which doesn't perform well for heavy-weighted single-threaded operations. I don't see a reason, why Goldspike couldn't initialize the JRuby runtime pool concurrently cutting down the startup time significantly.
  • Offload JRuby runtimes as much as possible. Currently operations like file upload or file download (via Rails' send_file) are handled by JRuby runtimes. I think that it should be possible to take care of these operations outside of the runtime, allowing the runtime to process other rails requests in the meantime.



1 - Rails and all the other gems and jars are frozen into the project