Tuesday, June 10, 2008

Speaking at Lone Star Ruby Conference

I will be speaking at the Lone Star Ruby Conference in September about how we use Ruby to deploy, monitor, and manage a cluster of servers running in the Amazon Web Services virtual cloud.   Below is a summary of what I'll be talking about.

In OtherInbox, almost every system administration task imaginable is carried out using Ruby, meaning we as developers can enjoy all of Ruby's expressive benefits and spend less time scripting the shell, writing cron tasks, or using other languages. Because we make fewer context switches from thinking in Ruby to thinking in other languages, we also reap a big productivity benefit.

Using Ruby throughout our cloud also means that porting the application to run in different production environments is a trivial task, because Ruby is the glue connecting the Ruby components together, thus all we require is a Ruby interpreter to deploy.

Two key Ruby technologies have matured in the previous 18 months which make it ideal for almost every layer of managing a cluster of servers:
  • god.rb allows fine-grained process monitoring and daemon control (a la monit)
  • rufus-scheduler enables Ruby-based scheduling (replacing cron, and providing a great facility for running daemons that must be executed on a recurring basis)
When combined with these Ruby workhorses, developers today can spend much more of their time writing Ruby code, and less time struggling with the vagaries of their production environment:The talk will also include a discussion of using several different AWS gems to make cloud computing simple, by illustrating the use of Amazon's S3 and SQS services to distribute asychnronous work and handle communication between servers.

(cross-posted from the OtherInbox blog)

Labels: , , , , ,

Friday, June 6, 2008

random_data v1.3.1 released

random_data v1.3.1 is out.  Courtesy of stalwart contributor Hugh Sasse, this release includes more firstnames, and two new methods: Random.firstname_male and Random.firstname_female

Install it via:
sudo gem install random_data
or get it manually.

Labels: ,

Sunday, May 25, 2008

random_data v1.3.0 released

random_data is a testing and seed data gem I wrote a few years back to help get Ruby projects up and running with semi-realistic fake data (the faker gem provides similar functionality).

I just released version 1.3.0 which includes a bunch of RDoc enhancements as well as some new features contributed by the tireless (and patient!) Hugh Sasse:
  • Added RandomData::Grammar, which lets you create simple random sentences from a grammar supplied as a hash, like so:
    Random::grammatical_construct({:story => [:man, " bites ", :dog], :man => { :bob => "Bob"}, :dog => {:a =>"Rex", :b =>"Rover"}}, :story)
    ==> "Bob bites Rex"
  • Added Random.bit and Random.bits
  • Added Random.uk_post_code
  • Bug fix: zipcodes should strings, not integers
Thanks Hugh!  Open source is awesome!

Labels: ,

Using Ruby's Autoload Method To Configure Your App Just-in-Time

Reading The Ruby Programming Language was a great experience — like revisiting a country I thought I knew intimately, but with expert tour guides who showed me whole new landscapes. It's also a good primer on what's changing in Ruby 1.9.

One of my favorite discoveries was Ruby's autoload method. Using autoload, you can associate a constant with a filename to be loaded the first time that constant is referenced, like so:
autoload :BCrypt, 'bcrypt'
autoload :Digest, 'digest/sha2'
The first time the interpreter encounters the constant BCrypt, it will load the file 'bcrypt' from Ruby's current load path, which it assumes will contain the definition of that constant. (Note that autoload takes the name of the constant, in symbol form, not the constant itself).

Here's an example of how useful it can be. OtherInbox uses beanstalkd in a few places where we haven't yet migrated to SQS. I was loading the beanstalk client with a Rails initializer, 'config/initializers/beanstalk.rb':
require 'beanstalk-client'
BEANSTALK = Beanstalk::Pool.new(['localhost:11300'])
Making this initial connection on our production server takes five seconds or more each time I restarted the app or dropped into the console. That doesn't sound like much but when you're doing that a few times a day, it starts to add up. So I moved the beanstalk code out of the initializer and into 'lib/etc/load_beanstalk.rb'. I placed all of my autoloads in a single initializer, 'config/initializers/autoload.rb'. For beanstalk, the statement is:
autoload :BEANSTALK, 'etc/load_beanstalk'
Now, the app starts more quickly, and even better, this library doesn't get loaded into memory by parts of the app that don't need it.

Labels: ,

Saturday, March 15, 2008

Using stunnel to wrap Ruby network operations on the fly

In my current project, we need to be able to connect to POP3 servers. Some POP3 servers, such as Gmail, only allow SSL connections. Unfortunately, the Ruby 1.8.x net/pop library doesn't support SSL (although the 1.9 library does, but that was not an option for us in this project).

The usual answer here is to wrap your connection using stunnel, which acts as an SSL proxy for whatever traffic you want to send over it. Usually you run stunnel as a separate service pointing at the server, but since we'll be connecting to many different POP servers, I needed to be able to set up and tear down stunnels on the fly. The first attempt looked something like this:
system("echo -e 'foreground = yes\npid =\n[mail]\nclient = yes\n \
accept = 127.0.0.1:2000\nconnect = #{server}:#{port}\n' \
| stunnel -fd 0")
Since stunnel doesn't accept command-line options, you have to pipe options to it. The "fd -0" tells stunnel to read its configuration from file descriptor 0, better known as STDIN.

Since I need to run that command in a child process, then have the parent resume and make use of the child service, I embarked on a fun foray of Ruby's forking and threading capabilities.

First, I tried forking, replacing the child process with a call to exec instead of system, then detaching the parent and killing the child process when the POP session was done. This partially worked, but I couldn't figure out how to kill the child process, so I'd end up with multiple copies of stunnel running after the script ran, or the parent process itself would hang.

Looking through the Pickaxe chapter on threads and processes, I discovered IO.popen, which works perfectly. I can pipe input to STDIN, avoiding the ugliness of the "echo -e" above, and I can more easily kill the child process when I'm done.

This is what the final method looks like:
def stunnel_wrap(server, port)
stunnel = IO.popen("stunnel -fd 0",'w+')
stunnel.puts("foreground = yes\npid =\n[mail]\nclient = yes\n \
accept = 127.0.0.1:2000\nconnect = #{server}:#{port}\n"
)
stunnel.close_write
Kernel.sleep(1)
yield
ensure
Process.kill(9,stunnel.pid)
end
I handle exceptions at a higher layer in this class, so here all I do is make sure the stunnel process gets killed no matter what. I'm not sure if the sleep call is needed, but when I was testing this with Gmail it seemed to help to wait one second for the tunnel to activate before trying to use it.

To make the above example work, you just need to point your POP client at the stunnel (in this case 127.0.0.1 port 2000) and you'll be talking SSL to the server.
stunnel_wrap('pop.gmail.com',995) do
Net::POP3.start('127.0.0.1', 2000, account, password) do |pop|
# pop securely
end
end

Labels:

Tuesday, December 4, 2007

random_data v1.2.1 released

Thanks to Paul Barry and Hugh Sasse for some awesome patches to random_data! They're responsible for all of the new stuff added this release.

1 major enhancement
  • Added method_missing to Random class, which looks for a method_name.dat file and fetches a random line for you (see docs for details) (Hugh Sasse)
  • Added Random.date_between method to get a date between 2 given dates (Paul Barry)
  • Added Random.boolean method to get a random true or false (Paul Barry)
  • Added Random.number to get a random number less than a number or within a given range (Paul Barry)
1 minor enhancement
  • Enhanced Random.date method to handle a Range as well as a Fixnum, which allows you to get a date guaranteed to be in the past (Paul Barry)
1 minor fix
  • Location sources organized into more understandable categories, for easier future expansion (Hugh Sasse)
  • Fixed path of require statements in random_data.rb (Paul Barry)
  • Make initial never return nil, because if it returns nil then ContactInfo#email can thrown and error because it tries to call nil. (Paul Barry)

Labels: ,

Saturday, October 20, 2007

random_data v1.1 released

I received a patch from Hugh Sasse for random_data which adds separate male and female first names and adds more names. I've released these changes as v1.1.
sudo gem install random_data
or get it manually from the rubyforge site.

Thanks, Hugh!

Labels: ,

Thursday, September 20, 2007

random_data gem released

I just released my first Ruby gem. I have a library of functions that I use for generating realistic data so I can have more meaningful examples to work with during development. So I used the newgem generator and hoe to package it all into a gem called random_data.

It provides a Random singleton class with a series of methods for generating random test data including names, mailing addresses, dates, phone numbers, e-mail addresses, and text. This lets you quickly mock up realistic looking data for informal testing.

Instead of:
>> foo.name = "John Doe"
You get:

>>
foo.name = "#{Random.firstname} #{Random.initial} #{Random.lastname}"
>> foo.name
=> "Miriam R. Xichuan"


The gem also includes code for phone numbers, e-mail addresses, physical addresses, and (primitive) text generation.

You can install it via:

sudo gem install random_data
For more details and full documentation, visit the rubyforge site.

Labels: ,