<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	
	xmlns:georss="http://www.georss.org/georss"
	xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
	>

<channel>
	<title>Runa - Cognizant Transmutation</title>
	<atom:link href="https://www.ibd.com/category/runa/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.ibd.com</link>
	<description>Internet Bandwidth Development: Composting the Internet for over Two Decades</description>
	<lastBuildDate>Thu, 05 Aug 2021 06:03:13 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.1</generator>

<image>
	<url>https://i0.wp.com/www.ibd.com/wp-content/uploads/2019/01/fullsizeoutput_7ae8.jpeg?fit=32%2C32&#038;ssl=1</url>
	<title>Runa - Cognizant Transmutation</title>
	<link>https://www.ibd.com</link>
	<width>32</width>
	<height>32</height>
</image> 
<atom:link rel="hub" href="https://pubsubhubbub.appspot.com"/><atom:link rel="hub" href="https://pubsubhubbub.superfeedr.com"/><atom:link rel="hub" href="https://websubhub.com/hub"/><site xmlns="com-wordpress:feed-additions:1">156814061</site>	<item>
		<title>Creating an Amazon EC2 AMI for Opscode Chef 0.8 Client and Server</title>
		<link>https://www.ibd.com/howto/creating-an-amazon-ami-for-chef-0-8/</link>
					<comments>https://www.ibd.com/howto/creating-an-amazon-ami-for-chef-0-8/#comments</comments>
		
		<dc:creator><![CDATA[Robert J Berger]]></dc:creator>
		<pubDate>Tue, 12 Jan 2010 09:00:21 +0000</pubDate>
				<category><![CDATA[HowTo]]></category>
		<category><![CDATA[Opscode Chef]]></category>
		<category><![CDATA[Runa]]></category>
		<category><![CDATA[Scalable Deployment]]></category>
		<category><![CDATA[Sysadmin]]></category>
		<category><![CDATA[AWS]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[EC2]]></category>
		<category><![CDATA[Git]]></category>
		<category><![CDATA[Infrastructure]]></category>
		<category><![CDATA[ubuntu]]></category>
		<guid isPermaLink="false">http://blog2.ibd.com/?p=333</guid>

					<description><![CDATA[<p>Changes Since Original 1/13/10: Fix various minor inaccuracies and improved description on how to set up the chef-server. Also removed nanite as a requirement (its no longer used) 1/17/10: Add the requirement to build and install mixlib-authentication for the chef-client 1/21/10: Added a mkdir for /var/log/chef 1/22/10: Added step to insure that /tmp permissions are set Introduction Here&#8217;s my experience&#8230;</p>
<p>The post <a href="https://www.ibd.com/howto/creating-an-amazon-ami-for-chef-0-8/">Creating an Amazon EC2 AMI for Opscode Chef 0.8 Client and Server</a> first appeared on <a href="https://www.ibd.com">Cognizant Transmutation</a>.</p>]]></description>
										<content:encoded><![CDATA[<h2>Changes Since Original</h2>
<ul>
<li>1/13/10: Fix various minor inaccuracies and improved description on how to set up the chef-server. Also removed nanite as a requirement (its no longer used)</li>
<li>1/17/10: Add the requirement to build and install mixlib-authentication for the chef-client</li>
<li>1/21/10: Added a mkdir for /var/log/chef</li>
<li>1/22/10: Added step to insure that /tmp permissions are set</li>
</ul>
<h2>Introduction</h2>
<p>Here&#8217;s my experience setting up an Amazon EC2 AMI and Instance for a Chef Server and Client. It is based mostly on <a href="http://loftninjas.org/" target="_blank">Bryan Mclellan (btm)</a>&#8216;s post of Nov 24, 2009 <a href="http://blog.loftninjas.org/2009/11/24/installing-chef-08-alpha-on-ubuntu-karmic/" target="_blank">Installing Chef 0.8 alpha on Ubuntu Karmic</a> and  his more up to date <a href="http://gist.github.com/242523" target="_blank">GIST: chef 0.8 alpha installation</a>. It has a slightly different focus and is a bit stale if you are building your own 0.8 gems from the <a href="http://github.com/opscode/chef" target="_blank">source</a>.</p>
<h2>Instantiate an Amazon EC2 Instance</h2>
<p>We&#8217;ll start with the Canonical Ubuntu 9.10 Karmic AMI. I always go to <a href="http://alestic.com/" target="_blank">Eric Hammond&#8217;s site  alestic.com</a> to get the pointers to the right AMIs. In this case we&#8217;re using a 32bit image for the US-West Region: ami-7d3c6d38 US-East 32bit: ami-1515f67c. You can use the US-West 64bit image ami-7b3c6d3e, US-East 64bit: ami-ab15f6c2</p>
<p>Start the instance from your local dev machine using the command line ec2-api-tools (available as a package or directly from <a href="http://developer.amazonwebservices.com/connect/entry.jspa?externalID=351" target="_blank">Amazon</a>) or using something like the Firefox <a href="http://developer.amazonwebservices.com/connect/entry.jspa?externalID=609" target="_blank">Elasticfox</a> and then ssh into the instance so that you can do the following steps on the instance. For the sake of this example, lets say that the Public DNS name for the instance you started is ec2-204-222-170-10.us-west-1.compute.amazonaws.com and the ssh keypair you associated with this new instance is now on your local dec machine in  ~/.ssh/gsg-keypair</p>
<h2>Prerequisite preparation</h2>
<p>The first set of steps need to be done on the instance you just created so login via ssh:</p>
<pre>ssh -i ~/.ssh/gsg-keypair ec2-204-222-170-10.us-west-1.compute.amazonaws.com</pre>
<h3>If on Amazon us-west</h3>
<p>There is a bug in the current us-west Canonical AMI where it does not use the us-west apt server. So you have to correct the apt soruces.list:</p>
<pre><code>sed -i.bak '1,$s/us.ec2.archive.ubuntu.com/us-west-1.ec2.archive.ubuntu.com/' \
/etc/apt/sources.list</code></pre>
<h3>For all cases</h3>
<pre><code>sudo sed -i.bak2 '1,$s/universe/universe multiverse/' /etc/apt/sources.list
sudo apt-get -y update
sudo apt-get -y upgrade
sudo apt-get -y install emacs23 # Of course this is the first package to install!</code></pre>
<pre><code># Will need these to manipulate ec2 images
sudo apt-get -y install ec2-api-tools ec2-ami-tools </code></pre>
<h3>Set up the ruby environment and install rubygems</h3>
<h4>Install Ruby and needed packages</h4>
<pre><code>sudo apt-get -y install -y ruby ruby1.8-dev libopenssl-ruby1.8 rdoc ri irb \
build-essential wget ssl-cert git-core rake librspec-ruby libxml-ruby \
thin couchdb zlib1g-dev libxml2-dev</code></pre>
<h4>Install Rubygems</h4>
<p>Rubygems will be installed from source since debian/ubuntu try to control rubygems upgrades. If you don&#8217;t care you can install it via apt-get install rubygems</p>
<pre><code>cd /tmp
wget http://rubyforge.org/frs/download.php/60718/rubygems-1.3.5.tgz
tar zxf rubygems-1.3.5.tgz
cd rubygems-1.3.5
sudo ruby setup.rb
sudo ln -sfv /usr/bin/gem1.8 /usr/bin/gem
sudo gem sources -a http://gems.opscode.com
sudo gem sources -a http://gemcutter.org</code></pre>
<h4>Install Pre-requisit Gems</h4>
<pre><code>sudo gem install cucumber merb-core jeweler uuidtools \
json libxml-ruby --no-ri --no-rdoc</code></pre>
<h3>Building and Installing Chef Related Gems</h3>
<p>Until there are final 0.8.x Chef gems, you will have had to build them on your local machine and upload them to this instance. On your dev machine (this example builds things in ~/src, but it could be anywhere appropriate) follow these instructions to build all the gems and install gems you might need to use your local machine. You will use your local dev machine to develop and manage cookbooks and to manage a remote chef-server:</p>
<pre><code>mkdir ~/src
cd ~/src
git clone git://github.com/opscode/chef.git
git clone git://github.com/opscode/ohai.git
git clone git://github.com/opscode/mixlib-log
git clone git://github.com/opscode/mixlib-authentication.git
# Need to get mixlib-log for client &amp; server and
# mixlib-authentication for the client from git till the 1.1.0 update hits
# See http://tickets.opscode.com/browse/CHEF-823
cd mixlib-log
sudo rake install
cd mixlib-authentication
sudo rake install
cd ../ohai
sudo rake install
cd ../chef
rake gem
# Now cd into ~/src/chef/chef to install the chef client/dev gem on your local machine
cd chef
rake install </code></pre>
<p>Upload the gems needed for the client to your instance. From ~/src on your local dev machine do:</p>
<pre>scp -i ~/.ssh/gsg-keypair chef/chef/pkg/chef-0.8.0.gem  ohai/pkg/ohai-0.3.7.gem \
mixlib-authentication/pkg/mixlib-authentication-1.1.0.gem \
mixlib-log/pkg/mixlib-log-1.1.0.gem  ec2-204-222-170-10.us-west-1.compute.amazonaws.com:</pre>
<h2>Set up the Chef Client on the new Instance</h2>
<p>Now back in your home directory on the instance ec2-204-222-170-10.us-west-1.compute.amazonaws.com install the gems you just copied over:</p>
<pre><code>sudo gem install mixlib-log-1.1.0.gem ohai-0.3.7.gem
sudo gem install chef-0.8.0.gem </code></pre>
<h3>Create the client config file</h3>
<pre><code>mkdir /var/log/chef
mkdir /etc/chef
chown root:root /etc/chef
chmod 755 /etc/chef
</code></pre>
<p>Put the following in /etc/chef/client.rb:</p>
<pre><code># Chef Client Config File

require 'ohai'
require 'json'

o = Ohai::System.new
o.all_plugins
chef_config = JSON.parse(o[:ec2][:userdata])
if chef_config.kind_of?(Array)
  chef_config = chef_config[o[:ec2][:ami_launch_index]]
end

log_level        :info
log_location     "/var/log/chef/client.log"
chef_server_url  chef_config["chef_server"]
registration_url chef_config["chef_server"]
openid_url       chef_config["chef_server"]
template_url     chef_config["chef_server"]
remotefile_url   chef_config["chef_server"]
search_url       chef_config["chef_server"]
role_url         chef_config["chef_server"]
client_url       chef_config["chef_server"]

node_name        o[:ec2][:instance_id]

unless File.exists?("/etc/chef/client.pem")
  File.open("/etc/chef/validation.pem", "w") do |f|
    f.print(chef_config["validation_key"])
  end
end

if chef_config.has_key?("attributes")
  File.open("/etc/chef/client-config.json", "w") do |f|
    f.print(JSON.pretty_generate(chef_config["attributes"]))
  end
  json_attribs "/etc/chef/client-config.json"
end

validation_key "/etc/chef/validation.pem"
validation_client_name chef_config["validation_client_name"]

Mixlib::Log::Formatter.show_time = true</code></pre>
<h4>Set up the /etc/init.d/chef-client</h4>
<p>Copy the example init.d script (You can also use runit instead, but we&#8217;re not going to describe that here)</p>
<pre><code>cp /usr/lib/ruby/gems/1.8/gems/chef-0.8.0/distro/debian/etc/init.d/chef-client /etc/init.d
cd /etc/init.d
update-rc.d chef-client defaults</code></pre>
<h4>Create an Init script to set /tmp to proper permmissions</h4>
<p>It looks like the Canonical Images will not  have /tmp with proper permissions if you exclude /tmp from your bundle process. Eric Hammond <a href="https://developer.amazonwebservices.com/connect/message.jspa?messageID=160098" target="_blank">recommends</a> doing the following.</p>
<p>Create a file /etc/init.d/ec2-mkdir-tmp with the following contents:</p>
<pre>#!/bin/sh
#
# ec2-mkdir-tmp Create /tmp if missing (as it's nice to bundle without it).
#
mkdir -p    /tmp
chmod 01777 /tmp</pre>
<p>Then set up the /etc/rc dirs to launch this on boot:</p>
<pre>
<pre>chmod a+x /etc/init.d/ec2-mkdir-tmp
ln -s /etc/init.d/ec2-mkdir-tmp /etc/rcS.d/S36ec2-mkdir-tmp</pre>
<h3><strong>Build the EC2 Image</strong></h3>
<p>The always amazingly helpful <a href="http://www.anvilon.com/" target="_blank">Eric Hammond</a> has a post, <a href="http://alestic.com/2009/06/ec2-ami-bundle" target="_blank">Creating a New Image for EC2 by Rebundling a Running Instance</a>, that describes the basics of how to do this. The following is pretty much a direct synopsis with minimal explanation. See his blog post for more details.</p>
<h3>Clean up potential security holes</h3>
<p>Remove stuff you don&#8217;t want to freeze into your image.</p>
<pre><code>sudo rm -f /root/.*hist* $HOME/.*hist*
sudo rm -f /var/log/*.gz</code></pre>
<h3>Copy AWS Certs to Instance</h3>
<p>Back on your local development system, copy your Amazon certificates to the instance.</p>
<pre><code>
remotehost=&lt;ec2-instance-hostname&gt;
remoteuser=ubuntu
scp -i &lt;private-ssh-key&gt; \
  &lt;path-to-certs&gt;/{cert,pk}-*.pem \
  $remoteuser@$remotehost:/tmp
</code></pre>
<h3>Create the new Image on the Instance</h3>
<p>Back on the ec2 instance, you&#8217;ll do the following to create the image.</p>
<h4>Define where to store the image on S3</h4>
<p>This assumes you have an S3 account setup on AWS. You don&#8217;t have to have already created the bucket. Set some bash variables that will be used by the commands that follow. You should set the prefix to something that is meaningful. Below is what I used as an example. You&#8217;ll want to make it unique to your environment. The Bucket name must be Globally unique across all of Amazon S3.</p>
<pre><code>bucket=runa-west-amis
prefix=runa-ubuntu-9.10-i386-20100101-base</code></pre>
<h4>Define your AWS credentials and target processor</h4>
<pre><code>export AWS_USER_ID=&lt;your-value&gt;
export AWS_ACCESS_KEY_ID=&lt;your-value&gt;
export AWS_SECRET_ACCESS_KEY=&lt;your-value&gt;

if [ $(uname -m) = 'x86_64' ]; then
  arch=x86_64
else
  arch=i386
fi
</code></pre>
<p>Bundle the files<br />
This also runs on the current instance and will bundle the everything on the instance file system except for dirs specified with the -e flag into a copy of the image under /mnt:</p>
<pre><code>sudo -E ec2-bundle-vol           \
  -r $arch                       \
  -d /mnt                        \
  -p $prefix                     \
  -u $AWS_USER_ID                \
  -k /tmp/pk-*.pem               \
  -c /tmp/cert-*.pem             \
  -s 10240                       \
  -e /mnt,/tmp,/root/.ssh,/home/ubuntu/.ssh
</code></pre>
<h5>If you are deploying to US-West-1 AWS Region</h5>
<p>Looks like the Amazon ec2 ami tools are not super aware about us-west yet. So you have to do this extra step right now. You&#8217;ll have to change the &#8211;kernel and &#8211;ramdisk to the ones appropriate for your kernel. You can inspect the values used for the AMI you used to boot the original instance. You can do this with ElasticFox or with the command (specify the AMI and region its in thatyou want to check):</p>
<pre>ec2-describe-images ami-7d3c6d38   -C /tmp/cert-*.pem -K /tmp/pk-*.pem --region us-west-1</pre>
<p>Then execute the following command and specify the right kernel and ramdisk</p>
<pre><code>sudo -E ec2-migrate-manifest        \
  -c /tmp/cert-*.pem             \
  -k /tmp/pk-*.pem               \
  -m /mnt/$prefix.manifest.xml   \
  --access-key $AWS_ACCESS_KEY_ID  \
  --secret-key $AWS_SECRET_ACCESS_KEY \
  --kernel aki-773c6d32          \
  --ramdisk ari-713c6d34         \
  --region us-west-1</code></pre>
<p><code> </code></p>
<h4>Upload the bundle to a bucket on S3:</h4>
<pre><code>sudo -E ec2-upload-bundle        \
    -b $bucket                   \
    -m /mnt/$prefix.manifest.xml \
    -a $AWS_ACCESS_KEY_ID        \
    -s $AWS_SECRET_ACCESS_KEY    \
    --location us-west-1
</code></pre>
<p>You may be prompted with something like:</p>
<pre><code>You are bundling in one region, but uploading to another. If the kernel or ramdisk associated with this AMI are not in the target region, AMI registration will fail.
You can use the ec2-migrate-manifest tool to update your manifest file with a kernel and ramdisk that exist in the target region.
Are you sure you want to continue? [y/N]
</code></pre>
<p>You should enter y return to accept.</p>
<h4>Register the AMI</h4>
<p>Back on your local development machine:</p>
<pre><code>ec2-register $bucket/$prefix.manifest.xml --region us-west-1</code></pre>
<p>The output of this will be the ami-id of your new instance. You can use this to instantiate your new ami.</p>
<p>You now have a private ami image you can start just like any other image. If you want to make it public</p>
<pre><code>ec2-modify-image-attribute -l -a all </code></pre>
<h2>Using the new AMI Image</h2>
<p>You can now use this instance as the basis for chef clients and also the basis to create a Chef Server. Use the Amazon EC2 tool, ElasticFox or whatever you favorite tool for managing EC2 instances to make a new instance first to create a Chef Server. Then after that you can create clients and have them load their roles and recipes from the chef server. Once you have a Chef Server, you can use knife ec2 instance command to create user data that includes a run list, credentials and other json that can be passed to the general ec2 tools to build specific instances.</p>
<h3>Creating a Chef Server from your new Image</h3>
<p>Using an EC2 tool like ec2-tools or elasticfox, create a new instance based on the AMI created earlier. You should use at least a c1.medium as the m1.small is just too painfully wimpy to use. Assume the new instance has the Public DNS name: <code>ec2-204-203-51-20.us-west-1.compute.amazonaws.com</code><br />
Copy the chef server gems to the new instance from the ~/src directory in your local dev environment to the new instance:</p>
<pre><code>scp -i ~/.ssh/gsg-keypair chef/*/pkg/*.gem \
ec2-204-203-51-20.us-west-1.compute.amazonaws.com:</code></pre>
<p>ssh to the new instance and do the following:</p>
<pre><code>sudo gem install chef-server-0.8.0.gem chef-server-api-0.8.0.gem \
chef-server-webui-0.8.0.gem chef-solr-0.8.0.gem</code></pre>
<h4>Set things up to use bootstrap client using chef-solo</h4>
<p>We&#8217;ll be using the last part of BTM&#8217;s GIST, and danielsdeleo (Dan DeLeo)&#8217;s <a href="http://github.com/danielsdeleo/cookbooks/tree/08boot/bootstrap" target="_blank">bootstrap cookbook</a> and chef-solo to set up this initial server.</p>
<pre><code>mkdir -p /tmp/chef-solo
cd /tmp/chef-solo
git clone git://github.com/danielsdeleo/cookbooks.git
cd cookbooks
git checkout 08boot
</code></pre>
<p>Create ~/chef.json:</p>
<pre><code>{
  "bootstrap": {
    "chef": {
      "url_type": "http",
      "init_style": "runit",
      "path": "/srv/chef",
      "serve_path": "/srv/chef",
      "server_fqdn": "localhost"
    }
  },
  "recipes": "bootstrap::server"
}
# End of file
</code></pre>
<p>Create ~/solo.rb with the following content:</p>
<pre><code>file_cache_path "/tmp/chef-solo"
cookbook_path "/tmp/chef-solo/cookbooks"
# End of ~/solo.rb file
</code></pre>
<p>Run chef-solo which will execute the chef bootstrap recipes using the bootstrap params in ~/chef.json to actually setup and configure this chef server</p>
<p>If you had installed rubygems with the ubuntu apt package you may have to specify the path:</p>
<pre><code>/var/lib/gems/1.8/bin/</code></pre>
<p>instead of:</p>
<pre><code>/usr/bin</code></pre>
<p>for the knife and various chef commands in the following code.</p>
<pre><code>/usr/bin/chef-solo -j ~/chef.json -c ~/solo.rb -l debug</code></pre>
<p>You will see a lot of Debug statements go by and it will take several minutes to complete. It should complete with something like:</p>
<pre><code>[Thu, 14 Jan 2010 00:19:38 +0000] INFO: Chef Run complete in 38.59808 seconds
[Thu, 14 Jan 2010 00:19:38 +0000] DEBUG: Exiting</code></pre>
<h5>Setup basic cookbooks</h5>
<p>The following will install the standard cookbooks on the chef server</p>
<pre><code>cd
git clone git://github.com/opscode/chef-repo.git
cd chef-repo
rm cookbooks/README
git clone git://github.com/opscode/cookbooks.git
</code></pre>
<p>Now upload the standard cookbooks using the credentials set up by the bootstrap process (user chef-webui)</p>
<pre><code>knife cookbook upload --all -u chef-webui \
-k /etc/chef/webui.pem -o cookbooks
</code></pre>
<h5>Startup the Chef Server web ui</h5>
<p>Do to a bug (http://tickets.opscode.com/browse/CHEF-839) you have to run this twice, the first time will create the admin user:</p>
<pre><code>sudo /usr/bin/chef-server-webui -p 4002</code></pre>
<p>But the first time will abort with an error message like:</p>
<pre><code>Loading init file from /usr/lib/ruby/gems/1.8/gems/chef-server-0.8.0/config/init-webui.rb
Loading /usr/lib/ruby/gems/1.8/gems/chef-server-0.8.0/config/environments/development.rb
~ Loaded slice 'ChefServerWebui' ...
WARN: HTTP Request Returned 404 Not Found: Cannot load user admin
~ Compiling routes...
~ Could not find resource model Node
~ Could not find resource model Client
~ Could not find resource model Role
~ Could not find resource model Search
~ Could not find resource model Cookbook
~ Could not find resource model Client
~ Could not find resource model Databag
~ Could not find resource model DatabagItem
/usr/lib/ruby/gems/1.8/gems/chef-server-0.8.0/config/init-webui.rb:32: uninitialized constant OpenID (NameError)
from /usr/lib/ruby/gems/1.8/gems/merb-core-1.0.15/lib/merb-core/bootloader.rb:1258:in `call'
from /usr/lib/ruby/gems/1.8/gems/merb-core-1.0.15/lib/merb-core/bootloader.rb:1258:in `run'
from /usr/lib/ruby/gems/1.8/gems/merb-core-1.0.15/lib/merb-core/bootloader.rb:1258:in `each'
from /usr/lib/ruby/gems/1.8/gems/merb-core-1.0.15/lib/merb-core/bootloader.rb:1258:in `run'
from /usr/lib/ruby/gems/1.8/gems/merb-core-1.0.15/lib/merb-core/bootloader.rb:99:in `run'
from /usr/lib/ruby/gems/1.8/gems/merb-core-1.0.15/lib/merb-core/server.rb:172:in `bootup'
from /usr/lib/ruby/gems/1.8/gems/merb-core-1.0.15/lib/merb-core/server.rb:42:in `start'
from /usr/lib/ruby/gems/1.8/gems/merb-core-1.0.15/lib/merb-core.rb:173:in `start'
from /usr/lib/ruby/gems/1.8/gems/chef-server-0.8.0/bin/chef-server-webui:76
from /usr/bin/chef-server-webui:19:in `load'
from /usr/bin/chef-server-webui:19</code></pre>
<p>Then again to actually start the WebUI and have it run in the background. You might want to start it in <a href="http://www.gnu.org/software/screen/" target="_blank">screen</a> for now or possibly redirect its output to a log file The following example shows sending the output of the command to a log file. You&#8217;ll want to check that log file after starting to make sure there were no errors.</p>
<pre><code>sudo sh -c '/usr/bin/chef-server-webui -p 4002 &gt; /var/log/</code><code>chef-server-webui.log' &amp;</code></pre>
<p>If you look at the output of a ps, you&#8217;ll see the shell command above, but the real work is being done by a merb instance with the port you specified (4002):</p>
<pre><code>#ps ax | grep webui
5533 pts/0    S      0:00 sh -c /usr/bin/chef-server-webui -p 4002 &gt; /var/log/chef-server-webui.log
#ps ax | grep merb
3694 ?        Sl     0:55 merb : worker (port 4000)
5534 pts/0    Sl     0:07 merb : worker (port 4002)</code></pre>
<p>The first merb worker is the chef-server itself, the second is the WebUI server.</p>
<p>Accessing the Chef Web UI</p>
<p>You can access the Chef Web UI web server using a web browser at the IP address / Public DNS name of this server that was just set up. Assuming the Public DNS is</p>
<pre><code>ec2-204-203-51-20.us-west-1.compute.amazonaws.com</code></pre>
<p>Assuming that you set up this instance to allow you to access port 4002 from the IP adddress of your local dev machine, you should be able to access the Web UI at</p>
<pre><code>http://ec2-204-203-51-20.us-west-1.compute.amazonaws.com:4002</code></pre>
<p>You can allow access to port 4002 from specific ip address ranges by updating your <a href="http://docs.amazonwebservices.com/AWSEC2/2007-08-29/DeveloperGuide/distributed-firewall-concepts.html" target="_blank">security group</a>. You can do that with ElasticFox (easy) or via the <a href="http://docs.amazonwebservices.com/AWSEC2/2007-08-29/DeveloperGuide/distributed-firewall-examples.html" target="_blank">command line tools</a> (a pain for a one off). Eventually you (or hopefully Opscode) will  set up an apache or nginx reverse proxy, Passenger or equiv to allow normal port 80 / 443 http/https access.</p>
<h2>Conclusion</h2>
<p>You should now be able to use  knife your local dev environment to develop cookbooks and upload roles and cookbooks to your new Chef Server and spin up new chef cookbook driven instances. You should use the knife documentation from the Opscode main wiki <a href="http://wiki.opscode.com/display/chef/Knife" target="_blank">Knife Page</a> <strong>NOT</strong> the docs in the Alpha Forums / Getting Started With Opscode / <a href="http://opscode.zendesk.com/forums/58858/entries/53988" target="_blank">Knife &#8211; Commandline API</a> as the later is actually more obsolete in terms of the version that you built from the opscode git repository. There is also a man page and knife &#8211;help gives you pretty much the same correct info as the wiki.</p>
<p>I hope to have a follow up post on how to do this in more details.</p>
<p>Feel free to leave comments if you find problems or have questions.</p><p>The post <a href="https://www.ibd.com/howto/creating-an-amazon-ami-for-chef-0-8/">Creating an Amazon EC2 AMI for Opscode Chef 0.8 Client and Server</a> first appeared on <a href="https://www.ibd.com">Cognizant Transmutation</a>.</p>]]></content:encoded>
					
					<wfw:commentRss>https://www.ibd.com/howto/creating-an-amazon-ami-for-chef-0-8/feed/</wfw:commentRss>
			<slash:comments>7</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">333</post-id>	</item>
		<item>
		<title>Building Opscode Chef 0.8.x from HEAD of the git repo</title>
		<link>https://www.ibd.com/howto/using-opscode-chef-0-8-x-alpha-from-head-of-the-git-repo/</link>
		
		<dc:creator><![CDATA[Robert J Berger]]></dc:creator>
		<pubDate>Wed, 23 Dec 2009 02:55:44 +0000</pubDate>
				<category><![CDATA[HowTo]]></category>
		<category><![CDATA[Opscode Chef]]></category>
		<category><![CDATA[Runa]]></category>
		<category><![CDATA[Scalable Deployment]]></category>
		<category><![CDATA[Sysadmin]]></category>
		<category><![CDATA[Bleeding Edge]]></category>
		<category><![CDATA[Iaas]]></category>
		<category><![CDATA[Infrastructure]]></category>
		<category><![CDATA[Saas]]></category>
		<guid isPermaLink="false">http://blog2.ibd.com/?p=324</guid>

					<description><![CDATA[<p>Update: I am having problems using the chef dev tools/client from the HEAD of the chef git repository with the Opscode Alpha Server service. I&#8217;m not sure if its me or if the latest versions of the chef client from HEAD is compatible with the Alpha Server Service. So the following is still useful for understanding how to build from&#8230;</p>
<p>The post <a href="https://www.ibd.com/howto/using-opscode-chef-0-8-x-alpha-from-head-of-the-git-repo/">Building Opscode Chef 0.8.x from HEAD of the git repo</a> first appeared on <a href="https://www.ibd.com">Cognizant Transmutation</a>.</p>]]></description>
										<content:encoded><![CDATA[<h2><strong>Update</strong><strong>:</strong></h2>
<p>I am having problems using the chef dev tools/client from the HEAD of the chef git repository with the Opscode Alpha Server service. I&#8217;m not sure if its me or if the latest versions of the chef client from HEAD is compatible with the Alpha Server Service. So the following is still useful for understanding how to build from HEAD, but it will not work with the Opscode Alpha SaaS server. It will work with the server you build from HEAD. See the next article <a href="http://blog2.ibd.com/scalable-deployment/creating-an-amazon-ami-for-chef-0-8/" target="_self">Creating an Amazon EC2 AMI for Opscode Chef 0.8 </a>for info on creating a Chef client and server on EC2.</p>
<h2>Introduction</h2>
<p><a href="http://www.opscode.com">Opscode</a> is introducing a pretty major set of changes in <a href="http://www.opscode.com/chef/">Chef</a> in the<a href="http://github.com/opscode/chef"> 0.8 release</a>. Its a major step forward and has some major changes as to how one interacts with Chef. (as well as some major bug fixes that alone make it worth the move). The <a href="http://www.opscode.com/blog/2009/10/07/preview-chef-0-8-and-the-opscode-platform/">Opscode Alpha Program introduces</a> a new service where Opscode runs the actual Chef Server as a service.</p>
<p>This post will describe setting up your User/Dev environment by building your own Chef Client / Dev Gems from the latest HEAD of the Chef repo from Github. It assumes that you did sign up for the Alpha program and have access to the Opscode Alpha Server. Though much of it would be the same if you were running your own chef server also built from the latest source from github. This post does not show how to actually use Chef and the chef-client on a target node. Hope to have a post on that in the next few days.</p>
<p>The documentation on how to move to and use Chef 0.8 is still very sparse, so I figured I would jot down some of the things we are learning as we apply this to our infrastructure at <a href="http://www.runa.com">Runa</a>. If any of you OpsChefs out there see something wrong or something I left out, let me know in the comments or via email.</p>
<h2>The Opscode Chef Alpha Environment</h2>
<p>If you are in the Opscode Alpha program, you would have been given login[s] and some pem keys. I won&#8217;t go into the details of this since they do have pretty good docs on setting this up (if you have an alpha login you can get them at <a href="http://opscode.zendesk.com/forums/58858/entries/49336">http://opscode.zendesk.com/forums/58858/entries/49336</a>). Its probably a good idea to follow these and start with their 0.8.0 gem to make sure you are talking with the Alpha Server before trying to use the Chef Git Repository to build your own gems.</p>
<p>The Alpha instructions use a Chef gem that is frozen at 0.8.0. But the Chef folks have already progressed much further than the Oct 29h release of 0.8.0 in the Chef Git Repository.</p>
<p>The HEAD of the Git Repository has many changes since 0.8.0. Some big ones include:</p>
<ul>
<li>The Knife sub commands are completely different</li>
<li>There is now a Chef Shell (A REPL like irb but for the chef client)</li>
<li>Lots of Bug Fixes</li>
</ul>
<p>And if we&#8217;re going to be on the bleeding edge, we might as well go all the way! So the rest of this blog will be about using the Chef HEAD branch from the Chef git repository. We&#8217;ll still use the Alpha Chef Server at least to start with.</p>
<h2>Configuring your Dev Environment</h2>
<h3>Prerequisites</h3>
<p>I&#8217;m using Mac OS X 10.6 (snow leopard). Our target environments are Ubuntu Linux on Amazon EC2. But assuming you have *nix, Ruby and Ruby Gems set up on your environment it should generally be the same (don&#8217;t know about people stuck in the Legacy Windows environment though).</p>
<p>So you will need to have installed and know how to use:</p>
<ul>
<li>Ruby</li>
<li>RubyGems</li>
<li>Git</li>
</ul>
<p>And the following Ruby Gems should be installed (I think this is the minimum you need, these will include their own dependencies:</p>
<ul>
<li>rake</li>
<li>rspec</li>
<li>cucumber</li>
<li>uuidtools</li>
<li>nanite</li>
<li>gemcutter</li>
<li>jeweler</li>
</ul>
<p>You will need http://gems.opscode.com as a gem source for the following. You can use the command:</p>
<pre><code>sudo gem sources -a http://gems.opscode.com</code></pre>
<ul>
<li>mixlib-authentication</li>
</ul>
<h3>Getting and building the code/GEMs for the Dev Environment</h3>
<p>The instructions that are in the README.doc of the Chef Git Repository are out of date as of now (Dec 20, 2009). The instructions on the wiki, <a href="http://wiki.opscode.com/display/chef/Installing+Chef+from+HEAD">Installing Chef from HEAD</a> are more accurate. Even though it seems like one can use the mixlib gems as the repository and the gems have the same version number, I found that I needed to install the mixlib libraries from source.</p>
<h4>Getting and building Ohai &amp; Mixlib Gems from Github</h4>
<p>We won&#8217;t be making any changes in these, so we&#8217;ll just git clone and build it:</p>
<pre><code>cd <em>to where you want to keep your local repositories</em>
git clone git://github.com/opscode/ohai.git
cd ohai
sudo rake install
cd ..
git clone git://github.com/opscode/mixlib-config.git
sudo rake install
cd ..
git clone git://github.com/opscode/mixlib-log.git
sudo rake install
cd ..
git clone git://github.com/opscode/mixlib-cli.git
sudo rake install
cd ..
</code></pre>
<h4>Getting the Chef code from github</h4>
<p>You can get the <a href="http://github.com/opscode/chef">Chef repository from github</a>. The readme there has most of the info you need for</p>
<p>If you plan to submit any patches or other changes back to Opscode, or you would like to have your own repository of this, you can fork the Opscode repository into your own Github account. This is what I did and will demonstrate below. If you don&#8217;t want any hardcore forking action, you can just git clone the opscode repository as shown here (assuming your current working directory is where you want the local directory repository placed. It will be named using the default &#8220;chef&#8221;):</p>
<pre><code>git clone git://github.com/opscode/chef.git</code></pre>
<p>If you have forked into your own github account (mine is rberger), you would git clone using the &#8220;Your Clone URL&#8221;:</p>
<pre><code>git clone git@github.com:rberger/chef.git rberger-chef</code></pre>
<p>This assumes you want your local directory name for the repository to be rberger-chef, just so you can distinguish it from the official opscode one. (I will refer to the top of the local repository as rberger-chef from now on).</p>
<h4>What&#8217;s in the Chef Git Repository</h4>
<p>Change directory into the local repository and do an ls. You&#8217;ll see that there are several components here.</p>
<pre><code>
$ cd rberger-chef
$ ls
CHANGELOG         README.rdoc       chef-server       chef-solr         scripts
LICENSE           Rakefile          chef-server-api   cucumber.yml
NOTICE            chef              chef-server-webui features
</code></pre>
<p>There are 2 main trees:</p>
<ul>
<li><strong>chef</strong>: chef-client and dev gem</li>
<li><strong>chef-server</strong>: Chef Server gem. Used only if you build your own server
<ul>
<li><strong>chef-server-api</strong>: Implements the REST interface sub-system as part of the full Chef Server</li>
<li><strong>chef-server-webui</strong>: Implements the WebUI as part of the full Chef Server</li>
<li><strong>chef-solar</strong>: Implements the Solar Search sub-system as part of the full Chef Server</li>
<li><strong>features</strong>: Not 100% sure all its used for, definately for the cucumber tests. But is part of the Server as far as I can tell</li>
</ul>
</li>
</ul>
<p>For now we are only interested in the chef tree. That will be used to set up the local dev environment. We&#8217;re not going to follow the outdated instructions that are in the README.doc in the root of the chef repository which assumes you are setting up the whole stack on the Dev machine. We&#8217;re going to just install the chef client and tools from the chef sub-tree on the dev machine.</p>
<p>This post will not describe how to build /use the chef-server, though you can pretty much build everything by running</p>
<pre><code>sudo rake install</code></pre>
<p>from the top of the distro. There are more gem dependencies that need to be installed before you can build the chef-server trees.</p>
<h4>Building and Installing the Chef Client / Dev tools</h4>
<p>Change directory to the chef subdirectory so you should be in rberger-chef/chef (or if you have a direct clone of the opscode chef repository: chef/chef)</p>
<pre><code>cd chef</code></pre>
<h4 style="text-decoration: line-through;">Some minor tweaks to the Source</h4>
<p>(shef is now included in the executables in the latest repository and setting my own sub-version number was lame)</p>
<p style="text-decoration: line-through;">I have done a few mods to the source. Mainly to set the version number to something that will not conflict with the official numbering now or when new releases come out and to have shef be installed by the gem.</p>
<ol style="text-decoration: line-through;">
<li>Changed line 30 in the Rakefile to <code>s.executables  = %w( chef-client chef-solo knife shef )</code> so the install puts shef in /usr/bin</li>
<li>Changed line 7 in the Rakefile to <code>CHEF_VERSION = "0.8.0.1"</code></li>
<li>Change line 30 in lib/chef.rb to <code>VERSION = '0.8.0.1'</code></li>
</ol>
<h5>Build and install</h5>
<pre><code>rake install</code></pre>
<p>Its going to eventually ask for your sudo password as it needs to use sudo to do the gem install. The run should look something like:</p>
<pre><code>(in /Users/rberger/work/Chef/rberger-chef/chef)
mkdir -p pkg
WARNING:  no rubyforge_project specified
WARNING:  description and summary are identical
  Successfully built RubyGem
  Name: chef
  Version: 0.8.0.1
  File: chef-0.8.0.1.gem
mv chef-0.8.0.1.gem pkg/chef-0.8.0.1.gem
sudo gem install pkg/chef-0.8.0.1 --no-rdoc --no-ri
Password:
Building native extensions.  This could take a while...
Successfully installed eventmachine-0.12.10
Successfully installed amqp-0.6.5
Successfully installed thor-0.12.0
Successfully installed deep_merge-0.1.0
Successfully installed moneta-0.6.0
Successfully installed chef-0.8.0.1
6 gems installed
</code></pre>
<h3>Using Chef with the Opscode Alpha SaaS Server</h3>
<p>This just touches on some of the things that are described in <a href="http://opscode.zendesk.com/forums/58858/entries/49336">The Official Guide to Getting Started With Opscode</a></p>
<h4>Setting up your Dev Environment</h4>
<p>Its not clear if you really have to do everything as described in the document if you are building the latest release from the chef repository and using the ~/.chef/knife.rb config described below. For instance I didn&#8217;t have to set the environment variables for OPSCODE_USER and OPSCODE_KEY since they are now set in the knife.rb nor did I have to create /etc/chef/client.rb. And even without the global Chef config, I was able to use most of the knife commands. But not some like the ec2 instances data seemed to need the organization validation key to be in /etc/chef/validation.pem</p>
<h5>Copy your assigned validation key to /etc/chef</h5>
<p>When you got your Opscode Alpha welcome stuff, you should have gotten your user keys and a key for your organization. Copy your organization (in our case runa.pem) to /etc/chef/validation.pem. You will probably have to create /etc/chef directory first.</p>
<h5>The User Chef/Knife config</h5>
<p>You must configure a knife config file in your home directory under ~/.chef/knife.rb and have your key that you got from Opscode somewhere pointed to by a line in ~/.chef/knife.rb. The configuration parameters are described on the <a href="http://wiki.opscode.com/display/chef/Knife">Knife Wiki Page</a>. For instance my config file:</p>
<pre><code>log_level        :info
log_location     STDOUT
node_name        'rberger'
client_key       '/Users/rberger/.chef/rberger.pem'
chef_server_url  "https://api.opscode.com/organizations/runa"
cache_type       'BasicFile'
cache_options( :path =&gt; '/Users/rberger/.chef/checksums' )
</code></pre>
<p>Once you have this set up you can now use knife and the chef rake commands. You can test things out by saying something like:</p>
<pre><code>knife client list</code></pre>
<p>Which should return and empty list assuming you haven&#8217;t set up any clients on this server yet.</p>
<p>The first real useful command you want to do is to upload your cookbooks to the Opscode Server:</p>
<pre><code>cd <em>to where your chef cookbook repository is</em>
rake upload_cookbooks</code></pre>
<p>You can also do it with just knife:</p>
<pre><code>knife cookbook upload -a</code></pre>
<p>This may take a while as it will upload all the cookbooks in cookbooks and site-cookbooks in your current repository.</p>
<p>After that you can upload single cookbooks</p>
<p>knife cookbook upload</p>
<p>Just remember the knife documentation on the Alpha site no longer applies to the knife that you get from building from the HEAD of the chef git repository. Strangely enough, the <a href="http://wiki.opscode.com/display/chef/Knife">knife documentation on the wiki</a> is accurate.</p>
<h2>Conclusion</h2>
<p>Once you&#8217;ve been thru it, its all quite simple. I hope to post some more on using 0.8.0+ soon. See a more recent blog post for building your own Chef Server <a href="http://blog2.ibd.com/scalable-deployment/creating-an-amazon-ami-for-chef-0-8/">Creating an Amazon EC2 AMI for Opscode Chef 0.8</a></p><p>The post <a href="https://www.ibd.com/howto/using-opscode-chef-0-8-x-alpha-from-head-of-the-git-repo/">Building Opscode Chef 0.8.x from HEAD of the git repo</a> first appeared on <a href="https://www.ibd.com">Cognizant Transmutation</a>.</p>]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">324</post-id>	</item>
		<item>
		<title>Experience installing Hbase 0.20.0 Cluster on Ubuntu 9.04 and EC2</title>
		<link>https://www.ibd.com/howto/experience-installing-hbase-0-20-0-cluster-on-ubuntu-9-04-and-ec2/</link>
					<comments>https://www.ibd.com/howto/experience-installing-hbase-0-20-0-cluster-on-ubuntu-9-04-and-ec2/#comments</comments>
		
		<dc:creator><![CDATA[Robert J Berger]]></dc:creator>
		<pubDate>Sat, 05 Sep 2009 01:34:41 +0000</pubDate>
				<category><![CDATA[HowTo]]></category>
		<category><![CDATA[Runa]]></category>
		<category><![CDATA[Scalable Deployment]]></category>
		<category><![CDATA[AWS]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[EC2]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[HBase]]></category>
		<category><![CDATA[Sysadmin]]></category>
		<category><![CDATA[ubuntu]]></category>
		<guid isPermaLink="false">http://blog2.ibd.com/?p=237</guid>

					<description><![CDATA[<p>NOTE (Sep 7 2009): Updated info on need to use Amazon Private DNS Names and clarified the need for the masters, slaves and regionservers files. Also updated to use HBase 0.20.0 Release Candidate 3 Introduction As someone who has &#8220;skipped&#8221; Java and wants to learn as little as possible about it, and as one who has not had much experience&#8230;</p>
<p>The post <a href="https://www.ibd.com/howto/experience-installing-hbase-0-20-0-cluster-on-ubuntu-9-04-and-ec2/">Experience installing Hbase 0.20.0 Cluster on Ubuntu 9.04 and EC2</a> first appeared on <a href="https://www.ibd.com">Cognizant Transmutation</a>.</p>]]></description>
										<content:encoded><![CDATA[<p><strong>NOTE (Sep 7 2009):</strong> Updated info on need to use Amazon Private DNS Names and clarified the need for the masters, slaves and regionservers files. Also updated to use HBase 0.20.0 Release Candidate 3</p>
<h2>Introduction</h2>
<p>As someone who has &#8220;skipped&#8221; Java and wants to learn as little as possible about it, and as one who has not had much experience with Hadoop so far, HBase deployment has a big learning curve. So some of the things I describe below may be obvious to those who have had experience in those domains.</p>
<h2>Where&#8217;s the docs for HBase 0.20</h2>
<p>If you go to the HBase wiki, you will find that there is not much documentation on the 0.20 version. This puzzled me since all the twittering, blog posting and other buzz was talking about people using 0.20 even though its &#8220;pre-release&#8221;</p>
<p>One of the great things about going to meetups such as the <a title="HBase Meetup" href="http://www.meetup.com/hbaseusergroup/" target="_blank">HBase Meetup</a> is you can talk to the folks who actually wrote the thing and ask them &#8220;Where is the documentation for HBase 0.20</p>
<p>Turns out its in the HBase 0.20.0 distribution in the docs directory. The easiest thing is to get the <a href="http://people.apache.org/~stack/hbase-0.20.0-candidate-3" target="_blank">pre-built 0.20.0 release candididate 3</a>. If you download the source from the version control repository you have to build the documentation using Ant. If you are an Java/Ant kind of person it might not be hard. But just to build the docs, you have to meet some dependencies like</p>
<h2>What we learnt with 0.19.x</h2>
<p>We have been learning a lot about making HBase Cluster work at a basic level. I had a lot of problems getting 0.19.x running beyond a single node in Psuedo Distributed mode. I think a lot of my problems was just not getting how it all fit together with Hadoop and what the different startup/shutdown scripts did.</p>
<p>Then we finally tried the <a href="http://issues.apache.org/jira/browse/HBASE-838" target="_blank">HBase EC2 Scripts </a>even though it uses an AMI based on Fedora 8 and seemed wired to 0.19.0. Its a pretty nice script if you want to have an opionated HBase cluster set up. But it did educate us on how to get a cluster to go. It has a bit of strangeness by having a script in /root/hbase_init that is called at boot time to configure all the hadoop and hbase conf scripts and then call the hadoop and hbase startup scripts. Something like this is kind of needed for Amazon EC2 since you don&#8217;t really know what the IP Address/FQDN is until boot time.</p>
<p>The scripts also set up an Amazon Security Group for the cluster master and one for the rest of the cluster. I beleive it then uses this as a way to identify the group as well.</p>
<p>The main thing we did get was by going thru mainly the /root/hbase_init script we were able to figure out what the process was for bringing up Hadoop/HBase as a cluster.</p>
<p>We did build a staging cluster with this script. We were able to pretty easily change the scripts to use 0.19.3 instead of 0.19.0. But its opions were different than ours for many things. Plus after talking to the folks at the HBase Meetup, and having all sort of weird problems with our app on 0.19.3, we were convinced that our future is in HBase 0.20. And 0.20 introduces some new things like using Zookeeper to manage the Master selection so seems like its not worth it for us to continue to use this script. Though it helped in our learning quite a bit!</p>
<h2>Building an HBase 0.20.0 Cluster</h2>
<p>This post will use the HBase pre-built Release Candidate 3 and the prebuild standard Hadoop 0.20.0.</p>
<p>This post will show how to do all this &#8220;by hand&#8221;. Hopefully we&#8217;ll have an article on how to do all this with Chef sometime soon.</p>
<p>The Hbase folks say that you really should have at least 5 regionservers and one master. The master and several of the regionservers can also run the zookeeper quorum. Of course the master serveris also going to run the Hadoop Nameserver Secondary name server. Then the 5 other nodes are running the Hadoop HDFS Data nodes as well as the HBase region servers. When you build out larger clusters, you will probably want to dedicate machines to Zookeepers and hot-standby Hbase Masters. Name Servers are still the Single Point of Failure (SPOF). Rumour has it that this will be fixed in Hadoop 0.21.</p>
<p>We&#8217;re not using Map / Reduce yet so won&#8217;t go into that, but its just a mater of different startup scripts to make the same nodes do Map/Reduce as HDFS and HBase.</p>
<p>In this example, we&#8217;re installing and running everything as Root. It can also be done as a special user like hadoop as described in the earlier blog post <a href="http://blog2.ibd.com/scalable-deployment/hadoop-hdfs-an…base-on-ubuntu/" target="_blank">Hadoop, HDFS and Hbase on Ubuntu &amp; Macintosh Leopard</a></p>
<h2 style="font-size: 1.17em;">Getting the pre-requisites in order</h2>
<p>We started with the vanilla <a href="http://alestic.com/" target="_blank">alestic</a> Ubuntu 9.04 Jaunty 64Bit Server AMI: <a href="http://developer.amazonwebservices.com/connect/entry.jspa?externalID=1951&amp;categoryID=101" target="_blank">ami-5b46a732</a> and instantiated 6 High CPU Large Instances. You really want as much memory and cores as you can get. You can do the following by hand or combine it with the shell scripting described below in the section <em>Installing Hadoop and HBase.</em></p>
<pre>apt-get update
apt-get upgrade</pre>
<p>Then added via apt-get install:</p>
<pre>apt-get install sun-java6-jdk</pre>
<h3>Downloading Hadoop and HBase</h3>
<p>You can use the production Hadoop 0.20.0 release. You can find them at the mirrors at http://www.apache.org/dyn/closer.cgi/hadoop/core/. The examples show from one mirror:</p>
<pre>wget http://mirror.cloudera.com/apache/hadoop/core/hadoop-0.20.0/hadoop-0.20.0.tar.gz

You can download the HBase 0.20.0 Release Candidate 3 in a prebuilt form from <a href="http://people.apache.org/~stack/hbase-0.20.0-candidate-3/" target="_blank">http://people.apache.org/~stack/hbase-0.20.0-candidate-3/</a> (You can get the source out of Version Control:<a href="http://hadoop.apache.org/hbase/version_control.html" target="_blank">http://hadoop.apache.org/hbase/version_control.html</a> but  you'll have to figure out how to build it.)

wget http://people.apache.org/~stack/hbase-0.20.0-candidate-3/hbase-0.20.0.tar.gz</pre>
<h3>Installing Hadoop and HBase</h3>
<p>Assuming that you are running in your home directory on the master server and that the target for the versioned packages is in /mnt/pkgs and that there will be a link in /mnt for the path to the home for hadoop and hbase:</p>
<p>You can do a some simple scripting to do the following on all the nodes at once:</p>
<p>Create a file named servers with the list of the fully qualified domain names of all your servers including &#8220;localhost&#8221; for the master and call the file &#8220;servers&#8221;.</p>
<p>Make sure you can ssh to all the servers from the master. Ideally you are using ssh keys. On master:</p>
<pre>ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/id_dsa.pub &gt;&gt; ~/.ssh/authorized_keys</pre>
<p>On each of your region servers make sure that the id_dsa.pub is also in their authorized_keys (don&#8217;t delete any other keys you have in the authorized keys!)</p>
<p>Now with a bit of shell command line scripting you can install on all your servers at once:</p>
<pre>for host in `cat servers`
 do
 echo $host
 ssh $host 'apt-get update; apt-get upgrade; apt-get install sun-java6-jdk'
 scp ~/hadoop-0.20.0.tar.gz ~/hbase-0.20.0.tar.gz $host:
 ssh $host 'mkdir -p /mnt/pkgs; cd /mnt/pkgs; tar xzf ~/hadoop-0.20.0.tar.gz; tar xzf ~/hbase-0.20.0.tar.gz; ln -s /mnt/pkgs/hadoop-0.20.0 /mnt/hadoop; ln -s /mnt/pkgs/hbase-0.20.0 /mnt/hbase'
done</pre>
<h4>Use Amazon Private DNS Names in Config files</h4>
<p>So far I have found that its best to use the Amazon Private DNS names in the hadoop and hbase config files. It looks like HBase uses the system hostname to determine various things at runtime. Thie is always the Private DNS name. It also means that its difficult to use the Web GUI interfaces to HBase from outside of the Amazon Cloud. I set up a &#8220;desktop&#8221; version of Ubuntu that is running in the Amazon Cloud that I VNC (or NX) into and use its browser to view the Web Interface.</p>
<p>In any case, Amazon instances normally have limited TCP/UDP access to the outside world due to the default security group settings. You would have to add the various ports used by HBase and Hadoop to the security group to allow outside access.</p>
<p>If you do use the Amazon Public DNS names in the config files, there will be startup errors like the following for each instance that is assigned to the zookeeper quorum (there may be other errors as well, but these are the most obvious):</p>
<pre>ec2-75-101-104-121.compute-1.amazonaws.com: java.io.IOException: Could not find my address: domU-12-31-39-06-9D-51.compute-1.internal in list of ZooKeeper quorum servers
ec2-75-101-104-121.compute-1.amazonaws.com:     at org.apache.hadoop.hbase.zookeeper.HQuorumPeer.writeMyID(HQuorumPeer.java:128)
ec2-75-101-104-121.compute-1.amazonaws.com:     at org.apache.hadoop.hbase.zookeeper.HQuorumPeer.main(HQuorumPeer.java:67)</pre>
<h3>Configuring Hadoop</h3>
<p>Now you have to configure the hadoop on master in /mnt/hadoop/conf:</p>
<h4>hadoop-env.sh:</h4>
<p>The minimal things to change are:</p>
<p>Set your JAVA_HOME to where the java package is installed. On Ubuntu:</p>
<pre>export JAVA_HOME=/usr/lib/jvm/java-6-sun</pre>
<p>Add the hbase path to the HADOOP_CLASSPATH:</p>
<pre>export HADOOP_CLASSPATH=/mnt/hbase/hbase-0.20.0.jar:/mnt/hbase/hbase-0.20.0-test.jar:/conf</pre>
<h4>core-site.xml:</h4>
<p>Here is what we used. Primarily setting where the hadoop files are and the nameserver path and port:</p>
<pre>&lt;?xml version="1.0"?&gt;
&lt;?xml-stylesheet type="text/xsl" href="configuration.xsl"?&gt;

&lt;configuration&gt;
   &lt;property&gt;
     &lt;name&gt;hadoop.tmp.dir&lt;/name&gt;
     &lt;value&gt;/mnt/hadoop&lt;/value&gt;
   &lt;/property&gt;

   &lt;property&gt;
     &lt;name&gt;fs.default.name&lt;/name&gt;
     &lt;value&gt;hdfs://domU-12-31-39-06-9D-51.compute-1.internal:50001&lt;/value&gt;
   &lt;/property&gt;

   &lt;property&gt;
     &lt;name&gt;tasktracker.http.threads&lt;/name&gt;
     &lt;value&gt;80&lt;/value&gt;
   &lt;/property&gt;
&lt;/configuration&gt;</pre>
<p>mapred-site.xml:</p>
<p>Even though we are not currently using Map/Reduce this is a basic config:</p>
<pre>&lt;?xml version="1.0"?&gt;
&lt;?xml-stylesheet type="text/xsl" href="configuration.xsl"?&gt;

&lt;configuration&gt;
   &lt;property&gt;
     &lt;name&gt;mapred.job.tracker&lt;/name&gt;
     &lt;value&gt;domU-12-31-39-06-9D-51.compute-1.internal:50002&lt;/value&gt;
   &lt;/property&gt;

   &lt;property&gt;
     &lt;name&gt;mapred.tasktracker.map.tasks.maximum&lt;/name&gt;
     &lt;value&gt;4&lt;/value&gt;
   &lt;/property&gt;

   &lt;property&gt;
     &lt;name&gt;mapred.tasktracker.reduce.tasks.maximum&lt;/name&gt;
     &lt;value&gt;4&lt;/value&gt;
   &lt;/property&gt;

   &lt;property&gt;
     &lt;name&gt;mapred.output.compress&lt;/name&gt;
     &lt;value&gt;true&lt;/value&gt;
   &lt;/property&gt;

   &lt;property&gt;
     &lt;name&gt;mapred.output.compression.type&lt;/name&gt;
     &lt;value&gt;BLOCK&lt;/value&gt;
   &lt;/property&gt;
&lt;/configuration&gt;</pre>
<h4>hdfs-site.xml:</h4>
<p>The main thing to change based on your config is the dfs.replication. It should be less than the total number of data-nodes / region-servers.</p>
<pre>&lt;?xml version="1.0"?&gt;
&lt;?xml-stylesheet type="text/xsl" href="configuration.xsl"?&gt;

&lt;configuration&gt;
   &lt;property&gt;
     &lt;name&gt;dfs.client.block.write.retries&lt;/name&gt;
     &lt;value&gt;3&lt;/value&gt;
   &lt;/property&gt;

   &lt;property&gt;
     &lt;name&gt;dfs.replication&lt;/name&gt;
     &lt;value&gt;3&lt;/value&gt;
   &lt;/property&gt;
&lt;/configuration&gt;</pre>
<p>Put the Fully qualified domain name of your master in the file <em>masters</em> and the names of the data-nodes in the file <em>slaves.</em></p>
<h4>masters:</h4>
<pre>domU-12-31-39-06-9D-51.compute-1.internal</pre>
<h4>slaves:</h4>
<pre>domU-12-31-39-06-9D-C1.compute-1.internal
domU-12-31-39-06-9D-51.compute-1.internal</pre>
<p>We did not change any of the other files so far.</p>
<p>Now copy these files to the data-nodes:</p>
<pre>for host in `cat slaves`
do
  echo $host
  scp slaves masters hdfs-site.xml hadoop-env.sh core-site.xml ${host}:/mnt/hadoop/conf
done</pre>
<p>And also format the hdfs on the master</p>
<pre>/mnt/hadoop/bin/hadoop namenode -format</pre>
<h3>Configuring HBase</h3>
<h4>hbase-env.sh:</h4>
<p>Similar to the hadoop-env.sh, you must set the JAVA_HOME:</p>
<pre>export JAVA_HOME=/usr/lib/jvm/java-6-sun</pre>
<p>and add the hadoop conf directory to the HBASE_CLASSPATH:</p>
<pre>export HBASE_CLASSPATH=/mnt/hadoop/conf</pre>
<p>And for the master you will want to say:</p>
<pre>export HBASE_MANAGES_ZK=true</pre>
<h4>hbase-site.xml:</h4>
<p>Mainly need to define the hbase master, hbase rootdir and the list of zookeepers. We also had to bump up the hbase.zookeeper.property.maxClientCnxns from the default of 30 to 300.</p>
<pre>&lt;?xml version="1.0"?&gt;
&lt;?xml-stylesheet type="text/xsl" href="configuration.xsl"?&gt;
&lt;configuration&gt;
   &lt;property&gt;
     &lt;name&gt;hbase.master&lt;/name&gt;
     &lt;value&gt;domU-12-31-39-06-9D-51.compute-1.internal:60000&lt;/value&gt;
   &lt;/property&gt;

   &lt;property&gt;
     &lt;name&gt;hbase.rootdir&lt;/name&gt;
     &lt;value&gt;hdfs://domU-12-31-39-06-9D-51.compute-1.internal:50001/hbase&lt;/value&gt;
   &lt;/property&gt;
   &lt;property&gt;
     &lt;name&gt;hbase.zookeeper.quorum&lt;/name&gt;
     &lt;value&gt;domU-12-31-39-06-9D-51.compute-1.internal,domU-12-31-39-06-9D-C1.compute-1.internal,domU-12-31-39-06-9D-51.compute-1.internal&lt;/value&gt;
   &lt;/property&gt;
   &lt;property&gt;
     &lt;name&gt;hbase.cluster.distributed&lt;/name&gt;
     &lt;value&gt;true&lt;/value&gt;
   &lt;/property&gt;
   &lt;property&gt;
     &lt;name&gt;hbase.zookeeper.property.maxClientCnxns&lt;/name&gt;
     &lt;value&gt;300&lt;/value&gt;
   &lt;/property&gt;
&lt;/configuration&gt;</pre>
<p>You will also need to have a file called regionservers. Normally it contains the same hostnames as the hadoop slaves:</p>
<h4>regionservers:</h4>
<pre>domU-12-31-39-06-9D-C1.compute-1.internal
domU-12-31-39-06-9D-51.compute-1.internal</pre>
<p>Copy the files to the region-servers:</p>
<pre>for host in `cat regionservers`
do
  echo $host
  scp hbase-env.sh hbase-site.xml regionservers ${host}:/mnt/hbase/conf
done</pre>
<h3>Starting Hadoop and HBase</h3>
<p>On the master:</p>
<p>(This just starts the Hadoop File System services, not Map/Reduce services)</p>
<pre>/mnt/hadoop/bin/start-dfs.sh</pre>
<p>Then start hbase:</p>
<pre>/mnt/hbase/bin/start-hbase.sh</pre>
<p>You can shut things down by doing the reverse:</p>
<pre>/mnt/hbase/bin/stop-hbase.sh
/mnt/hadoop/bin/stop-dfs.sh</pre>
<p>It is advisable to set up init scripts. This is described in the <em>Ubuntu /etc/init.d style startup scripts</em> section of the earlier blog post:<a href="http://blog2.ibd.com/scalable-deployment/hadoop-hdfs-and-hbase-on-ubuntu/" target="_blank">Hadoop, HDFS and Hbase on Ubuntu &amp; Macintosh Leopard</a></p><p>The post <a href="https://www.ibd.com/howto/experience-installing-hbase-0-20-0-cluster-on-ubuntu-9-04-and-ec2/">Experience installing Hbase 0.20.0 Cluster on Ubuntu 9.04 and EC2</a> first appeared on <a href="https://www.ibd.com">Cognizant Transmutation</a>.</p>]]></content:encoded>
					
					<wfw:commentRss>https://www.ibd.com/howto/experience-installing-hbase-0-20-0-cluster-on-ubuntu-9-04-and-ec2/feed/</wfw:commentRss>
			<slash:comments>10</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">237</post-id>	</item>
		<item>
		<title>Want to work at a Startup with Cool Tech? (HBase, Clojure, Chef, Swarms, Javascript, Ruby &#038; Rails)</title>
		<link>https://www.ibd.com/macintosh/want-to-work-at-a-startup-with-cool-tech-hbase-clojure-chef-swarms-javascript-ruby-rails/</link>
		
		<dc:creator><![CDATA[Robert J Berger]]></dc:creator>
		<pubDate>Fri, 28 Aug 2009 18:15:01 +0000</pubDate>
				<category><![CDATA[Macintosh]]></category>
		<category><![CDATA[Opscode Chef]]></category>
		<category><![CDATA[Ruby / Rails]]></category>
		<category><![CDATA[Runa]]></category>
		<category><![CDATA[Scalable Deployment]]></category>
		<category><![CDATA[AWS]]></category>
		<category><![CDATA[Git]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[HBase]]></category>
		<category><![CDATA[rabbitmq]]></category>
		<category><![CDATA[tweekts]]></category>
		<category><![CDATA[ubuntu]]></category>
		<guid isPermaLink="false">http://blog2.ibd.com/?p=253</guid>

					<description><![CDATA[<p>Opportunity Knocks Runa.com, the startup where I am CTO, is looking for great developers to join our small agile team. We&#8217;re an early stage, pre-series-A startup (presently funded with strategic investments from two large corporations). Runa offers a SaaS to on-line merchant that allows them to offer dynamic product and consumer specific promotions embeded in their website. This will be&#8230;</p>
<p>The post <a href="https://www.ibd.com/macintosh/want-to-work-at-a-startup-with-cool-tech-hbase-clojure-chef-swarms-javascript-ruby-rails/">Want to work at a Startup with Cool Tech? (HBase, Clojure, Chef, Swarms, Javascript, Ruby & Rails)</a> first appeared on <a href="https://www.ibd.com">Cognizant Transmutation</a>.</p>]]></description>
										<content:encoded><![CDATA[<h1 style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;"><strong>Opportunity Knocks</strong></h1>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">Runa.com, the startup where I am CTO, is looking for great developers to join our small agile team. We&#8217;re an early stage, pre-series-A startup (presently funded with strategic investments from two large corporations). Runa offers a SaaS to on-line merchant that allows them to offer dynamic product and consumer specific promotions embeded in their website. This will be a very large positive disruption to the online retailing world.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;"><span style="text-decoration: underline;">Techie keywords:</span> <strong>clojure, hadoop, hbase, rabbitmq, erlang, chef, swarm computing, ruby, rails, javascript, amazon EC2, emacs, Macintosh, Linux, selenium, test/behavior driven development, agile, lean, XP, scalability</strong></p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">If you&#8217;re interested, email  <a href="mailto:jobs@runa.com">jobs@runa.com</a></p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">If you want to know more, read on!</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<h1 style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;"><strong>What do we do</strong></h1>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">Runa aims to provide the top of the long tail thru the middle of the top 500 online retailers with tools/services that companies like amazon.com use/provide. These smaller guys can&#8217;t afford or don&#8217;t have the resources to do anything on that scale, but by using our SaaS services, they can make more money while providing customers with greater value.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">The first service we&#8217;re building is what we call Dynamic Sale Price.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">It&#8217;s a simple concept &#8211; it allows the online-retailer to offer a sale price for each product on his site, personalized to the individual consumer who is browsing it. By using this service, merchants are able to &#8211;</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<ul>
<li>Increase conversion (get them to buy!) and</li>
<li>Offer consumers a special price which maximizes the merchant&#8217;s profit</li>
</ul>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">This is different from &#8220;dumb-discounting&#8221; where something is marked-down, and everyone sees the same price. This service is more like airline or hotel pricing which varies from day to day, but much more dynamic and real-time. Further, it is based on broad statistical factors AND individual consumer behavior. After all, if you lower prices enough, consumers will buy. Instead, we dynamically lower prices to a point where statistically, that consumer is most likely to buy.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<h1 style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;"><strong>How we do it</strong></h1>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">Runa does this by performing statistical analysis and pattern recognition of what consumers are doing on the merchant sites. This includes browsing products on various pages, adding and removing items from carts, and purchasing or abandoning the carts. We track consumers as they browse, and collect vast quantities of this click-stream data. By mining this data and applying algorithms to determine a price point per consumer based on their behavior, we&#8217;re able to  maximize both conversion (getting the consumer to buy) AND merchant profit.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">We also offer the merchant comprehensive reports based on analysis of the mountains of data we collect. Since the data tracks consumer activity down to the individual product SKU level (for each individual consumer), we can provide very rich analytics.  This is a tool that merchants need today, but don&#8217;t have the resources to build for themselves.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<h1 style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;"><strong>The business model</strong></h1>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">For reference, it is useful to understand the affiliate marketing space. Small-to-medium merchants (our target audience) pay affiliates up to 40% of a sale price. Yes, 40%. The average is in the 20% range.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">We charge our merchants around 10% of sales the Runa delivers. Our merchants are happy to pay it, because it is a performance-based pay, lower than what they pay affiliates, and there is zero up-front cost to the service. In fact, the above mentioned analytics reports are free.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">We&#8217;re targeting e-commerce PLATFORMS (as opposed to individual merchants); in this way, we&#8217;re able to scale up merchant-acquisition. We have 10 early-customer merchants right now, with about 100 more planned to go live in the next 2-3 months. By the end of next year, we&#8217;re targeting about 1,000 merchants and 10,000 merchants the following year. Our channel deployment model makes these goals achievable.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">At something like a 5 to 10% service charge, and a typical merchant having between 500K to 1M in sales per year, this is a VERY profitable business model. That is, of course, if we&#8217;re successful&#8230; but we&#8217;re seeing very positive signs so far.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<h1 style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;"><strong>Technology</strong></h1>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">Most of our front-end stuff (like the merchant-dashboard, reports, campaign management) is built with Ruby on Rails. Our merchant integration requires browser-side Javascript magic. All our analytics (batch-processing) and real-time pricing services are written in Clojure. We use RabbitMQ for all our messaging needs. We store data in HBase. We&#8217;re deployed on Amazon&#8217;s EC2.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">Here are a few blog postings about what we&#8217;ve been up to &#8211;</p>
<p><a href="http://s-expressions.com/2009/05/02/startup-logbook-distributed-clojure-system-in-production-v02/" target="_blank">Distributed Clojure system in production</a><br />
<a href="http://s-expressions.com/2009/04/12/using-messaging-for-scalability/" target="_blank">Using messaging for scalability</a><br />
<a href="http://s-expressions.com/2009/03/31/capjure-a-simple-hbase-persistence-layer/" target="_blank">Capjure: a simple HBase persistence layer</a><br />
<a href="http://s-expressions.com/2009/01/28/startup-logbook-clojure-in-production-release-v01/" target="_blank">Clojure in production<br />
</a><span style="color: #0000ee; "><span style="text-decoration: underline;"><a href="http://blog2.ibd.com/scalable-deployment/experience-installing-hbase-0-20-0-cluster-on-ubuntu-9-04-and-ec2/" target="_blank">Experience installing Hbase 0.20.0 Cluster on Ubuntu 9.04 and EC2</a></span></span></p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">We&#8217;ve also open-sourced a few of our projects &#8211;</p>
<p><a href="http://github.com/amitrathore/swarmiji/tree/master" target="_blank">swarmiji</a> &#8211; A distributed computing system to write and run Clojure code in parallel, across CPUs<br />
<a href="http://github.com/amitrathore/capjure/tree/master" target="_blank">capjure</a> &#8211; Clojure persistence for HBase</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<h1 style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;"><strong>Culture at Runa</strong></h1>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">We&#8217;re a small team, very passionate about what we do. We&#8217;re focused on delivering a ground-breaking, disruptive service that will allow merchants to really change the way they sell online. We work start-up hours, but we&#8217;re flexible and laid-back about it. We know that a healthy personal life is important for a good professional life. We work with each other to support it.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">We use an agile process with a lot of influences from the &#8220;Lean&#8221;:http://en.wikipedia.org/wiki/Lean_software_development and &#8220;Kanban&#8221;:http://leansoftwareengineering.com/2007/08/29/kanban-systems-for-software-development/ world. We use &#8220;Mingle&#8221;:http://studios.thoughtworks.com/mingle-agile-project-management to run our development process. Everything, OK mostly everything <img src="https://s.w.org/images/core/emoji/14.0.0/72x72/1f642.png" alt="🙂" class="wp-smiley" style="height: 1em; max-height: 1em;" /> is covered by automated tests, so we can change things as needed.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">We&#8217;re all Apple in the office &#8211; developers get a MacPro with a nice 30&#8243; screen, and a nice 17&#8243; MacBook Pro.  We deploy on Ubuntu servers.  Aeron chairs are cliché, yes; but, very comfy.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">The environment is chilled out&#8230; you can wear shorts and sandals to work&#8230;  Very flat organization, very non-bureaucratic&#8230; nice open spaces (no cubes!). Lunch is brought in on most days! Beer and snacks are always in the fridge.</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">We&#8217;re walking distance to the San Antonio Caltrain station (biking distance from the Mountain View Caltrain/VTA lightrail station).</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<h1 style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;"><strong>What&#8217;s in it for you</strong></h1>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<ul>
<li>Competitive salaries, and lots of stock-options</li>
<li>Cutting edge technology stack</li>
<li>Fantastic business opportunity, and early-stage (= great time to join!)</li>
<li>Developer #5 &#8211; means plenty of influence on foundational architecture and design</li>
<li>Smart, full bandwidth, fun people to work with</li>
<li>Very comfortable, nice office environment</li>
<li>We have a &#8220;No Assholes&#8221; policy</li>
</ul>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<h1 style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;"><strong>OK!</strong></h1>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana; min-height: 15.0px;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">So, if you&#8217;re interested, email us at <a href="mailto:jobs@runa.com">jobs@runa.com</a></p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">No recruiters please!</p>
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">
<p style="margin: 0.0px 0.0px 0.0px 0.0px; line-height: 14.0px; font: 12.0px Verdana;">We would prefer folks who are already in the Bay Area (but if you not local and are really great let&#8217;s talk!)</p>
<div><span style="font-family: verdana, arial, helvetica, clean, sans-serif; font-size: small;"><span style="line-height: 14px; white-space: pre-wrap; "><br />
</span></span></div><p>The post <a href="https://www.ibd.com/macintosh/want-to-work-at-a-startup-with-cool-tech-hbase-clojure-chef-swarms-javascript-ruby-rails/">Want to work at a Startup with Cool Tech? (HBase, Clojure, Chef, Swarms, Javascript, Ruby & Rails)</a> first appeared on <a href="https://www.ibd.com">Cognizant Transmutation</a>.</p>]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">253</post-id>	</item>
		<item>
		<title>Hadoop, HDFS and Hbase on Ubuntu &#038; Macintosh Leopard</title>
		<link>https://www.ibd.com/runa/hadoop-hdfs-and-hbase-on-ubuntu/</link>
					<comments>https://www.ibd.com/runa/hadoop-hdfs-and-hbase-on-ubuntu/#comments</comments>
		
		<dc:creator><![CDATA[Robert J Berger]]></dc:creator>
		<pubDate>Tue, 06 Jan 2009 02:19:16 +0000</pubDate>
				<category><![CDATA[Runa]]></category>
		<category><![CDATA[Scalable Deployment]]></category>
		<category><![CDATA[Sysadmin]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[HBase]]></category>
		<category><![CDATA[ubuntu]]></category>
		<guid isPermaLink="false">http://blog2.ibd.com/?p=95</guid>

					<description><![CDATA[<p>UPDATE: This has been replaced by a newer post Experience installing Hbase 0.20.0 Cluster on Ubuntu 9.04 and EC2 . I found that using the pre-built distributions of Hadoop and HBase much better than trying to build from source. I need more Java/Ant-fu to do the build from scratch. The HBase-0.20.0 Release Candidates are really great and seemingly easier to&#8230;</p>
<p>The post <a href="https://www.ibd.com/runa/hadoop-hdfs-and-hbase-on-ubuntu/">Hadoop, HDFS and Hbase on Ubuntu & Macintosh Leopard</a> first appeared on <a href="https://www.ibd.com">Cognizant Transmutation</a>.</p>]]></description>
										<content:encoded><![CDATA[<p><strong>UPDATE: </strong>This has been replaced by a newer post <a href="http://blog2.ibd.com/scalable-deployment/experience-installing-hbase-0-20-0-cluster-on-ubuntu-9-04-and-ec2/" target="_blank">Experience installing Hbase 0.20.0 Cluster on Ubuntu 9.04 and EC2</a> . I found that using the pre-built distributions of Hadoop and HBase much better than trying to build from source. I need more Java/Ant-fu to do the build from scratch. The HBase-0.20.0 Release Candidates are really great and seemingly easier to get the cluster going than previous releases.</p>
<h2>Introduction</h2>
<p>Hadoop and Map / Reduce are all the rage now days, so we figure we should be using it too.</p>
<p>Hbase is an implementation of Google&#8217;s Bigtable. Its built on top of the Hadoop File System (HDFS).</p>
<p>Its trivial to install it as a standalone on top of a filesystem, but I had some difficulty getting it working on top of HDFS in the &#8220;Pseudo-Distributed&#8221; mode.</p>
<h2>Follow the Instructions</h2>
<p>I set up Hadoop with no problems following the <a href="http://hadoop.apache.org/core/docs/current/quickstart.html#PseudoDistributed">instructions on the Hadoop site</a>for Pseudo-Distributed Operation which runs Hbase on top of HDFS but everything runs on one server (I.E. Its configured pretty much like a cluster but all the pieces are on the same server). Another helpful set of instructions are at <a href="http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Single-Node_Cluster%29">Running Hadoop On Ubuntu Linux (Single-Node Cluster)</a>.</p>
<p>I followed the <a href="http://hadoop.apache.org/hbase/docs/current/api/overview-summary.html#overview_description">HBase installation instructions</a> also for Pseudo-Distributed Operation.</p>
<p>A few things to be aware of:</p>
<ul>
<li> Make sure that the Hadoop version and the Hbase major version numbers are the same<br />
(I used Hadoop 0.18.2 and Hbase 0.18.1)</li>
<li> Make sure that the Hadoop, Hbase trees as well as the directories and files that hold the hdfs filesystem are owned by hadoop:hadoop (You have to create the user and group)</li>
<li> No need to disable ipv6 as some sites said</li>
</ul>
<p>You can download the Hadoop tar file from <a href="http://www.apache.org/dyn/closer.cgi/hadoop/core/" target="_blank">http://www.apache.org/dyn/closer.cgi/hadoop/core/</a> and the Hbase tar file from <a href="http://www.apache.org/dyn/closer.cgi/hadoop/hbase/" target="_blank">http://www.apache.org/dyn/closer.cgi/hadoop/hbase/<br />
</a> They are also available as git repositories via:</p>
<pre>git clone git://git.apache.org/hadoop.git
git clone git://git.apache.org/hbase.git</pre>
<p>You can track a particular branch with the command (We&#8217;re stuck at hadoop 0.19.1 / hbase 0.19.0:</p>
<pre>cd hadoop
git branch --track release-0.19.1 origin/tags/release-0.19.1
git checkout release-0.19.1
cd ../hbase
git branch --track 0.19.0 origin/tags/0.19.0
git checkout 0.19.0</pre>
<p>Then in each directory build things. As far as I can tell you just need to use the default ant build. But you can build the jar also:</p>
<pre>cd ../hadoop
ant
ant jar</pre>
<pre>cd ../hbase
ant
ant jar</pre>
<h2>Biggest Problem I Had</h2>
<p>The thing that took the longest time to get right was when I wanted to access Hbase from other hosts. You would think you could put the DNS Fully Qualified Domain Name (FQDN) in the config file. Turns out that by default, the Hadoop tools don&#8217;t seem to use the host&#8217;s DNS resolver and just what is in /etc/hosts (as far as I can tell). So you have to use the IP address in the config file.</p>
<p>I believe there are ways to configure around this but I haven&#8217;t found it yet.</p>
<h2>Configuration Examples</h2>
<h2>File System Layout</h2>
<p>I untarred the distributions into /usr/local/pkgs and made symbolic links to /usr/local/hadoop  and /usr/local/hbase  as well as created the directory where Hadoop/HDFS will use for storage.</p>
<p>For Ubuntu:</p>
<pre>sudo addgroup hadoop
sudo adduser --ingroup hadoop hadoop</pre>
<p>For Mac:</p>
<p>Create a Home Directory</p>
<pre>mkdir /Users/_hadoop</pre>
<p>Find an unused groupid by seeing what ids are already in use:</p>
<pre>sudo dscl . -list /Groups PrimaryGroupID | cut -c 32-34 | sort -rn</pre>
<p>Then find an unused userid by seeing what userid&#8217;s are in use:</p>
<pre>sudo dscl . -list /Users UniqueID | cut -c 20-22 | sort -rn</pre>
<p>Pick a number that is in neither list. In our case we will use 402 for both the userid and groupid for _hadoop (Mac OS X has an underscore in front of daemon user/group names. We will also</p>
<pre>sudo dscl . -create /Groups/_hadoop PrimaryGroupID 402
sudo dscl . -append /Groups/_hadoop RecordName hadoop</pre>
<p>Take the Value of dsAttrTypeStandard:PrimaryGroupID in this case 500, and use it as the groupid in the following command:</p>
<pre>sudo dscl . -create /Users/_hadoop UniqueID 402
sudo dscl . -create /Users/_hadoop RealName "Hadoop Service"
sudo dscl . -create /Users/_hadoop PrimaryGroupID 402
sudo dscl . -create /Users/_hadoop NFSHomeDirectory /Users/_hadoop
sudo dscl . -append /Users/_hadoop RecordName hadoop</pre>
<p>For both Ubuntu and Mac (Note that the Mac will end up having a user/group id of _hadoop)</p>
<pre>cd /usr/local/pkgs
tar xzf hadoop-0.18.2.tar.gz
tar xzf hbase-0.18.1.tar.gz

cd ..
ln -s /usr/local/pkgs/hadoop-0.18.2 hadoop
ln -s /usr/local/pkgs/hbase-0.18.1 hbase
mkdir /var/hadoop_datastore
chown -R hadoop:hadoop hadoop/ hbase/ /var/hadoop_datastore /Users/_hadoop</pre>
<h2>Hadoop Config files</h2>
<p>The following are all in /usr/local/hadoop/conf</p>
<h4>hadoop-env.sh</h4>
<p>Need to set the JAVA_HOME variable. I installed java 6 via synoptic. You can also install it with:</p>
<pre><span style="font-family: Georgia; line-height: 19px; white-space: normal;">a</span>pt-get install sun-java6-jdk</pre>
<p>The Macintosh is a easy if you have a Intel Core 2 Dual (the Intel Core Dual doesn&#8217;t count). Apple is only supporting Java 1.6 on their 64 bit processors. If you have a 32 bit processor like the first generation Macbook Pro 17&#8243; or first generation MacMini, or you have a PPC see <a href="http://wiki.netbeans.org/JavaFXAndJDK6On32BitMacOS" target="_blank">Tech Tip: How to Set Up JDK 6 and JavaFX on 32-bit Intel Macs</a></p>
<p>So my config is (only the things I changed, the rest was left as is):</p>
<pre>...
# The java implementation to use.  Required.
# export JAVA_HOME=/usr/lib/j2sdk1.5-sun
 export JAVA_HOME=/usr/lib/jvm/java-6-sun
...</pre>
<p>For the Macintosh:</p>
<pre>export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/Current</pre>
<h4>hadoop-site.xml</h4>
<pre>&lt;?xml version="1.0"?&gt;
&lt;?xml-stylesheet type="text/xsl" href="configuration.xsl"?&gt;
&lt;!-- Put site-specific property overrides in this file. --&gt;
&lt;configuration&gt;
&lt;property&gt;
  &lt;name&gt;hadoop.tmp.dir&lt;/name&gt;
  &lt;value&gt;/var/hadoop_datastore/hadoop-${user.name}&lt;/value&gt;
  &lt;description&gt;A base for other temporary directories.&lt;/description&gt;
&lt;/property&gt;

&lt;property&gt;
  &lt;name&gt;fs.default.name&lt;/name&gt;
  &lt;value&gt;hdfs://localhost:54310&lt;/value&gt;
  &lt;description&gt;The name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem.&lt;/description&gt;
&lt;/property&gt;

&lt;property&gt;
  &lt;name&gt;mapred.job.tracker&lt;/name&gt;
  &lt;value&gt;localhost:54311&lt;/value&gt;
  &lt;description&gt;The host and port that the MapReduce job tracker runs
  at.  If "local", then jobs are run in-process as a single map
  and reduce task.
  &lt;/description&gt;
&lt;/property&gt;

&lt;property&gt;
  &lt;name&gt;dfs.replication&lt;/name&gt;
  &lt;value&gt;1&lt;/value&gt;
  &lt;description&gt;Default block replication.
  The actual number of replications can be specified when the file is created.
  The default is used if replication is not specified in create time.
  &lt;/description&gt;
&lt;/property&gt;
&lt;!-- As per note in http://mail-archives.apache.org/mod_mbox/hadoop-hbase-user/200810.mbox/&lt;C20126171.post@talk.nabble.com&gt; --&gt;
&lt;property&gt;
  &lt;name&gt;dfs.datanode.socket.write.timeout&lt;/name&gt;
  &lt;value&gt;0&lt;/value&gt;
&lt;/property&gt;

&lt;property&gt;
   &lt;name&gt;dfs.datanode.max.xcievers&lt;/name&gt;
   &lt;value&gt;1023&lt;/value&gt;
&lt;/property&gt;
&lt;/configuration&gt;</pre>
<h2>HBase Config Files</h2>
<p>The following are all in /usr/local/hbase/conf</p>
<h4>hbase-env.sh</h4>
<p>Again, just need to set up JAVA_HOME:</p>
<pre>...
# The java implementation to use.  Required.
# export JAVA_HOME=/usr/lib/j2sdk1.5-sun
export JAVA_HOME=/usr/lib/jvm/java-6-sun
...</pre>
<p>For the Macintosh:</p>
<pre>export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/Current</pre>
<h4>hbase-site.xml</h4>
<p>Here is where I wanted to give a FQDN for the host that is the hbase.master, but had to use an IP address instead.</p>
<pre>&lt;?xml version="1.0"?&gt;
&lt;?xml-stylesheet type="text/xsl" href="configuration.xsl"?&gt;
&lt;configuration&gt;
  &lt;property&gt;
    &lt;name&gt;hbase.rootdir&lt;/name&gt;
    &lt;value&gt;hdfs://localhost:54310/hbase&lt;/value&gt;
    &lt;description&gt;The directory shared by region servers.
    Should be fully-qualified to include the filesystem to use.
    E.g: hdfs://NAMENODE_SERVER:PORT/HBASE_ROOTDIR
    &lt;/description&gt;
  &lt;/property&gt;

  &lt;property&gt;
    &lt;name&gt;hbase.master&lt;/name&gt;
    &lt;value&gt;192.168.10.50:60000&lt;/value&gt;
    &lt;description&gt;The host and port that the HBase master runs at.
    &lt;/description&gt;
  &lt;/property&gt;
&lt;/configuration&gt;</pre>
<h2>Formatting the Name Node</h2>
<p>You must do this as the same user as will be running the daemon (hadoop)</p>
<pre>su hadoop -s /bin/sh -c /usr/local/hadoop/bin/hadoop namenode -format</pre>
<p>on the Mac:</p>
<pre>/usr/bin/su _hadoop /usr/local/hadoop/bin/hadoop namenode -format</pre>
<h2>Setup passphraseless ssh</h2>
<p>Now check that you can ssh to the localhost without a passphrase:</p>
<pre>su - hadoop
ssh localhost</pre>
<p>If you cannot ssh to localhost without a passphrase, execute the following commands (as haddop):</p>
<pre>$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub &gt;&gt; ~/.ssh/authorized_keys</pre>
<h2>Ubuntu /etc/init.d style startup scripts</h2>
<p>I scoured the InterTubes for example hadoop/hbase startup scripts and found absolutely none! I ended up creating a minimal one that is so far only suited for the Pseudo-Distributed Operation mode as it just calls the start-all / stop-all scripts.</p>
<h4>/etc/init.d/hadoop</h4>
<p>Create the place it will put its startup logs</p>
<pre>mkdir /var/log/hadoop</pre>
<p>Create /etc/init.d/hadoop with the following:</p>
<pre>#!/bin/sh
### BEGIN INIT INFO
# Provides:          hadoop services
# Required-Start:    $network
# Required-Stop:     $network
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Description:       Hadoop services
# Short-Description: Enable Hadoop services including hdfs
### END INIT INFO
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
HADOOP_BIN=/usr/local/hadoop/bin
NAME=hadoop
DESC=hadoop
USER=hadoop
ROTATE_SUFFIX=
test -x $HADOOP_BIN || exit 0
RETVAL=0
set -e
cd /

start_hadoop () {
    set +e
    su $USER -s /bin/sh -c $HADOOP_BIN/start-all.sh &gt; /var/log/hadoop/startup_log
    case "$?" in
      0)
        echo SUCCESS
        RETVAL=0
        ;;
      1)
        echo TIMEOUT - check /var/log/hadoop/startup_log
        RETVAL=1
        ;;
      *)
        echo FAILED - check /var/log/hadoop/startup_log
        RETVAL=1
        ;;
    esac
    set -e
}

stop_hadoop () {
    set +e
    if [ $RETVAL = 0 ] ; then
        su $USER -s /bin/sh -c $HADOOP_BIN/stop-all.sh &gt; /var/log/hadoop/shutdown_log
        RETVAL=$?
        if [ $RETVAL != 0 ] ; then
            echo FAILED - check /var/log/hadoop/shutdown_log
        fi
    else
        echo No nodes running
        RETVAL=0
    fi
    set -e
}

restart_hadoop() {
    stop_hadoop
    start_hadoop
}

case "$1" in
    start)
        echo -n "Starting $DESC: "
        start_hadoop
        echo "$NAME."
        ;;
    stop)
        echo -n "Stopping $DESC: "
        stop_hadoop
        echo "$NAME."
        ;;
    force-reload|restart)
        echo -n "Restarting $DESC: "
        restart_hadoop
        echo "$NAME."
        ;;
    *)
        echo "Usage: $0 {start|stop|restart|force-reload}" &gt;&amp;2
        RETVAL=1
        ;;
esac
exit $RETVAL</pre>
<h4>/etc/init.d/hbase</h4>
<p>Create the place it will put its startup logs</p>
<pre>mkdir /var/log/hbase</pre>
<p>Create /etc/init.d/hbase with the following:</p>
<pre>#!/bin/sh
### BEGIN INIT INFO
# Provides:          hbase services
# Required-Start:    $network
# Required-Stop:     $network
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Description:       Hbase services
# Short-Description: Enable Hbase services including hdfs
### END INIT INFO

PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
HBASE_BIN=/usr/local/hbase/bin
NAME=hbase
DESC=hbase
USER=hadoop
ROTATE_SUFFIX=
test -x $HBASE_BIN || exit 0
RETVAL=0
set -e
cd /

start_hbase () {
    set +e
    su $USER -s /bin/sh -c $HBASE_BIN/start-hbase.sh &gt; /var/log/hbase/startup_log
    case "$?" in
      0)
        echo SUCCESS
        RETVAL=0
        ;;
      1)
        echo TIMEOUT - check /var/log/hbase/startup_log
        RETVAL=1
        ;;
      *)
        echo FAILED - check /var/log/hbase/startup_log
        RETVAL=1
        ;;
    esac
    set -e
}

stop_hbase () {
    set +e
    if [ $RETVAL = 0 ] ; then
        su $USER -s /bin/sh -c $HBASE_BIN/stop-hbase.sh &gt; /var/log/hbase/shutdown_log
        RETVAL=$?
        if [ $RETVAL != 0 ] ; then
            echo FAILED - check /var/log/hbase/shutdown_log
        fi
    else
        echo No nodes running
        RETVAL=0
    fi
    set -e
}

restart_hbase() {
    stop_hbase
    start_hbase
}

case "$1" in
    start)
        echo -n "Starting $DESC: "
        start_hbase
        echo "$NAME."
        ;;
    stop)
        echo -n "Stopping $DESC: "
        stop_hbase
        echo "$NAME."
        ;;
    force-reload|restart)
        echo -n "Restarting $DESC: "
        restart_hbase
        echo "$NAME."
        ;;
    *)
        echo "Usage: $0 {start|stop|restart|force-reload}" &gt;&amp;2
        RETVAL=1
        ;;
esac
exit $RETVAL</pre>
<h4>Set up the init system</h4>
<p>This assumes you put the above init files in /etc/init.d</p>
<pre>chmod +x /etc/init.d/{hbase,hadoop}
update-rc.d hadoop defaults
update-rc.d hbase defaults 25</pre>
<p>You can now start / stop hadoop by saying:</p>
<pre>/etc/init.d/hadoop start</pre>
<pre>/etc/init.d/hadoop stop</pre>
<p>And similarly with hbase</p>
<pre>/etc/init.d/hbase start</pre>
<pre>/etc/init.d/hbase stop</pre>
<p>Make sure you start hadoop before hbase and stop hbase before you stop hadoop</p>
<h2>Macintosh launchd style startup</h2>
<p>Starting proceses on Macintosh Leopard is pretty easy with lauchd/launchctl.</p>
<p>For hadoop, create a file /Library/LaunchAgents/com.yourdomain.hadoop.plist with the following content (replace yourdomain with the domain you want to use for this class of apps):</p>
<pre><code>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"&gt;
&lt;plist version="1.0"&gt;
&lt;dict&gt;
    &lt;key&gt;GroupName&lt;/key&gt;
    &lt;string&gt;_hadoop&lt;/string&gt;
    &lt;key&gt;KeepAlive&lt;/key&gt;
    &lt;true/&gt;
    &lt;key&gt;Label&lt;/key&gt;
    &lt;string&gt;com.yourdomain.hadoop&lt;/string&gt;
    &lt;key&gt;ProgramArguments&lt;/key&gt;
    &lt;array&gt;
        &lt;string&gt;/usr/local/hadoop/bin/start-all.sh&lt;/string&gt;
    &lt;/array&gt;
    &lt;key&gt;RunAtLoad&lt;/key&gt;
    &lt;true/&gt;
    &lt;key&gt;ServiceDescription&lt;/key&gt;
    &lt;string&gt;Hadoop Process&lt;/string&gt;
    &lt;key&gt;UserName&lt;/key&gt;
    &lt;string&gt;_hadoop&lt;/string&gt;
&lt;/dict&gt;
&lt;/plist&gt;
</code></pre>
<p>And for hbase, /Library/LaunchAgents/com.yourdomain.hbase.plist:</p>
<pre><code>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"&gt;
&lt;plist version="1.0"&gt;
&lt;dict&gt;
	&lt;key&gt;GroupName&lt;/key&gt;
	&lt;string&gt;_hadoop&lt;/string&gt;
	&lt;key&gt;KeepAlive&lt;/key&gt;
	&lt;true/&gt;
	&lt;key&gt;Label&lt;/key&gt;
	&lt;string&gt;com.ibd.hbase&lt;/string&gt;
	&lt;key&gt;ProgramArguments&lt;/key&gt;
	&lt;array&gt;
		&lt;string&gt;/usr/local/hbase/bin/start-hbase.sh&lt;/string&gt;
	&lt;/array&gt;
	&lt;key&gt;RunAtLoad&lt;/key&gt;
	&lt;true/&gt;
	&lt;key&gt;UserName&lt;/key&gt;
	&lt;string&gt;_hadoop&lt;/string&gt;
&lt;/dict&gt;
&lt;/plist&gt;
</code></pre>
<p>Set the owner to root and the mode to 644:</p>
<pre>chown root /Library/LaunchAgents/com.yourdomain.hadoop.plist /Library/LaunchAgents/com.yourdomain.hbase.plist
chmod 644 /Library/LaunchAgents/com.yourdomain.hadoop.plist /Library/LaunchAgents/com.yourdomain.hbase.plist</pre>
<p>The next time you restart, it should start hbase and hadoop. You can also start them manually with the commands:</p>
<pre>sudo launchctl load /Library/LaunchAgents/com.yourdomain.hadoop.plist
sudo launchctl load /Library/LaunchAgents/com.yourdomain.hbase.plist</pre>
<h2>Conclusion</h2>
<p>You should now be able to see the HBase web interface at http://&lt;your domain name&gt;:60010</p>
<p>If you have problems check /var/log/{hbase,hadoop}/startup_log as well as /usr/local/hadoop/logs/hadoop-hadoop-namenode-yourhostname.log and /usr/local/hbase/logs/hbase-hadoop-master-yourhostname.log</p>
<p>The error messages are pretty poor. (Ie useless as far as I could tell when tracking down the FQDN/IP Address problem). But better than nothing.</p>
<p>I will post an update when I deploy a Full Cluster.</p><p>The post <a href="https://www.ibd.com/runa/hadoop-hdfs-and-hbase-on-ubuntu/">Hadoop, HDFS and Hbase on Ubuntu & Macintosh Leopard</a> first appeared on <a href="https://www.ibd.com">Cognizant Transmutation</a>.</p>]]></content:encoded>
					
					<wfw:commentRss>https://www.ibd.com/runa/hadoop-hdfs-and-hbase-on-ubuntu/feed/</wfw:commentRss>
			<slash:comments>8</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">95</post-id>	</item>
		<item>
		<title>Deploying RabbitMQ and Stomp on Ubuntu</title>
		<link>https://www.ibd.com/runa/deploying-rabbitmq-and-stomp-on-ubuntu/</link>
					<comments>https://www.ibd.com/runa/deploying-rabbitmq-and-stomp-on-ubuntu/#comments</comments>
		
		<dc:creator><![CDATA[Robert J Berger]]></dc:creator>
		<pubDate>Fri, 02 Jan 2009 10:33:31 +0000</pubDate>
				<category><![CDATA[Runa]]></category>
		<category><![CDATA[Scalable Deployment]]></category>
		<category><![CDATA[Sysadmin]]></category>
		<category><![CDATA[Cloud Computing]]></category>
		<category><![CDATA[rabbitmq]]></category>
		<category><![CDATA[stomp]]></category>
		<category><![CDATA[ubuntu]]></category>
		<guid isPermaLink="false">http://blog2.ibd.com/?p=89</guid>

					<description><![CDATA[<p>Install rabbitmq via synaptic Make sure that the erlang package is installed Add a repository from the rabbitmq site Set up Repository via the Synaptic GUI tool (http://www.rabbitmq.com/debian/) Set up Repository via command line Ubuntu Documentation for Managing Repositories via the Command Line How to use the RabbitMQ Debian repository and available RabbitMQ Debian packages The repositories are described in&#8230;</p>
<p>The post <a href="https://www.ibd.com/runa/deploying-rabbitmq-and-stomp-on-ubuntu/">Deploying RabbitMQ and Stomp on Ubuntu</a> first appeared on <a href="https://www.ibd.com">Cognizant Transmutation</a>.</p>]]></description>
										<content:encoded><![CDATA[<h2>Install rabbitmq via synaptic</h2>
<h3>Make sure that the erlang package is installed</h3>
<h3>Add a repository from the rabbitmq site</h3>
<h4>Set up Repository via the Synaptic GUI tool</h4>
<p>(http://www.rabbitmq.com/debian/)</p>
<h4>Set up Repository via command line</h4>
<p>Ubuntu Documentation for <a href="https://help.ubuntu.com/community/Repositories/CommandLine" target="_blank" rel="noopener">Managing Repositories via the Command Line</a></p>
<p>How to use the RabbitMQ Debian repository and available <a href="http://www.rabbitmq.com/debian.html" target="_blank" rel="noopener">RabbitMQ Debian packages</a></p>
<p>The repositories are described in /etc/apt/sources.list</p>
<p>So do the following:</p>
<pre>sudo cp /etc/apt/sources.list /etc/apt/sources.list.backup</pre>
<p>Edit /etc/apt/sources.list and add the following line:</p>
<pre>deb http://www.rabbitmq.com/debian/ testing main</pre>
<p>Then update the apt-get environment:</p>
<pre class="sourcecode">wget http://www.rabbitmq.com/rabbitmq-signing-key-public.asc
sudo apt-key add rabbitmq-signing-key-public.asc
sudo apt-get update</pre>
<h3>Install the RabbitMQ Server</h3>
<h4>Set up Repository via the Synaptic GUI tool</h4>
<h4>Set up Repository via command line</h4>
<pre class="sourcecode">sudo apt-get install rabbitmq-server</pre>
<p>This should have installed the main portion of the code base in <code>/usr/lib/erlang/lib/rabbitmq_server-1.5.1 (The trailing version number may be different than 1.5.1)</code></p>
<h4>After it installed the server, make sure its stopped</h4>
<p><code># /etc/init.d/rabbitmq-server stop</code></p>
<h2>Install rabbitmq-stomp</h2>
<p>I could not find any ubuntu/debian packages so I installed it from the Mercurial repository. If you don&#8217;t already have Mercurial (the hg command) then you can install it with the following command:</p>
<pre><span style="font-family: Georgia; line-height: 19px; white-space: normal;">a</span>pt-get install mercurial</pre>
<h3>Install the rabbitmq-stomp code</h3>
<p>This also will go parallel to where the ubuntu package put the stomp server main code and the rabbit-codegen.</p>
<pre><code>cd /usr/lib/erlang/lib/
hg clone http://hg.rabbitmq.com/rabbitmq-stomp/
</code></pre>
<h3>Compile the stomp code</h3>
<h4>Build and test run rabbitmq and stomp via make</h4>
<pre><code>cd /usr/lib/erlang/lib/rabbitmq-stomp
make RABBIT_SERVER_SOURCE_ROOT=../rabbitmq_server-1.5.1 all
</code></pre>
<p>This should produce an output like:</p>
<pre><code>mkdir -p ebin
erlc -I ../rabbitmq_server-1.5.0/include -I include -o ebin -Wall +debug_info  src/rabbit_stomp.erl
erlc -I ../rabbitmq_server-1.5.0/include -I include -o ebin -Wall +debug_info  src/stomp_frame.erl </code></pre>
<h3><span style="font-weight: normal;">A</span>dd a file /etc/default/rabbitmq and Restart rabbitmq_server</h3>
<p>You need to tell the main rabbitmq_server to load and run the rabbitmq-stomp stuff when it starts up. You do that by creating this file with the following content:</p>
<pre>SERVER_START_ARGS='
  -pa /usr//lib/erlang/lib/rabbitmq-stomp/ebin
  -rabbit
     stomp_listeners [{"0.0.0.0",61613}]
     extra_startup_steps [{"STOMP-listeners",rabbit_stomp,kickstart,[]}]'</pre>
<h4>Restart the Rabbitmq_server:</h4>
<pre>/etc/init.d/rabbitmq_server start</pre>
<p>You can do a</p>
<pre><span style="font-family: Georgia; line-height: 19px; white-space: normal;">p</span>s -ax | grep stomp</pre>
<p>and see an erlang process that is running the rabbit-stomp process.</p>
<h3>Install ruby stomp client code and test</h3>
<h4>Install the ruby stomp gems</h4>
<p>If you don&#8217;t have ruby already installed:</p>
<pre><code>sudo apt-get install ruby
sudo apt-get install rubygems
</code></pre>
<p>Then install the ruby stomp gem</p>
<pre><code>sudo gem install stomp
</code></pre>
<h4>Run the ruby receiver client in one window</h4>
<p><code>ruby /usr/lib/erlang/lib/rabbit-stomp/examples/ruby/cb-receiver.rb</code></p>
<h4>In another window run the ruby sender client</h4>
<p><code>ruby /usr/lib/erlang/lib/rabbit-stomp/examples/ruby/cb-sender.rb</code></p>
<h4>In the receiver window you should see 10,000 test message lines:</h4>
<pre><code>...
Test Message number 9998
Test Message number 9999
All Done!</code></pre>
<p><span style="font-family: -webkit-monospace;"><strong>That&#8217;s it! Now you can use Stomp</strong></span></p>
<p><span style="font-family: -webkit-monospace;"><strong>(See later post <a title="Permanent Link to Updating RabbitMQ and RabbitMQ-Stomp to RabbitMQ 1.5.3" href="http://blog2.ibd.com/scalable-deployment/updating-rabbitmq-and-rabbitmq-stomp-to-rabbitmq-153/" rel="bookmark">Updating RabbitMQ and RabbitMQ-Stomp to RabbitMQ 1.5.3</a>)</strong></span></p><p>The post <a href="https://www.ibd.com/runa/deploying-rabbitmq-and-stomp-on-ubuntu/">Deploying RabbitMQ and Stomp on Ubuntu</a> first appeared on <a href="https://www.ibd.com">Cognizant Transmutation</a>.</p>]]></content:encoded>
					
					<wfw:commentRss>https://www.ibd.com/runa/deploying-rabbitmq-and-stomp-on-ubuntu/feed/</wfw:commentRss>
			<slash:comments>6</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">89</post-id>	</item>
	</channel>
</rss>
