Monday, March 14, 2016

Random notes: Building a Hadoop test cluster

WARNING: UNDER CONSTRUCTION

We want to install a Hadoop cluster on either Vagrant or a set of remote Linux servers, copy ssh keys to machines to boot strap Ansible on the servers, install Ansible modules, and Ambari Blueprint. This is a parallel bookkeeping problem. The different con files need the same info -the host names and IP addresses.

https://www.vagrantup.com/docs/multi-machine/index.html

https://gist.github.com/ryanwi/9923791

https://sukhjinderkainth.wordpress.com/2015/08/19/create-a-multi-machine-vagrant-config-file/


# -*- mode: ruby -*-
# vi: set ft=ruby :
# inspired from https://gist.github.com/dlutzy/2469037
# inspired from https://gist.github.com/ryanwi/9923791

boxes = [
  { :name => :hmaster, ip: '192.168.33.10', ssh_port: 2223 },
  { :name => :hslave0, ip: '192.168.33.11', ssh_port: 2224 },
  { :name => :hslave1, ip: '192.168.33.12', ssh_port: 2225 },
  { :name => :hslave2, ip: '192.168.33.13', ssh_port: 2226 },
  { :name => :hslave3, ip: '192.168.33.14', ssh_port: 2227 },
  { :name => :hslave4, ip: '192.168.33.15', ssh_port: 2228 },
]

CUSTOM_CONFIG = {
                  "BOX_NAME"  =>  "ubuntu/trusty64",
                  "BOX_URL"   =>  "https://vagrantcloud.com/ubuntu/boxes/trusty64",
                  "HEADLESS"  =>  false
                }

Vagrant.configure("2") do |config|

  # headless?  uncomment this to have the VM's window available
  config.vm.provider :virtualbox do |vb|
    vb.gui = CUSTOM_CONFIG['HEADLESS']
  end

  # Disable default ssh in order to manually assign to a known value
  # https://github.com/mitchellh/vagrant/issues/3232
  config.vm.network :forwarded_port, guest: 22, host: 2222, id: "ssh", disabled: true

  boxes.each do |opts|
    config.vm.define opts[:name] do |boxconfig|
      boxconfig.vm.box        = CUSTOM_CONFIG['BOX_NAME']
      boxconfig.vm.box_url    = CUSTOM_CONFIG['BOX_URL']
      boxconfig.vm.hostname   = "%s.vagrant" % opts[:name].to_s
      boxconfig.vm.network      "private_network", ip: opts[:ip]
      boxconfig.vm.network    :forwarded_port, guest: 22, host: opts[:ssh_port]
    end
  end

  # provisioning with ansible
  # config.vm.provision :ansible do |ansible|
  #   ansible.playbook = "./provisioning/site.yml"
  # end

end

 $ vagrant status
Current machine states:

hmaster                   running (virtualbox)
hslave0                   running (virtualbox)
hslave1                   running (virtualbox)
hslave2                   running (virtualbox)
hslave3                   running (virtualbox)
hslave4                   running (virtualbox)

This environment represents multiple VMs. The VMs are all listed
above with their current state. For more information about a specific
VM, run `vagrant status NAME`.


Now lets add an include file with the server vars to the above Vagrantfile
http://stackoverflow.com/questions/16708917/how-do-i-include-variables-in-my-vagrantfile

Ambari Ansible installation.
http://dataminded.be/blog/how-install-hortonworks-hdp-22-ubuntu-1204

Hortonworks blueprints
https://medium.com/@b23llc/managing-distributed-data-products-with-ansible-and-ambari-44c7175555d8#.ytz5ceqsn

https://cwiki.apache.org/confluence/display/AMBARI/Blueprints#Blueprints-Step0:PrepareAmbariServerandAgents

Blueprint Details
Manually Prepare Ambari Server and Agents
  1. Perform your Ambari Server install and setup.
    yum install ambari-server
    ambari-server setup
  2. After setup completes, start your Ambari Server.
    ambari-server start
  3. Install Ambari Agents on all of the hosts you plan to include in your cluster.
    yum install ambari-agent
  4. Set the Ambari Server on the Ambari Agents.
    vi /etc/ambari-agent/conf/ambari-agent.ini
  5. Set hostname= to the Fully Qualified Domain Name for the Ambari Server. Save and exit.
    hostname=c6401.ambari.apache.org
  6. Start the Agents to initiate registration to Server.
    ambari-agent start
  7. Confirm the Agent hosts are registered with the Server.
    http://your.ambari.server:8080/api/v1/hosts

Generate the ssh key for the Hadoop servers
ssh-keygen

Copy the id_rsa.pub key to the Vagrant folder
cp id_rsa.pub /Vagrant

Script
cat /Vagrant/id_rsa.pub >> `/.ssh/authorized_keys

Copy the ssh id_rsa.pub key to the Vagrant machines
  ssh root@MachineB 'bash -s' < local_script.…