Skip to content

Instantly share code, notes, and snippets.

@isseium
Last active August 29, 2015 14:06
Show Gist options
  • Save isseium/99710ffcfb501c5ce0f0 to your computer and use it in GitHub Desktop.
Save isseium/99710ffcfb501c5ce0f0 to your computer and use it in GitHub Desktop.
いまさらだけどVagrantを使って,R Studio Server + Mahout の分析環境を作ってみる ref: http://qiita.com/isseium/items/d02767bb68210b2bca75
#
# Cookbook Name:: mahout
# Recipe:: default
#
# Copyright 2014, YOUR_COMPANY_NAME
#
# All rights reserved - Do Not Redistribute
#
# TODO: 将来的に attributes に移行
install_version = "0.9"
download_filepath="/tmp/mahout.tar.gz"
install_path = "/usr/local"
install_sym_path = "/usr/local/bin/mahout"
bin_path = "#{install_path}/mahout-distribution-#{install_version}/bin/mahout"
# 必要なパッケージのインストール
%w{java-1.6.0-openjdk java-1.6.0-openjdk-src java-1.6.0-openjdk-devel}.each do |pkg|
package pkg do
action :install
end
end
# mahout ダウンロード
bash "install_mahout" do
code <<-EOL
wget ftp://ftp.riken.jp/net/apache/mahout/#{install_version}/mahout-distribution-#{install_version}.tar.gz -O #{download_filepath}
tar xvzf #{download_filepath} -C #{install_path}
rm -f #{install_sym_path}
ln -s #{bin_path} #{install_sym_path}
rm #{download_filepath}
EOL
end
# bashrc の作成
template "/home/vagrant/.bashrc" do
source "dot.bashrc.erb"
variables :partials => {
"mahout.bashrc.erb" => "hack the planet",
},
:top_level => "I'm a variable from the template resource"
end
# .bashrc
# Source global definitions
if [ -f /etc/bashrc ]; then
. /etc/bashrc
fi
# Uncomment the following line if you don't like systemctl's auto-paging feature:
# export SYSTEMD_PAGER=
# User specific aliases and functions
# Mahout
<%= render "mahout.bashrc.erb" %>
$ vagrant --version
$ vagrant box add centos70 http://opscode-vm-bento.s3.amazonaws.com/vagrant/virtualbox/opscode_centos-7.0_chef-provisionerless.box
(中略)
# box の確認
$ vagrant box list
centos70 (virtualbox, 0) # vmware だと vmware_desktop と表示されます
$ knife cookbook create mahout -o site-cookbooks/
$ berks vendor
$ cd ..
$ vargrant reload --provision
$ vagrant ssh
$ mahout
hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
An example program must be given as the first argument.
Valid program names are:
arff.vector: : Generate Vectors from an ARFF file or directory
baumwelch: : Baum-Welch algorithm for unsupervised HMM training
canopy: : Canopy clustering
cat: : Print a file or resource as the logistic regression models would see it
cleansvd: : Cleanup and verification of SVD output
clusterdump: : Dump cluster output to text
clusterpp: : Groups Clustering Output In Clusters
cmdump: : Dump confusion matrix in HTML or text formats
concatmatrices: : Concatenates 2 matrices of same cardinality into a single matrix
cvb: : LDA via Collapsed Variation Bayes (0th deriv. approx)
(以下略)
$ mahout recommenditembased --input mydata.dat --numRecommendations 2 --output output/ --similarityClassname SIMILARITY_PEARSON_CORRELATION
(略)
$ cat output/part-r-00000
1 [104:3.9258494]
3 [102:3.2698717]
4 [102:4.7433763]
$ rm -rf output temp
$ mkdir -p ~/vagrant/cheekitrip_development
$ cd ~/vagrant/cheekitrip_development
$ vagrant init centos70
(略)
$ $ vagrant up
(略)
$ vagrant ssh
# 追記内容の確認
$ vagrant ssh-config --host mahout # mahout というホスト名にしています
Host mahout
HostName 127.0.0.1
User vagrant
Port 2222
UserKnownHostsFile /dev/null
StrictHostKeyChecking no
PasswordAuthentication no
IdentityFile /Users/issei/.vagrant.d/insecure_private_key
IdentitiesOnly yes
LogLevel FATAL
# ssh_config に追記
$ vagrant ssh-config --host mahout >> ~/.ssh/config
# ssh コマンド経由でログイン
$ ssh mahout
$ vagrant plugin install vagrant-omnibus
# 90行目くらいに下記を追加
config.omnibus.chef_version = :latest
# 24行目くらいに下記を追加
config.vm.network "forwarded_port", guest: 8787, host: 8787
$ knife solo init MyAnalysis
$ cd MyAnalysis
source "https://supermarket.getchef.com"
cookbook 'yum', git: "git@github.com:opscode-cookbooks/yum.git"
cookbook 'rstudio', git: "git@github.com:takemikami/chef-rstudio.git"
export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk.x86_64/
1,101,5.0
1,102,3.0
1,103,2.5
2,101,2.0
2,102,2.5
2,103,5.0
2,104,2.0
3,101,2.5
3,104,4.0
3,105,4.5
3,107,5.0
4,101,5.0
4,103,3.0
4,104,4.5
4,106,4.0
5,101,4.0
5,102,3.0
5,103,2.0
5,104,4.0
5,105,3.5
5,106,4.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment