First stop apache, nginx, php-fpm if you are running them.
List all the php 5.4 modules:
> yum list installed | grep php54
php54.x86_64 5.4.21-1.46.amzn1 @amzn-updates
php54-bcmath.x86_64 5.4.21-1.46.amzn1 @amzn-updates
php54-cli.x86_64 5.4.21-1.46.amzn1 @amzn-updates
php54-common.x86_64 5.4.21-1.46.amzn1 @amzn-updates
php54-devel.x86_64 5.4.21-1.46.amzn1 @amzn-updates
php54-fpm.x86_64 5.4.21-1.46.amzn1 @amzn-updates
php54-gd.x86_64 5.4.21-1.46.amzn1 @amzn-updates
php54-intl.x86_64 5.4.21-1.46.amzn1 @amzn-updates
php54-mbstring.x86_64 5.4.21-1.46.amzn1 @amzn-updates
php54-mcrypt.x86_64 5.4.21-1.46.amzn1 @amzn-updates
php54-mysqlnd.x86_64 5.4.21-1.46.amzn1 @amzn-updates
php54-pdo.x86_64 5.4.21-1.46.amzn1 @amzn-updates
php54-pecl-apc.x86_64 3.1.13-1.12.amzn1 @amzn-updates
php54-pecl-igbinary.x86_64 1.1.2-0.2.git3b8ab7e.6.amzn1 @amzn-updates
php54-pecl-memcache.x86_64 3.0.7-3.10.amzn1 @amzn-updates
php54-pecl-memcached.x86_64 2.1.0-1.5.amzn1 @amzn-updates
php54-pecl-xdebug.x86_64 2.2.1-1.6.amzn1 @amzn-updates
php54-process.x86_64 5.4.21-1.46.amzn1 @amzn-updates
php54-soap.x86_64 5.4.21-1.46.amzn1 @amzn-updates
php54-xml.x86_64 5.4.21-1.46.amzn1 @amzn-updates
php54-xmlrpc.x86_64 5.4.21-1.46.amzn1 @amzn-updates
Remove all of them:
yum remove php54.x86_64 php54-bcmath.x86_64 php54-cli.x86_64 php54-common.x86_64 php54-devel.x86_64 php54-fpm.x86_64 php54-gd.x86_64 php54-intl.x86_64 php54-mbstring.x86_64 php54-mcrypt.x86_64 php54-mysqlnd.x86_64 php54-pdo.x86_64 php54-pecl-apc.x86_64 php54-pecl-igbinary.x86_64 php54-pecl-memcache.x86_64 php54-pecl-memcached.x86_64 php54-pecl-xdebug.x86_64 php54-process.x86_64 php54-soap.x86_64 php54-xml.x86_64 php54-xmlrpc.x86_64
Install php 5.5
yum install php55.x86_64 php55-bcmath.x86_64 php55-cli.x86_64 php55-common.x86_64 php55-devel.x86_64 php55-fpm.x86_64 php55-gd.x86_64 php55-intl.x86_64 php55-mbstring.x86_64 php55-mcrypt.x86_64 php55-mysqlnd.x86_64 php55-pdo.x86_64 php55-pecl-apc.x86_64 php55-pecl-igbinary.x86_64 php55-pecl-memcache.x86_64 php55-pecl-memcached.x86_64 php55-pecl-xdebug.x86_64 php55-process.x86_64 php55-soap.x86_64 php55-xml.x86_64 php55-xmlrpc.x86_64
You may need to tweak the php-fpm settings
Showing posts with label ec2. Show all posts
Showing posts with label ec2. Show all posts
Thursday, July 2, 2015
Friday, November 15, 2013
Munin not generating graphs - Make sure CRON job is running
I am currently using Ubuntu 12.04 on EC2.
If your munin master is not running, you should check if munin is set up as a CRON job.
List all the scheduled cron jobs:
If your munin master is not running, you should check if munin is set up as a CRON job.
List all the scheduled cron jobs:
crontab -lIf munin-cron is not set up, we will add it. Edit the crontab file
crontab -eLet's make munin master run every 5 mins. Append the following to the end of the file
*/5 * * * * /usr/bin/munin-cronLet's make munin run.
sudo -u munin munin-cron
Wednesday, October 9, 2013
Elastic Search on EC2 - Install ES cluster on Amazon Linux AMI
We will install ElasticSearch (ES) on a EC2 instance.
Attach Two EBS drives
We will be using one for saving data and one for logging. Create and attach two EBS drives in the AWS console.
You will have two volumes: /dev/xvdf and /dev/xvdg. Let's format them using XFS.
Create the data and log paths.
Here's the specs:
- Amazon Linux AMI 2013.09
- Medium instance
- 64-bit machine
- Elastic Search 0.90.5
- Spring MVC
- Maven
Begin by launching an instance. You may get an out of memory error in /var/log/syslog if you use a micro instance when you launch a machine. If you are not sure how to launch an instance, read Amazon EC2 - Launching Ubuntu Server 12.04.1 LTS step by step guide.
For the security group, you will need to open the following ports:
- 22 (SSH)
- 9300 (ElasticSearch Transport)
- 9200 (HTTP Testing)
Attach Two EBS drives
We will be using one for saving data and one for logging. Create and attach two EBS drives in the AWS console.
You will have two volumes: /dev/xvdf and /dev/xvdg. Let's format them using XFS.
yum -y install xfsprogs xfsdumpMake the data drive /vol. Make the log drive /vol1.
sudo mkfs.xfs /dev/xvdf
sudo mkfs.xfs /dev/xvdg
vi /etc/fstabAppend the following:
/dev/xvdf /vol xfs noatime 0 0Mount the drives
/dev/xvdg /vo1 xfs noatime 0 0
mkdir /volRead Amazon EC2 - Mounting a EBS drive for more information.
mkdir /vol1
mount /vol
mount /vol1
ssh into the instance
ssh -i {key} ubuntu@{ec2_public_address}
Update the machine
sudo yum -y update
Install Oracle Sun Java
In order to run ES efficiently, a JVM must be able to allocate large virtual address space and perform garbage collection on large heaps without pausing JVM. There are also some stories online talking about OpenJDK is not as good as Oracle Java for ES. Feel free to let me know in the comments below if this is not the case.
Download Java 7 from Oracle.
Put it in /usr/lib/jvm.
Extract and install it
tar -zxvf jdk-7u40-linux-x64.gzRename the folder from jdk1.7.0_40 to jdk1.7.0
You should now have jdk1.7.0 inside /usr/lib/jvm
Set java, javac.
sudo /usr/sbin/alternatives --install "/usr/bin/java" "java" "/usr/lib/jvm/jdk1.7.0/bin/java" 1
sudo /usr/sbin/alternatives --install "/usr/bin/javac" "javac" "/usr/lib/jvm/jdk1.7.0/bin/javac" 1
Correct the permissions.
sudo chmod a+x /usr/bin/java
sudo chmod a+x /usr/bin/javac
sudo chown -R root:root /usr/lib/jvm/jdk1.7.0
Set to the Sun Java by:
sudo /usr/sbin/alternatives --config java
Check your java version.
java -version
Download and install ElasticSearch
Download ElasticSearch (Current version as of this writing is 0.90.5).
sudo su
mkdir /opt/tools
cd /opt/tools
wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.5.zip
unzip elasticsearch-0.90.5.zip
Install ElasticSearch Cloud AWS plugin.
cd elasticsearch-0.90.5
bin/plugin -install elasticsearch/elasticsearch-cloud-aws/1.15.0
Configuring ES
AWS can shut down your instances at any time. If you are storing indexed data in ephemeral drives, you will lose all the data when all the instances are shut down.
There are were two ways to persist data:
- Store data in EBS via local gateway
- Store data in S3 via S3 gateway
A restart of the nodes would begin to recover data from the gateway. The EBS route is better for performance, while the S3 route is better for persistence [S3 is deprecated].
We will be setting up a ES cluster and use a local gateway. S3 gateway is deprecated at the time of this writing. The ES team has promised a new backup mechanism in the future.
vi /opt/tools/elasticsearch-0.90.5/config/elasticsearch.yml
cluster.name: mycluster
cloud:
aws:
access_key:
secret_key:
region: us-east-1
region: us-east-1
discovery:
type: ec2
We have specified a cluster called "mycluster" above. You will need to input your aws access keys and create a S3 bucket.
We also need to ensure the JVM does not swap by doing two things:
1) Locking the memory (find this setting inside elasticsearch.yml)
bootstrap.mlockall: true
2) Set ES_MIN_MEM and ES_MAX_MEM to the same value. It is also recommended to set them to half of the system's available ram. We will set this in the ElasticSearch Service Wrapper later in the article.
Create the data and log paths.
mkdir /vol/elasticsearch/dataSet the data and log paths in /config/elasticsearch.yml
mkdir /vol1/elasticsearch/log
path.data: /vol/elasticsearch/data
path.logs: /vol1/elasticsearch/logs
Let's edit config/logging.yml
vi /opt/tools/elasticsearch-0.90.5/config/logging.yml
Edit these settings and make sure these lines are uncommented and present
logger:
gateway: DEBUG
org.apache: WARN
discovery: TRACE
Testing the cluster
Installing ElasticSearch as a Service
We will be using the ElasticSearch Java Service Wrapper.
Download the service wrapper and move it to bin/service.
Tweaking the memory settings
There will be three settings you want to care about:
We will be tweaking these settings in the service wrapper's elasticsearch.conf instead of elasticsearch's.
vi /opt/tools/elasticsearch-0.90.5/bin/service/elasticsearch.conf
set.default.ES_HEAP_SIZE=1024
There are a few things you need to beware of.
bin/elasticsearch -fBrowse to the ec2 address at port 9200
http://ec2-XX-XXX-XXX-XXX.compute-1.amazonaws.com:9200/You should see the following:
{ "ok" : true, "status" : 200, "name" : "Storm", "version" : { "number" : "0.90.5", "build_hash" : "c8714e8e0620b62638f660f6144831792b9dedee", "build_timestamp" : "2013-09-17T12:50:20Z", "build_snapshot" : false, "lucene_version" : "4.4" }, "tagline" : "You Know, for Search"}
Installing ElasticSearch as a Service
We will be using the ElasticSearch Java Service Wrapper.
Download the service wrapper and move it to bin/service.
curl -L -k http://github.com/elasticsearch/elasticsearch-servicewrapper/tarball/master | tar -xzMake ElasticSearch to start automatically when system reboots.
mv/service /opt/tools/elasticsearch-0.90.5/bin
bin/service/elasticsearch installMake ElasticSearch Service a defaul command (we will call this es_service)
ln -s /opt/tools/elasticsearch-0.90.5/bin/service/elasticsearch /usr/bin/es_serviceStart the service
es_service startYou should see:
Starting ElasticSearch...
Waiting for ElasticSearch......
running: PID:2503
Tweaking the memory settings
There will be three settings you want to care about:
- ES_HEAP_SIZE
- ES_MIN_MEM
- ES_MAX_MEM
It is recommended to set ES_MIN_MEM to be the same as ES_MAX_MEM. However, you can just set ES_HEAP_SIZE as it will be assigned to both ES_MIN_MEM and ES_MAX_MEM.
We will be tweaking these settings in the service wrapper's elasticsearch.conf instead of elasticsearch's.
vi /opt/tools/elasticsearch-0.90.5/bin/service/elasticsearch.conf
set.default.ES_HEAP_SIZE=1024
There are a few things you need to beware of.
- You need to leave some memory for the OS for non elasticsearch operations. Try leaving at least half of the available memory.
- As a reference, use 1024Mb for every 1 million documents you are saving.
Restart the service.
Ubuntu EC2 - Install Sun Oracle Java
Download Java 7 from Oracle.
Put it in /usr/lib/jvm.
Extract and install it
tar -zxvf jdk-7u40-linux-x64.gzRename the folder from jdk1.7.0_40 to jdk1.7.0
You should now have jdk1.7.0 inside /usr/lib/jvm
Set java, javac.
sudo update-alternatives --install "/usr/bin/java" "java" "/usr/lib/jvm/jdk1.7.0/bin/java" 1
sudo update-alternatives --install "/usr/bin/javac" "javac" "/usr/lib/jvm/jdk1.7.0/bin/javac" 1
Correct the permissions.
sudo chmod a+x /usr/bin/java
sudo chmod a+x /usr/bin/javac
sudo chown -R root:root /usr/lib/jvm/jdk1.7.0
If you have more than one version of java, you can always switch them using
sudo update-alternatives --config java
Check your java version.
java -version
Monday, September 30, 2013
ElasticSearch Query - how to insert and retreive search data
ElasticSearch uses HTTP Methods (ex. GET, POST, PUT, DELETE) to retrieve, save, and delete search data from its index.
To search based on the nested properties (Ex. first_name, last_name), we can do the following:
Deleting the document
Similarly, we can delete the subject index by a DELETE request.
Creating Document with settings and mappings
If you want to adjust settings like number of shards and replicas, you may find the following useful. The more shards you have, the better the indexing performance. The more replicas you have, the better the searching performance.
Checking the Mapping
For simplicity, we will use curl to demonstrate some usages. If you haven't done so already, start ElasticSearch in your terminal.
Adding a document
We will send a HTTP POST request to add the subject "sports" to an index. The request will have the following form:
curl -XPOST "http://localhost:9200/{index}/{type}/{id}" -d '{"key0": "value0", ... , "keyX": "valueX"}'
Example:
curl -XPOST "http://localhost:9200/subjects/subject/1" -d '{"name": "sports", "creator": {"first_name":"John", "last_name":"Smith"}}'
Retrieving the document
We can get back the document by sending a GET request.
curl -X GET "http://localhost:9200/subjects/_search?q=sports"
We can also use a POST request to query the above.
curl -X POST "http://localhost:9200/subjects/_search" -d '{
"query": {"term":{"name":"sports"}}
}'
Both of the above will give you the following:
{"took":1,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":0.30685282,"hits":[{"_index":"subjects","_type":"subject","_id":"1","_score":0.30685282, "_source" : {"name": "sports"}}]}}The _source filed above holds the results for the query.
To search based on the nested properties (Ex. first_name, last_name), we can do the following:
curl -XGET "http://localhost:9200/subjects/_search?q=subject.creator.first_name:John"All the above queries will return the same results.
curl -XGET "http://localhost:9200/subjects/subject/_search?q=creator.first_name:John"
curl -XGET "http://localhost:9200/subjects/subject/_search?q=subject.creator.first_name:John"
Deleting the document
Similarly, we can delete the subject index by a DELETE request.
curl -X DELETE "http://localhost:9200/subjects"
Creating Document with settings and mappings
If you want to adjust settings like number of shards and replicas, you may find the following useful. The more shards you have, the better the indexing performance. The more replicas you have, the better the searching performance.
curl -X PUT "http://localhost:9200/subjects" -d 'The above created an index called subjects. Each document in the index has a property called name.
{"settings":{"index":{"number_of_shards":3, "number_of_replicas":2}}},
{"mappings":{"document": {
"properties": {
"name" : {"type":string, "analyzer":"full_text"}
}
}
}
}'
Checking the Mapping
curl -X GET "http://localhost:9200/subjects/_mapping?pretty=true"You should see
{The pretty parameter above just formats the JSON result in a human readable format.
"subjects" : { }
}
How to Install ElasticSearch on EC2
Search is not easy. There are a lot of things you need to consider.
In the software level,
Can a search query have spelling mistakes?
Should stop words (Ex. a, the) be filtered?
What about a phrase search given non-exact phrase?
In the operation level,
Should the search be decoupled from the app machines?
Should the search be distributed? If so, how many shards, replicas should be there?
Doing a quick search would tell you that Apache Lucene is the industry standard. There are two popular abstractions on top of Lucene: Solr and ElasticSearch (ES).
There are a lot of debates on which one should be used. I choose ES because
The following post will talk about how you can install ElasticSearch in your linux machine (I like to use the ubuntu 12.04 build from EC2).
Download elasticsearch from elasticsearch.org. Extract the files and put it into a folder of your choice (Ex. /opt/tools).
After you started your service, visit "http://localhost:9200" in the browser. You should see the following:
{
"ok" : true,
"status" : 200,
"name" : "Solitaire",
"version" : {
"number" : "0.90.5",
"build_hash" : "c8714e8e0620b62638f660f6144831792b9dedee",
"build_timestamp" : "2013-09-17T12:50:20Z",
"build_snapshot" : false,
"lucene_version" : "4.4"
},
"tagline" : "You Know, for Search"
}
In the software level,
Can a search query have spelling mistakes?
Should stop words (Ex. a, the) be filtered?
What about a phrase search given non-exact phrase?
In the operation level,
Should the search be decoupled from the app machines?
Should the search be distributed? If so, how many shards, replicas should be there?
Doing a quick search would tell you that Apache Lucene is the industry standard. There are two popular abstractions on top of Lucene: Solr and ElasticSearch (ES).
There are a lot of debates on which one should be used. I choose ES because
- it's distributed by design
- easier to integrate for AWS EC2
The following post will talk about how you can install ElasticSearch in your linux machine (I like to use the ubuntu 12.04 build from EC2).
Download elasticsearch from elasticsearch.org. Extract the files and put it into a folder of your choice (Ex. /opt/tools).
cd /opt/toolsYou can start elasticsearch by:
wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.5.zip
unzip elasticsearch-0.90.5.zip
bin/elasticsearch -fYou may want to tweak the Xmx (max memory size the heap can reach for the JVM) and Xms (the inistal heap memory size for the JVM) values.
bin/elasticsearch -f -Xmx2g -Xms2g -Des.index.storage.type=memory -Des.max-open-files=trueYou can also run it as a service using the script located in bin/service.
After you started your service, visit "http://localhost:9200" in the browser. You should see the following:
{
"ok" : true,
"status" : 200,
"name" : "Solitaire",
"version" : {
"number" : "0.90.5",
"build_hash" : "c8714e8e0620b62638f660f6144831792b9dedee",
"build_timestamp" : "2013-09-17T12:50:20Z",
"build_snapshot" : false,
"lucene_version" : "4.4"
},
"tagline" : "You Know, for Search"
}
Wednesday, July 17, 2013
Ansbile EC2 - setting up Nginx, MySQL, php, git
In this post, we will write a playbook that's going to set up a EC2 machine for a fully workable php environment.
Starting from a fresh machine with an attached ebs volume, we will do the following:
Begin by spinning a fresh EC2 AMI and attach a ebs volume to it. Read Ansible - how to launch EC2 instances and setup the php environment.
Format the new ebs volume with XFS and mount it as /vol
We will mount the new ebs volume /dev/xvdf as /vol and format it with XFS
- name: update machine with latest packages
action: command yum -y update
- name: install xfsprogs
action: yum pkg=xfsprogs state=latest
- name: format new volume
filesystem: fstype=xfs dev=/dev/xvdf
- name: edit fstab and mount the vol
action: mount name={{mount_dir}} src=/dev/xvdf opts=noatime fstype=xfs state=mounted
Install php, mysql and nginx
- name: install php
action: yum pkg=php state=latest
- name: install php-mysql
action: yum pkg=php-mysql state=latest
- name: install nginx
action: yum pkg=nginx state=latest
- name: ensure nginx is running
action: service name=nginx state=started
- name: install mysql server
action: yum pkg=mysql-server state=latest
- name: make sure mysql is running
action: service name=mysqld state=started
Create a mysql user and a database
- name: install python mysql
action: yum pkg=MySQL-python state=latest
- name: create database user
action: mysql_user user=admin password=1234qwer priv=*.*:ALL state=present
- name: create db
action: mysql_db db=ansible state=present
Copy the public and private keys into the targeted machine
We want the target machine to be able to do a git pull without username and password prompts.
mkdir ~/.ssh
ssh-keygen -t rsa -C "you@email.com"
You will see:
Two files will be generated: id_rsa, id_rsa.pub
Log in to Github and then Go to Account Settings -> SSH Keys
Add new key by giving it a name and pasting the content of id_rsa.pub
Test it by:
- name: install git
action: yum pkg=git state=latest
- name: copy private key
action: template src=~/.ssh/id_rsa.pub dest=~/.ssh/id_rsa.pub
- name: copy public key
action: template src=~/.ssh/id_rsa dest=~/.ssh/id_rsa
Checkout a project from github
- name: git checkout source
action: git repo=ssh://git@github.com:{your_git_repo}.git dest={{work_dir}} version=unstable
Full Ansible Playbook source:
Starting from a fresh machine with an attached ebs volume, we will do the following:
- Format the new ebs volume with XFS and mount it as /vol
- Install php, mysql and nginx
- Create a mysql user and create a database
- Copy the public and private keys into the targeted machine
- Checkout a project from github
Begin by spinning a fresh EC2 AMI and attach a ebs volume to it. Read Ansible - how to launch EC2 instances and setup the php environment.
Format the new ebs volume with XFS and mount it as /vol
We will mount the new ebs volume /dev/xvdf as /vol and format it with XFS
- name: update machine with latest packages
action: command yum -y update
- name: install xfsprogs
action: yum pkg=xfsprogs state=latest
- name: format new volume
filesystem: fstype=xfs dev=/dev/xvdf
- name: edit fstab and mount the vol
action: mount name={{mount_dir}} src=/dev/xvdf opts=noatime fstype=xfs state=mounted
Install php, mysql and nginx
- name: install php
action: yum pkg=php state=latest
- name: install php-mysql
action: yum pkg=php-mysql state=latest
- name: install nginx
action: yum pkg=nginx state=latest
- name: ensure nginx is running
action: service name=nginx state=started
- name: install mysql server
action: yum pkg=mysql-server state=latest
- name: make sure mysql is running
action: service name=mysqld state=started
Create a mysql user and a database
- name: install python mysql
action: yum pkg=MySQL-python state=latest
- name: create database user
action: mysql_user user=admin password=1234qwer priv=*.*:ALL state=present
- name: create db
action: mysql_db db=ansible state=present
Copy the public and private keys into the targeted machine
We want the target machine to be able to do a git pull without username and password prompts.
mkdir ~/.ssh
ssh-keygen -t rsa -C "you@email.com"
You will see:
Enter file in which to save the key (/root/.ssh/id_rsa):Just press Enter on the above prompts.
Enter passphrase (empty for no passphrase):
Two files will be generated: id_rsa, id_rsa.pub
Log in to Github and then Go to Account Settings -> SSH Keys
Add new key by giving it a name and pasting the content of id_rsa.pub
Test it by:
ssh -T git@github.comHere are the Ansible tasks:
- name: install git
action: yum pkg=git state=latest
- name: copy private key
action: template src=~/.ssh/id_rsa.pub dest=~/.ssh/id_rsa.pub
- name: copy public key
action: template src=~/.ssh/id_rsa dest=~/.ssh/id_rsa
Checkout a project from github
- name: git checkout source
action: git repo=ssh://git@github.com:{your_git_repo}.git dest={{work_dir}} version=unstable
Tuesday, July 16, 2013
Ansible - how to launch EC2 instances and setup the php environment
In this post, we will create a script that will launch an instance in the EC2 cloud and install php and nginx (Installing httpd is going to be very similar) on it.
First you will need to set be Ansible.
If you are using ubuntu, read Install Ansible on ubuntu EC2.
If you are using a Mac, read Installing and Running Ansible on Mac OSX and pinging ec2 machines.
You must:
Label this launch_playbook.yml
Execute the script.
First you will need to set be Ansible.
If you are using ubuntu, read Install Ansible on ubuntu EC2.
If you are using a Mac, read Installing and Running Ansible on Mac OSX and pinging ec2 machines.
You must:
- have python boto installed
- set up the AWS access keys in the environment settings
Adding a host
We will use the ec2 module. It runs against localhost, so we will add a host entry.
vi /etc/ansible/hosts
Append the following:
localhost ansible_connection=local
Launching a micro instance
Label this launch_playbook.yml
Execute the script.
ansible-playbook launch_playbook.ymlIn your AWS EC2 console, you will see an instance named ansible. Each task is executed in sequence.
Now add this new host in the ansible host file and label it webservers.
vi /etc/ansible/hosts
[webservers]
{the_ip_of_ec2_instance_we_just_created} ansible_connection=ssh ansible_ssh_user=ec2-user ansible_ssh_private_key_file={path_to_aws_private_key}
You don't have to do the above. In fact, you can use the group name "ec2-servers" for the following script. But the following script will need to be in the same file as the first script. I am just separating these files for easier configuration in the future.
Installing php, nginx, mysql
Label this configure_playbook.yml
Execute the script.
Remember to terminate the instance when you finish, else it will incur charges.
Execute the script.
ansible-playbook configure_playbook.ymlGo to the public address of this instance. You should see the nginx welcoming message.
Remember to terminate the instance when you finish, else it will incur charges.
Install Ansible on ubuntu EC2
Begin by spinning a new EC2 ubuntu instance.
Install Ansible and its dependencies
To check boto version:
Make the hosts file
Ex. [webservers] is a group name for the 2 IPs below.
Check the Playbook Settings
ansible playbook playbook.yml --list-hosts
You will see the servers that the Playbook will run against:
play #1 (create instances): host count=1
localhost
play #2 (configure instances): host count=0
Play the Playbook
ansible-playbook playbook.yml
AWS credentials
If you are going to use the ec2 module, you will need to set up the access keys in your environment.
Install Ansible and its dependencies
sudo apt-get install python-pip python-devMake sure boto version is larger than 2.3
sudo pip install ansible
sudo apt-get install python-boto
To check boto version:
pip freeze | grep boto
Make the hosts file
sudo mkdir /etc/ansiblePut the IPs of your machines in the hosts file.
sudo touch /etc/ansible/hosts
Ex. [webservers] is a group name for the 2 IPs below.
[webservers]
255.255.255.255
111.111.111.111
Check the Playbook Settings
ansible playbook playbook.yml --list-hosts
You will see the servers that the Playbook will run against:
play #1 (create instances): host count=1
localhost
play #2 (configure instances): host count=0
Play the Playbook
ansible-playbook playbook.yml
AWS credentials
If you are going to use the ec2 module, you will need to set up the access keys in your environment.
vi ~/.bashrcAppend the following with your keys (You need to log in to your AWS console to get the access key pairs)
export AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY}
export AWS_SECRET_ACCESS_KEY=${AWS_SECRET_KEY}
Saturday, July 13, 2013
Installing and Running Ansible on Mac OSX and pinging ec2 machines
We will be installing Ansible from Git.
Install Ansible
Download ez_setup.py.
Install ez_setup, pip, jinja2 and ansible.
Define your host file
Create the file /etc/ansible/hosts.
Put the IP of each machine you want to ping.
Example:
Run Ansible
ansible all -m ping
You will see a response similar to the following if it's successful.
ansible all -a "/bin/echo hello"
You will see:
Saving the key in memory
If you don't specify the ansible_ssh_private_key_file and ansible_ssh_user attributes in the inventory file above. You can either 1.) specify the key and user in the ansible command or 2.) use ssh-agent.
1.) Explicitly specifying the user and key:
Install Ansible
Download ez_setup.py.
sudo python ez_setup.py
sudo easy_install pip
sudo pip install ansible
sudo pip install jinja2
Define your host file
Create the file /etc/ansible/hosts.
Put the IP of each machine you want to ping.
Example:
[appservers]Change the IP to your EC2 instance's IP. The [appservers] is just a label for grouping. You may have servers grouped as web servers, app servers, db servers, etc.
255.255.255.255 ansible_ssh_private_key_file={your_key_path}.pem ansible_ssh_user=ec2-user
Run Ansible
ansible all -m ping
You will see a response similar to the following if it's successful.
255.255.255.255 | success >> {Let's execute a command on all the machines:
"changed": false,
"ping": "pong"
}
ansible all -a "/bin/echo hello"
You will see:
255.255.255.255 | success | rc=0 >>
hello
Saving the key in memory
If you don't specify the ansible_ssh_private_key_file and ansible_ssh_user attributes in the inventory file above. You can either 1.) specify the key and user in the ansible command or 2.) use ssh-agent.
1.) Explicitly specifying the user and key:
ansible all -m ping -u ec2-user --private-key={your_key}.pem2.) Using ssh-agent and ssh-add
ssh-agent bashThen you can ping the ec2 server like this:
ssh-add ~/.ssh/{your_key}.pem
ansible all -m ping -u ec2-user
Tuesday, April 16, 2013
Scaling Pinterest from 0 to 1 billion
The following is a very informative link sharing how Pinterest scaled from 0 to 1 billion users in under two years. Throughout the years, they have tried different technologies and abandoned some.
Below are some key points:
Notice that they dropped Cassandra, and Rackspace.
Here's the link:
http://highscalability.com/blog/2013/4/15/scaling-pinterest-from-0-to-10s-of-billions-of-page-views-a.html?utm_source=feedly
Below are some key points:
- an architecture is good when growth can be handled by adding more of the same staff (machines)
- when you push a technology to the limit, it will fail in its own special way
- the stack used are MySQL with sharding, Python, Amazon EC2, S3, Akamai, elastic load balancer, memcache, Redis
Notice that they dropped Cassandra, and Rackspace.
Here's the link:
http://highscalability.com/blog/2013/4/15/scaling-pinterest-from-0-to-10s-of-billions-of-page-views-a.html?utm_source=feedly
Saturday, February 9, 2013
Micro Instance out of memory - add swap
I was trying to update my Symfony project and I got the following while I was trying to update the database schema or assets:
What I can do is to 1) switch to a small instance or 2) add a 1GB swap in disk.
Here are the commands to add a 1GB swap
sudo /bin/dd if=/dev/zero of=/var/swap.1 bs=1M count=1024
sudo /sbin/mkswap /var/swap.1
To turn off the swap do the following:
sudo /sbin/swapoff /var/swap.1
Fatal error: Uncaught exception 'ErrorException' with message 'Warning: proc_open(): fork failed - Cannot allocate memory inAn Amazon EC2 micro.t1 instance only has 613MB RAM. It is not enough to run a lot of processes.
What I can do is to 1) switch to a small instance or 2) add a 1GB swap in disk.
Here are the commands to add a 1GB swap
sudo /bin/dd if=/dev/zero of=/var/swap.1 bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 34.1356 s, 31.5 MB/s
sudo /sbin/mkswap /var/swap.1
Setting up swapspace version 1, size = 1048572 KiBsudo /sbin/swapon /var/swap.1
no label, UUID=9cffd7c9-8ec6-4f6c-8eea-79aa3173a59a
To turn off the swap do the following:
sudo /sbin/swapoff /var/swap.1
Saturday, February 2, 2013
EC2 instance terminated from custom AMI
If every time your instance terminates when you try to launch it from an AMI, the AMI is probably corrupted.
What you need to do is to recreate the AMI from the original instance.
Make sure all processes are turned down except the ssh daemon.
Run netstat -tupln
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 701/sshd
tcp6 0 0 :::22 :::* LISTEN 701/sshd
udp 0 0 0.0.0.0:68 0.0.0.0:* 491/dhclient3
Make sure only the above processes are running and then create the AMI.
How to deal with log files in AWS EC2
When you are launching your application for production, it's best to keep the logs in a separate drive.
What I typically do is to mount 2 EBS volumes on a instance. One for source code; one for log files.
For convenience, you can mount the source code directory as /var/www and the log files directory as /var/log.
If you are looking for information on how to mount and format a volume, read Amazon EC2 - Mounting a EBS drive.
Tuesday, January 29, 2013
Setting up Cassandra Multi Nodes on Amazon EC2
Cassandra is a NoSQL database. It is designed to be launched in a cluster of machines, providing high availability and fault tolerance.
Before starting, make sure you scan through Node and Cluster Initialization Properties.
Important Node attributes:
cluster_name
All nodes in a cluster must have the same name.
commitlog_directory
Datastax recommends to put this into a separate disk partition (Perhaps, EBS).
data_file_directories
Stores the column family data.
partitoner
defaults to RandomPartitioner
rpc_address
set to 0.0.0.0 to listen on all configured interfaces
rpc_port
Port for Thrift server. Default is 9160.
saved_caches_directory
Location where column family key and row caches will be stored.
seeds
Nodes that contain information about the ring topology and obtain gossip information.
storage_port
Port for inter-node communication. Default is 7000.
Create a Large Instance
You will need to launch at least a large instance. See Cassandra Hardware for more details. If you have a smaller instance, you may not be able to use /etc/init.d/cassandra start command and you will see a JVM heap memory error.
Ideal Cassandra Instance Specs:
We will NOT be using EBS due to the bad I/O throughput and reliability. We will use the ephemeral volume instead. Click here for more details.
Security Group
Create a Security Group with the settings above.
Port 22 and 8888 will be 0.0.0.0/0.
1024-65535 will be your group id. (Click on Details tab on a Security Group to check your group id)
All other ports will be your group id.
Mounting the ephemeral drive
We will begin by formatting the ephemeral drive with XFS.
Use fdisk -l to check what's your ephemeral drive. It may come with ext3 already.
umount /dev/xvdb
mkfs.xfs -f /dev/xvdb
vi /etc/fstab
Remove the original entry and put
/vol xfs noatime 0 0
sudo mount /vol
You may also want to use RAID-0 to strip a set of ephemeral volumes.
Install Oracle Sun Java
Do not use OpenJDK. Cassandra works only with Oracle Sun Java.
Download jdk-6u38-linux-x64.bin.
mkdir /usr/java/latest
Upload or wget the JDK in this folder.
chmod +x jdk-6u38-linux-x64.bin
sudo ./jdk-6u38-linux-x64.bin
sudo update-alternatives --install "/usr/bin/java" "java" "/usr/java/latest/jdk1.6.0_38/bin/java" 1
sudo update-alternatives --set java /usr/java/latest/jdk1.6.0_38/bin/java
java -version
Make sure JNA is installed. Linux does not swap out the JVM and performance can improve.
sudo apt-get install libjna-java
vi /etc/security/limits.conf
Add:
cassandra soft memlock unlimited
cassandra hard memlock unlimited
Install Cassandra
Begin by installing a single node Cassandra. Read Cassandra - installing on Ubuntu 12.04 Amazon EC2.
Make sure Cassandra is at version 1.2.x and cqlsh is 2.3.x
mkdir /vol/cassandra
mkdir /vol/cassandra/commitlog
mkdir /vol/cassandra/data
mkdir /vol/cassandra/saved_caches
chown cassandra:cassandra -R /vol/cassandra
vi /etc/cassandra/cassandra.yaml
Point these directories to the ones we created above.
Kill Cassandra if you started with cassandra -f command. We will want to start from init.d
sudo /etc/init.d/cassandra start
sudo /etc/init.d/cassandra stop
sudo /etc/init.d/cassandra status
Use nodetool to check the status:
If it's not starting, check the log /var/log/cassandra/output.log
If it's complaining about oldClusterName != newClusterName, just remove everything in the data_file_directories.
Create a Cassandra AMI
We will be setting up a ring (multi-node Cassandra). Before you create an AMI, umount /dev/xvdb and comment out the xfs record in /etc/fstab. Else you won't be able to ssh into the instances launched by this image.
Launch a second instance in another availability zone
Setting up a Cassandra Ring
Before you begin, make sure you have the following:
A snitch is used to determine which data centers and racks are written to and read from, and distribute replicas by grouping machines into data centers and racks.
For Ec2Snitch, a region is treated as a data center, and an availability zone is treated as a rack within a data center.
Setting up Multi Data Center Cassandra Ring
We will begin by tweaking the first node.
cd /etc/cassandra/cassandra.yaml
Set the following:
We will now add a second node in a different region (Ex. if first region is at us-east-1a, then make second region to be at us-east-1d).
ssh into your second instance. Remember to mount the partition back. Use "df" to make sure /dev/xvdb is mounted.
The following needs to be changed on all nodes:
Seeds
Add the private IPs of all nodes
RPC Address
The address in which clients connect to
Listen Address
The address in which nodes connect with each other
For the first node,
Initial token (skip to Virtual nodes if you are using Cassandra 1.2.x and above)
This is used for load balancing. The first node should have a value of zero. All other nodes will need to recalculate this value every time a new node joins the cluster.
Calculate this based on the number of nodes. Use the Python problem from Cassandra.
Create a file called token_generator.py. Paste the following in the file.
Virtual nodes (Cassandra 1.2.x or above)
vnodes are introducted in 1.2.x.
Set num_tokens to 256 and leave initial_token to empty.
Auto bootstrapping
When a new node is added, the cluster will automatically migrate the correct range of data from existing nodes.
Do not set autobootstrap: true and include it in the seed list together.
After all the above setup, start both nodes. Then check if they are up.
PropertyFileSnitch
Set endpoint_snitch: PropertyFileSnitch
We will be using PropertyFileSnitch and define our data centers and racks.
We will use dc1 to represent data center 1 and rac1 to represent rack 1.
Create /etc/cassandra/cassandra-topology.properties on all nodes and place the following:
10.216.218.73=dc1:rac1
10.31.2.31=dc2:rac1
default=dc1:rac1
default=dc1:rac1 is for when a node first joined and it's not specified in file.
Keep in mind that when creating our schema we will be using NetworkTopologyStrategy and use the dc and rac references we used above.
You may want to create an image again.
Testing the cluster replication
Start Cassandra for both nodes:
nodetool status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.32.6.31 28.94 KB 256 48.2% eab0379f-2ac6-408a-b6dc-0ad475337a28 rac1
Datacenter: dc2
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.108.23.52 47.66 KB 256 51.8% 35f1f17c-84c1-4e10-83b5-857feba03f4d rac1
In both nodes, try executing the following:
cqlsh 10.216.218.73
cqlsh 10.31.2.31
You should not have a problem connecting to both of these machines. Make sure you are have the latest cqlsh (2.3.0 at the moment of this post).
We will be executing a script. I would recommend setting up Git and pull your code from Github for a production machine.
Create a script called test.cql.
Paste the following:
Execute your script by running:
Connection your Application to Cassandra
In the Security Group for Cassandra, open up 9160 to your application security group id.
Check if the connection is okay by telnet
set to 0.0.0.0 to listen on all configured interfaces
rpc_port
Port for Thrift server. Default is 9160.
saved_caches_directory
Location where column family key and row caches will be stored.
seeds
Nodes that contain information about the ring topology and obtain gossip information.
storage_port
Port for inter-node communication. Default is 7000.
Create a Large Instance
You will need to launch at least a large instance. See Cassandra Hardware for more details. If you have a smaller instance, you may not be able to use /etc/init.d/cassandra start command and you will see a JVM heap memory error.
Ideal Cassandra Instance Specs:
- 32 GB RAM (Minimum 4GB)
- 8-core cpu
- 2 disks (one for CommitLogDirectory, one for DataFileDirectories)
- RAID 0 for DataFileDirectories disk when disk capacity is 50% full
- XFS file system
- Minimum 3 replications (instances)
We will NOT be using EBS due to the bad I/O throughput and reliability. We will use the ephemeral volume instead. Click here for more details.
Security Group
Port | Description |
---|---|
Public Facing Ports | |
22 | SSH port. |
8888 | OpsCenter website port. |
Cassandra Inter-node Ports | |
1024+ | JMX reconnection/loopback ports. See description for port 7199. |
7000 | Cassandra inter-node cluster communication. |
7199 | Cassandra JMX monitoring port. After the initial handshake, the JMX protocol requires that the client reconnects on a randomly chosen port (1024+). |
9160 | Cassandra client port (Thrift). |
OpsCenter ports | |
61620 | OpsCenter monitoring port. The opscenterd daemon listens on this port for TCP traffic coming from the agent. |
61621 | OpsCenter agent port. The agents listen on this port for SSL traffic initiated by OpsCenter. |
Create a Security Group with the settings above.
Port 22 and 8888 will be 0.0.0.0/0.
1024-65535 will be your group id. (Click on Details tab on a Security Group to check your group id)
All other ports will be your group id.
Mounting the ephemeral drive
We will begin by formatting the ephemeral drive with XFS.
Use fdisk -l to check what's your ephemeral drive. It may come with ext3 already.
umount /dev/xvdb
mkfs.xfs -f /dev/xvdb
vi /etc/fstab
Remove the original entry and put
/vol xfs noatime 0 0
sudo mount /vol
You may also want to use RAID-0 to strip a set of ephemeral volumes.
Install Oracle Sun Java
Do not use OpenJDK. Cassandra works only with Oracle Sun Java.
Download jdk-6u38-linux-x64.bin.
mkdir /usr/java/latest
Upload or wget the JDK in this folder.
chmod +x jdk-6u38-linux-x64.bin
sudo ./jdk-6u38-linux-x64.bin
sudo update-alternatives --install "/usr/bin/java" "java" "/usr/java/latest/jdk1.6.0_38/bin/java" 1
sudo update-alternatives --set java /usr/java/latest/jdk1.6.0_38/bin/java
java -version
java version "1.6.0_38"
Java(TM) SE Runtime Environment (build 1.6.0_38-b05)
Java HotSpot(TM) 64-Bit Server VM (build 20.13-b02, mixed mode)
Make sure JNA is installed. Linux does not swap out the JVM and performance can improve.
sudo apt-get install libjna-java
vi /etc/security/limits.conf
Add:
cassandra soft memlock unlimited
cassandra hard memlock unlimited
Install Cassandra
Begin by installing a single node Cassandra. Read Cassandra - installing on Ubuntu 12.04 Amazon EC2.
Make sure Cassandra is at version 1.2.x and cqlsh is 2.3.x
cassandra -version
cqlsh --versionWe want to save the data in the ephemeral drive. The mount point we created earlier is /vol
mkdir /vol/cassandra
mkdir /vol/cassandra/commitlog
mkdir /vol/cassandra/data
mkdir /vol/cassandra/saved_caches
chown cassandra:cassandra -R /vol/cassandra
vi /etc/cassandra/cassandra.yaml
Point these directories to the ones we created above.
- commitlog_directory
- data_file_directories
- saved_caches_directory
Kill Cassandra if you started with cassandra -f command. We will want to start from init.d
sudo /etc/init.d/cassandra start
sudo /etc/init.d/cassandra stop
sudo /etc/init.d/cassandra status
Use nodetool to check the status:
nodetool -h localhost -p 7199 ringReboot and check if it's running by running "netstat -tupln"
If it's not starting, check the log /var/log/cassandra/output.log
If it's complaining about oldClusterName != newClusterName, just remove everything in the data_file_directories.
Create a Cassandra AMI
We will be setting up a ring (multi-node Cassandra). Before you create an AMI, umount /dev/xvdb and comment out the xfs record in /etc/fstab. Else you won't be able to ssh into the instances launched by this image.
Launch a second instance in another availability zone
Setting up a Cassandra Ring
Before you begin, make sure you have the following:
- Cassandra on each node
- a cluster name
- IP of each node
- seed nodes
- snitches (EC2Snitch, EC2MultiRegionSnitch)
- open required firewalls
A snitch is used to determine which data centers and racks are written to and read from, and distribute replicas by grouping machines into data centers and racks.
For Ec2Snitch, a region is treated as a data center, and an availability zone is treated as a rack within a data center.
Setting up Multi Data Center Cassandra Ring
We will begin by tweaking the first node.
cd /etc/cassandra/cassandra.yaml
Set the following:
cluster_name: my_clusterStart Cassandra. If you face any problems starting it, delete all the files in commitlog_directory and data_file_directories.
initial_token: 0
We will now add a second node in a different region (Ex. if first region is at us-east-1a, then make second region to be at us-east-1d).
ssh into your second instance. Remember to mount the partition back. Use "df" to make sure /dev/xvdb is mounted.
umount /mntNow edit /etc/cassandra/cassandra.yaml.
vi /etc/fstab
uncomment "/dev/xvdb /vol xfs noatime 0 0" and remove entries that are using /dev/xvdb if appropriate
mkfs.xfs -f /dev/xvdb
mount /vol
mkdir /vol/cassandra
mkdir /vol/cassandra/commitlog
mkdir /vol/cassandra/data
mkdir /vol/cassandra/saved_caches
chown cassandra:cassandra -R /vol/cassandra
The following needs to be changed on all nodes:
- seeds
- rpc_address
- listen_address
- initial_token
- auto_bootstrap
Seeds
Add the private IPs of all nodes
- seeds: "10.31.2.31,10.216.218.73"
RPC Address
The address in which clients connect to
Listen Address
The address in which nodes connect with each other
For the first node,
listen_address: 10.31.2.31For the second node,
rpc_address: 10.31.2.31
listen_address: 10.216.218.73
rpc_address: 10.216.218.73
Initial token (skip to Virtual nodes if you are using Cassandra 1.2.x and above)
This is used for load balancing. The first node should have a value of zero. All other nodes will need to recalculate this value every time a new node joins the cluster.
Calculate this based on the number of nodes. Use the Python problem from Cassandra.
Create a file called token_generator.py. Paste the following in the file.
#! /usr/bin/pythonChange it to an executable.
import sys
if (len(sys.argv) > 1):
num=int(sys.argv[1])
else:
num=int(raw_input("How many nodes are in your cluster? "))
for i in range(0, num):
print 'node %d: %d' % (i, (i*(2**127)/num))
chmod 777 token_generator.pyExecute the program with the number of nodes as the first argument. In our case, it's 2.
./token_generator.py 2The output should be similar to the following
node 0: 0Put 85070591730234615865843651857942052864 as the initial token for the second node.
node 1: 85070591730234615865843651857942052864
initial_token: '85070591730234615865843651857942052864'If you get DatabaseDescriptor.java (line 509) Fatal configuration error, you are probably using Cassandra 1.2.x.
Virtual nodes (Cassandra 1.2.x or above)
vnodes are introducted in 1.2.x.
Set num_tokens to 256 and leave initial_token to empty.
Auto bootstrapping
When a new node is added, the cluster will automatically migrate the correct range of data from existing nodes.
Do not set autobootstrap: true and include it in the seed list together.
After all the above setup, start both nodes. Then check if they are up.
nodetool status
PropertyFileSnitch
Set endpoint_snitch: PropertyFileSnitch
We will be using PropertyFileSnitch and define our data centers and racks.
We will use dc1 to represent data center 1 and rac1 to represent rack 1.
Create /etc/cassandra/cassandra-topology.properties on all nodes and place the following:
10.216.218.73=dc1:rac1
10.31.2.31=dc2:rac1
default=dc1:rac1
default=dc1:rac1 is for when a node first joined and it's not specified in file.
Keep in mind that when creating our schema we will be using NetworkTopologyStrategy and use the dc and rac references we used above.
You may want to create an image again.
Testing the cluster replication
Start Cassandra for both nodes:
service cassandra startCheck the status:
nodetool status
Datacenter: dc1
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.32.6.31 28.94 KB 256 48.2% eab0379f-2ac6-408a-b6dc-0ad475337a28 rac1
Datacenter: dc2
===============
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 10.108.23.52 47.66 KB 256 51.8% 35f1f17c-84c1-4e10-83b5-857feba03f4d rac1
In both nodes, try executing the following:
cqlsh 10.216.218.73
cqlsh 10.31.2.31
You should not have a problem connecting to both of these machines. Make sure you are have the latest cqlsh (2.3.0 at the moment of this post).
We will be executing a script. I would recommend setting up Git and pull your code from Github for a production machine.
Create a script called test.cql.
Paste the following:
create keyspace helloworld with replication ={'class': 'NetworkTopologyStrategy', 'dc1': 1, 'dc2',1};
create table activity (
activity_key int,
activity_time timeuuid,
activity_type varint,
primary key (activity_key, activity_time)
)
with clustering order by (event_time desc);
cqlsh 10.216.218.73 -f test.cqlCheck to see if the keyspace "helloworld" exists:
cqlsh 10.216.218.73
describe keyspaces;
Connection your Application to Cassandra
In the Security Group for Cassandra, open up 9160 to your application security group id.
Check if the connection is okay by telnet
telnet 10.31.2.31 9160
Labels:
ami,
aws,
cassandra,
cassandra ami,
cassandra cluster,
ec2,
multi nodes,
nosql
Friday, January 25, 2013
Setting up Lighttpd Load Balancer on EC2 Ubuntu
Lighttpd is an asynchronous server. Along with Nginx, Lighttpd is one of the fast servers designed to counter the C10k problem. If you want to set up Nginx, read Setting up Nginx on EC2 Ubuntu.
This tutorial will demonstrate how to use Lighttpd to load balance application servers.
Creating a EC2 Instance
This tutorial will demonstrate how to use Lighttpd to load balance application servers.
Creating a EC2 Instance
In the AWS Management Console, begin by creating a t1.micro Ubuntu Server 12.04.1 LTS 64-bit. (If you don't know how to create an instance, read Amazon EC2 - Launching Ubuntu Server 12.04.1 LTS step by step guide.
Here are some guidelines:
- Uncheck Delete on Termination for the root volume
- Add port 22, 80 and 443 to the Security Group, call it lighttpd.
Install Lighttpd
ssh -i {key} ubuntu@{your_ec2_public_address}
sudo apt-get update -y
sudo apt-get install -y lighttpd
Lighttpd should be running. To check its status, run
service lighttpd status
All the configuration files are located in /etc/lighttpd
To enable/disable a module
- Use /usr/sbin/lighty-enable-mod and /usr/sbin/lighty-disable-mod
- Or create a symbolic link from /etc/lighttpd/conf-available/{module} to /etc/lighttpd/conf-enabled/module
To load balance application servers, we will be using the 10-proxy.conf file as a template.
cd /etc/lighttpd/conf-available
cp 10-proxy.conf 11-proxy.conf
vi 11-proxy.conf
We are interesting in the following two variables:
- proxy.balance - choose from hash, round-robin or fair
- proxy.server - put the servers you want to load balance to
For example:
proxy.balance = "hash"
proxy.server = ( "" =>
(
( "host" => "10.204.199.85",
"port" => 80
),
( "host" => "10.202.111.140",
"port" => 80
)
)
)
The above settings will load balance to two other servers based on IP.
Restart the server.
service lighttpd restart
Test the server.
To check the status:
To check the status:
netstat -ntulp
Thursday, January 24, 2013
Setting up a Java Tomcat7 Production Server on Amazon EC2
This tutorial will demonstrate how to build a Tomcat7 server running a Java application on Amazon EC2.
Here are the tools we will set up:
- Apache Tomcat7
- Open JDK7
- GitHub
- Maven 3.0.4
- MySQL
Creating a EC2 Instance
In the AWS Management Console, begin by creating a t1.micro Ubuntu Server 12.04.1 LTS 64-bit machine. (If you don't know how to create an instance, read Amazon EC2 - Launching Ubuntu SErver 12.04.1 LTS step by step guide.)
Here are some guidelines:
- Uncheck Delete on Termination for the root volume
- Add port 22, 80, 443 to the Security Group.
Create a EBS volume
We will create a 20GB volume to store our Java code. The EBS will be formated with XFS.
If the volume kept on getting stuck, keep restarting the EC2 instance until it's attached.
Configure the EC2 instance
ssh into the instance (ssh -i {key} ubuntu@{your_ec2_public_address})
sudo apt-get update -y
My mounting point for /dev/xvdf is called /vol.
cd /vol
mkdir src
mkdir webapps
mkdir war_backups
/vol/src is where we will place the application code. /vol/webapps is where we will deploy the WAR file. /vol/war_backups is for making war backups, as the name implies.
Deploying code from GitHub
Skip this if you are using other source control. The idea is that we will put the Java application code in the /vol/src folder.
sudo apt-get install git -y
mkdir /vol/src
cd /vol/src
git config --global user.name "your_name"
git config --global user.email "your_email"
git config --global github.user "your_github_login"
git clone ssh://git@github.com/username/repo.git
You will want to establish a connection with Github using ssh rather than https because if you are building an image that can be used for auto-scaling you don't want to input the username and password every time. See Generating SSH Keys for more details.
Your project should be located in /vol/src/{your_project}
Set up the Tomcat7 server
Begin by reading Install OpenJDK 7. Read Install Java OpenJDK 7 on Amazon EC2 Ubuntu.
echo $JAVA_HOME to check if it's set.
Install Tomcat 7. Read Install Tomcat 7 on Amazon EC2 Ubuntu.
Remember to change to ports 80, 443, and the root web directory as the Tomcat war root path.
Check http://{your_ec2_public_address} in your browser to make sure Tomcat7 is running.
Make sure Tomcat7 is still up after you reboot the machine.
Generating the war file
We will be using Maven to compile our Spring Java project. If you are using other build frameworks, skip this.
Read Install Maven 3 on Amazon EC2 Ubuntu.
mkdir /vol/src
cd /vol/src
git config --global user.name "your_name"
git config --global user.email "your_email"
git config --global github.user "your_github_login"
git clone ssh://git@github.com/username/repo.git
You will want to establish a connection with Github using ssh rather than https because if you are building an image that can be used for auto-scaling you don't want to input the username and password every time. See Generating SSH Keys for more details.
Your project should be located in /vol/src/{your_project}
Set up the Tomcat7 server
Begin by reading Install OpenJDK 7. Read Install Java OpenJDK 7 on Amazon EC2 Ubuntu.
echo $JAVA_HOME to check if it's set.
Install Tomcat 7. Read Install Tomcat 7 on Amazon EC2 Ubuntu.
Remember to change to ports 80, 443, and the root web directory as the Tomcat war root path.
Check http://{your_ec2_public_address} in your browser to make sure Tomcat7 is running.
Make sure Tomcat7 is still up after you reboot the machine.
Generating the war file
We will be using Maven to compile our Spring Java project. If you are using other build frameworks, skip this.
Read Install Maven 3 on Amazon EC2 Ubuntu.
Run "mvn --version" to make sure it's using OpenJDK 7 and running the latest version of Maven.
cd /vol/src/{your_project}
mvn clean install
A WAR file should be built.
Move this WAR file into the Tomcat webapps directory. If you are following this tutorial, it should be at /vol/webapps.
Remember to label this WAR file as ROOT.war
It's easier to do load balancing mapping later.
Remember to label this WAR file as ROOT.war
It's easier to do load balancing mapping later.
Browse to check you can access the site.
Using Amazon SES as the SMTP email service
Using SES will increase the likelihood of email delivery. Read Using Amazon SES to send emails.
Recompile your project and test it.
Moving MySQL to Amazon RDS
If you are using MySQL, you should move to Amazon RDS as it simplifies a lot of management, backup operations for you.
Read Using MySQL on Amazon RDS.
To interact with RDS through your EC2 instance, install MySQL Server or just the MySQL client interface.
To interact with RDS through your EC2 instance, install MySQL Server or just the MySQL client interface.
sudo apt-get install -y mysql-serverStop the local MySQL server. We won't be using it.
sudo /etc/init.d/mysql stopConnect to your RDS instance
mysql -h {rds_public_address} -P 3306 -u{username} -p{password}Do NOT use the following form. You will get a access denied.
mysql -u{username} -p{password} -h {rds_public_address} -P 3306
Update the JDBC settings in your application, recompile and test it.
Load Balancing Tomcat7
If you are planning to run multiple instances, read Setting up Lighttpd Load Balancer on EC2 Ubuntu or Setting up Nginx on EC2 Ubuntu.
Load Balancing Tomcat7
If you are planning to run multiple instances, read Setting up Lighttpd Load Balancer on EC2 Ubuntu or Setting up Nginx on EC2 Ubuntu.
Wednesday, January 23, 2013
Amazon EC2 - remote host identification has changed
You may get the following message when you ssh into your EC2 machine:
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@This can happen when you are associating your Elastic IP to another instance.
@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
aa:c3:4d:d2:db:64:17:f0:b3:9c:77:d7:47:2f:31:ab.
Please contact your system administrator.
All you need to do is to remove the known_hosts file
rm ~/.ssh/known_hosts
Tuesday, January 22, 2013
Setting up Nginx on EC2 Ubuntu
Nginx is a high performance Web server and a reverse proxy. It is one of the top servers that can counter the C10K problem. It can be used to load balancer application servers and serve static assets.
There are many ways to set up your load balancers in AWS. Here are some examples:
sudo vi /etc/apt/sources.list
Add:
wget http://nginx.org/packages/keys/nginx_signing.key
cat nginx_signing.key | sudo apt-key add -
sudo apt-get install nginx
You may get the following:
apt-get remove nginx-common
sudo apt-get install nginx
Check your version to make sure it's the latest version (nginx -v).
Make Nginx start on boot.
Nginx Basic Commands
sudo service nginx start
sudo service nginx stop
sudo service nginx restart
sudo service nginx status
Checking IP of your browser:
Load balancing servers
We will have the Nginx server to load balance two servers (backend1.example.com and backend2.example.com) in a round robin fashion.
Begin by creating a new virtual host configuration file.
upstream domain {
ip_hash;
server backend1.example.com:8080;
server backend2.example.com:8080;
}
server {
listen 80;
server_name domain
access_log /var/log/nginx/web_portal.access.log;
location / {
proxy_pass http://domain/;
proxy_next_upstream error timeout invalid_header http_500;
proxy_connect_timeout 2;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_intercept_errors on;
}
}
Make sure the domain above match with your request domain.
It is very important to have the following two attributes. It defines what happens when a server is down. In this case, it would redirect the client request to the next machine if the server is not responding within 2 secs.
ip_hash will always send the client back to the same server based on the IP.
Check the Nginx Wiki for more info.
Disable the default config.
There are many ways to set up your load balancers in AWS. Here are some examples:
- Elastic Loader Balancer -> Application and Nginx on each server
- Three layers: Elastic Load Balancer -> Nginx Servers(cache, load balancers) -> Application Servers
- Elastic Loader Blancer -> Application servers; using ElasticFront to server assets
- Nginx -> Application servers
Instagram uses the 2nd approach above.
This tutorial will focus on setting up Nginx on a single EC2 instance, while load balancing the application servers.
Creating a EC2 Instance
In the AWS Management Console, begin by creating a t1.micro Ubuntu Server 12.04.1 LTS 64-bit. (If you don't know how to create an instance, read Amazon EC2 - Launching Ubuntu Server 12.04.1 LTS step by step guide.
Here are some guidelines:
- Uncheck Delete on Termination for the root volume
- Add port 22, 80 and 443 to the Security Group, call it Nginx.
Installing Nginx
ssh into your instance.
ssh -i {your key} ubuntu@{your_ec2_public_address}sudo apt-get update
sudo apt-get install -y nginx
Check the nginx version
nginx -vIf this is not the latest version, do the following:
sudo vi /etc/apt/sources.list
Add:
deb http://nginx.org/packages/ubuntu/ precise nginxsudo apt-get update
deb-src http://nginx.org/packages/ubuntu/ precise nginx
You will get:
W: GPG error: http://nginx.org precise Release: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY ABF5BD827BD9BF62Add the public key:
wget http://nginx.org/packages/keys/nginx_signing.key
cat nginx_signing.key | sudo apt-key add -
sudo apt-get install nginx
You may get the following:
dpkg: error processing /var/cache/apt/archives/nginx_1.2.6-1~precise_amd64.deb (--unpack):
trying to overwrite '/etc/logrotate.d/nginx', which is also in package nginx-common 1.1.19-1ubuntu0.1
dpkg-deb: error: subprocess paste was killed by signal (Broken pipe)
Errors were encountered while processing:
/var/cache/apt/archives/nginx_1.2.6-1~precise_amd64.deb
apt-get remove nginx-common
sudo apt-get install nginx
Check your version to make sure it's the latest version (nginx -v).
Make Nginx start on boot.
update-rc.d nginx defaults
Nginx Basic Commands
sudo service nginx start
sudo service nginx stop
sudo service nginx restart
sudo service nginx status
Checking IP of your browser:
ifconfig eth0 | grep inet | awk '{ print $2 }'
Load balancing servers
We will have the Nginx server to load balance two servers (backend1.example.com and backend2.example.com) in a round robin fashion.
Begin by creating a new virtual host configuration file.
cp /etc/nginx/sites-available/default /etc/nginx/sites-available/{domain}Put the following into the file:
upstream domain {
ip_hash;
server backend1.example.com:8080;
server backend2.example.com:8080;
}
server {
listen 80;
server_name domain
access_log /var/log/nginx/web_portal.access.log;
location / {
proxy_pass http://domain/;
proxy_next_upstream error timeout invalid_header http_500;
proxy_connect_timeout 2;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_intercept_errors on;
}
}
Make sure the domain above match with your request domain.
It is very important to have the following two attributes. It defines what happens when a server is down. In this case, it would redirect the client request to the next machine if the server is not responding within 2 secs.
proxy_next_upstream error timeout invalid_header http_500;Check out proxy_read_timeout as well.
proxy_connect_timeout 1;
ip_hash will always send the client back to the same server based on the IP.
Check the Nginx Wiki for more info.
Disable the default config.
rm /etc/nginx/sites-enabled/defaultEnable the configuration by symbolic link to sites-enabled.
sudo ln -s /etc/nginx/sites-available/{domain} /etc/nginx/sites-enabled/{domain}If Nginx doesn't seem to pick up on the configuration, make sure /etc/nginx/nginx.conf has the following within the http block:
include /etc/nginx/sites-enabled/*;Restart the server.
service nginx restartTo deploy code without service interruption, read Nginx - How to deploy code without service disruption.
How to build a NodeJS AMI on EC2
This demo will provide guidelines on how to configure a NodeJS EC2 instance and create a NodeJS AMI on Ubuntu.
We are going to format the xvdf with XFS file system. Refer to Amazon EC2 - Mounting a EBS drive.
Install NodeJS and other dependencies
sudo apt-get -y nodejs npm
If you run "node --version", you will find the node version is 0.6.12. We want to use 0.8.18, since it's a lot faster.
sudo npm install -g n
sudo n 0.8.18
Now "sudo node --version" will show version 0.8.18 while "node --version" will show 0.6.12
Install Git and fetch your code (Optional)
sudo apt-get install git -y
mkdir /vol/src
cd /vol/src
git config --global user.name "your_name"
git config --global user.email "your_email"
git config --global github.user "your_github_login"
git clone ssh://git@github.com/username/repo.git
You will want to establish a connection with Github using ssh rather than https because if you are building an image that can be used for auto-scaling you don't want to input the username and password every time. See Generating SSH Keys for more details.
Test your application by running
sudo node {your_app}
Making the NodeJS start on boot
To make a robust image, we want the NodeJS app to start on boot and respawn when crashed. We will write a simple service. All service scripts are located in /etc/init.
Let's create the file /etc/init/{your_app_name}_service.conf
sudo vi http://upstart.ubuntu.com/wiki/Stanzas
Put the following into the file:
#######################
#!upstart
description "my NodeJS server"
author "Some Dude"
# start on startup
start on started networking
stop on shutdown
# Automatically Respawn:
respawn
respawn limit 5 60
script
cd /vol/src/{your_app}
exec sudo node /vol/src/{your_app}/app.js >> /vol/log/app_`date +"%Y%m%d_%H%M%S"`.log 2>&1
end script
post-start script
# Optionally put a script here that will notifiy you node has (re)started
# /root/bin/hoptoad.sh "node.js has started!"
end script
#######################
Refer to upstart stanzas for more details about what each field mean.
Create the directory to store NodeJS outputs:
sudo mkdir /vol/log
I have marked each log file with the start time of the app. You will probably want to change this to create logs daily.
To check if the services are running:
initctl list | grep {your_app_name}_service.conf
To start a service:
sudo service {your_app_name}_service.conf start
To stop a service:
sudo service {your_app_name}_service.conf stop
Now reboot your EC2 instance in the AWS console.
Test if your site is started.
Create a NodeJS AMI
In the AWS Management Console, click instances at the left sidebar.
Right click on the Wordpress instance created above and click on Create Image.
Fill in the image name. I like to name things in a convention that is systematic. If you are planning to write deploy scripts and do auto-scaling, it is easier to identify what an image is. I use the following convention:
{namespace}_{what_is_it}_{date}
Ex. mycompany_blog_20130118
You will want the date because you may create an image every time you deploy new codes.
Leave the other options as default, and click on Create Image.
On the left sidebar, click on AMIs under Images.
You can see the status of the AMI we just created.
You should launch an instance from this AMI and test all the data is there.
Specs:
Ubuntu Server 12.04.1 LTS 64-bit
Create a Ubuntu Server 12.04.1 LTS 64-bit t1.micro instance
If you don't know how to do so, read Amazon EC2 - Launching Ubuntu Server 12.04.1 LTS step by step guide.
Uncheck delete on termination for the EBS-root disc.
Create a Security Group called Node JS Production (or anything you want).
Add port 22, 80, 443, 3000 to the Security Group. (I am adding port 3000 because I run the app from port 3000)
Launch the instance.
In the AWS Management Console, Volumes -> Create Volume.
Make the volume with
- type = Standard
- Size = 20GB
- Availability Zone must match the EC2's Availability Zone
- make the drive name xvdf
Associate this EBS with the EC2 instance we just created.
ssh into your instance.
Ex. ssh -i {key} ubuntu@{ec2-address}.compute-1.amazonaws.comsudo apt-get update
We are going to format the xvdf with XFS file system. Refer to Amazon EC2 - Mounting a EBS drive.
Install NodeJS and other dependencies
sudo apt-get -y nodejs npm
If you run "node --version", you will find the node version is 0.6.12. We want to use 0.8.18, since it's a lot faster.
sudo npm install -g n
sudo n 0.8.18
Now "sudo node --version" will show version 0.8.18 while "node --version" will show 0.6.12
Install Git and fetch your code (Optional)
sudo apt-get install git -y
mkdir /vol/src
cd /vol/src
git config --global user.name "your_name"
git config --global user.email "your_email"
git config --global github.user "your_github_login"
git clone ssh://git@github.com/username/repo.git
You will want to establish a connection with Github using ssh rather than https because if you are building an image that can be used for auto-scaling you don't want to input the username and password every time. See Generating SSH Keys for more details.
Test your application by running
sudo node {your_app}
Making the NodeJS start on boot
To make a robust image, we want the NodeJS app to start on boot and respawn when crashed. We will write a simple service. All service scripts are located in /etc/init.
Let's create the file /etc/init/{your_app_name}_service.conf
sudo vi http://upstart.ubuntu.com/wiki/Stanzas
Put the following into the file:
#######################
#!upstart
description "my NodeJS server"
author "Some Dude"
# start on startup
start on started networking
stop on shutdown
# Automatically Respawn:
respawn
respawn limit 5 60
script
cd /vol/src/{your_app}
exec sudo node /vol/src/{your_app}/app.js >> /vol/log/app_`date +"%Y%m%d_%H%M%S"`.log 2>&1
end script
post-start script
# Optionally put a script here that will notifiy you node has (re)started
# /root/bin/hoptoad.sh "node.js has started!"
end script
#######################
Refer to upstart stanzas for more details about what each field mean.
Create the directory to store NodeJS outputs:
sudo mkdir /vol/log
I have marked each log file with the start time of the app. You will probably want to change this to create logs daily.
To check if the services are running:
initctl list | grep {your_app_name}_service.conf
To start a service:
sudo service {your_app_name}_service.conf start
To stop a service:
sudo service {your_app_name}_service.conf stop
Now reboot your EC2 instance in the AWS console.
Test if your site is started.
Create a NodeJS AMI
In the AWS Management Console, click instances at the left sidebar.
Right click on the Wordpress instance created above and click on Create Image.
Fill in the image name. I like to name things in a convention that is systematic. If you are planning to write deploy scripts and do auto-scaling, it is easier to identify what an image is. I use the following convention:
{namespace}_{what_is_it}_{date}
Ex. mycompany_blog_20130118
You will want the date because you may create an image every time you deploy new codes.
Leave the other options as default, and click on Create Image.
On the left sidebar, click on AMIs under Images.
You can see the status of the AMI we just created.
You should launch an instance from this AMI and test all the data is there.
Subscribe to:
Posts (Atom)