Apache Airflow – Bash Install U16.04 LTS on EC2

I will try to create an Ansible version soon.

Installing Airflow via Bash onto Ubuntu 16.04 LTS, for EC2; specific resource: (HVM)/ami-f4cc1de2

There are 3 parts to this since the editing of Postgres is manual for this example.

Part 1:

# aws – ec2 – ubuntu 16.04 LTS
# (HVM) / ami-f4cc1de2
HOSTS_FILE=”/etc/hosts”
PUBLIC_ADAPTER=”eth0″
echo “Add appropriate mapping for local IP, since EC2 instance.  So if IP = 172.30.1.5, then…”
echo “172.30.1.5 ip-172-30-1-5 >> $HOSTS_FILE”
echo “”
echo “Attempting to add IP to Hosts file…”
IP=$(ip addr show $PUBLIC_ADAPTER | grep “inet\b” | awk ‘{print $2}’ | cut -d/ -f1)
EC2_IP=”${IP//./-}”
HOST_LINE=”$IP ip-$EC2_IP”
echo “Adding ‘$HOST_LINE’ to $HOSTS_FILE”
echo “$HOST_LINE” >> $HOSTS_FILE
cat $HOSTS_FILE

sudo apt-get update -y && sudo apt-get upgrade -y
sudo apt-get install -y unzip build-essential libssl-dev libffi-dev python-dev libsasl2-dev python-pandas python-pip
sudo apt-get update
sudo apt-get install postgresql-9.6

echo “(Edit File) sudo nano /etc/postgresql/9.6/main/pg_hba.conf”
echo “Since new install, comment out or remove all lines within the file.”
echo “And replace them with:”
echo “# TYPE  DATABASE        USER            ADDRESS                 METHOD”
echo “local   all             postgres                                peer”
echo “local   all             all                                     peer”
echo “host    all             all             127.0.0.1/32            md5”
echo “host    all             all             ::1/128                 md5”

Part 2 (so now actually edit the pg_hba.conf file as described in the last section of the above script)

Part 3:

sudo service postgresql start

# upgrade pip itself
sudo pip install –upgrade pip

# added to overcome a potential error
sudo pip install cryptography

### if virtualenv is needed
#suod pip virtualenv virtualenvwrapper
#mkvirtualenv airflow
#workon airflow

export AIRFLOW_HOME=~/airflow

sudo pip install airflow

### if error “error trying to exec ‘as’: execvp: No such file or directory” ###
# apt-get install binutils
# apt-get install gcc
# apt-get install build-essential
# pip install pandas
### and retry pip install airflow
#
### If the problem persists, uninstall the packages listed above and reinstall. Then rerun.

# added because of “ImportError: cannot import name HiveOperator”
sudo pip install airflow[hive]

sudo pip install airflow[crypto]
sudo pip install airflow[postgres]
sudo pip install airflow[celery]
sudo pip install airflow[rabbitmq]

airflow initdb
airflow webserver

# The following tutorial worked at this point
# https://airflow.incubator.apache.org/tutorial.html

 

And Here is the actual script(Warning: Postgress pg_hba.conf update is not done automatically in this script):

EC2_U16.04_Install_Airflow.sh

Disclaimer: I provide this information as an example of what can be possible.  Use at your own risk.

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s