Guide for Deploy – Django Python 3

crianças fazendo deploy

There is a lot of tutorials out there, especially in English. Here it goes another one. I wrote it originally in Portuguese.

The reason many people has problems deploying is that they don’t pay enough attention to details. Deploying is easy when you are familiarized with all parts involved. You must know how to authenticate through ssh, be used to command line and Linux, understand how to configure and set up your project, have an idea of what serving static files is, what is Gunicorn… Ok, it’s not that simple. That’s why there is a lot of deploy tools, kits, and tutorials. Currently, with Ansible, Docker and whatever kids are using these days it’s easier to deploy, but what happens under the hood gets more abstract.

Maybe in a couple of years, this post is going to be obsolete if it’s not already with serverless and everything else. Anyway, just a few people want to learn how to deploy Django as I’ll show here, but if it helps at least one person, I’ll be satisfied.

Enjoy this Old-Style guide!

The Server

I presume you don’t have a server or AWS account, DigitalOcean, Linode… Nothing! You have to create an account in one of them and launch a server with the distro you want. If it’s your first time, don’t go with AWS because it’s way more complicated than the others.

In this tutorial, I’m using an Ubuntu 16.04, the most common distro you’ll see around. You can also pick a Debian if you like.

Initial Set Up

Configure server timezone

sudo locale-gen --no-purge --lang pt_BR  # I'm using pt_BR, because HUE HUE BR BR
sudo dpkg-reconfigure tzdata

Update and upgrade OS Packages:

sudo apt-get update 
sudo apt-get -y upgrade

Installing Python 3.6 over Python 3.5

Replace Python 3.5 which is default on our distro with Python 3.6.

sudo apt-get update
sudo add-apt-repository ppa:jonathonf/python-3.6
sudo apt-get install python3.6
sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.5 1
sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.6 2

You can choose which Python version the OS will call when you type python3.

sudo update-alternatives --config python3

Having trouble, take a look here:

How to Install Python 3.6.1 in Ubuntu 16.04 LTS

Install OS requirements

sudo apt-get install python3-pip nginx supervisor git git-core libpq-dev python-dev 
python-virtualenv

If your project has more OS requirements, install them as well.

VirtualEnvWrapper for Python3

I’m a fan of VirtualEnvWrapper. It’s super easy and creates all my virtual environments in the same place. That’s a personal choice, if you don’t like it, use what you know how to use.

First, you install virtualenvwrapper, and then define where to put your virtualenvs. (WORKON_HOME).

If you need to use it with multiple Python versions, you must define VIRTUALENVWRAPPER_PYTHON. Here I’m using always with python3. It’s not a problem since you can create a virtualenv pointing which Python that env will use.

sudo pip3 install virtualenvwrapper
echo 'export WORKON_HOME=~/Envs' >> ~/.bashrc
echo ‘export VIRTUALENVWRAPPER_PYTHON=`which python3`’ >> ~/.bashrc
echo 'source /usr/local/bin/virtualenvwrapper.sh' >> ~/.bashrc
source ~/.bashrc

Now, create your virtualenv and define what Python is going to use.

mkvirtualenv name_venv --python=python3

VirtualEnvWrapper is really easy to use. If you want to activate a virtual env, you can use workon.

workon name_venv

To deactivate this virtualenv:

deactivate

To remove a virtualenv:

rmvirtualenv name_venv

Generate SSH for GitHub Authentication

You don’t want (neither should) write your password to git pull your project on the server.

Generating SSH Keys:

cd ~/.ssh
ssh-keygen -t rsa -b 4096 -C "[email protected]"
eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_rsa

See and copy the content of your public key (id_rsa.pub)

cat ~/.ssh/id_rsa.pub

Then sign in your GitHub account and go to Settings > SSH and GPG Keys. Click on New SSH Key, give it a name, like (“test server keys”) and in Key paste the content of your  id_rsa.pub

Clone your Django Project

Copy the SSH link from GitHub to clone your project. In this case, I’m using a project that I just have found as an example.

git clone [email protected]:kirpit/django-sample-app.git

In the project folder, install the project requirements.

Remember that you have to be in your virtual environment

cd django-sample-app/
pip install -r requirements.txt

Now, make the necessary alterations for your deploy, such as create a settings_local.py file, change database settings or anything specific to your project.

After you’re done, run your migrations and collect your static files (if you’re using it).

python manage.py migrate
python manage.py collectstatic

Configuring NGINX

Nginx, like Apache, is an entirely separate world. Right now, you just need the basics.

/etc/nginx/sites-available/ is a directory where you have to put the config files of available sites. There is another directory, /etc/nginx/sites-enabled/ that shows which sites are enabled. They are the same thing, but what is put on enabled will be served by Nginx.

It’s usual to create your config file on sites-available and create just a symlink to sites-enabled.

First of all, I’ll remove the default site from Nginx.

sudo rm /etc/nginx/sites-enabled/default

Now, create the config file for your site. (If you don’t know how to use VIM, use nano instead of vi)

sudo vi /etc/nginx/sites-available/mysite

Past this on your file, changing the necessary paths:

server {
 listen 80;
 access_log /home/username/logs/access.log;
 error_log /home/username/logs/error.log;

 server_name nome-site.com.br;

 location / {
 proxy_pass http://127.0.0.1:8000; 

 proxy_pass_header Server;
 proxy_set_header X-Forwarded-Host $server_name;
 proxy_set_header X-Real-IP $remote_addr;
 proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
 proxy_set_header Host $http_host;


 }

 location /static {

   alias /home/username/project_path/static/;

 }

And create a symlink to sites-enabled:

sudo ln -s /etc/nginx/sites-available/mysite /etc/nginx/sites-enabled/mysite

Restart Nginx:

sudo service nginx restart

Ok, if you made it till here, if you access your website you will see a 502 Bad Gateway from Nginx. That’s because it’s nothing here: http://127.0.0.1:8000

Now, configure the website to run on 8000 port.

Configuring Gunicorn

Are you guys alive? Don’t give up, we’re almost there.

In your virtualenv (remember workon name_env?) install Gunicorn

pip install gunicorn

In your project’s directory, make a gunicorn_conf file:

bind = "127.0.0.1:8000"
logfile = "/home/username/logs/gunicorn.log"
workers = 3

Now, if you run Gunicorn you will see your website working!

/home/username/Envs/name_venv/bin/gunicorn project.wsgi:application -c gunicorn_conf

But what are you going to do? Run this command inside a screen and walk away? Of course not! You’ll use Supervisord to control Gunicorn.

Configuring Supervisor

Now create a gunicorn.conf:

sudo vi /etc/supervisor/conf.d/gunicorn.conf

That’s the content:

[program:gunicorn]
command=/home/username/Envs/name_venv/bin/gunicorn project.wsgi:application -c /home/username/project/project_django/gunicorn_conf
directory=/home/username/project/project-django
user=username
autostart=true
autorestart=true
redirect_stderr=true

And now, you just tell Supervisor that there is a new process in town and Supervisord will take care of it:

sudo supervisorctl reread
sudo supervisorctl update
sudo supervisorctl restart gunicorn

And voilá! A new running you will have.

Conclusion

There is a lot of things involved in a deploy process. You have to configure a firewall, probably you’ll have to serve more than one static folder, etc, etc… But you have to start somewhere.

I can’t believe I wrote a whole post without using any GIF. So, just to finish, pay attention to all paths I’ve used here.

Oops..

Using celery with multiple queues, retries and scheduled tasks

On this post, I’ll show how to work with multiple queues, scheduled tasks, and retry when something goes wrong.

If you don’t know how to use celery, read this post first: https://fernandofreitasalves.com/executing-time-consuming-tasks-asynchronously-with-django-and-celery/

Retrying a task

Let’s say your task depends on an external API or connects to another web service and for any reason, it’s raising a ConnectionError, for instance. It’s plausible to think that after a few seconds the API, web service, or anything you are using may be back on track and working again. In this cases, you may want to catch an exception and retry your task.


from celery import shared_task
 
@shared_task(bind=True, max_retries=3)  # you can determine the max_retries here
def access_awful_system(self, my_obj_id):
    from core.models import Object
    from requests import ConnectionError
 
    o = Object.objects.get(pk=my_obj_id)
 
    # If ConnectionError try again in 180 seconds
    try:
 
        o.access_awful_system()
  
    except ConnectionError as exc:
        self.retry(exc=exc, countdown=180)  # the task goes back to the queue

The self.retry inside a function is what’s interesting here. That’s possible thanks to bind=True on the shared_task decorator. It turns our function access_awful_system into a method of Task class. And it forced us to use self as the first argument of the function too.

Another nice way to retry a function is using exponential backoff:

self.retry(exc=exc, countdown=2 ** self.request.retries)

ETA – Scheduling a task for later

Now, imagine that your application has to call an asynchronous task, but need to wait one hour until running it.

In this case, we just need to call the task using the ETA(estimated time of arrival)  property and it means your task will be executed any time after ETA. To be precise not exactly in ETA time because it will depend if there are workers available at that time. If you want to schedule tasks exactly as you do in crontab, you may want to take a look at CeleryBeat).


from django.utils import timezone
from datetime import timedelta

now = timezone.now() 
 
# later is one hour from now
later = now + timedelta(hours=1)

access_awful_system.apply_async((object_id), eta=later) 

Using more queues

When you execute celery, it creates a queue on your broker (in the last blog post it was RabbitMQ). If you have a few asynchronous tasks and you use just the celery default queue, all tasks will be going to the same queue.

Suppose that we have another task called too_long_task and one more called quick_task and imagine that we have one single queue and four workers.

In that scenario, imagine if the producer sends ten messages to the queue to be executed by too_long_task and right after that, it produces ten more messages to quick_task. What is going to happen? All your workers may be occupied executing too_long_task that went first on the queue and you don’t have workers on quick_task.

The solution for this is routing each task using named queues.


# CELERY ROUTES
CELERY_ROUTES = {
    'core.tasks.too_long_task': {'queue': 'too_long_queue'},
    'core.tasks.quick_task': {'queue': 'quick_queue'},
}

Now we can split the workers, determining which queue they will be consuming.


# For too long queue
celery --app=proj_name worker -Q too_long_queue -c 2

# For quick queue
celery --app=proj_name worker -Q quick_queue -c 2

I’m using 2 workers for each queue, but it depends on your system.

As, in the last post, you may want to run it on Supervisord

There is a lot of interesting things to do with your workers here.

Calling Sequential Tasks

Another common issue is having to call two asynchronous tasks one after the other. It can happen in a lot of scenarios, e.g. if the second tasks use the first task as a parameter.

You can use chain to do that


from celery import chain
from tasks import first_task, second_task
 
chain(first_task.s(meu_objeto_id) | second_task.s())

The chain is a task too, so you can use parameters on apply_async, for instance, using an ETA:


chain(salvar_dados.s(meu_objeto_id) | trabalhar_dados.s()).apply_async(eta=depois)

Ignoring the results from ResultBackend

If you just use tasks to execute something that doesn’t need the return from the task you can ignore the results and improve your performance.

If you’re just saving something on your models, you’d like to use this in your settings.py:


CELERY_IGNORE_RESULT = True

 

Sources:

http://docs.celeryproject.org/en/latest/userguide/tasks.html

http://docs.celeryproject.org/en/latest/userguide/optimizing.html#guide-optimizing

https://denibertovic.com/posts/celery-best-practices/

https://news.ycombinator.com/item?id=7909201

http://docs.celeryproject.org/en/latest/userguide/workers.html

http://docs.celeryproject.org/en/latest/userguide/canvas.html

 

Super Bônus

Celery Messaging at Scale at Instagram – Pycon 2013

Creating and populating a non-nullable field in Django

Hey, what’s up guys, that’s another quick post! I’ll show you how to create a new non-nullable field in Django and how to populate it using Django migrations.

SAY WUUUUT?????

george costanza taking his glasses off

Here’s the thing, do you know when you have your website in production and everything set in order and then some guy (there’s always some guy) appears with a new must-have mandatory field that nobody, neither the client nor the PO, no one, thought about it? That’s the situation.

But it happens that you use Django Migrations and you want to add those baby fields and run your migrations back and forth, right?

For this example, I decided to use a random project from the web. I chose this Django Polls on Django 1.10.

So, as usual, clone and create your virtual environment.

git clone [email protected]:garmoncheg/django-polls_1.10.git
cd django-polls_1.10/
mkvirtualenv --python=/usr/bin/python3 django-polls 
pip install django==1.10  # Nesse projeto o autor não criou um requirements.txt
python manage.py migrate  # rodando as migrações existentes
python manage.py createsuperuser 
python manage.py runserver

Note: This project has one missing migration, so if you’re following this step-by-step run python manage.py makemigrations to create the migration 0002 (that’s just a minor change on a verbose_name)

Now, access the admin website and add a poll

django polls admin

Alright, you can go to the app and see your poll there, answer it and whatever. Until now we did nothing.

The idea is to create more questions with different pub_dates to get the party started.

After you use your Polls app a little you’ll notice that any poll stay forever on your website, i.e., you never close it.

So, our update on this project will be that: From now on, all polls will have an expiration date. When the user creates a poll, he/she must enter the expiration date. That’s a non-nullable, mandatory field. For the polls that already exists in our database, we will arbitrarily decide they will have a single month to expire from the publication date.

Before migrations exist, it was done through SQL, you had to add a DateField that allowed NULL, then you’d create a query to populate this field and finally another ALTER TABLE to turn that column into a mandatory field. With the migrations, it works in the same way.

So, let’s add the expires_date field to models.py

expires_date = models.DateTimeField('expires at', null=True)

The whole models:

class Question(models.Model):
    question_text = models.CharField(max_length=200)
    pub_date = models.DateTimeField('date published')
    expires_date = models.DateTimeField('expires at', null=True)

    def __str__(self):
        return self.question_text

    def was_published_recently(self):
        now = timezone.now()
        return now - datetime.timedelta(days=1) <= self.pub_date <= now
    was_published_recently.admin_order_field = 'pub_date'
    was_published_recently.boolean = True
    was_published_recently.short_description = 'Published recently?'


It’s time to make migrations:

python manage.py makemigrations

This is going to generate the 0003_question_expires_date migration, like this:

class Migration(migrations.Migration):

    dependencies = [
        ('polls', '0002_auto_20170429_2220'),
    ]

    operations = [
        migrations.AddField(
            model_name='question',
            name='expires_date',
            field=models.DateTimeField(null=True, verbose_name='expires at'),
        ),
    ]

Let’s alter this migration’s code, NO PANIC!

Populating the new field

First of all, create a function to populate the database with the expires dates:

def populate_expires_date(apps, schema_editor):
    """
    Populates expire_date fields for polls already on the database.
    """
    from datetime import timedelta

    db_alias = schema_editor.connection.alias
    Question = apps.get_model('polls', 'Question')

    for row in Question.objects.using(db_alias).filter(expires_date__isnull=True):
        row.expires_date = row.pub_date + timedelta(days=30)
        row.save()

Originally, I’ve used this code in a project with multiple databases, so I needed to use db_alias and I think it’s interesting to show it here.

Inside a migration, you’ll find a operations list. On that list, we’ll add the commands to run our populate_expires_date function and after that, we’ll alter this field to make it non-nullable.

operations = [
    migrations.AddField(
        model_name='question',
        name='expires_date',
        field=models.DateTimeField(null=True, verbose_name='expires at'),
    ),
    migrations.RunPython(populate_expires_date, reverse_code=migrations.RunPython.noop),
    migrations.AlterField(
        model_name='question',
        name='expires_date',
        field=models.DateTimeField(verbose_name='expires at'),
    )
]

You can see that we used migrations.RunPython to run our function during the migration. The reverse_code is for cases of unapplying a migration. In this case, the field didn’t exist before, so we’ll do nothing.

Right after, we’ll add the migration to alter the field and turn null=True. We also could have done that just removing that from the model and running makemigrations again. ( Now, we have to remove it from the model, anyway).

models.py

class Question(models.Model):
    question_text = models.CharField(max_length=200)
    pub_date = models.DateTimeField('date published')
    expires_date = models.DateTimeField('expires at')

    def __str__(self):
        return self.question_text

    def was_published_recently(self):
        now = timezone.now()
        return now - datetime.timedelta(days=1) <= self.pub_date <= now
    was_published_recently.admin_order_field = 'pub_date'
    was_published_recently.boolean = True
    was_published_recently.short_description = 'Published recently?'

And we’re ready to run the migrations

python mange.py migrate

Done! To see this working I’ll add this field to admin.py:

class QuestionAdmin(admin.ModelAdmin):
    fieldsets = [
        (None,               {'fields': ['question_text']}),
        ('Date information', {'fields': ['pub_date', 'expires_date'], 'classes': ['collapse']}),
    ]
    inlines = [ChoiceInline]
    list_display = ('question_text', 'pub_date', 'expires_date', 'was_published_recently')
    list_filter = ['pub_date']
    search_fields = ['question_text']

And voilá, all Questions you had on polls now have an expires_date, mandatory and with 30 days by default for the old ones.

That’s it, the field we wanted! The modified project is here on my GitHub: https://github.com/ffreitasalves/django-polls_1.10

If you like it, share it and leave a comment, if you didn’t, just leave the comment.

XD

 

References:

http://stackoverflow.com/questions/28012340/django-1-7-makemigration-non-nullable-field

http://stackoverflow.com/questions/29217706/django-view-sql-query-without-publishinhg-migrations

https://realpython.com/blog/python/data-migrations/

https://docs.djangoproject.com/en/1.10/howto/writing-migrations/#non-atomic-migrations

Pip Installing a Package From a Private Repository

That’s a python quick tip. It’s very basic, but still very helpful. When your company uses GitHub for private repositories you often want to put them on the requirements.

First of all, remember to add your public key on your GitHub settings.

 

You just have to use like this:

pip install git+ssh://[email protected]/<<your organization>>/<<the project>>[email protected]<< the tag>>

You can even use on your requirements, without the pip install. E.g. if your organization is called Django and your project is called… let’s say… Django and you’d like to add Django 1.11.4 in your requirements you can use like this:

git+ssh://[email protected]/django/[email protected]

Probably you already have a deploy key configured to your server or a machine user, it will work for your private Repos on your server, if you don’t, take a look at this

 

SSH Keys

If you don’t know how to generate your ssh-key, It’s easy:

cd ~/.ssh
ssh-keygen -t rsa -b 4096 -C "[email protected]"
eval "$(ssh-agent -s)"
ssh-add ~/.ssh/id_rsa

Now, copy the content of your public key (id_rsa.pub).

cat ~/.ssh/id_rsa.pub

In your GitHub account go to  Settings > SSH and GPG Keys and add it.