Home:ALL Converter>Serve large dataset w/ Docker, nginx, & django

Serve large dataset w/ Docker, nginx, & django

Ask Time:2016-11-08T09:57:35         Author:Alex Hall

Json Formatter

I am working on a research project that involves large video datasets (100s of GB, possibly multiple TB in the near future). I am fairly new to linux, sysadmin, and setting up servers, so please bear with me. I've provided quite a bit of info, and let me know if there is anything else that would be helpful.

I am using Ubuntu, Docker (w/ docker-compose), nginx, Python3.5 & django 1.10

Uploading a large-ish (60GB) dataset leads to the following error:

$ sudo docker-compose build
postgres uses an image, skipping
Building django
Step 1 : FROM python:3.5-onbuild
# Executing 3 build triggers...
Step 1 : COPY requirements.txt /usr/src/app/
 ---> Using cache
Step 1 : RUN pip install --no-cache-dir -r requirements.txt
 ---> Using cache
Step 1 : COPY . /usr/src/app
ERROR: Service 'django' failed to build: Error processing tar file(exit status 1): write /usr/src/app/media/packages/video_3/video/video_3.mkv: no space left on device

My files are on a drive with 500GB free, and the current dataset is only ~60GB.

I found this discussion on container size. Perhaps I am misunderstanding Docker, but I believe I just want my volumes to be larger, not the containers themselves, so this doesn't seem appropriate. It also doesn't use docker-compose, so I'm unclear how to implement it in my current setup.

Just to be clear, with help from this question I am able to serve static files & media files with a small test set of data. (unclear to me if they're serving from the django container or the nginx container, as the data appears in both containers via ssh)

How can I get my setup to handle this large amount of data? I would like to be able to upload additional data later, so if a solution exists that can do this without having to rebuild volumes all the time, that'd be swell.

My Setup

Directory Structure

film_web
├── docker-compose.yml
├── Dockerfile
├── film_grammar
│   ├── #django code lives here
├── gunicorn_conf.py
├── media
│   ├── #media files live here
├── nginx
│   ├── Dockerfile
│   └── nginx.conf
├── requirements.txt
└── static
    ├── #static files live here

docker-compose.yml

nginx:
  build: ./nginx
  volumes:
    - ./media:/usr/src/app/film_grammar/media
    - ./static:/usr/src/app/film_grammar/static
  links:
    - django
  ports:
    - "80:80"
  volumes_from:
    - django

django:
  build: .
  volumes:
    - ./film_grammar:/usr/src/app/film_grammar
  expose:
    - "8000"
  links:
    - postgres

postgres:
  image: postgres:9.3

film_web Dockerfile

From python:3.5-onbuild
ENV DJANGO_CONFIGURATION Docker
CMD ["gunicorn", "-c", "gunicorn_conf.py", "--chdir", "film_grammar", "fg.wsgi:application", "--reload"]

VOLUME /home/alexhall/www/film_web/static
VOLUME /home/alexhall/www/film_web/media

nginx Dockerfile:

FROM nginx
COPY nginx.conf /etc/nginx/nginx.conf

nginx.conf

worker_processes 1;

events {
    worker_connections   1024;
}

http {
    include /etc/nginx/mime.types;
    server {
        listen 80;
        server_name film_grammar_server;

        access_log /dev/stdout;
        error_log /dev/stdout info;

        location /static {
            alias /usr/src/app/film_grammar/static/;
        }

        location /media {
            alias /usr/src/app/film_grammar/media/;
        }


        location / {
            proxy_pass http://django:8000;
            proxy_set_header   Host $host;
            proxy_set_header   X-Real-IP $remote_addr;
            proxy_set_header   X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header   X-Forwarded-Host $server_name;
        }
    }
}

Thanks in advance for your help!

Author:Alex Hall,eproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/40477768/serve-large-dataset-w-docker-nginx-django
yy