Monday, April 10, 2017

GitLab - Omnibus Edition

In an attempt to look for a build/deployment pipeline at home, I came upon GitLab, a source control and continuous integration/deployment system. This really appealed to me because my entire home setup was meant to be controlled by automation. GitLab definitely fills a gap, but I ran into a lot of issues, and still have some that are plaguing me.

Let’s get the obvious things out of the way. GitLab is “basically” Github and TravisCI smashed into one with some closer integrations with other services as direct plugins rather than external webhooks. You can also host on premise for free, or you can opt for the price per user per month Enterprise option, which is similar in cost as Github Enterprise.

Let’s talk editions real fast, there are two that are self-hostable. Community Edition and Omnibus. Community Edition is more of the configure it yourself, here are all the components, have fun! Where Omnibus is a single docker container you spin up and use. While Omnibus was super appealing to me, after trying this out in my sandbox, TOTALLY a pain in the ass to keep running… damn you unicorn 502 errors…

I use Nomad to schedule everything in my cluster, create a job, nomad run job.nomad, my life is done. Want to update? Edit job file, run again, if it needs to change, it will, if not, it will just continue running. The Omnibus edition seemed like the perfect fit, because it was just a docker container. “just” should always be quoted here, there are tons of things running inside this container. Out of the box, without any types of customizations, the container is running roughly 78 processes. Granted the NGINX server has its child workers (7), as well as the Postgres children (16), take up a bit of them, and 35/32 are just Ubuntu things in the background doing Ubuntu things.

One of the core concepts with Nomad is bin packing, meaning, put things where they can fit. However, with that, you need to allocate what you “need” for a job. This is where most of my issues with Omnibus came in. I started, like a naive fool, by allocating 500mhz CPU and 512gb memory, my “bench” for most sandboxing applications, since the host “server” (i.e. old laptop) only has 3gb total. After quite a long time, the system came online, I hit the website, and almost every other page hit was a 502. Digging through the logs for the allocation in Hashi-UI, I find this constantly in the unicorn_stderr log:

E, [2017-04-09T16:24:26.245792 #451] ERROR -- : reaped #<Process::Status: pid 27913 SIGKILL (signal 9)> worker=2

This has a lot to do with there OOM killer implementation, which is by design, unfortunately, even with allocating 1.5gb memory, I still have 502 responses regularly, I have even seen reports of 3gb having issues with a brand new install, so be aware, Gitlab is Memory Hungry, I think a lot of it has todo with the master/worker configurations for so many different services on the same box, and that one of those services is a database (and while tuned to only use 25%) still causes issues. The 25% issue might also be another one of my problems, considering that when looking at the Monitoring within gitlab admin (assuming this is what promethius running is providing), the container still sees the entire Host OS memory, and if Postgres is configured to use 25%, thats going to be 1gb/3gb

So, at this point, I had a mostly working system, granted I would get a lot of 502 errors, but not constant after increasing the memory. I created a new group (org) and project (repository), I really like there are “context aware” UI hints. For example “Add License” button, once it has a license, the button shows “license type”, same thing with the contrib, changelog, and CI settings. For most of these there is a template system for kicking off a new one out of the box, I am almost 100% sure you can customize these, probably by dropping the templates on the server, but this is a container, and its ephemeral, so… not sure.

Getting the CI running was… interesting, no where in the UI did it mentioned I needed a Runner, until the Pipelines literally just sat there and spun forever in “waiting”. I think this has a lot to do with the Pull vs. Push model that the system uses. Each Runner (builder) polls into the server looking for things to do, when it finds something, it does it and reports back. Most likely over websocket, to keep the streaming events going. Starting up the runner was actually a no brainer, only annoying part was the instructions having you exec into a container to configure it via command line, the runner, like a lot of everything else, needs to take in more ENV or ARGS for this into the container, I mounted a volume just so I didn’t have to keep re-running configurations to set the URL and token.

This URL thing… Let me start off really fast and say this, if you are using Omnibus and Http, you just lost port 80 on your container host, because the runners will NOT work otherwise, dynamic ports don’t work at all. Here is the reason: The container spins up, if the external_url is not set, it will use the hostname, which in most cases, is garbage, because its a hash in docker. If the external_url is set, and includes a port, internal systems will try to bind to this port, and in the case of you “accidentally” choosing the same port as they did internally for your container port mapping, some of the internals may fail to start up because ports are in use… So don’t pick 8080, because… 1hr punching in face. I gave up, hosted it on port 80, because the builder, even though you went through a configuration process to tell it where the server was, and it had to handshake and say hello, won’t use that URL during jobs, it will use that external_url provided to gitlab. That is as confusing as dressing a squirrel up as a chipmunk.

FINALLY, after some points of frustration in getting things setup, I had a system that was capable of doing version control (easy revert), CI (make sure my shit works), CD* (it can run commands…. using the code). What was really nice is my sample ci used docker in docker (docker:dind) to build/run a container within a container, by being started by another container…. Ok, in case you didn’t get it, this system relies pretty heavily on containers, and the “deployment” is pretty much based around Kubernetes/Openshift. Would love to see more integrations, but at this point, my CD is running nomad.

I hit my final bridge, when setting up the “integrated registry” feature, turns out by integrated, they mean to projects/builds, not that they have one, so I still need to go and setup Docker Registry before I can configure it to use them, my ideal scenario is to edit a Dockerfile, the system builds/publishes it, and then runs a nomad job to update it on my servers.

Ending thoughts, don’t containerize Gitlab, if you do, use more than one container. I do get the “put all the stuff in a single thing”, but this was just not super suitable, if only there was a common way to describe a docker cluster, without getting too specific (docker-compose, k8, swarm, etc…).

Until next time, I will be banging my head on hosting a cert authority and internal certificates for SSL (damn you docker registry for requiring this….)

No comments: