6DOS Helps You Explore Your Personal Network

Copied URL with current time.
0:00 / 0:00

In this episode of Running in Production, Henry Popp goes over building a platform to help explore your personal network which was built using Phoenix and Elixir. It’s hosted on Google Cloud using a self managed Kubernetes cluster. It’s been up and running since September 2019.

Henry went into great detail about the value of using a service oriented architecture, DDD, event driven design and running a self managed Kubernetes cluster. There’s a lot of great insights in this episode around general code design and scaling that apply to any web framework.

Topics Include

  • 2:11 – 4 developers are actively working on the project
  • 2:50 – It’s been running in production since September 2019
  • 3:13 – Motivation for using Phoenix and Elixir
  • 4:26 – Henry started using Elixir back in late 2014
  • 5:48 – Ditching Umbrella apps for dedicated services
  • 7:35 – 6DOS is built on 6 independent git repos with a service oriented architecture
  • 8:20 – A break down of what those 6 repos are and what they do
  • 10:37 – Each service has its own independent database (Postgres, Neo4j, Elasticsearch)
  • 11:21 – Neo4j is a graph database which is a great fit for their main data model
  • 12:55 – How is Elixir support for Neo4j?
  • 13:46 – Each service talks to each other through RabbitMQ events / notifications
  • 15:43 – Walking through the request / response cycle when a visitor hits the site
  • 17:04 – How did you arrive at this service oriented architecture?
  • 18:33 – It’s easy to get Domain-driven Design (DDD) wrong initially
  • 19:42 – Are Phoenix contexts being used? Nope
  • 20:20 – Monoliths vs micro-services vs something in between and industry trends
  • 20:56 – “Instantaneous complexity”
  • 21:39 – Using an app skeleton project to help spin up new services quickly
  • 23:23 – Using VueJS on the front-end with Webpack, but not through Phoenix
  • 24:43 – Currently 6DOS doesn’t need websockets but that could change later
  • 26:47 – Quite a lot of work happens in the background
  • 27:37 – RabbitMQ handles queueing up all of the jobs
  • 29:10 – Docker is being used in production, but not in development (yet)
  • 29:38 – The work flow for starting everything up locally in development
  • 30:52 – Everything is hosted on a self managed Kubernetes cluster on GCP
  • 31:19 – (3) 2 core master nodes, (3) 2 core worker nodes and extra servers for the databases
  • 32:24 – The self managed Kubernetes cluster was terrifying to set up initially
  • 34:00 – Kubernetes is not a magic button you press to scale your application
  • 35:15 – Auto-recovering from a CrashLoopBackOff error with Kubernetes
  • 37:45 – Those 2 CPU core servers have 8 GB of RAM but the app isn’t using all of that
  • 38:47 – Handling an interesting auto-scaling problem with Kubernetes
  • 40:20 – Performing rolling restarts so there’s no down time for each deploy
  • 40:41 – Dealing with restarting containers while an important action is happening
  • 43:23 – Walking through the deploy process from development to production
  • 43:34 – It starts with a self hosted Gitlab instance with automated CI
  • 44:15 – On the other side, Keel takes over to automate deploying any services
  • 45:12 – Helm is being used for a few things, but not everything
  • 46:17 – Humans needing to accept the release happens within Keel’s UI
  • 47:51 – Secrets are stored directly in the self hosted config repo with strict access rights
  • 49:09 – Balancing your time between low level infrastructure vs app features
  • 49:58 – Handling SSL certificates on the cluster with cert-manager
  • 51:06 – Everything is behind a Cloudflare proxy too
  • 51:20 – Dealing with database migrations when you have automated deployments
  • 52:40 – Migrations get run as part of the app boot up process
  • 54:24 – Design your software like a space ship
  • 55:16 – Diagnosing errors with custom tasks and 3rd party tools
  • 56:23 – No one can agree on how to format API JSON errors
  • 57:32 – Elixir best practices are still being discovered, the future is bright
  • 58:19 – An example of one of Henry’s open source Elixir tools (Pigeon) taking off
  • 59:14 – All of the databases get backed up hourly
  • 1:00:26 – Kubernetes really needs to be configured
  • 1:01:16 – Rate limiting is currently being added to all of the services
  • 1:03:07 – What about alerts if something goes down? It’s a digital notification bomb
  • 1:03:36 – Using UptimeRobot as a second sanity check to make sure things are up
  • 1:04:12 – Hung over at 6am out in the middle of the woods and your server goes down
  • 1:04:55 – Using an external tool like UptimeRobot is worth it
  • 1:06:16 – Timber.io is being used for logging but that will change soon
  • 1:07:00 – Kubernetes’ Stern package helps with tailing logs across pods
  • 1:07:44 – Henry isn’t a fan of Kubernetes’ web UI tools to manage the cluster
  • 1:07:54 – Weave Scope was interesting but it used too much resources to run it
  • 1:08:41 – Best tips? Don’t be afraid to break your code up into multiple repos
  • 1:09:45 – If you’re not at that point yet, at least look into using contexts and DDD
  • 1:10:24 – Developers as a whole are getting better over time
  • 1:11:56Codedge is Henry’s consulting company and they are on GitHub / Twitter too
📄 References
⚙️ Tech Stack
🛠 Libraries Used

Support the Show

This episode does not have a sponsor and this podcast is a labor of love. If you want to support the show, the best way to do it is to purchase one of my courses or suggest one to a friend.

  • Dive into Docker is a video course that takes you from not knowing what Docker is to being able to confidently use Docker and Docker Compose for your own apps. Long gone are the days of "but it works on my machine!". A bunch of follow along labs are included.
  • Build a SAAS App with Flask is a video course where we build a real world SAAS app that accepts payments, has a custom admin, includes high test coverage and goes over how to implement and apply 50+ common web app features. There's over 20+ hours of video.

Questions

Feb 17, 2020

✏️ Edit on GitHub