Homelab 2021 - Data Ownership

2021 is the start of a new year and with that comes a new chapter of data ownership for me. Like many I have relied nearly exclusively on cloud based systems (some paid and some free) for email, file storage, photos, backups and more. This strategy has served me well over the years - and many of the products I use will remain in use - but the reality is we are giving away vast troves of data away to companies and not getting enough in return.

I am fortunate to be in the position where I have both the knowledge and the financial ability to tackle my current situation and late in 2020 I set out to build a homelab/home server to help re-balance my data ownership. It is now early 2021 and the very basics of this are up and running. Read on to learn more about my initial goals and solutions.

  • Move to paid services that respect my privacy where available
  • Create continuous backups of all cloud data
    • Email accounts
    • Dropbox data
    • Photos (iCloud and Google Photos)
    • Code on Github
  • Ensure I have secure remote access to the onsite data at all times
    • This will grant me remote access to my files

I am still in the process of developing the solutions but so far I have made very good progress and below I will share what I have done to get started.

Server Hardware

The first part of the project was purchasing parts for a new home server that was power efficient and had a higher storage density than my current Synology DS218+. The Synology has served me well for many years as a small server running Plex and backups with TimeMachine and rsync - but I was out of storage with no options for expansion.

My needs have changed over the years so I decided something needed to be built instead. I ended up going with two servers. The first is a storage server and the second is a small recycled corporate mini pc I call the worker node.

Storage Server

After surveying the landscape I decided on a new machine powered by the following.

  • ASRock Rack D1521D4I Motherboard with Xeon D1521 Processor
  • 64Gb Ram
  • 4x 4 Terabyte Hard Drives
    • For ZFS storage pool
  • 250gb NVME SSD
    • For OS and configuration
  • 450w Power Supply
  • Fractal Design Node 304

The storage server is responsible for providing durable storage and backups of all my data in the cloud. In order for this to be as maintence free as possible I am utilizing Docker for all of the backup jobs and stateful services which I will go over in a future section below. The storage server runs the following major components.

  • Ubuntu 20.04.2 LTS
  • ZFS Filesystem
  • Docker

I have a small Ansible setup that handles basic server configuration such as setting the timezone, locale, and installing docker here. I also setup unattended security upgrades so the machine always has the latest security patches and hardened the SSH config for remote login if needed.

I then utilized ZFS to create a storage system that is durable and plenty fast for my needs. I went with a mirrorred setup so each time I want to expand the pool I need to add two drives at a time. I created a number of different datasets and quotas to manage each type of data as needed. The relevant files are then shared on the network with Samba if required. Below is an example of the filesystem.

# Main storage pool
/zstorage

# Main media storage
/zstorage/music
/zstorage/movies
/zstorage/tv

# TimeMachine - 2TB Quota
/zstorage/timemachine

# Docker - 100GB Quota
# Stores persistent docker container data
# Note this is NOT actual backup data
/zstorage/docker/unifi
/zstorage/docker/plex
/zstorage/docker/mbsync
/zstorage/docker/rclone

# Backups for all of my cloud data
/zstorage/backup/zsiegel/mail
/zstorage/backup/zsiegel/dropbox
/zstorage/backup/zsiegel/photos
/zstorage/backup/zsiegel/github

# ... Many more ...

Worker Node

For the stateless services I went with something cheap and small. I purchased a used HP EliteDesk 800 G1 Mini desktop off eBay. Many of these tiny corporate PCs are sold for huge discounts after they are recycled from corporate use. It came barebones so I added 16gb of ram and a small SSD drive for the OS. This machine is currently my testbed for playing around with Nix and NixOS.

  • HP EliteDesk 800 G1 Mini
  • 16gb Ram
  • 256 Gb SSD
  • NixOS Stable

I am in the process of setting it up to run all of the stateless services within docker containers such as my VPN and a number of DNS services. I can maintain the entire configuration from files stored in git which you can checkout some basics I have here.

Docker Containers

In order to reduce the server maintenance I decided to setup Docker and Docker Compose so I could drive most of the configuration and changes declaratively via Git. You can see a number of different configuration files and the corresponding services I am utilizing on my Github repo.

Note this is changing fairly frequently and is a bit behind my curreny system. As I iron out the kinks and bring more of these services online I will update the repo.

Plex

My storage server runs a Plex container and hosts my music, movies, and tv shows. It runs in a container and has read access to the various media files on the ZFS pool. This docker container has been rock solid and is available outside my network for streaming on my phone or tablet.

ddclient

ddclient is a perl client that can update DNS records for a number of different domain registrars. It allows me to ensure my home machines are always accessible remotely if needed. This will be moved to run on the worker node in the future because its a stateless process.

mbsync

mbsync is a command line tool for synchronizing mailboxes. I am currently running this on the storage server. I am utilizing Jake Whartons customized container as it has some health check integration that I appreciate for ensuring my backups are running smoothly. Right now this is currently backing up my old gmail account.

I have recently moved to Fastmail referral link which I will write about in a future post. I am currently in the process of setting up a Fastmail backup as well.

cloudflare-ddns

cloudflare-ddns does a similar job as ddclient. In this case it updates my dns entries in Cloudflare as I have some domains registered with them instead of Google Domains. This will be moved to run on the worker node in the future because its a stateless process.

docker-rclone

docker-rclone I intend to use rclone to backup my entire Dropbox folder. I have many important documents and files in Dropbox and its imperative I get a local copy of this in case anything goes awry.

portainer

portainer is a wonderful open source GUI for getting an overview of the running containers on the server. I do not use this often as much of the changes are driven by git but its nice to have this in place for browsing the state of the containers. I run this on both the storage server and the mini-pc.

wireguard

wireguard I have begun experimenting with Wireguard in a container to augment and hopefully replace the VPN configured inside my unifi router. This will run on the mini-pc and provide access to my home network.

Closing Thoughts

I look forward to continuing to pursue a higher degree of data ownership and ensuring I have full backups of all my cloud data. If you have any other ideas on useful services to run or migrate to please let me know. I will be publishing more updates about backing up Dropbox and my photos in future posts.

Special thanks to @JakeWharton and @TechnoTimLive for some of their articles and youtube videos as I got a ton of inspiration from them.