Let’s Talk Backups and Data Preservation

February 27th, 2016

I’ve produced a lot of data over the years.  Some of this data is super crazy important (we’ve been working on Aztez for over 5 years now–the project must absolutely be immune to data failure).  I’ve refined and clarified my thinking on handling data over the last few years, so I thought I’d do a brain dump here:

Classifying Data

(ravenous data appetite)

I think about my data in roughly three ways:

1) Data I’ve Created

This is the precious stuff! Some of these things might be published, and potentially possible to recover in the case of failure, but most of them aren’t.  If I lose this data, it’s gone, and it’s gone forever.  Examples:

  • Game projects (source code, assets, whole source control archives)
  • Photography (surprisingly large, north of 2TB these days)
  • Misc personal projects (usually small, some with big data sets)

2) Data I’ve Collected

I collect a fair amount of data!  Mostly of this is just media, but sometimes it might be harder-to-come-by bits (obscure emulation sets, very old games, etc).  The key point, though, is that if I did lose any of this data, I could find it again on the Internet.  I’m not the sole person holding copies of it.

3) Transient Data

Everything else, basically–all the crap pooling up in a downloads directory, currently-installed games and software.  All of this stuff is easily recoverable.  Note that most backup strategies here are usually aimed at easing downtime and continuity in case of hardware failure, rather than preserving data.

Sources of Failure

There are roughly a couple of ways you can lose data:

1) Hardware Failure

Hard drives are mechanical.  All hard drives will eventually fail–it’s just a question of two years or twenty years.  I think drive failure is probably the single biggest source of failure for most people.  They keep a bunch of stuff on a single hard drive, and have for years, and then one day it starts clicking/grinding/gasping and they’re in deep trouble.

2) Software Error (or user error)

Something goes awry and deletes a bunch of data.  Maybe a script accidentally wipes a drive, or you delete the original copy of something and don’t realize it, or you simply make a mistake in moving/organizing/configuring your setup.

Software error can be especially nefarious, since all the hardware redundancy in the world won’t help you if deletion is permanent.

3) Catastrophic Failure

An entire system is destroyed:  Theft, fire, water damage, lightning, malice (digital intruder, CryptoLocker ransomeware, etc)…

My Current Setup

Multi-Drive NAS

I keep all data, created and collected, on a multi-hard drive network attached storage system.  At home I have a Synology DS1815+, and the office has a DS1515+.  I wholeheartedly recommend Synology units.  They’re incredibly stable, easy to expand, and have a wealth of useful software available.

They’re also kind of expensive, but if you can afford it they’re absolutely worth it.  (I already wish I had ponied up the additional cash for the DS1815+ at the office, just have to have the extra bays for caching or extra storage).

Home NAS

I run both Synology units with two-drive redundancy (the volume can survive two simultaneous drive failures).

Other options, if you wanted to go the DIY route:

  • XPEnology is a community-built version of the DSM software that runs on Synology hardware.
  • unRAID is a easy-to-use, oft-recommended OS (not free, but pretty cheap). Handles uneven drive sizes very well, so a good option if you have a bunch of hardware/drives lying around already.
  • FreeNAS is an open-source NAS solution. Sadly, like a lot of open source software, it’s ugly, complicated, and has a caustic community.  But hey, if your goal is tinkering…

Physical Offsite Backup Cycling

I use 5TB external hard drives for offsite backups.  Each Synology (home/office) has a backup drive connected.  The drives themselves are inaccessible on the network; only the DSM software can touch them.

Nightly backups are performed using DSM 6’s HyperBackup.  This is kind of like Apple’s Time Machine–the backups include versioned files until the disk is full.  Backups are thinned to be recently dense (it keeps daily copies for a few weeks, then weekly copies for a few months, monthly copies after that, etc).

The drives are large enough to hold all created data.  Every week I swap the drives between the two locations.  In an absolute worse-case scenario, if either location is totally destroyed, I have backups from at most a week ago.

Office setup

(Office setup–Synology and backup drive lower-left)

Digital Offsite Backups

I push nightly backups of important things into a Dropbox folder.  The Synology units sync to Dropbox, which means these backups also end up in the offline external HD backups too.  I don’t place 100% trust in Dropbox (or 100% trust in any one service, really).

(Nightly Aztez source control backups for 1702 days? Sure.)

Nightly full-database Aztez source controls backups for the last 1,703 days?  Sure, why not.

On my home desktop, I run Backblaze for a complete online backup of most files.  Backblaze is great–$5/month per machine for unlimited storage.  I have 2TB+ on Backblaze, which includes all of my personal creative projects like photography.  Their recovery options including sending you a physical HD or thumb drive, too!

Continuity of Infrastructure

Some of my backups are intended to minimize downtime in case of failure:

  • I keep a cloned copy of my desktop boot drive (with Carbon Copy Cloner).
  • Our source control servers are VMs with data stores hosted directly from the Synology units. Even if their host fails completely, I can spin up the VM on a new host in minutes.
  • VMs with local storage maintain nightly snapshot backups with 3-day retention.
  • All of my Digital Ocean VPS instances make nightly database backups and weekly file backups (these get pushed into Dropbox, which then sync into the Synology, and eventually the external HD rotation).

Takeaways and Final Thoughts

Some final thoughts on things!  Also a short list of what to do:

  • Use the “n-1” rule for copies.  If you have one copy of something, you really have zero.  Aim for three copies of any and all bits you’ve created.
  • Invest in a multi-drive storage system.  It’s worth it.
  • Bit rot is real.  Run regular parity checks, or if you have a Synology, run DSM 6 (currently in beta) to utilize BTRFS, a filesystem more resilient to bit rot.
  • For important data, you absolutely must have some kind of versioned backup system that can handle deletes.  Hardware redundancy won’t help you if a file deletion is still a permanent deletion. Maybe this something fancy like a snapshotted backup program, or maybe it’s just a bunch of thumbdrives/HDs with your project.
  • Audit your backups! If you run any kind of stat dashboard, prominently include backup age as red/green stoplights.
  • Test recovery. Make sure your system backups contain what you think they contain, and that you can actually recover from them (database backups especially).
  • Monitor your systems.  Synology can be configured to send an email when issues arise, but do something for any/all systems. Failure-resistant systems don’t do much good if they don’t warn you when something startings to go wrong.

Moving From Dedicated Hosting to $5/Month VPSes

April 28th, 2015

I recently moved 25 projects from one dedicated server to 8 different VPS instances on DigitalOcean.  Here is what I learned!


For the last 8 years, I hosted all of my personal and professional web projects on a single, dedicated server. Dedicated hardware has a lot of benefits, especially ample, fixed resources and total root control of your setup. But it also has a lot of downsides: Higher expense, rare but inevitable hardware issues, and lack of separation between projects. Fixed resources can be a double-edged sword–clobbered MySQL trying to back up a huge 100GB database? Oops, everything goes offline.

More than anything, the expense of the server was becoming a personal burden as my requirements diminished and the server itself became overkill. Blurst was our largest project, but we halted development years ago. I found myself paying $250/month for a server that hosted a bunch of small-scale projects (including a lot of friends’ personal sites). It was time for a change.

No More Bare Metal

Virtual machine and hypervisor technologies have advanced significantly in recent years.  It’s now feasible to run production hosting on virtual machines.  In moving away from our dedicated server, I had two basic options for moving into the cloud:

1) Switch to a monolithic virtual instance.  I’d move all sites to a single instance, with enough storage/CPU/memory to handle hosting all sites.

2) Move sites to their own instance, with instance specs tailored each site (or group of sites)

#1 seemed like my best option for an easy transition, in terms of doing the actual move, but it also had many of the same downsides of dedicated hardware–I didn’t want to end up paying for resources I didn’t need, or have changes or issues with sites affect all other sites.

#2 fit a lot better with how I was actually using the old server, but I was worried about overhead from managing multiple virtual servers.  I’m not a sysadmin by trade, and my time is already spent working on Aztez full-time plus a bunch of demanding hobbies.

Ultimately, two technologies made #2 the best option and the easiest option:  DigitalOcean and ServerPilot.


DigitalOcean is a VPS hosting company with a focus on fast hardware and an easy-to-use backend for managing instances (which they refer to as “droplets”).  It hits a lot of my needs:

– Cheap! Their lowest tier costs $5/month. Many of my projects can actually live just fine at the $5/month specs.

– Flexible. Droplets can be resized anytime, so if a project goes up or down in traffic I can adjust accordingly.  This is particularly good for projects like the IGF judging backend which actually sit idle half the year (and really only spike in traffic occasionally for deadlines).

– Full control. A freshly-provisioned droplet is a full operating system of your choice, running on a hypervisor.  This is an important distinction–some cheaper VPS solutions use container virtualization, which share kernels with their host OS.

The backend is really nice.  Here’s the interface for a new instance:


They do provide an API, too, in case your infrastructure gets fancy enough to load balance instances on the fly (instances are billed hourly up until a monthly cap).

An obvious question: Why not Amazon AWS?

For me, AWS is overkill.  AWS is a fantastic solution if you’re some kind of startup aiming to build an entire business around your technology, with thousands of active users from tens of thousands of total users.

DigitalOcean lacks many of the robust scalability features that AWS provides, especially flexibility like elastic IPs and elastic block storage. If the hypervisor backing your droplet dies–and I assume in the next few years I’ll eventually experience an instance failure–you’re looking at some downtime until you can work around it.

Because of the flexibility of AWS, their backend interface is also complicated and multifaceted. There’s a lot of overhead in learning how things work, even for simple tasks like getting a single EC2 instance online.


DigitalOcean can give you the IP to a fresh Linux install in 60 seconds.  But a fresh Linux install don’t get you very far with hosting an actual website: You’ll need to install some kind of web server (probably Apache), some kind of database server (probably MySQL or MariaDB), configure virtual hosting and paths, logins, passwords, firewall, etc…

And in practice, a stock LAMP configuration has a lot of performance bottlenecks.  Really, you’ll want some kind of reverse proxy in front of Apache to manage connection pooling, PHP configured with FastCGI and compilation caches/accelerators, and so on. This left me with a few options for configuring multiple VPS instances:

1) Configure a single instance with an initial setup, and then use DigitalOcean’s snapshot system to spin up additional instances from there.  Making changes after the initial setup phase would be painful and manual.

2) Roll my own multi-server management scripts to aid in adding new sites, applying updates/etc.  While this is a genuinely appealing side project, I just don’t have time for something like this.

3) Use an all-in-one server management suite like cPanel.  cPanel has per-server licensing costs, and is also a huge, gigantic system that makes a mess out of your server configuration.  Once you’ve moved a server to cPanel, you’re basically stuck with it for any kind of administration tasks (and therefore stuck into the monthly cost).

In looking around for something lighter than cPanel, but still far more automated than rolling my own systems, I bumped into ServerPilot.  It’s pretty great:

– It bootstraps your server into a very competent initial state

Their documentation already covers common configuration changes

– Their free pricing tier includes unlimited servers/sites, which works very well for this kind of  single-server-to-multiple-VPS transition

ServerPilot setup is very, very quick.  You paste in their setup script on a fresh install, and a few minutes later everything is ready to be managed by their web UI.  Once that process finishes adding a new website is via this simple form:



So how do things perform?

TIGSource is a good example of one of the decently-active projects on the server.  The forums have one million posts with 200-300 active users at any time.  This is now hosted on the $10 tier (1GB RAM, 1 CPU).  It was actually performing well on the $5/month tier until the server hit the 512MB memory limit and had a load runaway.


On a $5/month Droplet, I tested incoming network speeds (SFO location):

–2015-04-28 17:55:00– http://speedtest.wdc01.softlayer.com/downloads/test500.zip Resolving speedtest.wdc01.softlayer.com (speedtest.wdc01.softlayer.com)… Connecting to speedtest.wdc01.softlayer.com (speedtest.wdc01.softlayer.com)||:80… connected. HTTP request sent, awaiting response… 200 OK Length: 524288000 (500M) [application/zip] Saving to: ‘/dev/null’

100%[==============================================================================>] 524,288,000 78.0MB/s in 9.2s

2015-04-28 17:55:09 (54.6 MB/s) – ‘/dev/null’ saved [524288000/524288000]

And outgoing network speeds (downloading from a RamNode VPS in Seattle):

–2015-04-28 22:59:39– http://mwegner.com/test.zip
Resolving mwegner.com (mwegner.com)…
Connecting to mwegner.com (mwegner.com)||:80… connected.
HTTP request sent, awaiting response… 200 OK
Length: 524288000 (500M) [application/zip]
Saving to: ‘/dev/null’

100%[==============================================================================>] 524,288,000 86.9MB/s in 8.1s

2015-04-28 22:59:48 (61.5 MB/s) – ‘/dev/null’ saved [524288000/524288000]

And a quick test of the SSD disk speeds (again, this is the cheapest possible droplet):

hdparm -Tt /dev/vda1

Timing cached reads: 15380 MB in 2.00 seconds = 7696.35 MB/sec
Timing buffered disk reads: 1558 MB in 3.00 seconds = 518.76 MB/sec


The DigitalOcean backend provides some simple bandwidth and CPU performance graphs:

But for more advanced resource monitoring, New Relic has great stats, and their free pricing tier includes 24 hour data retention for unlimited servers:

Their at-a-glance page is especially useful for managing multiple VPS instances together:


Tips and Tricks

– DigitalOcean resizing is a little confusing.  If you resize upwards and increase disk size, that resize operation is permanent.  However, you can select “flexible”, which will increase CPU/memory resources but leave your disk alone, which lets you resize back down later.  (Put another way, increasing disk size is a one-way operation).

– DigitalOcean backups can only be enabled at droplet creation.  You can turn them off later, though, so if you’re unsure if you’ll need them it’s best to leave them enabled initially.  Backups cost 20% more on your instance and take weekly snapshots of the entire OS image.

– You can transfer DigitalOcean snapshots to other customers.  This is actually super appealing to me, because it means I deploy a contract job to a separate VPS and then potentially transfer it entirely to the client after the project.

– I ended up using my own backup system.  Each VPS instance runs a nightly cron job, which exports my control scripts out of a Subversion repository and runs them.  Right now, I’m backing up databases nightly and web files weekly. Backups go directly into a Dropbox folder via this excellent Bash script.

– It’s worth mentioning that I deploy all of my own systems via Subversion already.  A staging copy of a website exists as a live checkout, and I export/move it to production.  With ServerPilot this means I just have a “source” directory living next to each app’s “public” folder.

– ServerPilot seems like they aim their setup at developers.  You probably want to disable display_errors or at least suppress warnings/notices.  They have a per-website php.ini configuration you can use for this.  Ideally I would like to see a “development/production” toggle on their app settings.

– ServerPilot configures apps alphabetically on a server.  If you request the IP directly, or a hostname that isn’t configured in the apps’ domain settings, you’ll get the first app alphabetically.   (Some of my curation systems use wildcard DNS entries with URLs configured in the backend, so this mattered quite a lot!  I just ended up prefixing with “aaa”, which feels a bit sloppy, but hey it worked).

– ServerPilot puts the MySQL root password in ~root/.my.cnf, and requires localhost for connections by default.  I manage database development with a graphical client that lets me SSH tunnel in to the server so I can connect as root on localhost.

– ServerPilot will push security fixes and the like to your servers

– NewRelic has application monitoring in addition to their server daemon, which operates via a PHP extension that injects JavaScript into your output.  It’s pretty neat, and lets you see how long each page request is taking in each area (PHP/MySQL/memcached/etc), in case you’re developing your own tech and having performance issues.

Uptime Robot is a free monitoring service (5-minute interfaces; paid plan goes down to 1-minute).  I just set this up, though, so I can’t speak to its reliability yet! Previously I was monitoring the single dedicated server, but most monitoring services are priced per-host, so the VPS exodus didn’t work so well there.

– The DigitalOcean links in this post use my referral code (you get $10 credit, I get $25 credit once you spend $25 there).  Just an FYI!


At the end of the move, I ended up with a $50/month recurring bill at DigitalOcean.  Not bad!

(There’s actually another $20/month on top of that for Blurst itself, but that’s a temporary situation and the droplet size is due to the disk space requirements on the database).

IGS 2013 Soapbox Talk

March 27th, 2013

Here’s an article version of my Independent Games Summit 2013 soapbox talk.  I added the IGF stats pretty last minute, and I definitely felt like I rushed through the first part of my talk to cram everything in.

The prompt I give to the speakers for this session is “what are you thinking about?”.  This is deliberately wider than asking someone to give a “rant” (although if someone is thinking a lot about something that’s been bothering them, they’re more than welcome to rant)!

So What Am I Thinking About?

My first company, Flashbang Studios, just turned 10 years old, so I’ve been thinking a lot about that.  Many of these thoughts are just nostalgic rememberings, but a big observation is that my focus on game development has shifted over the years.


In the early days, nearly 100% of my time was spent making games.  Partly this was due to lifestyle. I lived with my then-girlfriend in her mom’s house, after recently moving to a new city.  My total expenses were $150/month, including rent, food, power, etc.  I basically sat in a room with my computer and worked on Flashbang projects.  I didn’t get out much.

These days, my time is spent roughly something like this:


My time allocation shifts month-to-month, of course, but there now are a lot of competing elements besides straight-up game development.  Surprisingly, I find some of these distractions really satisfying!  “Web development” encompasses my contract work–I built the proposals/sessions backend for GDC, and the entry/judging system for IGF.   Another surprise was that I’m a pretty good web developer.  In fact, I think I’m a better web developer than a I am a game developer.  I’m definitely more productive as a web dev, and I even think I might be more capable.  So why is that?

Developing Across Abstraction Levels

My web development projects are all solo projects:  I do everything, from the database to the client-side HTML.  This turns out to be super useful!  I’m able to take a high-level problem and implement a solution in multiple places, using the strengths of each layer.  It may be more efficient to do something more client-side, or maybe I should calculate something server-side first, or maybe I need to restructure the database relationships entirely.



In Aztez, I’m using one language, C#:


Developing Across Team Members

The above graphs reminded me of the codebase interaction breakdown we might have in one of our Blurst games, which might look like this:


Versus Aztez, which is again something like this:


Coding for Delineation

Solo coding across multiple languages/abstractions, and team coding in the same language actually feel very similar to me.  With web development, I need to clearly hand off data and functionality between the different layers.  “PHP Matthew” needs to collaborate with “HTML Matthew”.  In a team project, we would similarly create clear demarcation points between systems and people.  You want logical membranes in your project!  They naturally create clearer code and more independent systems.

When I write in one language, by myself, I tend to create much messier systems and abstractions.  Things stay in my head longer, and I’m not as strict or clean with my abstractions, simply because I don’t need to be.

Perhaps I should create artificial constraints to enforce this?  Or better discipline on code quality?  Reintroduce Unity’s JavaScript in Aztez, for different game systems?  I’m not really sure what the solution is here.

Time-to-Reward and Challenge Boredom


Game development is slow.  Even fast development is slow.  Our 8-week cycles on Blurst games felt insanely fast, but that’s still two months to any kind of finished product (and Blurst games used a pretty loose definition of “finished”).

I’ve also been developing games for half of my life.  A lot of the fascination with the challenges of development have waned for me.  I sometimes get bored with game work, especially a lot of the grunt work (UI, etc).

Enter Photography

I bought my first serious camera three years ago.  I was drawn to the promise of having a creative outlet where I quickly make finished things, and also to a new set of challenges and puzzles.  It provides a lot of the elements that have waned for me in game development over the years!


You can see more of my travel/portrait photography at mwegner.com.

Creative Turnaround

As a very concrete example, multiple hours of my work in Aztez could result in a change that’s directly visible as this:


(The change is the addition of a “Force Direction” option in our move system, if you didn’t catch it).

And yes, I can mentally understand how this new flexibility enables new combat options, and eventually affects the player experience, but it still feels so deeply buried.  The long feedback cycle between and results hurts the virtuous cycle of being motivated from results of my work.

Compare this to my experiences with photography.  I had an idea for this photo while taking a bath, and I literally went from concept to final product in 30 minutes.  “Final” is really important to me here!  I won’t ever re-edit this shot.  It’s done and out there:


Even a “big” photo shoot is measured in hours.  Here’s a shot from a friend coming over with a borrowed straightjacket and an idea for an image:


(I shoot a lot of fetish material, if you’re curious–and it’s been really interesting and enjoyable to learn an entirely new scene over a few years from a totally cold start!)

Increasing Game Development Rewards

Fortunately for Aztez, we’re actually entering a period of development where we can begin producing alpha builds of the game that encompass the entire game experience.  I think this will help the work->results->reward cycle tremendously.  It certainly won’t be anywhere near the same feedback loop as a completed photo in under an hour, but I’m really looking forward to the feedback cycle of being excited by people actually playing our game out in the wild.

(By the way, the “Internet” slice on that pie chart is just bullshit like Reddit.  The Internet is much better at distracting you in 2013 than it was in 2003.  YouTube didn’t even exist until 2005)!

Addendum:  IGF Entry Stats

I ended my soapbox talk with some stats from the IGF backend.  In the interests of transparency, here they are:

entries-allUnique views are the number of judges that viewed the entry.  Comments are judge-to-judge discussion left on the entires.  Notes are judge-to-jury comments (other judges can’t see).  Ratings are reports on a game (which might be “hey couldn’t play, it was broken”).  Votes are the average upvotes for any category on a game (note you can rate a game and leave zero votes).

The all entries average isn’t the important one, in my opinion.  The screensaver I was running during some of the IGS breaks was built from screenshots from all entries.  You may have noticed that a lot of the games looked pretty rough, or not very “indie”, or whatever.  As any judge will know, there are a lot of IGF entries that are too early in development, or not a good fit for the conference, or simply broken.

Sorting by total votes, the top chunk of more competitive entries looks like this:

entries-top And the very top of the entries is even wider visibility.  Note that many of these entries are finalists, and that lots of judges check out finalists in the backend after they’re announced (this data was taken the other day, so it’s post-finalist-announcement):


Addendum:  IGF Judge Stats

I also crunched the judge data.  There are 195 active judges in the IGF backend.


The top quarter of judges, sorted by total ratings:


The 10 most active judges were really active:


With one judge really standing out (these stats include student entries and main competition entries both):


Contact Details

Questions or comments?  Feel free to hit me up on Twitter or via email!

Building a Hackintosh

November 24th, 2011

For the past 4 years, I’ve been using a Mac Pro as my primary desktop.  It’s been a fantastic machine:  Ultra stable, fast, and reasonably expandable.  It’s getting old, though.  The whole Mac Pro line is getting old; Apple’s last update was almost 500 days ago (as of this writing, check the MacRumors Buying Guide for current info).  Intel has repeatedly delayed the Xeon CPUs used by Apple in the Mac Pros.  Today, an i7 MacBook Pro can easily outperform an entry-level Mac Pro.

So Use an iMac or MacBook Pro?

The writing is pretty clearly on the wall, and there are indeed rumors that Apple will indeed kill the Mac Pro line soon.  Thunderbolt, which is basically external PCI Express, is meant to supplant most use cases for a burly desktop.  iMacs and MacBook Pros both ship with i7s, and the new iMacs can support 16GB of memory.

I actually have an i7/SSD/HD 27″ iMac, which I use for work, and an i7/SSD/HD MacBook Pro, which I use for travel.  They’re both awesome devices, but they fail to meet my needs for a desktop in three primary areas:

  • Storage (my desktop has 4 internals drives and 8 external drives)
  • Gaming performance (the iMacs use laptop graphics cards)
  • >2 Display support

Enter Hackintosh

I use OS X as my primary operating system–I only boot to Windows for games–which puts me in a bit of a bind. Do I wait for a Mac Pro upgrade, and shell out a ridiculous premium once they launch?  Part of the price problem with the Pro line is that you’re paying for huge amounts of upgradability that you’re probably not going to use (massive power supply, dual socket motherboard, ECC memory, etc).  The Mac Pro line is meant to accomodate serious scientific computing and video production needs, but all I really want is a faster photo editing machine with decent graphics performance.

“Hackintosh” refers to running OS X on commodity hardware.  This has serious price advantages.  Let’s take a look at a higher-end build, as of November 2011.  Check the Kakewalk Builds page for current pricing:

Up agains a low-end Mac Pro, this is awfully appealing:

Hackintosh Downsides

Modern Hackintosh machines are actually pretty good.  You can certainly achieve stability with minimal hassle.  Still, the Hackintosh route is not nearly as painless as owning actual Apple hardware.  In short, you’ll be facing:

  • Complicated initial setup
  • Inconvenient minor/point updates for OS X
  • Painful major updates (ie Snow Leopard -> Leopard)
  • Potential compatibility problems with your workflow (ie MainStage)

My Build

Apple maintains a fairly minimal set of drivers in OS X itself.  A lot of hardware has 3rd-party support, but this means adding kernel extensions after initial install (.kext packages).  These 3rd-party kext files are what make updates tricky, especially since some hardware compatibility relies on replacing or patching existing Apple extensions.

I tried to minimize kernel hacking by selecting hardware with known/stable OS X compatibility.

(Note that I changed my network card configuration after taking this video)

Hardware List

These link to Amazon:

I moved most of my drives, but wanted new SSDs so I could sell the old Mac Pro with boot SSD intact:
GPU-wise, I went with two of these:
I needed FireWire 800 for my Drobo.  This card works out of the box:
Post-setup, onboard audio had a slight-but-annoying noise floor in OS X (but not Windows).  I added this USB 2.0 audio interface, which has been super awesome.  I run Audioengine A5 speakers and can definitely hear the difference:
Finally, this case is awesome and contains everything:

Setup Process

Now that I had a pile of hardware I had to get everything running!

Motherboard/BIOS Settings

Taking some steps from this guide, I made the following BIOS changes:

  • Reset to “Optimized Defaults”
  • Set SATA controller modes to ACHI instead of IDE
  • Changed HPET to 64-bit in power management
  • Disabled onboard FireWire
  • Later, I overclocked to a 42x multiplier

UniBeast (Initial Lion Install)

UniBeast is an amazing tool that creates a bootable USB drive with the Mac App Store version of Lion.  You need a working copy of OS X to create the bootable drive, and a purchased copy of Lion from the App Store in the form of the “Install Mac OS X Lion.app”.  Since I was already running Lion on all my machines, I need to redownload this.  Turns out you can hold option and click on “Purchases” to redownload Lion in the App Store application.  The App Store version of Lion was current (10.7.2).

The UniBeast guide on tonymac’s side was comprehensive and worked flawlessly.  As soon as my install was complete, I had a working version of Lion with 1024×768 display, bootable via the USB drive’s copy of Chameleon (a boot manager by the Hackintosh community).

MultiBeast (Post-Lion Drivers)

MultiBeast is an all-in-one installation tool designed to modify a clean Lion install with the necessary files, drivers, and settings to self-boot your copy of OS X.  I used the settings from the same guide I used for BIOs settings.  Note that you’ll need to download a DSDT file for your motherboard, which contains the proper lookup tables to map hardware features to the OS.

Dual Video Cards

If you’re looking at a Hackintosh build, and don’t need crazy multi-monitor support or high-end gaming performance, you should stick to one of the video cards supported by OS X out of the box (there’s a great list on the tonymacx86 blog).

This is where things got tricky.  The GTX5xx series of cards are not officially supported, which means you need to do a bit of technical muckery to get things working.  I booted with a single GTX560Ti connected and ran this GT5xx Enabler.  This produced native resolutions with the one card.

To support two cards, I poked around this thread.  This gist is that you must pass in all graphics cards parameters into your boot.plist file as a hex-encoded EFI string, since auto-detection will fail.  You also need to probe your PCI addresses for both slots independently (boot one card, probe, boot second card, probe).

My input plist looks like this.  The first two addresses are for my onboard Ethernet and PCI slot.  I’m not sure if I still need these, to be honest.  After encoding hex with gfxutil, here’s my full /Extra/org.chamaleon.Boot.plist.

Network Setup

My initial setup used an out out of the box-supported network adapter (this Rosewill PCIe card).  However, this meant I had to use a PCI FireWire adapter, since the two GPUs cover some of my slots.  It turned out that whatever was in this PCI slot failed to re-initialize after sleep.  I wanted to preserve sleep, so I rearranged my hardware to use a PCIe FireWire card and onboard network.  Apparently onboard ethernet is flakey for some people, but it’s been fine for me.  I used the official RealTek drivers available in MultiBeast.  Gigabit speeds are great, it recovers from sleep; everything seems fine.

Juggling my network cards around meant changing my MAC address.  Definitely do all App Stores purchases after you’ve figured your network setup out.  I had to reinstall Apps due to Apple’s DRM.  I was unable to login to the App Store initially, too, with a “could not be verified” error (this has to due with default calls to discovering your MAC address failing).  If you find yourself in that position, this guide worked perfectly!

Sleep and CMOS Reset

Apparently this motherboard will reset CMOS/BIOS settings on sleep.  I applied this fix without testing that I had the problem.  Sleep has been fine, so it was either a solution or a benign change.

TRIM Support

OS X now supports TRIM for SSDs, although Apple has hardcoded specified device IDs into their support.  You can patch in general support for all SSDs.  I did this (and was running this before on my Mac Pro without any ill effects).

“Fermi Freeze”

Apparently Lion has much better overall support for NVIDIA’s “Fermi” GT100 architecture (via its official GTX4xx drivers, which is what you’re patching into with GTX5xx support).  Still, I experienced two graphics freezes/corruption.  Most of the information out there appears to be people experiencing issues with Snow Leopard.  Two fixes were suggested that I applied:  Disabling hardware acceleration in Flash, and upping my base clock/bus speed from 100MHz to 103Mhz.  The theory on the clock change has something to do with GPU idle states.

I’m not sure if my freezes were random, and will return, but as of this writing I’ve had 4 days of uptime (partly to run a gigantic Time Machine backup), during which I’ve edited a photo set in Lightroom/Photoshop, encoded videos, watched a movie, and did some work in Unity.  Seems very stable!

Still, there are rumors Apple is returning to NVIDIA architecture for its next refresh, which will hopefully bring additional support for the GTX5xx GPUs.  Patching graphics card drivers is the only real Hackintosh requirement that makes me nervous.


Windows install was straightforward.  Gaming performance has been great with the GTX560s in SLI mode.  Service Pack 1 install was unhappy due to being booted from Chamelon, since it wanted to screw with the reserved partition, but booting directly via the Motherboard F12-key boot selector fixed that.  In general, Windows has been super stable, and much more stable than Windows 7 on my Mac Pro.  Apple doesn’t give a shit about driver support on Windows.  I was bluescreening pretty regularly after updating to the Boot Camp driver version that accompanied Lion.

Hiding Partitions

My Windows 7 install actually created two partitions:  My actual Windows partition, plus a “System Reserved” NTFS partition.  This partition shows up in Mac OS X.  To hide it, I created an /etc/fstab file with:

LABEL=System\040Reserved none ntfs rw,noauto

The partition is still there; it just no longer mounts on boot.

I also use Carbon Copy Cloner to maintain a hot copy of my OS X boot drive.  To hide this drive from Finder, I used:

sudo SetFile -a V /Volumes/Backup

I think CCC is actually resetting this flag after it runs, so I guess I need to reset it immediately after a backup.

Impressions and Conclusions

I’ve been running the new build for a few days now, and have completed a some real-world project work on it:  Lightroom/Photoshop photo editing, video encoding, Unity game development, and general browsing.  It’s fast!

Benchmark-wise, my old Geekbench score was 9,800.  The new system is currently 13,800.  My old Mac Pro was an 8-core machine, though, and the new rig is 4-core.   General “speediness” is way up, since each individual core absolutely crushes my old box. My cooling setup definitely accommodates more overclocking, although I strongly favor stability over speed.

Gaming is awesome again!  My old setup was struggling to perform well on a DX10 code path with its GTX285.  Now I can run max settings at 60+ FPS on all modern games.  It’s crazy to see the difference in graphics fidelity in a game like Battlefield 3.

I’m a little wary about system fragility with updates.  Cloning your boot drive is a very wise move with a Hackintosh (I’ve already used it once when I broke my boot with a kernel panic).  If you’re going the Hackintosh route, spend the $50 to get a cheap drive to clone to!

I spent around $2k on this build.  A huge portion of that was SSD/GPU, so you can definitely go cheaper.  I’m going to offset much of the cost when I sell my old Mac Pro after another week of testing.  All told, I’m very happy with this upgrade!  I’d recommend an out of the box-friendly Hackintosh build to most people, with the caveat that you need to be quite technical to manage unsupported GPU hardware.  Good luck!