Data on Ice - Decentralize Your Damn Data in a Norwegian Mine

Tagged: data & files & storage & decentralized

Let's say you have a LOT of data, really important data, and you need to maintain the information for generations to come, and you can't trust Google or Apple or even Amazon to provide that for you, given their track record.

In the first three parts of this blog post I talked about storing data on hard drives, burning data to optical discs, and going long-term with LTO. There's a fourth option if those aren't hardcore enough.

Svalbard - Unincorporated free area of Norway

In the Arctic Circle north of the mainland of Norway lies the Svalbard archipelago that is legally outside of the Schengen Area and European Economic Area, which also means anyone can come there without a visa and even work there without a permit. It's been declared a demilitarized zone by 42 countries, and while sovereign to Norway, other states are allowed certain rights to use it, and Russians maintain settlements there for mining.

This is the setting for the Arctic World Archive, a facility alongside the Svalbard Global Seed Vault (many countries around the world depend on it). Both use the unique geographic properties of an decommissioned coal mine inside a geologically stable sandstone mountain where the temperature is constantly well below freezing. The facility built inside the mine is also well above sea level. Thus, even if maintenance were stopped and power cut for months, the facilities should stay cold and dry for a long time.

Cold and dry is good for preserving the seeds of many nation's agricultural heritage. It's also ideal for storing many kinds of media used for storing data, such as film.

The Arctic World Archive is operated by a Norwegian data-storage company called Piql and a state-owned Norwegian mining company. Keep in mind that using this storage method does have some dependency on those two parties. But remember this a demilitarized zone, not on US soil, accessible without a European or American visa, and not controlled by any US public or state companies or organizations. Risk remains, the profile is very different from the usual.

The company Piql provides a service to digitize just about anything worth saving (informational value) to "piqlFilm" reels which store digital data as QR codes which can be read by machine but also by your own eye in case there's no hardware to read them, which is already the case for things like floppy disks. And these piqlFilms, after writing, can be stored by you yourself, with a storage provider you choose, or at the Svalbard facility (best option).

Organizations like the Internet Archive, companies like Microsoft, and even the governments of Mexico and Brazil have put their faith into the setup designed by Piql.

The reason I was made aware of this is that recently GitHub (owned by Microsoft), in partnership with the Long Now Foundation, the Internet Archive, and others, piql'd the repositories for Linux, Bitcoin, and tons of other open source projects, putting them down the mine shaft, aiming to be stored for 1000 years. You probably use Linux or Bitcoin in some form even if you don't realize it and you further don't realize what it would mean if the world lost the source code to such software, but the same goes for thousands of lesser known projects which the modern world depends on. If a data apocalypse were to happen today, maybe we would be able to piece much of our open source world back together from thousands of distributed hard drives. But will that hold true 100 years from now, for today's software?

Far into the future, what artifacts will our descendants look at to try to understand the world we live in today? A big piece of it will be running ancestor simulations, ancient operating systems supporting our everyday lives. It's already tough trying to run software from just a few years ago, simply due to the nested dependencies where any breaking change in any one of them halts the running of your program.

While tapes (LTO) are designed for long term archival, relative to things like recorded DVDs, they still need to be kept away from heat and humidity (optical discs do too). In the best case, tapes aren't expected to last many decades or centuries. M-DISC is purported to be able to last that long and longer. But who knows how they will be affected by years of improper storage in dank attics. Organizations that take this sort of thing seriously have chosen Piql.

Part 1: Hard drives, every computer has one and is already storing data on it. You can just turn off your computer and save the drive for a few years (might need occasional power if it's an SSD). Some percentage of hard drives will fail after spinning for a year. But not a bad option for a few terabytes of data.

Part 2: Optical discs were horrible archival options in the past. But with M-DISC and the capacity of today's BD-R, it's possibly a feasible long term option. But only M-DISC.

Part 3 For decades, tape storage has served for data archival purposes and with the standardization of LTO, the situation is looking stable long term. The tapes should last a decade or 3, better than most hard drives and optical discs except for M-DISC. The downside is the high cost for the tape drive itself.

Finally, for those who think deeply about very long term secure data storage, there's this unorthodox solution from Norway, kept in "data vault" deep inside a frozen mountain far from people and armies yet accessible to individuals globally. I'm not sure how much it costs and it's probably the most expensive option but it's a cost that must be amortized over hundreds of years, which might make it the cheapest option in the end.

Decentralization can mean everyone keeping a copy of the data, like a blockchain. Or it can mean everyone not putting their data in the same center. Maybe the right word for that is "uncentralized".