a nd te r a by t e s o f te x t f i l e s c o l l e c t...
TRANSCRIPT
Steven Melendez
Today 10:25am
GarbageThis week, we are writing about waste and trash, examining the junk that dominates our lives, and
digging through garbage for treasure.
Prev Next View All
GARBAGE
Delete Never: The Digital Hoarders Who
Collect Tumblrs, Medieval Manuscripts,
and Terabytes of Text Files
Illustration: Jim Cooke (Gizmodo)
• Filed to: HOARDING 14.7K 50 1
VIDEO REVIEWS SCIENCE IO9 FIELD GUIDE EARTHER DESIGN PALEOFUTURE
When it comes to their stuff, people often have a hard time letting go. When
the object of their obsession are rooms full of old clothes or newspapers, it can
be unhealthy—even dangerous. But what about a stash that fits on 10 5-inch
hard drives?
Online, you’ll find people who use hashtags like “#digitalhoarder” and hang
out in the 120,000-subscriber Reddit forum called /r/datahoarder, where they
trade tips on building home data servers, share collections of rare files from
video game manuals to ambient audio records, and discuss the best cloud
services for backing up files.
The often stereotyped hoarders letting heaps of physical items of questionable
utility dominate their homes and lives often suffer social stigma and anxiety as
a result. By contrast, many self-proclaimed digital hoarders say they enjoy
their collections, can keep them contained in a relatively small amount of
physical space, and often take pleasure in sharing them with other hobbyists or
anyone who wants access to the same public data.
“Data hoarder means to me simply someone who collects and curates digital
data,” said the user -Archivist, one of the moderators of /r/datahoarder, in a
private message on Reddit. “It’s a little removed from the disorder we usually
see from traditional hoarders.”
“Data hoarding isn’t about just buying $3,000 worth
of hard drives just for posting them here. What’s
interesting is what you do with your storage.”
Share Tweet
He and many of his fellow subreddit users also take pride in keeping their data
well organized into folders and subfolders. Some even take pains to keep the
forum itself from getting bogged down with dubious material: One of the most
popular recent threads begs users to stop spamming the subreddit with photos
of their hard drives.
“Data hoarding isn’t about just buying $3,000 worth of hard drives just for
posting them here,” wrote user Nooco24, one of the site’s moderators.
“What’s interesting is what you do with your storage.”
What users seem to prefer to see are discussions of unusual and intricate
storage setups, guides to using complex archive software and, of course,
interesting datasets, from public-domain collections of vintage scientific
papers to old BBC sound effect samples. Public archives, naturally, are a plus.
In addition to roughly 2.6 petabytes stored on a system of servers in his spare
room—data collection size is the one fact each moderator highlights on the
AD VERTI S EM EN T
Screenshot: Reddit
forum’s mod list—-Archivist is also the data curator and server manager of
The Eye, a sprawling online archive of everything from vintage movie posters
to beer-brewing guides to video games from short-lived console systems of
the 1980s. A German resident in his late 20s who restores historic paintings
and documents for a living, -Archivist said he got his start collecting printed
and digitized medical journals.
“After that came piracy, which I was introduced to early on by my stepfather,”
he quipped, leading him to start developing collections of movies and TV
shows. Today, he personally prefers to collect digital books and texts, which he
said are often quick to disappear from the internet.
“Most other data types aren’t so rare,” he said. “Weird and obscure books and
texts seem to vanish first.”
Many people active in the data hoarding community take pride in tracking
down esoteric files of the kind that often quietly disappear from the internet—
manuals for older technologies that get taken down when manufacturers
redesign their websites, obscure punk show flyers whose only physical copies
have long since been pulled from telephone poles and thrown in the trash, or
episodes of old TV shows too obscure for streaming services to bid on—and
making them available to those who want them.
GitHub, owned by Microsoft since late last year, is mostly known for hosting
source code for collaborative programming projects. But it’s also home to a
AD VERTI S EM EN T
“Most other data types aren’t so rare. Weird and
obscure books and texts seem to vanish first.”
collection of works by the Polish surrealist painter Zdzisław Beksiński uploaded
by the user itdaniher, a Midwesterner and /r/datahoarder user who’s been
collecting data for a decade and asked to only be identified by their username.
“I’ve been in touch with his estate a little bit, and they’re fine with me hosting
a mirror of his works,” said itdaniher, who first obtained the images from a
shared BitTorrent file, in a phone interview. Another file they uploaded to
GitHub is a database mapping more than 2,000 common names of plants to
their Latin scientific names, with entries from “Abe Lincoln Tomato” to “Zuni
Gold Bean.” Itdaniher, who also enjoys gardening and doesn’t identify as a true
“hoarder”—“I try to exercise a certain level of judiciousness,” they say, usually
spending three or four hours a week archiving—hopes to expand the list into a
larger project documenting ideal temperatures, soil and other conditions for
growing the various plants. They hope to find that data scattered across the
internet, just as the list of names initially was.
“The internet is a big place, and a lot of times I will find other people who have
HTML tables on their web pages that have some information, but a small
fraction of the information that I want,” itdaniher said. “Sometimes it’s finding
personal sites where [someone’s said] here’s the list of the common and Latin
names for the plants I’m growing this year.”
“The internet is a big place, and a lot of times I will
find other people who have HTML tables on their
web pages that have some information, but a small
fraction of the information that I want. Sometimes
it’s finding personal sites where [someone’s said]
here’s the list of the common and Latin names for
the plants I’m growing this year.”
Itdaniher, an experienced Linux system administrator, also runs software
provided by the group Archive Team to help download materials at risk of
disappearing from the internet and help them make their way to the nonprofit
Internet Archive. Founded by the digital archivist and filmmaker Jason Scott in
2009, Archive Team calls itself “a loose collective of rogue archivists,
programmers, writers and loudmouths dedicated to saving our digital
heritage.” Members frequently scramble to preserve aspects of internet history
before they disappear as sites fade from the web. Through a mix of manual
labor and distributed bots, the project has archived large swaths of sites
including the classic free web host Geocities, the text-hosting platform
Etherpad and the blog platform Xanga.
“Since 2009 this variant force of nature has caught wind of shutdowns,
shutoffs, mergers, and plain old deletions—and done our best to save the
history before it’s lost forever,” the group says on its official site.
Itdaniher shared with Scott a collection of Tumblr postings linked from Reddit
and tagged as “not safe for work” as part of a global effort to preserve adult
content on the now-Verizon-owned blogging network, after the company
controversially announced it would no longer allow such material. At least
344,000 archived Tumblr sites marked for deletion are en route to the Internet
Archive or already uploaded where they’ll be publicly accessible, Scott said.
AD VERTI S EM EN T
“Since 2009 this variant force of nature has caught
wind of shutdowns, shutoffs, mergers, and plain old
deletions—and done our best to save the history
before it’s lost forever.”
“I was able to contribute to that larger project of saving that aspect of internet
culture for future generations,” said itdaniher.
Some /r/datahoarder users acknowledge they collect files that other people
might not find interesting: HeloRising, a man in his mid-30s from the Pacific
Northwest, said via Reddit PM that he’s built up a collection of high-quality
digital copies of illuminated manuscripts, which he said he finds fascinating but
has yet to find other uses interested in sharing. The files sometimes get posted
by institutions that house and scan the medieval documents, but they’re often
difficult to download and can disappear over time or live on only in obscure
online archives.
“The illuminated manuscripts are unicorns,” he said. “They turn up in odd
places.”
HeloRising, who has about 30 terabytes in total of data and spends five or six
hours per week on the hobby, said the Reddit community has been a “treasure
trove” of useful advice and information. It’s a common sentiment from users,
who enjoy solidarity and support on the subreddit, where a recent comment
thread filled with excitement about a newly organized collection of thousands
of vintage video game manuals.
“Having a community is great,” said itdaniher. “It makes me feel like the time
that I spend, I’m working towards of a common goal of not throwing things
down the proverbial memory hole, the 1984 trash disposal of uncomfortable
facts.”
While people with hoarding disorders are often isolated, embarrassed and
overwhelmed by disorganized piles of clutter, members of /r/datahoarder tend
to take pride in their digital collections and thrive on keeping them organized,
whether for sharing or personal use. More than a few work in technology or
simply enjoy tinkering with computers, so tweaking download scripts and data
AD VERTI S EM EN T
storage networks is a fun part of their hobby, not a chore. Some also share
custom-crafted archiving tools and other software they’ve created on GitHub,
which can serve as a portfolio for those seeking programming jobs or just a
high-tech social outlet.
“With time flying, we aren’t just people archiving data together, we are more
than that,” said Corentin Barreau, a 19-year-old administrator on The Eye who
is nicknamed “The French Guy,” in a Twitter direct message. “Beside that, I
have an affection to everything that links to collections, even IRL, I like to
collect, and it’s peaceful to sort data, it’s satisfying. And the joy of people when
you share something [is] worth more than everything.”
His most prized archive is a set of “family memories,” digitized from analog
photos and VHS tapes taken by his loved ones over the years. Barreau keeps
local copies of the digital versions, as well as looking after cloud backups and
the analog originals.
“That’s the most exciting thing [I’ve] done, and the collection I’m most proud
of,” he said.
“Having a community is great. It makes me feel like
the time that I spend, I’m working towards of a
common goal of not throwing things down the
proverbial memory hole, the 1984 trash disposal of
uncomfortable facts.”
AD VERTI S EM EN T
Barreau said he doesn’t see himself as a hoarder in a negative sense, since it
doesn’t negatively impact his personal life.
“It’s just a passion, like people doing sports every day, or painting,” he said
with an ASCII wink.
As with other mental health issues, experts say hoarding really becomes an
issue when it interferes with people’s happiness or gets in the way of everyday
life. Collecting, on the other hand, can be a perfectly healthy hobby, whether
people are collecting baseball cards or rare Frank Zappa MP3s.
“The collections tend to give pride and positive feelings, whereas hoarding
tends to be associated with stress and disorganization,” said Gregory Chasson,
an associate professor of psychology at the Illinois Institute of Technology who
has studied hoarding disorder. “There doesn’t tend to be a sense of cohesion or
a theme.”
And digital media’s small physical footprint means it’s harder for even
disorganized files on hard drives or USB sticks to grow unmanageable and
dominate spaces the way physical collections of clothes, books or other
materials can.
“I walk into homes where I can’t discern where sleeping, bathing and eating
takes place because of the volume of the stuff,” said Regina Lark, owner of the
AD VERTI S EM EN T
“The collections tend to give pride and positive
feelings, whereas hoarding tends to be associated
with stress and disorganization.”
Los Angeles area professional organizing firm A Clear Path, which helps people
with physical hoarding problems. “I would imagine the uber-acquiring of
digital media is not impairing the quality of your life, unless that is what you’re
spending your life on, is acquiring.”
Still, problem digital hoarding, where massive collections of files, inbox
messages and other digital data bring stress to their owners, isn’t unheard of,
including among people who already struggle with hoarding tangible objects.
Chasson said anecdotally, it’s not uncommon to see people with hoarding
issues also have computer desktops riddled with icons or email accounts
stuffed with unread messages. There hasn’t yet been much formal research
into digital hoarding, he said. But a recent paper he coauthored does suggest a
connection with physical hoarding, finding “higher levels of physical acquiring
behaviors were significantly related to increased distress” when experimental
subjects were falsely told a digital item from their Pinterest collections would
be deleted.
“Ultimately, I think it’s tapping into the same mechanisms for a lot of people,”
he said.
Both physical and digital hoarding can be motivated by the fear of permanently
losing something important, even if others might think it’s easily replaceable
or simply trash, said the creator of the YouTube channel I am a Compulsive
Hoarder, a self-proclaimed “disposophobic” (referring to her fear of throwing
out something that might prove valuable) who asked that her name not be
used.
AD VERTI S EM EN T
“I would imagine the uber-acquiring of digital media
is not impairing the quality of your life, unless that is
“I start thinking, but that particular article has such good information, I’m not
going to find it again,” she said. “We can’t even consider the possibility we
could find a better article.”
She said she has a tendency to store disorganized collections of web articles
describing exercises she’s never done, foods she’s never prepared and even
treatments for hoarding. Managing text messages can also be stressful, since
she worries about deleting conservation histories en masse without going
through each individual message. Even e-commerce can bring challenges for
people with hoarding issues, she said, as websites guilt them into signing up
for inbox-clogging discount newsletters they hesitate to delete or unsubscribe
from.
“They get inundated about marketing emails,” she said. “Once you’re there,
it’s hard to get unsubscribed, because now you’ve got FOMO.”
When old files do turn out to be valuable—like old Christmas newsletters that
bring back old memories, or a wedding speech she recently unearthed and
shared with a delighted friend—she has to remind herself it’s not a reason to
stockpile every bit of data.
“When I found something else everyone else is so glad I kept, I really have to
splash cold water on my face and tell myself, don’t let this be a reason to start
saving stuff,” she said. “I don’t want to keep getting more hard drives.“
what you’re spending your life on, is acquiring.”
AD VERTI S EM EN T
The fact is, though, it is often genuinely difficult for users without a decent
amount of technical experience to find the right balance. Many systems don’t
make it easy to find, organize and back up valuable files, while shunting more
ephemeral data to the digital trash heap. Social networking sites are
notoriously difficult to search, let alone download content from. Cloud services
shut down or change policies often with little notice, said the Archive Team’s
Jason Scott, like Tumblr’s about-face on erotic pictures, Google’s move to shut
down social network Google+ or the venerable photo-sharing site Flickr’s
recent announcement it would begin purging images from legacy free accounts
with more than 1,000 pictures uploaded as of March 12.
“We have consistently been working since the mid-80s to turn every single
aspect of life into a digital file in one way or another,” Scott said. “People are
suddenly discovering they don’t own their data, and all your life is data.”
“When I found something else everyone else is so
glad I kept, I really have to splash cold water on my
face and tell myself, don’t let this be a reason to
start saving stuff. I don’t want to keep getting more
hard drives.“
Archive Team sometimes finds itself effectively the last stop before data
disappears from shuttering services. That means there’s often little time or
desire to distinguish between trash and treasure. But many of the group’s
volunteer archivists—some of whom also frequent forums like /r/datahoarder
—are more inclined to find joy and pride than frustration in loading their hard
drives and public online archives with as much data as they can save for
posterity.
“People are like really, you’re gonna save a bunch of furry art?” Scott said.
“Well, we don’t know, and we’re not going to be the ones to make that
decision.”
Steven Melendez is an independent journalist living in New Orleans.
SHARE THIS STORY
https://gizmodo.com/…
RELATED STORIES
How '90s Cybersex PioneersLooked for Action and FoundCommunity
When the Internet ArchiveForgets
Let's Make the World Wide Weba National Monument
The Life-Changing Magic ofKonMari-ing Your iPhone Apps
Emulate the Golden Age of theMacintosh Thanks to theInternet Archive
The Best Project ManagementApps to Get Your Life in Order
AD VERTI S EM EN T
Want Gizmodo’s email newsletter?
About Need Help? Content Guide Gizmodo Store Privacy Terms of Use
Advertising Jobs
More from our network
© 2019 Gizmodo Media Group
Your email address
Subscribe