arachnaut
www.arachnaut.net

  Arachnaut's Lair

Sizes of Data

When I got a terabyte drive sometime in 2008 I found that 'terabyte' was not in my spell check dictionary. I loaded all the large number names into it so that wouldn't happen again. Then I thought, 'How big is a terabyte of data?'. I came up with this list to put it in perspective. The sources were taken from all over the web, mostly Wikipedia articles such as this one on 'exabyte'. You can find them, and more, easily.

First the unit names, then, the list. (Data is approximate.)

Sizes (^ = exponent):

MB = megabyte = 10^6
GB = gigabyte = 10^9
TB - terabyte = 10^12
PB = petabyte = 10^15
EB = exabyte  = 10^18
ZB = zettabyte = 10^21
YB = yottabyte = 10^24

  2 MB - about a thousand pages of text
 10 MB - a very high resolution JPEG image (a picture is worth a thousand pages)
 40 MB - a typical music album in MP3 format (approx. 45 minutes)
400 MB - typical Audio CD in raw format (approx. 45 minutes)

  1 GB - an old-time black-and-white movie in normal DVD format
  1 GB - size of Windows 98 (Windows folder)
  2 GB - 10 seconds of detector data captured by the Large Hadron Collider
  8 GB - size of Windows XP (Windows folder)
 15 GB - size of Windows Vista (Windows folder)
 20 GB - a collection of the works of Beethoven
 24 GB - size of Windows 7 (Windows folder)
 50 GB - size of data on a dual-layer Blu-ray disc
100 GB - a library floor of academic journals

  1 TB - 50,000 trees (5 acre dense forest) made into paper and printed
  1 TB - the size of my personal computer data backup on March 16, 2012
 10 TB - the print collections of the U.S. Library of Congress
500 TB - data storage of Large Hadron Collider
850 TB - Google search crawler raw data from web in September 2006

  2 PB - all U.S. academic research libraries
 15 PB - data generated annually by the Large Hadron Collider
 20 PB - data size of all the hard disks made in 1995
576 PB - all the telephone data sent in a 2003

  2 EB - total volume of information generated in 1999
  5 EB - all words ever spoken by human beings
  5 EB - all new print, film, magnetic, and optical storage data made in 2002

1.2 ZB - the volume of digital information created and duplicated in 2010
3.6 ZB - amount of content that a typical US consumer went through in 2008

  1 YB - US Department of Defense planned capacity for Global Information Grid

Notes on some really large sizes:

 2^128 =~ 10^38    = address space of IPv6
 2^265 =~ 10^80    = number of hydrogen atoms in the universe (conservative estimate)
10^100 == 1 googol = ten duotrigintillion

More notes:
Technically, each of these units is called an "SI prefix".
The yottabyte scale for the Global Information Grid comes from this document.
On March 16, 2012, the compressed Acronis image backup of my disk partitions C, D, E, F, G, H, J and K exceeded 1 terabyte.
It took 8 hours to store and validate.

According to Wolfram Alpha, there are three more orders of magnitude proposed:
hella or xenna - 10^27
weka - 10^30
vendeka - 10^33

Made on March 23, 2012, but originally posted in an old blog of mine on October 6, 2008.

 
 
Valid HTML 4.01 Transitional