Friday, June 27, 2008

Facebook Photo Storage Architecture

Awesome presentation called "Facebook - Needle in a Haystack: Efficient Storage of Billions of Photos", here is an excerpt. You should see the full presentation (if you can get flowgram to work).

Facebook uses MySQL, Memcache, Apache, PHP and Extensions in their application stack.

Facebook uses NetAPP filers for storing files.

Facebook scale of photos
  • ~6.5 billion total images, 4-5 sizes stores for each image => ~30 billion files => 540TB total storage capacity.
  • ~475,000 images server per second at peak – most through CDNs.
  • ~100 million uploaded per week.
Facebook uses a 4-tier architecture for serving profiles and photos.
The first tier is CDN, then their proprietary “Cachr”, then their photo servers then the NetApp filers.

  • “Protects the origin for profile pictures
  • Based on modified evhttp
  • Uses memcache as backing store
    • Microsecond response time on cache hit
    • Server can die or restart without losing cache
File handle cache
  • Based on lighthttpd
  • Uses memcache as backing store
  • Reduces metadata workload on NTAP
NetApp storage architectural issues
  • NetApp Storage is overwhelmed with metadata
  • ~3 disk reads to to read one photo
  • Totally bottlenecked on disk bandwidth
Thus, heavy reliance on expensive CDNs to serve reads:
  • 99.8% hit rate in CDN for profile images
  • ~92% hit rate for photos
  • Drastically reduces load on the storage

Labels: , , ,


Anonymous Anonymous said...

hi there...nice job you're doing!i was wondering if you could assist me.I'm currently in my final year about to start my project and dissertation. My project is creating a socal network website.The only problem is I cant seem to fid any suitable research topics.i'm doing one on usability but i just cant think of a second one that related to my project?

9:17 PM  

Post a Comment

Links to this post:

Create a Link

<< Home