Create an Account
username: password:
 
  MemeStreams Logo

MemeStreams Discussion

search


This page contains all of the posts and discussion on MemeStreams referencing the following web page: Petascale SQL DB at Yahoo!. You can find discussions on MemeStreams as you surf the web, even if you aren't a MemeStreams member, using the Threads Bookmarklet.

Petascale SQL DB at Yahoo!
by possibly noteworthy at 7:13 pm EDT, May 29, 2008

Wednesday Yahoo announced they have a built a petascale, distributed relational database. In Yahoo Claims Record With Petabyte Database, the details are thin but they built on the PostgreSQL relational database system. In Size matters: Yahoo claims 2-petabyte database is world's biggest, busiest, the system is described as an over 2 petabyte repository of user click stream and context data with an update rate for 24 billion events per day. Waqar Hasan, VP of Engineering at Yahoo! Data group, describes the system as updated in real time and live – essentially a real time data warehouse where changes go in as they are made and queries always run against the most current data. I strongly suspect they are bulk parsing logs and the data is being pushed into the system in large bulk units but, even near real time at this update rate, is impressive.

The original work was done at a Seattle startup called Mahat Technologies acquired by Yahoo! in November 2005.


 
 
Powered By Industrial Memetics