Create an Account
username: password:
 
  MemeStreams Logo

MemeStreams Discussion

search


This page contains all of the posts and discussion on MemeStreams referencing the following web page: Why MapReduce matters to SQL data warehousing | DBMS2 -- DataBase Management System Services. You can find discussions on MemeStreams as you surf the web, even if you aren't a MemeStreams member, using the Threads Bookmarklet.

Why MapReduce matters to SQL data warehousing | DBMS2 -- DataBase Management System Services
by Lost at 11:48 pm EST, Feb 8, 2009

In essence, you can do almost anything to a single record* — that’s a map step. But you are sharply limited in how you combine information about multiple (often intermediate) records – that’s a reduce step. Still, reduce steps let you do counts, sums, or other aggregations. That, plus the general power of map steps, makes MapReduce useful for at least three major classes of applications:

1. Text tokenization, indexing, and search
2. Creation of other kinds of data structures (e.g., graphs)
3. Data mining and machine learning

Except for the building of entire search engines, these are all application areas that data warehouse users should and do care about. And they all still could benefit from large performance increases, as is evidenced by the routine compromises analysts make in areas such as data reduction, sampling, over-simplified models and the like.


 
 
Powered By Industrial Memetics