| |
Current Topic: Technology |
|
Ubuntu Enterprise Cloud | Ubuntu |
|
|
Topic: Technology |
11:06 pm EDT, Jun 7, 2009 |
Home Ubuntu Enterprise Cloud (new in Ubuntu 9.04)Eucalyptus logo Ubuntu Enterprise Cloud brings Amazon EC2-like infrastructure capabilities inside the firewall. The Ubuntu Enterprise Cloud is powered by Eucalyptus, an open source implementation for the emerging standard of EC2. This solution is designed to simplify the process of building and managing an internal cloud for businesses of any size, thereby enabling companies to create their own self-service infrastructure.
Ubuntu Enterprise Cloud | Ubuntu |
|
pero on anything - Hadoop Indexes |
|
|
Topic: Technology |
9:31 pm EDT, Jun 6, 2009 |
Just a simple order log with an unique identifier (id) and a single associated product and customer. Since we want to view our data from different perspectives we added two additional indexes on product and customer. (In this example we need two indexes because MySQL can only use the leftmost prefix of an index.) Dumping the whole table as a single CSV-file into your Hadoop cluster would mean that you always have to use what (R)DBMS call a “full table scan”. It would be pretty much the same like removing all indexes from your MySQL-table. Try to search for all products a customer ordered without the index idx_product_customer. (In fact Hadoop would perform this full table scan an order of magnitude faster.) But it would be ridiculous to remove all indexes from your table. But that is actually what you did when you exported the whole table into a flat-file! What you should do, and what we did with great success, is to split up your flat-file CSV and arrange the data so that you can decide beforehand which part of the data needs to be accessed. So let’s split up the data and simulate all of the indexes (besides the primary key, more on that later on). A file-system-layout could look like this: orders/ product_A/ customer_1.csv customer_2.csv product_B/ customer_1.csv customer_3.csv So when searching all orders customer_1 placed, we just use this file-pattern orders/*/customer_1.csv. Remember: HDFS and MapReduce’s inputs (like FileInputFormat) support globbing. Now we actually simulated indexes by partitioning the data! From here on you can go into more detail depending on your data structure. As an example you could add the date- and id-range to the file name like this: orders/product_A/customer_1.2009-06-04.2009-06-05.1000.2000.csv orders/product_A/customer_1.2009-06-06.2009-06-07.5000.7000.csv
pero on anything - Hadoop Indexes |
|
[#HADOOP-5815] Sqoop: A database import tool for Hadoop - ASF JIRA |
|
|
Topic: Technology |
9:58 pm EDT, May 19, 2009 |
Overview: Sqoop is a tool designed to help users import existing relational databases into their Hadoop clusters. Sqoop uses JDBC to connect to a database, examine the schema for tables, and auto-generate the necessary classes to import data into HDFS. It then instantiates a MapReduce job to read the table from the database via the DBInputFormat (JDBC-based InputFormat). The table is read into a set of files loaded into HDFS. Both SequenceFile and text-based targets are supported. Longer term, Sqoop will support automatic connectivity to Hive, with the ability to load data files directly into the Hive warehouse directory, and also to inject the appropriate table definition into the metastore.
[#HADOOP-5815] Sqoop: A database import tool for Hadoop - ASF JIRA |
|
Russell Jurney Journeys to Silicon Valley — TechDrawl |
|
|
Topic: Technology |
6:38 pm EDT, May 15, 2009 |
In an upcoming series spanning both coasts, Russell Jurney will explore startup geography — the importance and effect of physical location on technology startups. TechDrawl is flying him business class on AirTran round-trip to San Francisco and will be raising additional support for his travels by auctioning a pair of AirTran business class domestic passes on eBay and generally hitting-up and guilt-tripping our personal contacts for cash to cover his week of expenses. We will create a visualization page for donors to said fund. (Details to follow in a separate post). AirTran delivers the “best flying experience to smart travelers” and has 270 of its 750 daily departures leaving the ATL via Hartsfield-Jackson Atlanta International Airport (its primary hub). Jurney will try to quantify the ‘Valley Advantage.’ Tooling around Silicon Valley on a tractor (if they’ll rent him one at Dahl’s in the Valley), he will focus on several aspects of startup geography, including personal experiences of founders of both successful and unsuccessful startups and how their physical location influenced their startup’s trajectory. He will examine the importance of physical proximity in a social networked and internet collaborative world and try to answer the question, “How much does social networking make up for physical separation in developing a new venture and building a startup ecosystem?” Finally, he will examine regional specialization in technology startups and explore the question, “How important is it that the market your startup pursues is well suited to your physical location, and what kinds of products and markets are suited to different regions?” Jurney is a technologist and a compulsive, degenerate technology entrepreneur who is writing his next business plan by creating this series of articles. You can read more about him on Cloudstenography and follow his journey to Silicon Valley on Twitter and on TechDrawl.
Doing a bit of writing about startup geography and regional advantage. Russell Jurney Journeys to Silicon Valley — TechDrawl |
|
Behind The Business Plan Of Pirates Inc. : NPR |
|
|
Topic: Technology |
9:39 pm EDT, Apr 30, 2009 |
From this and other ransom situations, here's a typical accounting for a piracy operation: About 20 percent goes to pay off officials who look the other way. About 50 percent is for expenses and payroll. The leader of an attack makes $10,000 to $20,000 (the average Somali family lives on $500 a year). The initial investor — who put in $250,000 of seed capital — gets 30 percent, sometimes up to $500,000. Gullestrup's ship and crew were returned safely, although the pirates didn't actually want to get off the ship right away. That's because they were afraid of getting robbed by other pirates on their way back to shore, Gullestrup says, so he gave them a ride north, dropping them closer to home. Fortunately, he says, he was going that way anyway.
Behind The Business Plan Of Pirates Inc. : NPR |
|