Big data has arrived. At BIDMC, I oversee 1.5 petabytes of clinical and administrative data. At HMS, I oversee nearly 3 petabytes of research data.
As Blackberry's recent outage illustrates depending on single monolithic infrastructure has its risks and impact of failure can be enormous.
How can we leverage commodity hardware infrastructure, reduce risk, and meet user demands for mining big data? Apache Hadoop is a cool technology worth knowing about.
Hadoop is an open source framework that allows for the distributed processing of large data sets across clusters of computers, designed to scale from a single server to thousands of machines. Rather than rely on hardware to deliver high-availability, Hadoop detects failures and automatically finds redundant copies of data. The Hadoop library includes
*The Hadoop Distributed File System (HDFS), which splits user data across servers in a cluster.
*MapReduce, a parallel distributed processing system that takes advantage of the distribution and replication of data in HDFS to spread execution of any job across many nodes in a cluster.
Microsoft has just introduced support for Hadoop into SQL Server 12 as part of their end-to-end Big Data roadmap.
A fault tolerant distributed file system using commodity hardware for big data that is even integrated into mainstream data mining tools like SQL Server. That's cool!
Friday, October 14, 2011
3:00 AM
dssadsds
No comments
Related Posts:
S&I Framework Implementation GuidesNow that the Stage 2 Standards and Certification NPRM has been released, many people are asking me for the detailed implementation guides that will support it.The S&I Framework website is being enhanced to make their work… Read More
The Stage 2 Standards and Certification NPRMOn Friday, ONC released the Standards and Certification NPRM, the companion to the the CMS Meaningful Use Stage 2 NPRM.Here's a bookmarked PDF - thanks to Tony Panjamapirom of the Advisory Board.In my view, the NPR… Read More
Cool Technology of the Week Although I did not attend HIMSS this year because of my wife's chemotherapy timing, I did send several of my staff. I asked them to summarize the cool technologies, most frequently heard buzzwords, and the … Read More
Our Cancer Journey - Week 11Tomorrow, Kathy starts her next round of chemotherapy - 12 weeks of Taxol administered every Friday at noon.As with Adriamycin/Cytoxan (AC), we fear the unknown - what symptoms will it bring, how will it affect day to day and… Read More
The February HIT Standards Committee MeetingThe February meeting of the HIT Standards Committee included an in depth discussion of the Stage 2 Standards and Certification NPRM, updates from the projects in our 2012 HITSC work plan, and an overview of HITPC plans for 20… Read More
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment