Warning: "continue" targeting switch is equivalent to "break". Did you mean to use "continue 2"? in /homepages/8/d158560430/htdocs/ericjlawson/wp-content/plugins/jetpack/_inc/lib/class.media-summary.php on line 77

Warning: "continue" targeting switch is equivalent to "break". Did you mean to use "continue 2"? in /homepages/8/d158560430/htdocs/ericjlawson/wp-content/plugins/jetpack/_inc/lib/class.media-summary.php on line 87
SQL Server leads the (verti)paq | Data Architecture
Warning: count(): Parameter must be an array or an object that implements Countable in /homepages/8/d158560430/htdocs/ericjlawson/wp-includes/post-template.php on line 284

SQL Server leads the (verti)paq

Over the last few years, many large organisations have very likely been taking a serious look at the Microsoft Business Intelligence product stack. It’s a no brainer to give it a try as in general the licensing is free, but there are other offerings.

Some big players out there simply attract attention because of their high price tags (it must be good right?). Informatica, Business Objects, Cognos, SAP and plenty of others offer the tantalising promise of end user BI solutions.

Recently there appears to have been significant activity from newer outfits deploying very compelling BI solutions, interfacing into SQL Server. They don’t all attempt to be all singing and dancing, but target specific needs within the BI arena. Varigence is one such example.

Whilst Oracle has been sitting on a ticking bug, possibly set to rival Y2K, IBM are what they are and niche vendors like Tandem NonStop SQL evoke questions like “who”, Microsoft has continued to roll out significant product improvements that genuinely raise the bar in corporate database solutions.

During a course I attended nearly a year ago, Ralph Kimball mentioned his “big data” white paper he had been commissioned to do by Informatic, casually dropping a few BIG company names and quoting outrageously BIGGER data volumes. It began to get airplay a few weeks later. Since then every vendor (and their best mate) have been dropping the “big data” buzzword.

Microsoft’s response has been to promote Vertipaq from SSAS into the core database engine. This is one of the biggest upgrades to the basic RDBMS infrasturcture in a very long. Previous enhancements across the industry have been focused on faster disk hardware, comms channels, RAM increases, very clever caching, DBA focused performance and tuning, SSD’s, virtualisation etc. I am sure there are plenty of other techniques and other vendors solutions won’t have been static.

But, ColumnStore goes right to the core of processing mind blowing datasets that are now becoming commonplace. Big data handling frameworks like MapReduce & then Hadoop kick started the trend, but Microsoft has delivered it into the heart of BI. Clearly they have been cooking this up for some time. If you want to read up on it then get yourself the free ebook released by Microsoft last week.

Dave Campbell, a Microsoft Technical Fellow, dropped this article on us early last week about “The coming in-memory database tipping point”. It was absolutely fascinating reading, especially for someone who “grew up” on b-trees, block splits, file reorgs, file contention measurement etc, on a system with only 512meg of memory per each of 8 CPU’s (and some of that had to be allocated as disk cache). I still can’t believe we are prepping a BI server with 96gig of RAM.

From what David has indicated, we have a lot more developments in this area to look forward to over the coming decade. This is one snippet, lifted directly from his article, of whats in store

Microsoft is also investing in other in-memory database technologies which will ship as the technology and opportunities mature. As a taste of what’s to come, we’re working on an in-memory database solution in our lab and building our real-world scenarios to demonstrate the potential. One such scenario, based upon one of Microsoft’s online services businesses, contains a fact table of 100 billion rows. In this scenario we can perform three calculations per fact – 300 billion calculations in total, with a query response time of 1/3 of a second. There are no user defined aggregations in this implementation; we actually scan over the compressed column store in real time.

As John would say, “Thats awesome dude”.



Tags: , , , , , ,

Comments are closed.