News & Events

Appliance Affinity: Why Appliance Vendors are Buying the Kickfire Appliance

December 10th, 2008
Posted by Karl Van den Bergh

The demand for high-tech appliances has been on the rise in the last few years. Their benefits — including high performance, low TCO, rapid time-to-value, and ease of use — have driven adoption in a variety of industries from data warehousing to network and security management, storage, retail, telephony and so on. As the analyst firm, Forrester, noted in a 2008 report:

“Appliances - in all their proliferation - are here to stay and are moving into the mainstream of computing and networking”

It turns out that the database of preference for a growing number of appliance vendors is MySQL. As noted on its appliance page, MySQL’s benefits of low TCO, ease of use, and rapid time-to-value map well to the requirements of appliance offerings.

As appliance markets have matured and competition has increased, there has been a growing need to differentiate. In many cases, appliance vendors are accomplishing this with the addition of analytics that enable customers to generate more value from the data being collected/stored in the appliance. However, due to it’s data warehousing limitations on general-purpose hardware, MySQL presents performance challenges for these appliance vendors. What we are now starting to see is that these vendors are turning to Kickfire to address this challenge.

As an example, one of our first customers is a mid-market ERP vendor with over 30,000 customer locations that delivers its application as an appliance. This company has developed a new analytic offering that gives customers a series of reports and dashboards to analyze their transactional data captured in the ERP application. Using the Kickfire appliance as the platform for their analytic offering, this vendor saw 50X - 100X query performance improvement.

Another early customer, a network performance management appliance vendor that services many of the world’s largest organizations, was seeing its larger customers starting to hit the scalability and performance limits of MySQL. After testing with the Kickfire appliance, this vendor saws it query performance improve by an average of 600X.

Performance aside, there are other features of the Kickfire offering that make it compelling to appliance vendors. Some of these include:

  • High analytic user concurrency (100+). Many of these appliance offerings have large numbers of users that need access to the data, not just a select few analysts as in traditional data warehousing
  • High volume, near real-time updates. This feature makes Kickfire suitable in operational-type environments, common in those markets where appliances have succeeded
  • Extensive SQL performance coverage. With this capability, Kickfire can deliver performance on the broad range of analytic applications found within the appliance world
  • The appliance form factor. As an appliance, Kickfire fits well with the vendors’ own delivery mechanism and plug-and-play value proposition

In summary, the Kickfire MySQL appliance fits well with both the delivery model and the analytic requirements of many appliance vendors which is why we believe we are continuing to see strong interest from this market sector.

New, New, New … News at Kickfire

November 12th, 2008
Posted by Karl Van den Bergh

It’s been a crazy month here at Kickfire which is why I have fallen a bit behind on my postings – a new product, new customers, a new CEO, a new relationship with Sun/MySQL, a new website … and a new baby girl! Here’s a quick summary of all that has been going on:

New Product
We quietly came out of beta a month ago. After nearly two and half years in development, this is a great achievement for the company. The team took on a hugely ambitious project: to re-design how SQL is processed today to be able to deliver an order of magnitude improvement in price/performance relative to any other data warehousing solution on the market. This project involved bringing together over 50 of the industry’s smartest database and hardware engineers to build a new type of database machine that includes the world’s first SQL chip, an ultra-modern database kernel, and advanced system features. Kickfire’s four data warehousing benchmark world records and subsequent customer performance results have proven that the team has successfully delivered on this lofty goal.

New (Paying) Customers
I will write about this in more detail in a future post but I wanted to include a couple of reference quotes from some of our early paying customers to give a sense of the performance results they are seeing:

“In the global deployments of our largest customers, MySQL performance placed limitations on the volume of data we could analyze. Kickfire’s MySQL Appliance delivered impressive 600X average performance improvements in our tests with much larger data volumes than we could previously support. We just placed an order.”  Research Director, Network Management Company

“We started evaluating Kickfire for the excellent price/performance they deliver for MySQL which we use extensively. After the initial tests showed performance improvements of over 100X for some of our most complex queries, we decided to purchase our first system.” SVP, $400M High-Tech Company

New CEO
We just announced today, Bruce Armstrong, as our new CEO. We’re all very excited to have Bruce join the company. He is a respected database and data warehousing veteran who brings with him a wealth of public and startup company experience. He started his career by helping to build Teradata, the data warehousing giant, into a successful public company. Following its acquisition by AT&T/NCR he was named president of the Teradata subsidiary and an AT&T company officer.  After Teradata he was VP and GM of Sybase’s $700M database business. Prior to joining Kickfire he was CEO at startup KNOVA which he grew from a private venture to a public company. We all think Bruce makes an excellent addition to the Kickfire team and will help us take the company to the next level.

New Relationship with Sun Microsystems
On October 14th we announced an important global, multi-year agreement with Sun. As part of this agreement all of Kickfire’s 2000 and 3000 Series Appliances will come with a pre-packaged MySQL Enterprise (TM) subscription.  This agreement builds on our earlier work together to announce record-breaking TPC-H numbers. This announcement reinforces Kickfire’s absolute dedication to and focus on MySQL. As I mentioned in an earlier post, we have built our business around MySQL as we believe it will evenutally establish itself as the #1 deployed database by virtue of its ease of use and cost benefits. This sea change in the database world will ultimately benefit hardware vendors, not software vendors which is why Sun spent a billion dollars on MySQL. Kickfire’s vision is to become the #1 hardware platform for MySQL for data warehousing and business intelligence applications.

New Website
OK, so it’s not completely new but our Marketing team did upgrade our site to include some nice new features. As an example, I like the image below that appears on our home page highlighting the Kickfire difference. When people ask me to describe Kickfire’s difference in a nutshell I talk about the SQL Chip, MySQL, and being a true appliance. The first gives us the industry’s best price/performance. The second means we operate as a standard. The third delivers a true plug-and-play experience. All three requirements are critical for the data warehousing mass market (sub 10TB) which is Kickfire’s sweet spot.

And a New Baby Girl
To end on a personal note — and by way of an excuse regarding my dearth of postings — in addition to all the excitement at Kickfire over the last month, my wife and I welcomed little Zoe Margaret to our family. We are thrilled to have her with us, as is her older sister Maia. That said, my daily routine is completely upside down right now. All I can say is thank God for coffee!

More Good News for Data Warehousing on MySQL

September 24th, 2008
Posted by Karl Van den Bergh

Last week, Infobright announced it had open sourced its data warehousing software code. This is good news for the growing number of organizations looking to use MySQL as a data warehousing platform. According to IDC, MySQL is already the third-most deployed database for data warehousing and Infobright’s move will give users yet another reason to seriously consider MySQL for this application.

For those of you not familiar with the Infobright offering, it is essentially a column-oriented data store for data warehousing. While the column-oriented approach is not exclusive to Infobright (Kickfire’s MySQL storage engine is also column-oriented, as are some other non-MySQL data warehousing solutions on the market) Infobright does have some unique technology that Lou Agosta recently described as follows in his post on Trends in Data Warehousing for the Second Half of 2008:

Infobright is applying “rough set theory” – related to fuzzy logic – to enable what I would describe as “smart data” (this is not Infobright’s term) without the requirement for indexing. Infobright calls it an “intelligent knowledge grid,” an automatically tuned, ultra-small layer of metadata.”

I think Infobright has made the right choice in going open source. It is difficult to stand out in the crowded data warehousing market and this move helps the company differentiate. Additionally, it is a business model that makes sense when targeting organizations interested in using MySQL. Although, unlike Kickfire, Infobright is not uniquely focused on the MySQL market — it is first and foremost a standalone analytic database — it does integrate with MySQL through the storage engine API and has targeted MySQL accounts. In its prior incarnation it was attempting to sell commercial software to open-source-sensitive accounts — never an easy value proposition. Today, its model is much better suited to the MySQL user base. [As a side note, Kickfire has innovated an open-source friendly business model by creating what I term as an open source-based business.]

Some might view Infobright’s and Kickfire’s offerings as competitive but I believe they are complementary. First of all, each solution is targeted at different ends of the market. Kickfire is primarily focused on the 100GB to 10TB market whereas Infobright mainly targets data warehouses from the 10TB - 30TB range. Secondly, Infobright is focused on data volume — enabling MySQL-based data warehouses to reach the 10’s of terabytes. Kickfire is focused on price/performance — it delivers the industry’s highest price/performance offering. Finally, Infobright is a specialist in traditional and historic analysis whereas Kickfire focuses on covering a breadth of data warehousing workloads including ad hoc querying, operational data warehousing, and reporting on OLTP schemas. 

The bottom line is that I believe Infobright’s announcement is good for the MySQL data warehousing market. It will hopefully incent more organizations to try MySQL for their data warehousing needs which will ultimately benefit all vendors, Infobright and Kickfire included, that choose to focus on this market.

A New Business Model for Open Source?

September 5th, 2008
Posted by Karl Van den Bergh

Kickfire was recently selected by Network World as one of 10 Open Source Companies to Watch. First of all, the disclaimer: we are not an open source company. As any of you reading this blog know, Kickfire is an appliance company. So, why then did we appear on the list? The link of course is MySQL.

The Kickfire appliance was built to run MySQL for high-performance business intelligence and data warehousing workloads. So, while we are not an open source company, we are very much what I would term as an “open source-based business”. Now, for those who track the data warehousing market, it might seem that a lot of vendors could claim that mantle as a large proportion have code that is derived from PostgreSQL. However, that’s not what I mean by an open source-based business. So, how would one define such a business and what is the significance? Before I get to an explanation, let me make a brief digression.

No one would dispute that the open source movement has irrevocably changed the software landscape. The days of commercial software’s monopoly are numbered. Of course Oracle and its ilk are not going anywhere anytime soon but they will die a slow death as they run out of companies to acquire and as fewer customers accept the exorbitant license and maintenance fees. The results of the Independent Oracle User’s Group 2007 open source survey are an early indication of this transformation. The survey found that over one third of Oracle shops are using open-source databases (75% of which are MySQL by the way). The primary driver? Cost.  Now, the deployments today are small in size but will grow over time as the capabilities of the open-source databases increase. And as they grow in size and frequency, they will eat away at the market share of the established commercial vendors. In fact, according to Gartner, MySQL is already the third most deployed database (behind SQL Server and Oracle but ahead of DB2). I suspect, given its growth rate, It won’t be long before it reaches the #1 spot.

Given the inescapable fact that open source will eventually eat commercial software’s lunch, isn’t this a concern for the future of innovation? It is, if you care to listen to pundits such as Craig Mundie, Chief Research and Strategy Officer at Microsoft, who not surprisingly don’t see open source as a viable business model. The conclusion of Mundie’s article titled Commercial Software, Sustainable Innovation sums it up:

“When comparing the commercial software model to the open-source software model, look carefully at the business plans and licensing structures that form their foundations. This comparison leads to the conclusion that the commercial software model alone has the capacity for sustaining real economic growth.”

While there is clearly an element of self interest in his article one can’t deny the fact that, from a pure revenue perspective, commercial software has vastly outperformed open source software to date. Red Hat, the largest commercial open source software company has approximately $500M in sales while Microsoft, the largest commercial software company, has about 120 times that.

So, is the dawn of open source also the dusk of innovation? If you’re Microsoft or Oracle, maybe. But that’s because they are looking at open source exclusively through the lens of revenue tied to software licenses. What’s missing from Craig’s argument and others like it is that open source DOES have massive revenue-generating capability (and hence can sustain innovation) - it’s just not related to software licenses or subscriptions. Take MySQL as an example. There are approximately 11 million active MySQL deployments worldwide but only a tiny fraction (<.01% ?) generate subscription revenue for MySQL/Sun. However, these MySQL instances require billions of dollars of hardware to run. So, whereas MySQL might look insignificant from a software revenue perspective it is a gold mine from a hardware perspective. And this is how Kickfire sees the world.

Kickfire believes MySQL will ultimately win the database war (from a deployment, not software revenue perspective) thanks to its cost advantages, its architectural flexibility and its ease of use. Starting from the premise that the preponderance of money in open source is to be made from hardware, the company built hardware customized for MySQL that significantly improves its performance. Specifically, MySQL running on the Kickfire appliance outperforms general-purpose hardware by 10-1000X for business intelligence and data warehousing workloads.

Simply put, Kickfire has built a business that complements this open source standard, without itself becoming an open source company in the traditional sense. The latter is what I define as an open source-based business. On the other hand, a significant number of Kickfire’s competitors in the data warehousing world are what I would term open source-derived businesses. Specifically, they have taken the PostgreSQL code base as a starting point to build proprietary and commercial software offerings. On the surface, the difference might seem academic but the reality is that the market approaches and business models are fundamentally different. An open source-based business builds its products to complement the open source standard. An open source-derived business takes the open source standard and makes it proprietary. An open source-based business generates its revenue from offerings that add value to the open source standard. An open source-derived business generates its revenue based on a traditional commercial software model. The bottom line is that although open source appears somewhere in the mix, the open-source derived businesses are really no different from traditional commercial software vendors.

Kickfire believes open source-based businesses are the wave of the future and will help in continuing to fund the innovation that open source brings. Let us know your thoughts.

When VLSI meets DBMS: The Story behind the World’s First SQL Chip

August 21st, 2008
Posted by Raj Cherabuddi

In April this year, Kickfire announced the first high-performance appliance for MySQL. As part of the announcement, the company released data warehouse benchmark results that broke prior records in terms of price/performance and performance in a non-clustered environment. While the creation of a new appliance built exclusively for MySQL along with the benchmark records was noteworthy, perhaps the bigger story lies in what we believe to be the beginning of a paradigm shift in the database world - one marked by the advent of the first SQL chip.

To give some context to this story I have included a graph below which depicts the evolution of VLSI (Very-Large-Scale Integration) semiconductor technology and its growing impact on a broadening range of industries.

VLSI Impact

Specifically, this diagram shows that as VLSI density (# transistors per sq millimeter) has increased over time per Moore’s Law, it has been possible to transition an increasing number of applications from a “software/CPU” model to a “custom chip” model. Starting with Digital Signal Processing in the 1970’s through to SQL Processing today, there has been a long history of industries that have witnessed this transition and seen a major upheaval in the status quo.

Take graphics processing as an example. Initially, the graphics market was led by companies such as Silicon Graphics with their high-end terminals built on a combination of proprietary software and general-purpose CPUs. This all changed with the arrival of the graphics chip. Designed by companies like ATI and Nvidia, the graphics chip delivered a much higher price/performance ratio, which opened up high-end graphics processing to a much broader audience (e.g. gamers) and transformed the industry. Silicon Graphics, now called SGI, is worth $73M today. Nvidia is worth 100 times more at $7.3B.

The question you might be asking yourself is why these particular applications? What is it about these applications that made them suitable for such a transition? In a word, Dataflow.

The common characteristic underlying these application domains is that they all deal with the need to process large volumes of data at high speed. Now, general-purpose CPUs are based on the von Neumann architecture which was conceived in the 1940’s at a time when data volumes were much much smaller than today. This architecture is an instruction-centric or control flow one that is good at processing large numbers of instructions quickly but not well suited to processing large data sets due to the so-called von Neumann bottleneck.

What the pioneers in each of the application domains we mentioned discovered is that a Dataflow architecture is much better suited to solving the problem of high-volume data processing because it eliminates the von Neumann bottleneck. In a dataflow architecture the data, as opposed to instructions, flows directly through the execution engine. There are no wasted clock cycles spent waiting for data to arrive into the registers as in the case of the von Neumann architecture. The difference is significant. As an example, a single SQL chip from Kickfire provides better performance than 10’s of CPU cores, as demonstrated in the data warehouse benchmark results we published.

In my next post I’ll discuss this topic a little more, explaining why the transition from general-purpose CPU to custom chip is only happening now in the database world and why we believe this will be an irreversible trend.

Why $20 million for Kickfire?

July 31st, 2008
Posted by Karl Van den Bergh

As Matt Asay recently mentioned in his post about Kickfire, the company just closed a Series B for $20 million. In today’s credit-scarce market where VC funding is flat/declining, $20 million is a lot of money, especially for a company whose product is still in beta. What’s more, there seems to be an investment bubble in the broader data warehousing space in which Kickfire participates (at last count, there were over two dozen vendors, the majority of which are relatively new entrants) and that bubble looks like it is starting to burst as witnessed by Microsoft’s recent acquisition of DATAllegro. So, are the Kickfire investors misguided or is there something more here than just another data warehousing play? If the successful track records of the top-tier firms (Accel, Greylock, and Mayfield) that have invested in Kickfire  are anything to go by, a betting man would probably assume the latter.

There are many reasons I could give for why this $20 million bet makes sense but I’ll just give the two most important ones here - ones that set Kickfire apart from every other player in this space.

1) The SQL chip. This is Kickfire’s core technology differentiator. It has become clear from the TPC-H benchmark records that Kickfire announced at launch and subsequently  that this technology delivers results. What may not be as apparent is the macro-level implication. We believe that this is the start of a trend that has been seen before in many other industries such as graphics and network routing. Specifically, as VLSI technology has improved, more computing workloads have moved from software running on general-purpose CPUs to custom chips specifically designed for these workloads. The resulting chips have far outperformed their general-purpose counterparts at a fraction of the cost leading to tectonic shifts in the industry landscape. Such shakeups have played out numerous times before, leading in many cases to the creation of new markets and the birth of industry behemoths (think Nvidia/ATI in graphics and Cisco/Juniper in network routing). VLSI technology has now gotten to the point that SQL-like operations can be run natively in silicon. Much as has happened in other industries, we believe this shift will also happen in the database world. Whether or not Kickfire will ultimately be a player in this transformation, the transformation WILL happen, and Kickfire has started the ball rolling.

2) MySQL. This is Kickfire’s second not-so-secret “secret weapon.” The two dozen data warehousing vendors I mentioned can be broadly grouped into two buckets. First, the traditional database vendors (Oracle, Microsoft, IBM). Second, the pure-play data warehousing vendors. Up until now, customers have had to choose between these two imperfect options a) incumbents which deliver the benefits of being standard (e.g. broad third-party tool and app support) but fall down from a performance perspective and b) pure-plays which deliver the performance benefit but fail the “standard” metric as their databases are proprietary. With Kickfire this changes. The Kickfire appliance looks and feels just like MySQL running on a Linux server, except it is 10-1000X faster for data warehousing. It therefore delivers the benefit of running a standard database while delivering performance at the same time. And just in case there is any dispute that MySQL is now a standard here are some stats. MySQL has now 11 million active installations, growing at an estimated 30% a year. According to Gartner, MySQL is now the third most deployed database on the planet, ahead of DB2. According to IDC, MySQL is now the third most used database for data warehousing.

On a final note, some skeptics might still say that targeting an open source market is no guarantee of success. Very few companies, with a couple of notable exceptions, have made it big thus far. After all, isn’t one of the most appealing aspects of open source its low cost (read “free” for the majority of users)? All true if you think about this from a software perspective. But from a hardware perspective, the open source world looks very different. Today, billions of dollars are spent on the servers running MySQL. Kickfire is a systems company. We build appliances and we’re targeting those billions of dollars.

600X MySQL Performance Improvement with Kickfire

July 17th, 2008
Posted by Karl Van den Bergh

As promised, in this post I will update on the performance improvements another Kickfire beta customer is seeing relative to its query response times.

The customer in question is a successful mid-sized company in the network management space. As part of their network management offering, they provide network monitoring and analysis capabilities. They are currently using MySQL as their backend database. The trouble they are having is that they can’t scale beyond about 50GB of data without impacting their monitoring and analysis performance. What this translates to is that their customers can’t use their solution to monitor more than 30 days worth of network traffic. While this is OK for some, others are clamoring for the ability to track and analyze up to three years of traffic and willing to pay significantly more to do so. Today, if they try to accomodate these customers, the queries end up taking hours to run which is unacceptably high.

To test the Kickfire appliance, the customer ran their 12 hardest queries on about half a terabyte of data. The customer schema has125 tables and about half a billion rows in the fact table. As I received a request for more detail in my last post, I’ve pasted below one of the queries (obfuscated for privacy) that the customer is tyring to run as an example of what they are trying to do.

SELECT(CEILING(TIME_END/900)*900)+ -21600 AS TIMEBIN,

IFNULL(SUM(RTT_SUM)/SUM(RTT_COUNT), 0.0) AS RTT,
IFNULL(SUM(RETRANS_SUM)/SUM(RETRANS_COUNT), 0.0) AS RETRANS,
IFNULL(SUM(APP_SUM)/SUM(APP_COUNT), 0.0) AS DTT,
IFNULL(SUM(SERVER_SUM)/SUM(SERVER_COUNT), 0.0) AS SRT,
IFNULL(SUM(RTT_COUNT),0) AS RTTCOUNT,
IFNULL(SUM(RETRANS_COUNT),0) AS RETRANSCOUNT,
IFNULL(SUM(APP_COUNT),0) AS DTTCOUNT,
IFNULL(SUM(SERVER_COUNT),0) AS SRTCOUNT,
IFNULL(SUM(RTT_SUM+RETRANS_SUM)/SUM(RTT_COUNT), 0.0) AS EFFECTIVERTT,

FROM RUNS1

WHERE (TIME_END > 1197472400 - (86400*30) AND TIME_END <= 1197472400)
AND MAINTENANCE = 0

GROUP BY TIMEBIN

Prior to using Kickfire, the customer had taken these 12 queries and set up a lab environment to try and achieve higher performance. By re-architecting the application, leveraging things like partitioning, they were able to achieve a 60X improvement. The problem with this approach is that this re-architecture would have introduced significant development and testing efforts and forced their customers to make a major and costly upgrade to their installations.

The customer then did the query performance comparison with Kickfire by simply moving the data and schema as is to the Kickfire appliance. The appliance in question was the Kickfire 2300 which comes with 64GB memory.

Without any re-architecture, the customer was able to achieve an average 600X improvement out of the box. What this means is the customer can now support its larger customers, and generate more revenue for the company, without any system re-architecture or associated cost.

A New Hardware-Based Approach to Data Warehousing

June 27th, 2008
Posted by Ravi Krishnamurthy

My name is Ravi Krishnamurthy - I am the Chief Software Architect here at Kickfire. I’ll be blogging about our thoughts on database technologies for data warehousing. More specifically I’ll be talking about current challenges, directions going forward, and the simplifications for wider market deployments and other ideas.

Data Warehouse (DW) queries are known to be more complex, more demanding, and longer running than OLTP queries. Some of the distinctive features of these DW queries that produce these characteristics are:

1) Table scan: Most OLTP queries are point queries updating or inserting a few transactional data. Most DW queries on the other hand are reporting or business intelligence (BI) queries which typically touch large numbers of rows of data, often computed by sequential table scans over the large data sets.

2) Many/complex joins: Multiple tables with many joins in the query poses a number of challenges. For example, any sizeable table (except the first/outermost) being joined using table scan would cause the performance to degrade significantly in most cases. Obviously, if all joins are foreign-key (FK) to primary-key (PK) joins and can use the index on the PK then there’s no problem. However, for many reasons (e.g., use of functions, PK-to-FK joins, etc.) using indexed-join methods may not be possible, thereby making joins very expensive.

3) Lots of GROUP BY aggregations: Typical reporting and BI queries leverage many grouped aggregations with multiple GROUP BY keys over large number of rows which can be very slow. Use of DISTINCT clause compounds this problem further. Clearly, using indexes on GROUP BY keys helps improve the performance of this type of query. However, the presence of multiple GROUP BY keys or the use of indexing in the join operation may preclude the use of indexing for the grouping operation which impacts performance.

4) ORDER BY limit: A request for the top ten rows, especially based on a GROUP BY aggregated value, is typically done by sorting all the computed data and then returning the top ten rows. If this grouping (for aggregated values) were done say by department of even a large enterprise you end up with a sort over thousands of rows which is not that bad. However, if you try to group by say visitor ID from a clickstream data set or by product ID from a point-of-sale data set to get the top ten rows, you end up with a sort over potentially millions of rows.

5) Complex filters: LIKE predicates (over a large number of rows) and STRING functions creating complex filters involving AND/OR conditions as well as the use of CASE expressions are typical in DW queries. These tend to create execution flows that significantly reduce the ability to use indexes and also incur computational overhead.

6) Correlated sub-queries: Creating queries that use other queries as building blocks is a common practice in BI. If correlated variables are used in these sub-queries then it becomes difficult to process those queries efficiently which becomes a challenge for the optimizer. Any failure to correctly optimize these types of queries can quickly degrade performance.

The above are examples of the types of problems that database administrators (DBA) are facing today when trying to deliver performance for reporting and data warehousing applications. To get around these problems, DBAs spend currently a lot of time and effort tuning the system, rewriting queries, configuring the I/O subsystem, increasing memory/disk resources and so on.

What if a hardware-based approach could mitigate these performance issues?

In the next few blog posts I will dig into this question in more detail. I’ll talk about some of the issues mentioned above, discuss how DBAs try to work around them today, the pitfalls inherent in these workarounds, and how a hardware-based approach could significantly simplify things.

Kickfire: Early MySQL customer success

June 12th, 2008
Posted by Karl Van den Bergh

I’m happy to say that the market response to our launch continues to be positive. So far we have had nearly 30 postings on the leading blogs in the MySQL world as well as close to 20 articles published in traditional media. Our press releases were picked up and published on over 30 sites. We had about 400 people stop by our booth at the MySQL conference and we continue to get a significant number of prospective customers and partners contacting us every week who want to know more about the company and our product.

Though the response has been very enthusiastic there has also been some healthy skepticism about how well the product would perform in real customer environments. In this post I’d like to briefly describe the results we are seeing at one of our beta customers.

The customer in question is a publicly traded company that manages the online forums for large media and web businesses. They use MySQL extensively today to store the forum data and have racks of servers doing this. They have also used MySQL to build their data warehouse. They use the data warehouse to provide their clients with reports on forum traffic, user activity, hot topics etc. For example, one of their customers, a leading cable channel, uses the data from the reports to understand how viewers are reacting to new shows being introduced.

The data set currently for one of their clients has 50 million rows or about 45GB of data. The data set contains clickstream data with all the associated dimension tables. In order to validate the performance of the Kickfire Database Appliance, the customer gave us six of their poorest performing queries to run. Even though these queries had already been highly optimized, they still took too long to run - in one case taking over an hour. The reason why these queries run slowly is because they contain multiple GROUP BY aggregations with DISTINCT clause whose non-linear processing causes the performance degradation. The fact that this would be a performance challenge for MySQL will not come as a surprise to MySQL DBAs but the fact that the impact is significant even on a relatively small data set makes the observation notable.

After loading the data into the Kickfire Database Appliance the customer saw an average 35X speedup in the performance of the previously tuned queries. The query that had taken over an hour to run now runs in a fraction of a minute, yielding a 150X speedup. A positive outcome of this speedup is that the customer no longer has to resort to using and maintaining custom stored procedures to make sure the queries are processed in an acceptable time window.

Given these significant performance improvements, the customer has conceived of a new revenue-generating service that the Kickfire appliance will allow them to implement. The service consists of enabling the company’s community managers to create ad hoc queries for their clients on a fee basis. Clients have been asking for additional information beyond the canned reports they have been getting but it has been too difficult, too time consuming and ultimately too expensive to provide custom data access for all but the very largest clients. Now, with Kickfire, the customer plans to give its community managers the ability to create these ad hoc queries directly themselves. By combing a user-friendly BI tool with the raw power of the Kickfire appliance, the customer believes even its non-technical community managers will be able to quickly generate these ad hoc queries and consequently create a new revenue stream for the company.

In my next post I’ll write about a customer in the network monitoring space.

MySQL and Kickfire Break Records (Again)

May 20th, 2008
Posted by Karl Van den Bergh

Following on from the announcement at the MySQL conference where Sun and Kickfire jointly announced data warehousing benchmark records, we have just announced new TPC-H benchmark records. Specifically, the Kickfire Database Appliance 2400 is the highest price/performance offering at 300GB, again breaking the $1 barrier for the first time coming in at 89 cents per QphH (Queries per hour on the TPC-H benchmark). The 2400 is also the highest performance (non-clustered) offering at 300GB.

I’m not going to further dwell on the numbers in this post other than to quickly point out another aspect of this achievement that Justin noted in his blog related to the energy savings the Kickfire appliance delivers in addition to the performance and price/performance. What I want to address is why we decided to do these benchmarks in the first place and what we believe their relevance to be. The reason is that as we continue publishing these benchmarks we occasionally get questioned about their importance (or lack thereof). Here’s my take.

First of all, they’re benchmarks and so, by definition, are limited. Let’s get this one out of the way. No benchmark, no matter how thorough, is going to cover every possible real-world scenario. But just because benchmarks have limitations is not a reason to discard them, particularly if they are thoughtfully conceived and rigorously applied as is the case for the TPC-H benchmarks.

Some vendors have been pushing the idea that only POCs count. Not surprisingly these are the vendors who haven’t published their results (and the reason for this should be obvious). Whereas I would agree that POCs are clearly a critical part of an evaluation process, much as test driving is when buying a car, it doesn’t mean you should discard the objective comparison of pertinent metrics. Going back to our car buying analogy, it would be like throwing away the information sheet you find on the car’s passenger window at the dealership. It would seem to me that prospective buyers would want to know about things such as the MPG rating or horsepower and how these compare to other cars they are considering.

To speak more specifically about the TPC-H benchmarks, it is not immediately obvious (unless you have been through the process) how extensive and rigorous they are. First, the 22 queries test a broad spectrum of SQL complexity spanning everything from simple reporting-type queries to deep analytic-type queries with multi-table joins, correlated sub-queries and the like. Second, the system performance is measured on a single query stream (the Power Run) but also on concurrent queries (The Throughput Run). Third, the load performance (important for data warehousing) is measured. Finally, full ACID compliance is tested for. You can check out the full details of the benchmark specifications here.

The benchmark specification also places extensive restrictions on what is allowed to prepare the test system. As an example, anything that would circumvent a true test of the system’s performance such as pre-built aggregates is disallowed.

The audit is also a rigorous process which is carried out by independent, TPC-sanctioned auditors who certify the benchmark and must sign off on the disclosure report in order for the results to be approved and published.

The point I’m making here is that these are not your homegrown benchmarks too often seen in vendors’ marketing material. The fact that there is an independent body, the Transaction Processing Performance Council, which has been around for 20 years and whose sole purpose is to define and monitor these benchmarks, should be a clear indicator that these benchmarks mean business.

Finally, and to make the point that these are serious, I have to be diligent here in how I talk about our performance numbers as there is a fair use clause when speaking about results that must be carefully adhered to in order to avoid penalties. To that end and to wrap up this post, here are some additional details I must disclose in order to be fully in compliance:

The Kickfire Database Appliance Series 2400 delivers a performance of 54,895 QphH@300GB (Queries per hour on the TPC-H benchmark) on the 300GB TPC-H benchmark. The Kickfire Database Appliance has a price/performance of $0.89/QphH@300GB USD on the 300GB benchmark. Kickfire delivers this performance with a 3 year total system cost of $48,790 USD. The Kickfire Database Appliance is in Beta and will be available October 14, 2008.

 

TPC-H, QphH and $/QphH are trademarks of the TPC. For additional information on the TPCH benchmark, please visit the Transaction Processing Performance Council’s Web site at http://www.tpc.org/.

 

 



[Close]


[Close]