Blog

Promises, Promises

December 17th, 2009
Posted by Bruce Armstrong, CEO

I was contacted by Doug Henschen on Monday for an article he was writing for Intelligent Enterprise on Oracle’s just announced promises to the European Commission regarding MySQL.  It turns out Doug was polling all the major open source data warehousing and business intelligence vendors to get our perspectives on the promises, and it appears as if we all had roughly the same response:  “cautious optimism” as Doug put it.

Having invented the industry’s first SQL Chip that improves the performance and scalability of reporting and analytics for MySQL by 10x – 1000x, we have been keenly interested and aware of Oracle’s announced intent to acquire Sun, and along with it MySQL.  While we have architected our product to work with other database management systems in the future, we chose MySQL because of its incredibly rapid adoption in the market (especially for high-growth web businesses), open storage engine API, and large and growing open source community and ecosystem of partners.

Now having production reference customers like Mamasource, LiveRail, MindSpark, and others running their operational systems on MySQL and reporting and analytics on MySQL-based Kickfire, we are even more focused on making sure the leading open source data base has a bright future.  As such, we were very pleased to see Oracle promise to continue support of MySQL’s Pluggable Storage Engine Architecture and to a policy of non-assertion against third parties using the storage engine API.  We believe this is in Kickfire’s and Kickfire’s customers and partners best interests, so we applaud the announcement.

Going forward, should the deal be consummated, we plan on submitting our name for a seat on the promised MySQL Storage Engine Vendor Advisory Board to insure our interests as well as our customers and partners interests continued to be met by Oracle.

Handmark is the Winner!

December 7th, 2009
Posted by Bruce Armstrong, CEO

The votes are in, and we are delighted to award Handmark with a Kicfire analytic appliance. Handmark powers more than 50 mobile content storefronts globally with entertainment, information and productivity applications and service.

Currently, their reporting is backlogged largely due to poor query performance. Their data volumes are rapidly growing making the problem worse every day. They need to report vital business data such as ad-metrics, billing, purchasing, and user statistics. In addition, they need to run ad-hoc reports. The Kickfire analytic appliance will eliminate Handmark’s backlog by dramatically improving performance and providing the scale necessary to keep up with their data volumes.

The popularity of Handmark’s mobile content is driving that data volume. Ramesh Narayanan, Handmark Database Analyst, explains, “We employ very large tables such as our call and ad-metrics tables so database performance is not always up to our standards.  We look forward to using Kickfire as part of our ongoing efforts to lead the mobile application industry in service and performance.”

We look forward to helping you achieve that goal Ramesh!

From everyone on the Kickfire team, we thank all the contestants, the judges and the community for making this contest so successful.

Help Select the Winner!

November 30th, 2009
Posted by Bruce Armstrong, CEO

Our contest to give away one of our blazingly fast Kickfire Analytic Appliances to a deserving organization is heading into the home stretch. We have already had almost 1,000 people voice their opinion and encourage everyone interested in data warehousing, business intelligence, MySQL, open source, or any other aspect of this contest to vote for their favorite semi-finalist before 5:00P PT Tuesday, December 1st.  We will announce the winner on Wednesday, December 5th.

To enter the contest, we asked organizations to submit their most compelling story of “data warehouse pain.” Our esteemed judges, Curt Monash, founder of Monash Research (www.monash.com) and publisher of DBMS2, Joy Mundy, principal at the Kimball Group (www.kimballgroup.com), Peter Zaitsev, founder and chief executive officer of Percona (www.percona.com), and Robin Schumacher, formerly director of product management at Sun Microsystems™ (www.sun.com) selected the semi-finalists.

Next, the semi-finalists prepared and submitted videos of their story. As you see from the videos, some of the teams went all out with very creative productions!

I encourage you to vote. Once you’ve voted, ask your friends and colleagues to vote as well in order to build a groundswell for your favorite contestant.

You can also post the reasons behind your choice on our Facebook Fan page or reach us on Twitter @kickfire.

We have worked closely with the MySQL community leaders getting their guidance and feedback on product direction. This contest is our way of giving back. Thank you for your help and support to select our winner.

Enterprise Data Trapped by Enterprise Apps: The New Data Mart Imperative?

September 9th, 2009
Posted by Bruce Armstrong, CEO

And then there were two.  As Oracle and SAP have consolidated enterprise applications from scores of providers down to basically two vendors, have they reduced choice and freedom in the world of analytic applications and business intelligence tools?  Some are saying that the consolidation of enterprise applications and analytic tools and applications has resulted in more customer dissatisfaction, not less.  For many, it seems like license prices are up, implementation costs are up, and data is even less accessible than before.

With the combination of Siebel’s nQuire and Hyperion into Oracle Business Intelligence Enterprise Edition Plus (OBIEE Plus), along with the fusing of Peoplesoft’s Enterprise Performance Management (EPM) – and, soon, Golden Gate’s data integration products – customers now don’t have much choice but to turn to their ERP provider of choice for reporting and analytics.  A similar fate is awaiting SAP’s customers with Business Objects’ tools being combined with the Business Warehouse (BW) – now referred to as SAP Netweaver Business Intelligence (SAP BI).

Has this resulted in higher prices to purchase and implement reporting and analytics for line of business owners?  One of our system integration partners, Glenridge Solutions of Atlanta, GA, thinks so.  In fact, they are now targeting customers that are unhappy with the seven figure license and services price tags quoted from their ERP provider to deploy reporting and analytic solutions.  Glenridge’s offering based on the Kickfire platform and open source business intelligence tools – along with their own domain knowledge in departmental reporting and analytics – can be deployed for a fraction of that cost.

Along with the downturn in the economy forcing data warehouse managers to consider lower-cost alternatives, is the consolidation of enterprise and analytic applications contributing to the resurgence of more affordable, fast time-to-value data marts?

Kickfire Sponsoring Open Source BI/Data Warehousing Survey

September 9th, 2009
Posted by Karl Van den Bergh, VP Marketing

Mark Madsen, in conjunction with BeyeNetwork, is running the survey. If you have a few minutes, would like to get a free copy of the final report, and a $5 Amazon card to boot, you can take the survey here http://www.zoomerang.com/Survey/?p=WEB229FHNWQXDQ

High-Performance, Affordable, Open Data Marts

August 3rd, 2009
Posted by Bruce Armstrong, CEO

Departmental or subject-specific data warehouses – known as “data marts” in the industry – seem to be gaining in popularity.  Fueled partly by companies wanting to start small with focused projects in today’s economy, and partly by advances in data warehousing technology improving affordability and deployability, data marts seem to be popping-up everywhere.

In most cases, data mart projects are driven by the head of a business unit or a functional group (like Sales) needing to analyze their own slice of data in order to run their department more efficiently and effectively.  The data may come directly from an operational system or a combination of source systems resulting in what’s called an “independent data mart”, or it may come directly from a larger, enterprise data warehouse in a hub-and-spoke or “dependent data mart” configuration.

In either case today, according to industry analysts, companies are looking for data mart products that provide compelling price-performance and plug-and-play simplicity based on open architectures.

With our Kickfire Data Mart Appliance, we believe we have done just that.  By dramatically reducing the cost of high-performance data warehousing with our SQL Chip and ultra-modern column-store database, and by packing our technology in a true appliance, we have been able to achieve the industry’s leading price-performance and very compelling time-to-value. 

Furthermore, by leveraging the defacto standard open source database MySQL, our customers are able to design, develop, and deploy their data marts quickly and flexibly with the tools of their choice.  In this way, we’re able to provide high-performance, affordable, open data marts to allow businesses to respond to a market opportunity or competitive threat quickly and effectively.

Kickfire Basics – The KFDB columnar storage engine

July 28th, 2009
Posted by Justin Swanhart, MySQL Czar

This is the first post in a new series of “Kickfire Basics” blog posts by myself and others here at Kickfire.  This series will review the basics of the Kickfire appliance starting from this post describing how data is stored on disk, to future posts on topics such as loading data into the appliance and writing queries which best leverage the capabilities of the SQL chip.

The Kickfire Equation
Column store + Compression + SQL Chip = performance

The Kickfire Analytic Appliance features the new KFDB storage engine which was built from scratch to handle queries over vast amounts of data.  KFDB is a column store in contrast to most MySQL storage engines which are row stores.  What follows is a description of our column oriented storage engine and how it improves performance over typical row stores.

This post concerns itself with the first part of the equation, the KFDB column store.

Column stores provide significant IO benefits over row stores

In general row stores are optimized for the quick storage and retrieval of many columns from a table for a small number of rows. Performance may suffer when a large number of rows must be accessed, particularly if only a small subset of columns must be accessed by the query.

In contrast, column stores perform very well when querying over a large number of rows, particularly if a small number of columns must be accessed, but they may struggle if asked to return only a small number of rows.  This is because instead of having to access entire rows of data, the column store can quickly retrieve data for only the subset of columns included in the query. Read the rest of this entry »

Kickfire Launches On-Demand Trials

July 15th, 2009
Posted by Karl Van den Bergh, VP Marketing

Join the Sun and Kickfire team tomorrow to see the unveiling of the Kickfire’s On-Demand Trial. You can sign up for the live webinar and trial review here: http://tinyurl.com/kickfiretrial.

At Kickfire we’re very excited about this launch. We’ve had many customers who have asked for a quick way to trial the system to get a sense of the performance. In order to speed up setup time we are providing users with access to US Bureau of Transportation’s database. This database contains flight data from the last twenty years. The trial consists of four parts:

1) An overview of Kickfire and its technology (includes a short Flash movie)
2) An interactive tutorial of a couple of sample queries. The tutorial explains the DB schema, the SQL and the Kickfire features that get performance
3) A pick list of sample queries and comparison times against MySQL running on commodity hardware. As Kickfire also runs MySQL, the comparison times show the performance boost from the SQL chip and the column store engine.
4) True adhoc access to the Kickfire machine through the popular open source phpMyAdmin tool

We also have a Live Chat feature available so you can ask any questions you might have of our technical staff.

All in all, we believe this provides a great first introduction to Kickfire’s capabilities. You can sign up through our self-service calendaring application here: http://ondemand.kickfire.com.

If you do give it a try, we’d love to hear your feedback

Prepare to be amazed: DBT-3 Query #17 on a one terabyte DBT-3 database.

June 30th, 2009
Posted by Justin Swanhart, MySQL Czar

Kickfire is really different than anything you have seen before

The Kickfire column store and SQL chip combine to achieve database performance never before seen in  a small footprint and power efficient database appliance, or in any other relational database to date for that matter.

I’d like to demonstrate the performance of the Kickfire model 2400 appliance running query #17 of the DBT-3 benchmark.  Others have blogged about this query recently, so I figured it would be good to look at Kickfire performance on this query.  I decided to present results not at one or ten gigabytes, but instead at one terabyte of data.

Have you ever experienced using MySQL on a very large database?

Let me begin by saying that I don’t think anybody has ever had the patience to allow this query to run to completion on MySQL with one terabyte of data.  At the time of this writing, Kickfire is the only MySQL database capable of running this query at all, let alone quickly.

I’ve personally let the query run for three days on a 300 gigabyte MyISAM database with a large 24GB key cache, before finally killing it.  I’m not saying that MySQL isn’t a great database,  because really it is.   It is just that MySQL was not designed to run queries which examine large volumes of data.    It is optimized for running queries which examine a small amounts of data, even when the database is large.  With Kickfire the exact opposite is true.  Our appliances accel at running queries which examine vast amounts of data.

If any one query can attempt to demonstrate the power of the Kickfire appliance, then this is probably the one.  Remember that I’m going to run this query on a database with seven hundred more gigabytes of data than that test at 300G.

A quick review of our hardware

Before I actually show the query results, I want to review quickly a little bit about the appliance architecture. The SQL chip is situated in what we call the QPM (query processing module) which is attached to the base Kickfire server (the BSM) with a PCI-X cable.  The QPM contains the SQL chip and RAM, which constitutes a very large CPU cache for the chip. While various CPUs today feature tens of megabytes of direct-attached cache, the SQL chip has direct access to gigabytes of cache.

The SQL chip, which features a dataflow architecture, uses its cache for multiple purposes.  From caching portions of columns in memory to storage of intermediate results during the execution of queries.

When a query is processed by the appliance it is broken up into a set of interconnected data flow operators which access data stored in QPM memory. This memory is managed very efficiently by the Kickfire OS, which also handles prefetching data from disk for upcoming tasks.  The Kickfire model 2400 appliance comes with 128GB of QPM memory which provides plenty of performance.

Okay, now that you made it through the hardware review, on to the show:

Kickfire, like MyISAM, stores the count of rows in the table, so COUNT(*) with no predicate is always fast. I used COUNT(*) to verify that indeed I am using a 1TB database with nearly 6 billion rows:

mysql> select count(*) from lineitem;
+------------+
| count(*)   |
+------------+
| 5999989709 |
+------------+
1 row in set (0.01 sec)

First run query results

mysql> select

    ->         sum(l_extendedprice) / 7.0 as avg_yearly
    -> from
    ->         lineitem,
    ->         part
    -> where
    ->         p_partkey = l_partkey
    ->         and p_brand = 'Brand#33'
    ->         and p_container = 'WRAP PACK'
    ->         and l_quantity < (
    ->                 select
    ->                         0.2 * avg(l_quantity)
    ->                 from
    ->                         lineitem
    ->                 where
    ->                         l_partkey = p_partkey
    ->         );
+------------------+
| avg_yearly       |
+------------------+
| 308023084.442857 |
+------------------+
1 row in set (10 min 49.89 sec)

The first run of the query is the most expensive run, and thus the slowest.  The database was cold, so there was IO necessary to process the results of the query.  The first run takes just under 11 minutes, which is very impressive.  Remember that this query was still running after 3 days on regular MySQL and that was with much less data in the database.

It gets better

Are you impressed yet?  If so, great, if not keep reading because Kickfire can do better.   Don’t get me wrong, 11 minutes is a very impressive amount of time in which to complete this query, particularly with just 3U of hardware. However, remember that I said the first run would be the slowest. With most of the data prefetched in compressed form into the QPM the SQL chip can blaze through tens of millions of rows per second to produce results.  In fact, around 99% of the time spent answering the query was spent on IO.  Believe it or not, the SQL chip sat idle for most of the query.

Most of the information necessary to answer the query is already in the QPM at this time.  At this point the IO bottleneck has been, on the whole, removed for this query. We can say in a real sense that the data is in memory for the query.

Everyone knows that a query will be fast when the data is in memory, and they are right, but there is still a very big instruction-flow bottleneck, also known as the Von Neumann bottleneck or the traditional CPU bottleneck that “in memory” databases must live with.  This bottleneck is not often discussed, because it is nearly ubiquitous in computing.  Almost all CPUs use instruction flow, so without different hardware it is impossible to avoid.  The SQL chip is the first specialized dataflow hardware for processing SQL instructions.

To demonstrate results when both bottlenecks are removed, I’ve changed the query slightly from examining brand #33 to #32.  The database must process roughly the same amount of data for both queries.  The database is synthetically generated and its properties are well known.  I made the change to demonstrate that the query results are not cached, only the data.

This new query completes in 5.55 seconds.  Kindly pick your jaw up off the floor.  When the von neumann bottleneck is removed, it is possible to process gigabytes of data per second with the SQL chip.

mysql> select
    ...
    ->         and p_brand = 'Brand#32'
    ->         and p_container = 'WRAP PACK'
    ...
    ->         );
+------------------+
| avg_yearly       |
+------------------+
| 308023084.442857 |
+------------------+
1 row in set (5.55 sec)

We can even complicate the query, giving it a disjunction, proving that the query result isn’t simply cached or materialized or in some other way “faked” by the database engine. The SQL chip is really processing gigabytes per second, this time coming up with the answer in just over 7 seconds.

mysql> select
    ...
    ->         and p_brand = 'Brand#32'
    ->         and (p_container = 'WRAP PACK' or p_container = 'SM BOX')
    ...
    ->         );
+------------------+
| avg_yearly       |
+------------------+
| 614714472.377143 |
+------------------+
1 row in set (7.07 sec)

How much faster will Kickfire be for you?

I hope you are impressed.  I’ve demonstrated that the SQL chip uses a dataflow architecture, which eliminates the traditional CPU register (instruction flow) bottleneck in combination with an advanced column store and memory management system that reduces or eliminates IO bottlenecks.

Of course, not every query is going to perform as well as this one, and I wouldn’t be being honest if I said so.  As with any benchmark your mileage may vary.  This query in particular runs completely in our SQL hardware, but this isn’t possible for all queries. Speed improvements of 10x to 100x faster over a regular MySQL server are typically possible, with most queries achieving results somewhere between those two extremes.  Then there are queries like these, which aren’t even possible with MySQL. I can’t really say how many times faster we are on this query because I don’t know how long it would take to eventually complete on a regular MySQL server.  I will leave that judgement up to you.

Free Kimball Group Data Warehousing Educational Webinar

June 23rd, 2009
Posted by Bruce Armstrong, CEO

We’re sponsoring an important webinar series along with Sun/MySQL starting this week on June 25thThe Kimball Group Data Warehousing Educational Webinar Series.  This webinar series will introduce the audience to data warehousing concepts and best practices, and will cover the history and evolution of data warehousing, provide an overview of dimensional modeling, and review the full life cycle of designing and implementing a data warehouse.  Part 1, on June 25th at 1:00P PDT, is on Data Warehousing Fundamentals.

There are two key reasons why we think this webinar series is important:

  • First, we believe this webinar further advances data warehousing in the MySQL world. There is a whole new generation of database developers in the MySQL community that are at various stages of understanding data warehousing – what it is, why they would consider it, and how best to get started. One of the key benefits of this webinar series will be to explain why someone would build a data warehouse at all and then how to go about it with a specific emphasis on leveraging MySQL and open source.
  • Second, this webinar will further evangelize MySQL and open source in the data warehousing world. The more traditional data warehousing market is early in its understanding of MySQL and open source. This webinar will help these practitioners better understand the benefits and practical implications of MySQL and open source data warehousing and business intelligence tools.

 In this way, we’re doing our best to cross-pollinate the two environments – MySQL and Data Warehousing – so that each can learn from and benefit the other. 

According to Gartner Group’s annual survey of 1500 CIO’s, data warehousing and business intelligence is once again the #1 spending priority – for the fourth year in a row.  While data warehousing and business intelligence have been around for decades now, these concepts are relatively new to the MySQL and open source communities.  As more and more businesses – especially web-based businesses that built their infrastructure on the LAMP stack – are beginning to analyze the information they have been generating and collecting, it’s important to consider the best practices developed over the years around reporting and analytics.

This webinar series will provide very practical methodology and implementation advice and best practices from one of the most well-known consulting groups in the data warehouse business.  The Kimball Group provides education, training, and consulting on data warehousing and business intelligence.  Founded in 2003 by Ralph Kimball, who is considered to be one of most well-known and well-regarded figures in the industry, the Kimball Group specializes in dimensional modeling techniques including the “star schema” concept, which is designed to optimize data warehousing usability and performance.

Upcoming webinars in this educational series will include Data Warehouse Data Modeling on September 10th and Data Warehouse Lifecycle Management on October 6th.  Both Kickfire and Sun/MySQL look forward to your participation and feedback on this important educational webinar series.



[Close]


[Close]