OLAP is an acronym for On Line Analytical Processing. It is an approach to quickly provide the answer to analytical queries that are dimensional in nature. It is part of the broader category business intelligence, which also includes Extract transform load (ETL), relational reporting and data mining. The typical applications of OLAP are in business reporting for sales, marketing, management reporting, business process management (BPM), budgeting and forecasting, financial reporting and similar areas. The term OLAP was created as a slight modification of the traditional database term OLTP (On Line Transaction Processing).
Databases configured for OLAP employ a multidimensional data model, allowing for complex analytical and ad-hoc queries with a rapid execution time. Nigel Pendse has suggested that an alternative and perhaps more descriptive term to describe the concept of OLAP is Fast Analysis of Shared Multidimensional Information (FASMI). They borrow aspects of navigational databases and hierarchical databases that are speedier than their relational kin. [TOP]
OLAP takes a snapshot of a set of source data and restructures it into an OLAP cube. The queries can then be run against this. It has been claimed that for complex queries OLAP cubes can produce an answer in around 0.1% of the time for the same query on OLTP relational data.
The cube is created from a star schema or snowflake schema of tables. At the centre is the fact table which lists the core facts which make up the query. Numerous dimension tables are linked to the fact tables. These tables indicate how the aggregations of relational data can be analyzed. The number of possible aggregations is determined by every possible manner in which the original data can be hierarchically linked.
For example a set of customers can be grouped by city, by district or by country; so with 50 cities, 8 districts and two countries there are three hierarchical levels with 60 members. These customers can be considered in relation to products; if there are 250 products with 20 categories, three families and three departments then there are 276 product members. With just these two dimensions there are 16,560 (276 * 60) possible aggregations. As the data considered increases the number of aggregations can quickly total tens of millions or more.
The calculation of the aggregations AND the base data combined make up an OLAP cube, which can potentially contain all the answers to every query which can be answered from the data (as in Gray, Bosworth, Layman, and Pirahesh, 1997). Due to the potentially large number of aggregations to be calculated, often only a predetermined number are fully calculated while the remainder are solved on demand. [TOP]
There are three types of OLAP.
Each type has certain benefits, although there is disagreement about the specifics of the benefits between providers.
Some MOLAP implementations are prone to database explosion. Database explosion is a phenomenon causing vast amounts of storage space to be used by MOLAP databases when certain common conditions are met: high number of dimensions, pre-calculated results and sparse multidimensional data. The typical mitigation technique for database explosion is not to materialize all the possible aggregation, but only the optimal subset of aggregations based on the desired performance vs. storage trade off.
MOLAP generally delivers better performance due to specialised indexing and storage optimizations. MOLAP also needs less storage space compared to ROLAP because the specialised storage typically includes compression techniques.
ROLAP is generally more scalable. However, large volume pre-processing is difficult to implement efficiently so it is frequently skipped. ROLAP query performance can therefore suffer.
Since ROLAP relies more on the database to perform calculations, it has more limitations in the specialised functions it can use.
HOLAP encompasses a range of solutions that attempt to mix the best of ROLAP and MOLAP. It can generally pre-process quickly, scale well, and offer good function support. [TOP]
The following acronyms are also used sometimes, although they are not as widespread as the ones above
Unlike relational databases - which had SQL as the standard query language, and wide-spread APIs such as ODBC, JDBC and OLEDB - there was no such unification in the OLAP world for a long time. The first real standard API was OLEDB for OLAP specification from Microsoft which appeared in 1997 and introduced the MDX query language. Several OLAP vendors - both server and client - adopted it. In 2001 Microsoft and Hyperion announced the XML for Analysis specification, which was endorsed by most of the OLAP vendors. Since this also used MDX as a query language, MDX became the de-facto standard in the OLAP world. [TOP]
The first product which performed OLAP queries was IRI's Express which was released in 1970 (and acquired by Oracle in 1995). However, the term did not appear until 1993 when it was coined by Ted Codd, who has been described as "the father of the relational database". But Codd's paper was financed by the former Arbor Software (now Hyperion Solutions), as a sort of marketing coup: the company had released its own OLAP product - Essbase - a year earlier. As a result Codd's "twelve laws of online analytical processing" were explicit in their reference to Essbase. There was some ensuing controversy, and when Computerworld learned that Codd was paid by Arbor, it retracted the article. [TOP]
According to the influential OLAP Report site, the market shares for the top commercial OLAP products in 2005 were:
Send mail to the
Webmaster Chap with questions or comments about this web site.