Enterprise BI vs Departmental BI?

Posted by Juan on 22/12/2009 under Articles, How To | 22 Comments to Read

For a long time one of the hottest discussions in the BI arena has been the concept of Enterprise BI vs Departmental BI, Top-down approach vs Bottom-up, Pragmatism vs Idealism. In this corner we have Spreadmarts, spreading like a virus throughout the organization to provide a quick and dirty fix to the desperate need of end users for timely information out of IT databases…..and in the other corner we have multiyear, multimillion Enterprise Data Warehouse initiatives that focus first on creating infrastructure, BI governance committees, data integration, while end users keep waiting for the very much needed information.

Is there an intermediate approach? Can we implement fast solutions that address today’s business needs while keeping an eye on strategic issues such as scalability and single version of the truth? Our experience is showing that not only this vision is quite possible, but it is already becoming a reality in many of our clients.

In the case of QlikView the key to evolve from a group of departmental applications into an enterprise solution relies on what we call the “QVD layer” or “QlikView datamart”. In its simplest form, this architecture comprises the following:
1. QVD Generator (.qvw): a QlikView document containing the script that reads data from all needed data sources and stores the tables in QVD files.
2. The resulting QVD files.
3. QlikView Application (.qvw): the application itself, which contains a script to load the data from the QVDs (resulting in a QlikView data model) and a User Interface that provides a visual and interactive way of getting access to the information.

1. QVD Generator (qvw)
The QVD Generator.qvw is a QlikView document that contains the script to retrieve data from external data sources, usually through ODBC connections. The source tables may be transformed, filtered or merged when necessary (including incremental loads). The goal is to produce a set of QVD files containing tables that are going to be part of data models inside QlikView applications. The basic steps for each table being read are LOAD and STORE (into a QVD file). At the end of the script execution the QVD Generator.qvw contains no data (everything is already stored in QVDs) and no User Interface (this will be handled by the QlikView applications).

Regarding data sources, the current situation will vary depending on the scale and complexity of the company, ranging from a simple ERP to multiple legacy systems and Enterprise Data Warehouses. As part of this architecture there may be existing processes that cater for issues such as DI (Data Integration), MDM (Master Data Management), data cleansing, query performance tuning, etcetera. Besides these issues, the QVD Generator will consolidate tables when appropriate, reducing the number of resulting tables (and therefore QVDs) in order to achieve a simplified data model in the QlikView environment.

2. QVD Files
A QVD file contains a table in QlikView’s patented, compressed format. Unlike relational database tables, QVDs don’t need DBA maintenance. The initial goal of QVD files has been to improve reload performance (QlikView can load several million records in seconds when reading QVDs in optimized mode), but we can use them in other useful ways.

3. QlikView Application (.qvw)
At this stage, all the necessary tables are available in QVD files, ready to be used by the QlikView applications. The typical QlikView application will contain a script that loads tables from QVDs and potentially other data files and produces a QlikView data model that can be navigated through the User Interface.

An example architecture
This basic infrastructure can be expanded to accommodate the needs of rolling out analytical solutions throughout the organization. Take for example the Sales function, which is a common place to start with QlikView. In the following example Figure 1 shows tables that have been identified as part of the sales data model. Some of them are only relevant to the Sales function, while others may be reused by other applications in following phases.

Figure 1 - Sales QVDs

Figure 1 - Sales QVDs

Figure 1 shows a set of QVD files created with a Sales QVD Generator. The Sales Application will read the data from these QVDs.

Now at a next stage there’s a project to build a Procurement application, as a result of requirements from the Supply Chain area. This application would contain information about purchase orders, purchase invoices and inventory. Figure 2 shows the QVDs identified for this application, you can notice some of them already exist in the list of Sales QVDs (dotted lines).

Procurement QVDs

Figure 2 - Procurement QVDs

This presents the opportunity of creating the “Enterprise” QVD layer. In this phase, we have identified QVDs that can be shared between two or more applications, so we create a QVD Generator that deals with this data in common, apart from the area/department specific QVDs. The result is shown in figure 3.

Enterprise QVDs

Figure 3 - Enterprise QVDs

Some benefits of this approach include:
*Consistency: uniformity of concepts across all applications.
*Reuse: ability to reuse part of the already existing components in the architecture.
*Efficiency: no need to read the same data twice.
*Improved flow: ability to “drill-round” the applications. For example, by using the new feature of document chaining, the selections on fields coming from the shared QVDs would be carried along when moving from one application to another.
*Single version of the truth: each business concept is captured in a unique QVD or set of QVDs that are ready to be used by one or many applications.

Full architecture

Figure 4 shows the final architecture:

Figure 4 - Full architecture

Figure 4 - Full architecture

The full architecture is comprised of the following components:

QVD Generators:
*QVD Generator Sales.qvw (refreshes Sales related QVDs from its original data sources). The end product is the Sales QVDs.
*QVD Generator Procurement.qvw (refreshes Procurement related QVDs from its original data sources). The end product is the Procurement QVDs.
*QVD Generator Enterprise.qvw (refreshes shared QVDs from its original data sources). The end product is the Enterprise QVDs.

QVDs:
*Sales QVDs
*Procurement QVDs
*Enterprise QVDs

Applications:
*Sales Analysis.qvw: contains both the Sales data and the user interface to access it.
*Procurement Analysis.qvw: contains both the Procurement data and the user interface to access it.

Conclusion

Although simple, this is an Enterprise solution: all components (including the data) reside in a centralized, secured server environment that provides a single platform for analysis and reporting all across the organization. The architecture however is completely transparent to the end users, who get access to the applications through a web browser such as Internet Explorer.

There are many alternatives and variations from this basic approach. For example in the solution above we are assuming each application shows data related to a specific department, but in reality there’s an opportunity to provide applications that give information about the surrounding business processes as well. For instance a Sales application may provide information about inventory and inbound transactions to give a picture of outstanding orders and current inventory levels from a sales point of view.

I would like to know if you are implementing similar solutions. But regardless of the chosen approach, the ultimate goal is to achieve the vision of Enterprise BI without slowing down current initiatives that quickly unlock the value of existing databases.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Google Bookmarks
  • DZone
  • LinkedIn
  • Reddit
  • StumbleUpon
  • Technorati
  • TwitThis
  • Twitter
  • Gilles said,

    Hi Juan,

    Another very good post from Australia. We implemented a similar solution at one of our clients and we’re considering making this part of our standard approach in qlikview projects!

    Cheers and a mery christmas for you and those close to you!

    PS. Snow in The Netherlands, we’ll have a white christmas

  • william said,

    Great post Juan!

    Another big advantage of this approach is the flexibility of having qvd files. With the qvd’s you can develop and reload applications anywhere (of course this also is a security risk).

  • Juan said,

    Thanks Gilles, and merry christmas for you as well, unfortunately we’re not gonna have snow anytime soon here in Sydney so enjoy!

  • Juan said,

    Hey William, thank you. And that’s true, if you have to develop off-site and you don’t have VPN access, qvds are a good option.

  • Enterprise BI vs Departmental BI? | Quick - Qlear - Qool ERP Terms said,

    [...] Excerpt from: Enterprise BI vs Departmental BI? | Quick – Qlear – Qool [...]

  • Ralf Becher said,

    Hi Juan,

    this is a very good approach. I would also mention the side-effect of data archiving with QVD’s. Because of the high compression ratio you can store easily daily snapshots of your enterprise data outside of databases.

  • Juan said,

    Thanks Ralf. That’s true, we can use qvds to solve specific needs of archiving historical snapshots. But that means you will now have a repository in the QlikView environment and we need to consider having a backup routine and a plan for load failures (missing the snapshot on any given day). That’s fine if we keep the solution simple and the data volume growth is taken into account for HW specs(RAM). On a large scale a datawarehousing approach might be a better answer.

  • Brian said,

    Juan, We too are doing something similar although we are in healthcare. We have an Electronic Medical Records (EMR) system as our primary data source. We have constructed a staging area consisting of QVD files. The reload time for QVD’s in a fraction of what we were experiencing with traditional OLAP.

  • Juan said,

    Thanks for your feedback, Brian. It’s good to know these practices are becoming well known.

  • paul yeo said,

    Very good ariticle ,

    it is possible for you to share with us , how you o :-

    1) link the sales order with inventory

    so that we are able to understand how you do it. as in the past i try to do it , my problem is when the data file get big , it become very diff to manage and maintenance. at the end i give up when i get memory error.

    Any hints on how to over come the above.
    Paul

  • John Witherspoon said,

    I should probably copy my post from the QlikCommunity forums, since a question about your blog post lead to it. Might be of use here.

    “Our architecture is very similar to the blog post you linked to. About all we do differently is that we have a lot more QVD generators. There isn’t a single QVD Generator Enterprise.qvw. Instead, we have masterCustomer.qvw, masterOrderItem.qvw, masterManufacturingInstructions.qvw and so on. This gives us a lot more control over scheduling, limits the size of each application, and so on. One of our master QVWs might create multiple QVDs, but the QVDs from a single QVW are all closely-related. The QVWs used in this way all start with “master” and all go in one MasterData directory. The QVDs then go to the \QVD subdirectory of that, so that’s where most of our enterprise-wide data is stored after being cleaned up for QlikView’s use. Mind you, we consider MOST of our data to be enterprise-wide, even if only used in a single user application so far, so most of our data is loaded in this way.

    Another minor difference between us and the blogged approach is that sometimes we leave the data in the master QVW rather than dropping the table after creating the QVD. This can be convenient when trying to debug simple problems – no need to load the QVD in again, just look at the master QVW. But for some of the larger QVDs, we do indeed drop the tables from the QVW at the end.

    I can’t say that this is the right approach for you. I don’t think there IS a right approach. There are just better and worse approaches based on your needs. I think this is a good approach for a medium to large scale enterprises. A small company with only a few tables might find the extra overhead of managing all of these files to be a waste of time. But I’m very happy with how this approach has worked for us, with our QVD “data warehouse” growing over time without a lot of effort required to explicitly design it.

    You may have to be cautious with slowly growing a QVD “data warehouse” instead of designing it up front. In our case, a great deal of the data was coming from a carefully-designed, integrated business system. So most of our source data was already in a sensible layout, even if it wasn’t a layout ideal for reporting. I was personally involved in creating a lot of the core QVDs, and I have a strong background in database design (though not specifically in data warehousing). So while in our case there was little effort required for the QVD “data warehouse” to end up in good shape, it might take significantly more effort when assembling data from a large number of sources and if you have less of a background in database design or data warehousing. There is still a danger of ending up with a complete mess. On the other hand, even if you explicitly design your data warehouse from the beginning, you can still end up with a complete mess.

    One thing our approach does help guarantee is that even if you somehow end up with a mess, it will be a mess that at least supports the users’ current reporting needs, since the data warehouse is built up piece by piece, and only as it supports user reporting.”

  • Gilles said,

    Hi John, thanks for your response and your ideas about Qlikview architecture.

    For completeness the link to the community discussion: http://community.qlikview.com/forums/p/24218/93374.aspx#93374

    Cheers,
    Gilles

  • Juan said,

    Hi Paul, thanks. Inventory apps is indeed an interesting topic to cover on a next post. What type of information are you trying to link? How much inventory history you want to keep? Normally it is enough with a daily balance of the last month or two plus monthly balances for the last 2/3 years. The solution will depend on these and many other factors.

  • Juan said,

    Oh, I see, thanks Gilles! I was looking for the discussion on the forum.

  • Juan said,

    Hi John, thank you for your comment, I like the idea of splitting the loads into several qvws, particularly when there are dependencies. If a script with 30 loads fails in load #12 it may be desirable to be able to resume from that point after solving the load error.
    Now on the forum discussion, it looks like Paul is retrieving information from a report instead of a relational database. When reading relational databases the data is structured (normalized) so you don’t have to reinvent the wheel in QV. I agree with John in terms of the need for data modelling skills, perhaps one of the “bibles” is The Data Warehouse Toolkit by Ralph Kimball. The logical data models presented in that book are universal and can be applied regardless of the BI tool chosen.

  • Organise QVD Data « Blog On Business Intelligent said,

    [...] Organise QVD Data 2010 February 19 by paulyeo11 Read the blog post from From Juan [...]

  • Paul said,

    I already read this article many time and thank for yr sharing . And you are right for my approach I don,t need to use yr approach . Since I am reading my data from text data file and not from SQL

  • Julio said,

    Hi Juan.
    Very interesting post. I agree with you that an “Enterprise DataWarehouse” is nowadays difficult to achieve: high costs, long projects difficult to be financed,…
    I think the more realistic approach is to build departamental BI applications on a solid foundation. For instance, QlikView documents extracting data from a shared data repository. In your approach, you buid this repository with QVD files. Nevertheless, this QVD files can only be real from QlikView documents… What about building this layer in a “traditional” database??
    Regards.

  • Gilles said,

    Hello Julio,
    Thanks for your comment. Building this layer in a “traditional” database is a very commonly seen solution. That’s how we position Qlikview: on top of a solidly build datawarehouse. But building a reporting layer in a traditional database or building a datawarehouse are also costly and long taking projects. My thoughts on this subject are that you can perfectly start of with just a qlikview solution for one department. With the success of qlikview and other departemental solutions joining in that’s when the need for a more structured and solid foundation starts. That could be a solution as Juan describes based on QVD’s, but also a datawarehouse. The latter would be my preferred solution, because you can use it with other tools than qlikview alone.

    Just my two cents, regards,
    Gilles

  • Juan said,

    Hi Julio, thanks for your reply. I think Gilles’ answer is very reasonable (thanks Gilles!) because you would rather have more flexibility. Having said that, whether you need intermediate repositories/data warehouse or not will depend on the complexity of the source data (i.e. disparate systems with the need to consolidate master records/MDM and/or apply lots of complicated business rules). If the data is fairly straightforward you’re better off keeping it simple in my opinion.
    And with the new feature of storing tables to plain txt files (as an alternative to qvd) you can now get data out of the QlikView environment easily.

  • kribo.alim said,

    hi..

    how to update the QVD?
    is it going to add or reload it from the beginning?

  • Juan said,

    Hi Kribo, for each table->QVD load, you will decide whether it is appropriate to run a full load or an incremental load appending to an already existing qvd file containing historical data.
    The main reason to implement incremental loads is to reduce the time it takes to run a reload process. In some environments a 2 hour reload is not a big deal, but in others it is unacceptable. This may be due to tight batch windows, process dependencies and/or a requirement specifying early availability of a QlikView document.
    The typical candidates for incremental loads are big tables (i.e. several million records), particularly when SQL query performance is slow due to network or source DB issues.
    There is an introduction on how to implement incremental loads in the QlikView Reference Manual, but it would be an interesting topic for a new post.

Add A Comment

Get Adobe Flash playerPlugin by wpburn.com wordpress themes