Ten Steps to Deploy IBM Cognos 10.2.1

Cognos_TM1_Accounting_Software_by_IBM
Last weekend, my team was able to successfully migrate Georgetown’s IBM Cognos software from version 10.1 to 10.2.1.  Seems like a piece of cake, right?  Well, in our case, not exactly.  I thought I’d share a bit about our experience in the event that it may help others.

CognosHome

Our IBM Cognos upgrade effort followed a previously failed attempt to upgrade.  After much deliberation with IBM, it was decided that the failed attempt to upgrade primarily related to the fact that we had extremely old hardware and were running on a dated Solaris OS.  Still being fairly new to Georgetown, I dug in a bit further.  Indeed – our hardware was ancient and the Solaris OS was preventing us from getting much needed support from the vendor (IBM).  We had previously implemented IBM Cognos in 2005 on version 8.4.  Then, in 2010, we migrated on the same hardware, to version 10.1.  Given these facts, I was dealt a server which was 8-9 years old, and software that hadn’t been upgraded in at least 3-4 years.

Through several conversations with IBM SMEs, we settled on a proposed 6 server architecture (depicted below).  I’ve removed the details for security reasons, but we designed these machines down to the processor and RAM level.  We also had conversations with IBM about what OS would be best suited for us longer term.  We landed on Windows Server 2012.

CognosHardwareSizingCognosVirtualInfrastructure

For anyone interested, below are the series of steps that we followed to get this project off the ground:

  1. Proposed business case and secured funding
  2. Assessed the current state architecture internally.  Made an educated decision on what we felt that we needed to support our business requirements for the future state.  We were specific – down to the processor and memory level for each machine in the architecture.  We lessened the hardware requirements for the DEV and TEST tiers.  QA and PROD tiers were identical.
  3. Validated the architecture with the vendor (IBM) and ensured that they supported and recommended the approach.
  4. Altered the architecture post vendor recommendation.
  5. Based upon the agreed architecture, engaged the IBM sales representative to identify the correct licensing.  This step will make your head spin a bit.  IBM calculates license costs on a Processor Value Unit (PVU) basis.  Effectively, it is a proprietary formula used to calculate how much processing power resides in each of your machines.  This done by processor and accounts for cores.
  6. Negotiated and completed the procurement process.  Thankfully, IBM has some decent higher education discounts.  For future operating budgets, please be aware that the licensing does not stay flat.  You’ll typically see a 4-5% increase per year.  Also, for renewals, you might consider working through a business partner (such as CDW-G) to save money on the renewal.
  7. Setup the infrastructure.  We chose to do this in a virtual environment.  We also setup the architecture in DEV, TEST, QA, and PROD.
  8. Configured the IBM Cognos software (in this case, 10.2.1).  This is more intensive across the distributed architecture, but well worth the performance and scalability benefit.
  9. Tested, tested, and tested.  We started at the DEV tier and slowly promoted to TEST, QA, and then PROD.  If you have an existing environment already in production, you may consider running the two production environments in parallel for a short period of time.  We did this for about a week and then recopied the content store from the live environment to the new production environment.  It provided an additional level of comfort for testing.
  10. Go-live and enjoy the new features of IBM Cognos 10.2.1.  Please note – we decided to go-live with 10.2.1 on our existing 32-bit packages.  As a next phase, we are migrating all of the 32-bit packages to 64-bit.  You may consider this during the testing phase and deploy all at once.

CognosPVUContract

What tips do we recommend?

  1. Ensure your SSL certificate is installed across all of the machines in your architecture.  If you only install on the gateway server(s), the images on some of your reports will be broken.  They attempt to run via port 80 (HTTP) instead of port 443 (HTTPS) and are blocked.
  2. The governor limit on packages is reset.  We had to go in a modify each package to reset this limit.
  3. The portal may seem slower initially. Allow the content store several business days to optimize and reindex.  You’ll then see an improvement.
  4. Don’t forget to import the new visualizations and install the mobile capability. Very cool stuff!
  5. Collaborate with IBM. They may offer to provide an overview of the new features to your team.  If you have budget, they may also have optimization recommendations.

So what are our favorite features thus far?

  1. External object store for report output – helps tremendously with the size of our content store.
  2. New visualizations – very cool!
  3. We also enjoy the Cognos Mobile app which allows us to share content on mobile devices and push alerts.

CognosWorkspace

CognosMobile

Here’s the full list of new features from IBM:
http://pic.dhe.ibm.com/infocenter/cbi/v10r2m1/index.jsp?topic=%2Fcom.ibm.swg.ba.cognos.ug_cra.10.2.1.doc%2Fc_asg_new_10_2_1.html

Higher Education Data Warehouse Conference (HEDW) @ Cornell University

I just returned from an excellent conference  (HEDW) which was administered by the IT staff at Cornell University. Kudos to the 2013 Conference Chair, Jeff Christen, and the staff of Cornell University for hosting this year! Below is a little bit more information about the conference:

The Higher Education Data Warehousing Forum (HEDW) is a network of higher education colleagues dedicated to promoting the sharing of knowledge and best practices regarding knowledge management in colleges and universities, including building data warehouses, developing institutional reporting strategies, and providing decision support.

The Forum meets once a year and sponsors a series of listservs to promote communication among technical developers and administrators of data access and reporting systems, data custodians, institutional researchers, and consumers of data representing a variety of internal university audiences.

There are more than 2000 active members in HEDW, representing professionals from 700+ institutions found in 38 different countries and 48 different states.

This conference has proven to be helpful to Georgetown University over the last 5 years.  It is a great opportunity to network with peers and share best practice around the latest technology tools.  And, sorry vendors, you are kept at bay.  This is important as the focus of the conference is less on technology sales – and more about relationships and sharing successes.

Cornell University Outside of Statler Conference Center

Cornell University Outside of Statler Conference Center

Personally, this was my first year in attendance.  I gained a lot of industry insight, but it was also helpful to find peer organizations that are using the same technology tools.  We are about to embark upon an Oracle Peoplesoft finance to Workday conversion.  It was helpful to connect with others that are going through similar projects.  And for me specifically, it was helpful to learn how folks are starting to extract data from Workday for business intelligence purposes.

Higher Education Data Warehouse Conference

Higher Education Data Warehouse Conference

2013 HEDW Attendee List

2013 HEDW Attendee List

My key take-aways from the conference were:

  • Business intelligence is happening with MANY tools.  We saw A LOT of technology.  Industry leaders in the higher education space still seem to be Oracle and MicrosoftOracle seemed to be embedded in more universities; however many are starting projects on the Microsoft stack – particularly with the Blackboard Analytics team standardizing on the Microsoft platform.  IBM Cognos still seemed to be the market leader in terms of operational reporting; however Microsoft’s SSRS is gaining momentum.  From an OLAP and dashboard perspective, it seemed like a mixed bag.  Some were using IBM BI Dashboards, while others were using tools such as OBIEE Dashboards, Microsoft Sharepoint’s Dashboard Designer, and an emerging product – Pyramid Analytics. Microsoft’s PowerPivot was also highly demonstrated and users like it!  PowerView was mentioned, but no one seemed to have it up and running…yet.  Tableau was also a very popular choice and highly recommended.  Several people mentioned how responsive both Microsoft and Tableau had been to their needs pre-sale.
  • Business intelligence requires a SIGNIFICANT amount of governance to be successful.  We saw presentation after presentation about the governance structures that should have been setup.  Or, projects that had to be restarted in order be governed in the appropriate way.  This includes changing business processes and ensuring that common data definitions are put in place across university silos.  A stove-piped approach does not work when you are trying to analyze data cross functionally.
  • Standardizing on one tool is difficult.  We spoke to many universities that had multiple tools in play.  This is due to the difficulty of change management and training.  It is worth making the investment for change management in order to standardize on the appropriate tool set.
  • Technology is expensive.  There is no one size fits all.  Depending on the licensing agreements that are in place at your university – there may be a clear technology choice.  Oracle is expensive, but it may already be in use to support critical ERP systems.  We also heard many universities discuss their use of Microsoft due to educational and statewide discounts available.
  • Predictive Analytics are still future state.  We had brief discussions about statistical tools like SAS and IBM’s SPSS; however, these tools were not the focus of many discussions.  It seems that most universities are trying to figure out simple ODS and EDW projects. Predictive analytics and sophisticated statistical tools are in use – but seem to be taking a back seat while IT departments get the more fundamental data models in place.  Most had an extreme amount of interest in these types of predictive analytics, but felt, “we just aren’t there yet.”  GIS data also came up in a small number of presentations, but also has interest.  In fact, one presentation displayed a dashboard with student enrollment by county.  People like to see data overlaid on a map.  I can see more universities taking advantage of this soon.
  • Business intelligence technologists are in high demand and hard to find.  It was apparent throughout the conference that many universities are challenged to find the right technology talent.  Many are in need of employees that possess business intelligence and reporting skills.
  • Hadoop remains on the shelf.  John Rome from Arizona State gave an excellent presentation about Hadoop and its functional use.  He clarified how Hadoop got its name.  The founder, Doug Cutting, named the company after his son’s stuffed yellow elephant!  John also presented a few experiments that ASU has been doing to evaluate the value that Hadoop may be able to bring the university.  In ASU’s experiments, they used Amazon’s EC2 service to quickly spin up supporting servers and configure the services necessary to support Hadoop.  This presentation was entertaining, but was almost the only mention of Hadoop during the entire conference.  It may have more use in research functions, but does not seem widely adopted in key university business intelligence efforts as of yet.  Wonder if this will change by next year?
g with Son's Stuffed Elephant

Doug Cutting with Son’s Stuffed Elephant

Hadoop

A Compilation of My Favorite DW Resources

Recently, I received an email as part of a listserv from a colleague at HEDW.org.  HEDW, or Higher Education Data Warehousing Forum, is a network of higher education colleagues dedicated to promoting the sharing of knowledge and best practices regarding knowledge management in colleges and universities, including building data warehouses, developing institutional reporting strategies, and providing decision support.

In the email that I referenced above, my colleague sent a link to an IBM Redbooks publication titled, “Dimensional Modeling: In a Business Intelligence Environment.”  This is a good read for someone that wants the basics of data warehousing.  It also may be a good refresher for others.  Here’s a short description of the book:

In this IBM Redbooks publication we describe and demonstrate dimensional data modeling techniques and technology, specifically focused on business intelligence and data warehousing. It is to help the reader understand how to design, maintain, and use a dimensional model for data warehousing that can provide the data access and performance required for business intelligence.

Business intelligence is comprised of a data warehousing infrastructure, and a query, analysis, and reporting environment. Here we focus on the data warehousing infrastructure. But only a specific element of it, the data model – which we consider the base building block of the data warehouse. Or, more precisely, the topic of data modeling and its impact on the business and business applications. The objective is not to provide a treatise on dimensional modeling techniques, but to focus at a more practical level.

There is technical content for designing and maintaining such an environment, but also business content.

Dimensional Modeling: In a Business Intelligence Environment

Dimensional Modeling: In a Business Intelligence Environment

In reading through a few responses on the listserv, it compelled me to produce a list of some of my favorite BI books.  I’ll publish a part II to this post in the future, but here is an initial list that I would recommend to any BI professional.  It is also worth signing up for the Kimball Group’s Design Tips.  They are tremendously useful.

Related Articles:

10 Steps to Data Quality Delight!

Data quality is always an aspect of business intelligence (BI) projects that seems to be deprioritized.  It is easy to look at the beautiful visualizations and drill-through reports that are key selling features of a BI project.  However, this article is about the value of cleansing your data so that these tools will work seamlessly with the data model that you establish.  Everyone knows the IT saying, “Garbage in.  Garbage Out.”  That holds entirely true with BI projects.  If the incoming data is dirty, it is going to be very difficult to efficiently process the data and make it available for a reporting platform.  This isn’t an easy problem to solve either.  When working across multiple functional areas, you may also have different sets of users that are entering data into the system in DIFFERENT ways.  So, in this instance, you may not have a data quality issue, but a business process issue.

As I have worked through my BI projects, here are 10 steps that I have followed to work with teams to create a data-centric culture and to improve data integrity.   I hope that these are of use to you…and please feel free to share any additional best practice in the comments of this blog!  We can all learn from one another.

Data Quality Workflow

Data Quality Workflow

  • Step #1:  Build data profiling and inspection into the design of your project
    Don’t wait until you are about to go-live to start looking at the quality of your data.  From the very beginning of your project, you should start to profile the data that you are loading into your BI platform.  Depending on your technology stack, there are multiple tools that will aid you in data profiling and inspection.  You might consider tools such as Informatica Analyst, or Microsoft SSIS Data Profiler.  A quick Google search will provide many alternatives such as Talend.  Regardless of the tool, make sure that you incorporate this activity into your BI project as soon as possible.  You’ll want to do a fair amount of inspection on each system that you intend to load into your BI platform.

    Informatica Analyst

    Informatica Analyst

    Microsoft SSIS Data Profiler

    Microsoft SSIS Data Profiler

    Talend Data Quality Tool

    Talend Data Quality Tool

  • Step #2:  Don’t be afraid to discuss these data quality issues at an executive level (PRIOR TO THE EXECUTION PHASE OF THE PROJECT)
    Awareness is always a key factor for your executive team.  Executives and executive sponsors need to know about the data quality issues as soon as possible.  Why?  You will need their support not only to address the data quality issues, but sometimes these issues stem from poor business process and training.  Their support will be critical to address either issue.

  • Step #3:  Assign ownership and establish accountability
    Assign ownership for data as soon as possible.  This will assist you to not only to resolve the data quality issues, but these key data stewards may be able to help identify additional data quality issues as they may be more familiar with their data than you.  In most cases, they have inherited this bad data too, and will likely want to partner with you to fix it.  However, you must consider that it will also place a burden on them from a bandwidth perspective.  Unless dedicated to your project, they will also have a day job.  Keep this in mind during your planning and see if you can augment and support these data cleansing efforts with your team.
  • Step #4:  Define rules for the data
    One of the biggest challenges that I continue to see is when data stewards do not want to cleanse their data, they want the ETL scripts to handle the 1,001 permutations of how the data should be interpreted.  While the ETLs can handle some of this logic, the business owners need to ensure that the data is being entered into the transactional system via a single set of business processes and that it is being done consistently and completely.  Usually, the transactional systems can have business rules defined and field requirements put in place that can help to enforce these processes.  In some cases, the transaction systems are sophisticated enough to handle workflow too.  Utilize these features to your advantage and do not over-engineer the ETL processes.  Not only will this be time consuming to initially develop, but it will be a management nightmare moving forward.
  • Step #5:  Modify business process as needed
    If you are working cross-functionally, you may run into the need to revise business processes to support consistent data entry into the transactional systems.  Recently, I was working on a project across 6 HR departments.  The net of their hiring process was the same, but they had 6 different processes and unfortunately, they were utilizing the same transactional system.  We had to get their executives together and do some business process alignment work before we could proceed.  Once the business process is unified, you then have to consider the historical data.  Does it need to be cleansed or transformed?  In our case it did.  Don’t underestimate this effort!
  • Step #6:  Prioritize and make trade-offs.  Data will rarely be perfect.
    Once you have revised business process and defined data cleansing activities, you will need to prioritize them.  Rarely are you in a position where data is perfect or resources are unlimited.  If you have done your design work correctly, you will have a catalog of the most critical reports and key pieces of data.  Focus on these areas first and then expand.  Don’t try to boil the ocean.  Keep your data cleansing activities as condensed as possible and make an honest effort to try to support the business units as much as possible.  In my experience, the BI developers can generally augment the full time staff to get data cleansing and data corrections done more efficiently.  However, make sure that the business unit maintains responsibility and accountability.  You don’t want the data to become IT’s problem.  It is a shared problem and one that you will have to work very hard to maintain moving forward.
  • Step #7:  Test and make qualitative data updates
    As you prioritize and move through your data cleansing checklist, ensure that you have prioritized the efforts that will reap the largest reward.  You might be able to prioritize a few smaller wins at first to show the value of the cleansing activities.  You should then align your efforts with the primary requirements of your project.  You may be able to defer some of the data cleansing to later stages of the project, or handle it in a more gradual way.
  • Step #8:  Setup alerts and notifications for future discrepancies
    After data cleansing has occurred and you feel that you have the data in a good state, your job is not over!  Data quality is an ongoing activity.  You almost always run into future data quality issues and governance needs to be setup in order to address these.  Exception reports should be setup and made available “on-demand” to support data cleansing.  Also, one of my favorite tools is data-driven subscriptions, or report bursts.  Microsoft uses the “data-driven subscription” terminology.  IBM Cognos uses the term “report burst.”  Once you have defined the type of data integrity issues that are likely to occur (missing data, incomplete data, inaccurate data, etc.), you can setup data-driven subscriptions, or report bursts, that will prompt the data stewards when these issues occur.  Of course, at the end of the day, you still have the issue of accountability.  We’ll take a look at that in the next step.  Depending on the tool that you using, you may have the capability of sending the user an exception report with the data issue(s) listed.  In other systems, you may simply alert the user of a particular data issue and then they must take action.  These subscriptions should augment the exception reports that are available “on-demand” in your reporting portal.

    Microsoft SSRS Data-Driven Subscription

    Microsoft SSRS Data-Driven Subscription

    IBM Cognos Report Burst

    IBM Cognos Report Burst

  • Step #9:  Consider a workflow to keep data stewards accountable
    So, what now?  The user now has an inbox full of exception reports, or a portal inbox full of alerts, and they still haven’t run the manual, on-demand exception report.  Data integrity issues are causing reporting problems as the data is starting to slip in its quality.  You have a few options here.  In previous projects, I have setup a bit of workflow around the data-driven subscriptions.  The first port of call is the data steward.  They are alerted of an issue with the data and a standard SLA is set to allow them an adequate amount of time to address the issue.  After that SLA period expires, the data issue is then escalated to their line manager.  This can also be setup as a data-driven subscription.  If both steps fail (i.e. both the data steward and the line manager are ignoring the data issue), then it is time to re-engage with your executive committee.  Make the data issues visible and help the executives understand the impact of the data being inaccurate.  Paint a picture for the executive about why data is important.  To further illustrate your point, if you have an executive dashboard that is using this data…it may be worthwhile to point out how the data integrity issue may impact that dashboard.  Not many executives want to be in a position where they are making decisions on inaccurate data.
  • Step #10:  Wash, rinse, and repeat
    By the time that you have gotten to this point, it will likely be time to fold in another transactional system into your BI platform.  Remember this process and use it again!WashRinseRepeat

Gartner releases 2013 Business Intelligence & Analytics Magic Quadrant

Last month, Gartner released the 2013 version of their Business Intelligence & Analytics Platform Magic Quadrant.  I always look forward to the release of Gartner’s magic quadrants as they are tremendously helpful in understanding the landscape of specific technology tools.

Gartner Magic Quadrant for Business Intelligence & Analytics - Comparison of 2012 to 2013

This year, I was pleased to observe the following:

  • Microsoft has improved its overall ability to execute.  Overall, it seems that Microsoft is moving in the right direction with their SQL Server 2012 product.  I’m excited about the enhancements to Master Data Services and I like where they are headed with PowerPivot and Power Views.  A full list of new features can be found at Microsoft’s website.  I’m a big Microsoft fan and I’m excited about Office 2013 and the impact that it will have on BI.
  • IBM has maintained, and slightly increased, its market position.  IBM continues to expand upon the features of their key acquisitions (Cognos, SPSS).  They have done a nice job of migrating customers from the old Cognos 8 platform to IBM Cognos 10.x.  This has increased customer satisfaction.  I also really like their Analytic Answers offering.  In my opinion, BI will continue to become more service oriented – so a big applause for IBM’s analytics as a service offering.
  • Tableau has moved into the top right square.  Tableau deserves to be here and I’m excited to see this movement.  Tableau’s customer support and product quality has been consistently high.  They have also set a benchmark in terms of how straightforward it is to move to their platform and upgrade to the latest version release.
  • There is plenty of competition at the bottom of the market.  Niche players like Jaspersoft and Pentaho are at each others heels.  Competition is healthy!

The only thing that surprised me is that I didn’t see Pyramid Analytics on this list.  Microsoft acquired ProClarity back in 2006.  Extended support for ProClarity will soon end in 2017.  Given that Microsoft has not migrated all of the ProClarity features to PerformancePoint, I am speaking to many users that are jumping ship on the ProClarity front and moving toward Pyramid Analytics.  Pyramid Analytics has done a nice job to aid customers moving from ProClarity.  Keep an eye out.  We might see them on the list next year.

If you are interested in reading the full 2013 report, you may preview the online version here.