alt =""
Showing posts with label Business Intelligence. Show all posts
Showing posts with label Business Intelligence. Show all posts

Wednesday, 25 November 2015

Taming the Elephant....Building a Big Data & Analytics Practice - Part I

A couple of decades ago, the data and information management landscape was significantly different. Though the core concepts of Analytics, in a large sense, has not changed dramatically, adoption and the ease of analytical model development has taken a paradigm shift in recent years. Traditional Analytics adoption has grown exponentially and Big Data Analytics needs additional and newer skills.

For further elaboration, we need to go back in time and look at the journey of data. Before 1950, most of the data and information was stored in file based systems (after the discovery and use of punched cards earlier). Around 1960, Database Management Systems (DBMS) became a reality with the introduction of hierarchical database system like IBM Information Management and thereafter the network database system like Raima Database Manager (RDM). Then came Dr. Codds Normal Forms and the Relational Model. Small scale relational databases (mostly single user initially) like DBase, Access and FoxPro started gaining popularity.

With System R from IBM (later becoming the widely used Structured Query Language  database from which IBM DB2 was created), and ACID (Atomicity, Consistency, Isolation, Durability) compliant Ingres Databases getting released, commercialization of multi-user RDBMS became a reality with Oracle and Sybase (now acquired by SAP) databases coming into use in the coming years. Microsoft had licensed Sybase on OS2 as SQL Server and later split with Sybase to continue on the Windows OS platform. The open source movement however continued with PostGreSQL (an Object-Relational DBMS) and MySQL (now acquired by Oracle) being released around mid 1990's. For over 2 decades, RDBMS and SQL grew to become a standard for enterprises to store and manage their data.

From 1980's, Data Warehousing systems started to evolve to store historical information to separate the overhead of Reporting and MIS from OLTP systems. With Bill Inmon's CIF model and later Ralph Kimball's popular OLAP supporting Dimensional Model (Denormalized Star & Snowflake schema) gaining popularity, metadata driven ETL & Business Intelligence tools started gaining traction, while database product strategy promoted the then lesser used ELT approach and other in-database capabilities like in-database data mining that was released in Oracle 10g.  For DWBI products and solutions, storing and managing metadata in the most efficient manner proved to be the differentiator. Data Modeling Tools started to gain importance beyond desktop and web application development. Business Rules Management technologies like ILOG JRules, FICO Blaze Advisor or Pega started to integrate with DWBI applications.

Big Data Analytics, Business Intelligence,

Once the Data Warehouses started maturing, the need for Data Quality initiatives started to rise, since most Data Warehousing development cycles would have used a subset of the production data (at times obfuscated / masked) during development and hence even if the implementation approach would have included Data Cleansing and Standardization, core DQ issues would start to emerge post production release to even at times render the warehouse unusable till the DQ issues were resolved.

Multi-domain Master Data Management (Both Operational or Analytical) / Data Governance projects started to grow in demand once organizations started to view Data as an Enterprise Asset for enabling a single version of truth to help increase business efficiency and also for both internal and at times external data monetization.  OLAP integrated with BI to provide Ad-hoc reporting besides being popular for what-if modeling and analysis in EPM / CPM implementations (Cognos TM1, Hyperion Essbase, etc.)

Analytics was primarily implemented by practitioners using SAS (1976) and SPSS (1968) for Descriptive and Predictive Analytics in a production environment and ILOG (1987) CPLEX, ARENA (2000) for Prescriptive Modeling including Optimization and Simulation. While SAS had programming components within Base SAS, SAS STAT and SAS Graph, the strategy evolved to move SAS towards a UI based modeling platform with Enterprise Miner and Enterprise Guide getting launched, products that were similar to SPSS Statistics and Clementine (later IBM PASW modeler) which were essentially UI based drag-drop-configure analytics model development software for practitioners usually having a background in Mathematics, Statistics, Economics, Operations Research, Marketing Research or Business Management. Models used sample representative data and a reduced set of factors / attributes and hence performance was not an issue till then.

Around mid of last decade, if anyone had knowledge and experience with Oracle, ERWin, Informatica and MicroStrategy or competing technologies, they could play the role of a DWBI  Technology Lead or even as an Information Architect with additional exposure & experience on designing Non Functional DW requirements including scaleability, best practices, security, etc.

Sooner, the enterprise data warehouses, now needing to store years of data, often without an archival strategy, started to grow exponentially in size. Even with optimized databases and queries, there was a drop in performance. Then came Appliances or balanced / optimized data warehouses. These were optimized database software often coupled with the operating system and custom hardware. However most appliances were only supporting vertical scaling. However, the benefits that appliances brought were rapid accessibility, rapid deployment, high availability, fault tolerance and security.
Appliances thus became the next big thing with Agile Data Warehouse migration projects being undertaken to move from RDBMS like Oracle, DB2, SQL Server to query optimized DW Appliances like Teradata,  Netezza, GreenPlum, etc. incorporating capabilities like data compression, massive parallel processing (shared nothing architecture), apart from other features. HP Vertica, which took the appliance route initially, later reverted to become a software only solution.

Initially Parallel Processing had 3 basic architectures – MPP, SMP and NUMA. MPP stands for Massive Parallel Processing, and is the most commonly implemented architecture for query intensive systems. SMP stands for Symmetric Multiprocessing and had a Shared Everything (including shared disk) Architecture while NUMA stands for Non Uniform Memory Architecture which is essentially a combination of SMP and MPP. Over a period of time, the architectures definitions became more amorphous as products kept on improvising their offerings.

While Industry and Cross-Industry packaged DWBI & Analytics Solutions became increasingly a Product and SI / Solution Partner  Strategy, end of last decade started to see increasing adoption of Open Source ETL, BI and Analytics  technologies like Talend, Pentaho, R Library, etc. adopted within industries (with the only exceptions of Pharma & Life Science and BFSI Industry groups / sectors), and in organizations where essential features and functionality were sufficient to justify the ROI on DWBI initiatives that were usually undertaken for strategic requirements and not for day to day operational intelligence or for  insight driven additional or new revenue generation.

Also, cloud based platforms and  solutions adoption and even DWBI and Analytics application development on  private or public cloud platforms like Amazon, Azure, etc. (IBM has now come with BlueMix and DashDB as an alternate) started to grow as part of either a start-up strategy or cost optimization initiative of Small and Medium Businesses and even in some large enterprises as an exploratory initiative, given confidence on data security.

Visualization Software also started to emerge and carve a niche, growing in increasing relevance mostly as a complementary solution to the IT dependent Enterprise Reporting Platforms. The Visualization products were business driven, unlike technology forward enterprise BI platforms that could also provide self-service, mobile dashboards, write-back, collaboration, etc. but had multiple components with complex integration and pricing at times.

Hence while traditional enterprise BI platforms had a data driven "Bottom Up" product strategy, with dependence and control with the IT team, Visualization Software took a business driven "Top Down" Product Strategy, empowering business users to analyze data on their own and create their own dashboards with minimal or no support from the IT department.

With capabilities like geospatial visualization, in-memory analytics, data blending, etc. visualization software like Tableau is increasingly growing in acceptance. Some others have blended Visualization with out-of-box Analytics like TIBCO Spotfire and in recent years SAS Visual Analytics, a capability which otherwise is achieved in Visualization tools mostly by integrating with R.

All of the above was manageable with reasonable flexibility and continuity till data was more or less structured and ECM tools were used to take care of documents and EAI technologies were used mostly for real-time integration and complex event processing between Applications / Transactional Systems.

But a few year ago, Digital platforms including Social, Mobile and other platforms like IOT/M2M started to grow in relevance and Big Data Analytics grew beyond being POCs undertaken as an experiment to thereafter complement an enterprise data warehouse (along with enterprise search capabilities), to at times even replace them. The data explosion gave rise to the 3 V dilemma of velocity, volume and variety and now data was available in all possible forms and in newer formats like JSON, BSON, etc. which had to be stored and transformed real-time.

Analytics had to be now done over millions of data in motion unlike the traditional end of day analytics over data at rest. Business Intelligence including Reporting and Monitoring, Alerts and even Visualization had to become real-time. Even the consumption of analytics models now needed to be real-time as in the case of  customer recommendations and personalization, trying to leverage smallest windows of opportunity to up-sell / cross-sell to customers.

It is Artificial Intelligence systems, powered by Big Data that is becoming the game changer in the near future and it is Google, IBM and others like Honda who are leading the way in this direction.
To be continued.........

Tuesday, 27 October 2015

Disruptive Technology Roundup - Product Engineering Services

Cloud computing is the foremost among the disruptive technologies that rule the IT industry.  Organizations are leveraging public cloud for reducing the infrastructure costs and also for a faster delivery of technology projects. Since the data moved into the cloud is often dependent on the application that creates and maintains it, it is vital to integrate the SaaS apps in the cloud with the existing on- premise software. Configuring these multiple SaaS applications to share data in the cloud is crucial in determining the success or failure of cloud projects. Instead of choosing the richest SaaS application, organizations should consider the performance of the app and its ability to integrate into an overall portfolio. It is crucial that the app purchasing decisions need to consider operational performance metrics beyond features and functionality, and how new SaaS apps will contribute to the way the business runs in the future. In this age of Big Data, where large chunks of data are analyzed for churning out Business intelligence and insights, organizations should consider the SaaS vendors that provide access to their own data with better performing and efficiently integrated SaaS applications. 

Businesses are moving into an age of innovation and disruption with the influence of new age disruptive technologies including IoT and Big Data. When everything and everyone gets connected into an integrated global network, the safety of data from unauthorized access, dissemination, and usage is a matter of greater concern. Organizations are now searching for new ways and means to protect their assets from cyber security breaches. At a time when traditional security measures become inefficient, a major rethink of the existing cyber security systems and strategies is the need of the hour. The global cyber security industry is going through a fundamental change and is growing to address the cyber security challenges in the age of IoT. With IoT creating innovations and disruptions in the business world, parallel innovations are happening in the cyber security space also to address the IoT security threats.

Technology space is witnessing a major upheaval with the new disruptive technologies changing the way businesses are carried out. Cloud, Social, and Mobile are converging and accelerating one another to give rise to a constant access paradigm consisting of:

Continuous Services – solutions will increasingly need to be cloud-based to ensure they are always available on-demand and can be consumed on demand.

Connected Devices – proliferation of the number and types of devices that allow users to be continuously or intermittently connected to the internet and with one another.  
Product Engineering Services

With a combination of agile methodology, experienced architects and pre-built components, Happiest Minds deliver Product Engineering Services on 4 specific domains: Enterprise Domain catering to Enterprise ISV customer, Customer Platforms focussing on E-Commerce and Media & Entertainment, IoT focussing on Industrial and Automotive & Building Automation and Data Center Technologies (DCT) focussing on Software Defined Networking and Data Centres. A strong team of technical experts to offer Architecture and Engineering services, well-defined methodologies, frameworks and product engineering processes and standards make Happiest Minds a preferred partner for Product Engineering Services.