
Welcome
Lifecycle Management for Big Data Projects
Guaranteed success for all data projects

Guaranteed success for all data projects
Gartner continues to report (for over 30+ years) that a significant portion of big data projects, estimated at 85%, fail to deliver expected results. This failure rate highlights the challenges businesses face in successfully implementing and leveraging big data initiatives.
The market is flooded with graphical ETL products and for years the vendors of these products have pushed the notion that the data project failure problem is due primarily to ETL hand coding. If this belief is true, then we need to ask why the high failure rate has continued for 30+ years.
All ETL products are based on the same paradigm. The unit-of-work is a job, graph, drawing, etc.. This unit-of-work consumes one or more data sources, joins, cleanses, modifies, restructures the source data, and finally produces one or more data targets. These products contribute to data project failure because the ETL product unit-of-work is not capable of accommodating and directing the entire lifecycle of a big data project. A data project produces a data application which is composed of many ETL units-of-work created by many data project participants. Further, application development and application runtime must be a single endeavor. Something much larger is required.
A big data project will employ many participants – data analysts, developers, testers, runtime operations – creating 10s, 100s, or even 1000s of those ETL units-of work concurrently, and those units-of-work become the data application. The lifecycle of a data application begins with:
A solution is needed that guarantees success for all data projects by providing an environment that accommodates the entire data project lifecycle from application requirements gathering, to object development, to application deployment, to ongoing application maintenance. This solution is Mozart4Draw.io.
Many data projects begin with the acquisition of an ETL tool meaning developer training is required as these tool tend to be very complex. Imagine being an English speaker in the USA, learning the Italian language over a 6 week period, and then being tasked to write a successful Italian Opera. This is how most data projects begin and the projects that actually succeed are typically implemented using ETL vendor consultants at a huge expense. Further, the applications created can not be easily maintained or modified by the novice Italian speakers. This means that you are now married to your ETL product vendor’s consultants when problems arise or modifications are needed as business requirements evolve over time. There is a better way.
Encapsulation principles provide segregated context for all project roles. Segregated because a data analyst needs no knowledge of ETL development, testing, systems, and runtime environment. A developer of an ETL object needs no knowledge of other ETL objects, testing, the overall application, or the application run time environment. The run time production operators needs no knowledge of the application functionality or purpose because load management, load distribution, and point-of-failure recovery are all automated.
Because objects resemble standalone black-boxes whose contents are irrelevant to most project participants, different developers may use different ETL technologies to implement their object’s functionality. This means that if a project begins with 3 proficient Datastage developers and 5 proficient Oracle PL/SQL developers, the project begins with the best object developers from day one, each developer working in a preferred technology. Further, because each object is self contained having defined interfaces and requirements, each object may be developed and tested in unique locations anywhere in the world.
Mozart4Draw.io is both a software plugin and a methodology created for open source Draw.io drawing application. It facilitates the decomposition of complex data application requirements into encapsulated multi-level objects for distribution to developers and testers. When the developed and tested objects are plugged into the overall parent data application, the completed application is portable across processing environments, distributed within a single processing environment or across multiple processing environments, point-of-failure recoverable, and easily modified and maintained over time. You essentially draw a multi-level object-oriented data diagram and run it on any processing topology. Mozart4Draw.io provides guided context for the entire data project lifecycle.
You will be amazed at how easy it is to manage any data project producing successful high performing data applications.

The journey began as a big data consultant working for the Informix Data Warehouse group in the mid-1990s. I was assigned to an enterprise data warehouse project for a large car rental company in the Midwest. It was their first data warehouse project and had 34 participants. The ETL product selected was Informix-4GL, a hand-coding product. The ETL developers were hired from within and given very little training before the project began. I instantly realized that I could give a developer specifications defining inputs, outputs, and mappings for an individual ETL job, but the developer would not be able to deal with things like overall application point-of-failure recovery, application load balancing and distribution, and application metadata. I wrote an engine application based on encapsulation principles that managed the hundreds of ETL jobs in the production environment and the project was a resounding success delivering accurate data in a timely manner.
We use cookies to analyze website traffic and optimize your website experience. By accepting our use of cookies, your data will be aggregated with all other user data.