CLOSE

Agile devops & continuous delivery

DataOps and further DevOps derivates

Technology Summary

Brief definition of DevOps

DevOps combines software development (Dev) and IT operations (Ops) in a closer, more effective and efficient way. A key objective is rapid and frequent software deployment, often changing from ‚quarterly‘ to ‚daily‘ or even near-continuous. That includes improved monitoring, defect management, and rollback without which rapid deployment would not be repeatable. DevOps is often visualized as an infinite loop.

DevOps achieves this by new practices and roles, optimized and automated processes, and advanced tooling. It puts a high emphasis on continuous improvement through defining and acting on metrics.

Applying DevOps to other objects

The object of DevOps is a software product. Because DevOps is very successful, the approach is applied to other objects as well. This works best if a target object has the following main characteristics:

  • The object consists of multiple parts that are worked on separately and can be automatically merged (‘built’) into a deployable.
  • The object can be tested automatically.
  • The object has the notion of a ‚productive use‘. The production environments can be described declaratively and their setup automated.
  • The object can be automatically deployed to production environments, but also rolled back from it to an earlier version.
  • Metrics (KPIs) on the performance of the object in a target environment can be defined and automatically collected.

This is no conclusive or binding set of characteristics; the ‚Ops‘ term can be used (and abused) at will. But the more criteria are fulfilled, the better the DevOps principles are applicable.

Technology Evaluation

Ops for Data Pipelines: DataOps

Data and analytic engineers implement data flows by building data pipelines. Objectives include but are not limited to serving Data Warehouses and Data Hubs, employing both ETL and ELT techniques. Data pipelines operate in a complex and highly dynamic environments where changes occur without notice (sometimes described as ‚data drift‘) that demands the level of agility that DevOps provides. It is therefore not surprising that DataOps is one of the hottest emerging trends in data management and integration.

Data pipelines are a specific kind of deployable software (even if not coded in the classical sense). All criteria described in the previous chapter apply, making data pipelines a perfect object to apply DevOps principles to. Tooling and some techniques have to be adopted. One example is KPI collection: A quality problem in a data pipeline manifests itself differently from an error in a software product.

It is important to keep in mind that no conclusive definition of DataOps exists (in fact, not even of DevOps). Some definitions are very broad and all-inclusive; Wikipedia is a generator of those definitions due to the way it works. Such definitions risk to water terms down. Specifically, ‚data‘ is not a deployable in itself. Using the term ‚DataOps‘ just for advanced collaboration on data in a general sense misses major elements of the DevOps approach. As stated above, this is not the case for deployable data pipelines.

DevOps

DevOps infinity loop. Also applicable to non-software objects that can be deployed

Ops for analytical use cases: XOps and AnalyticsOps

XOps extends the DevOps principles to all key artefacts in the AI arena. Gartner identifies XOps as a Top Trend in Data and Analytics for 2021. In Demystifying XOps: DataOps, MLOps, ModelOps, AIOps and Platform Ops for AI, Gartner describes the following elements of XOps:

  • DataOps with an emphasis on Data Pipelines, as described above
  • Machine Learning Ops (MLOps) where the target objects are ML models. These are deployable objects and a good fit to ‘Ops’.
  • ModelOps covers all AI and decision models as objects, including ML models (so MLOps is actually a sub-discipline of ModelOps), graph models, heuristic models, linguistic models, and more.
  • PlatformOps is not targeting a specific object, but rather describes the orchestration of entire AI systems that include all of the above ‘Ops’ plus DevOps. Whether or not the orchestration of multiple ‘Ops’ disciplines should itself also be called ‘Ops’ is debatable (why not simply ‘orchestration’?). Nevertheless, this highlights the important point that once multiple ‘Ops’ beyond DevOps are established, orchestration across them becomes a necessity.
  • AIOps is the application of ML and AI towards IT operations, so the subject of ML and AI is the data generated during operation. ‘Ops’ stands for ‘operations’ here, not the notion that DevOps principles are adopted. This is an example of a misleading usage of the ‘Ops’ term.

Not mentioned by Gartner is AnalyticsOps, broadening the scope from ML/AI to all kind of analytics. For example, a streaming analytics pipeline could be the object of ‘Ops’. Both XOps and AnalyticsOps are advanced practices with the objective of accelerating data-driven decision making. In fact, numerous authors say that this is what should be called DataOps. The disadvantage of using the term ‘DataOps’ this way is that it narrows it down to analytical use cases.

Market – Current Adoption

DevOps derivatives enjoy strong adoption, with DataOps being the most mature example.

There is a continuously growing, confusing set of Ops terms. The intended association that DevOps principles are applied is justified to a varying degree. Ops can be grouped as follows, in descending order of proximity to DevOps:

  • Extensions of the original DevOps, like BizDevOps and (adding business to the scope) and DevSecOps (adding security).

  • Application of DevOps to an object that shares main characteristics of deployable code (as listed above). Examples are DataOps and MLOps.

  • Application of DevOps to anything that is not like code, but can be described in a descriptive model which can then be put into reality in an automated process resembling software deployment. For example, in Infrastructure as Code, IT environments are descriptively documented (‘codified’). GitOps combines a specific tool (Git) and DevOps principles to automatically deploy and implement infrastructure changes.

  • There is no deployable at all. Yet, principles and best practices of DevOps are applied to advance collaboration in a team by getting the right tools, systems, and practices in place, things like process-driven working, tool support, and KPI driven continuous improvement. DesignOps, ContentOps, ResearchOps, ProductOps etc. come to mind. (This assumes that the resulting designs etc. are not codified, but advocates of these principles rarely think that far).

  • Terms with a near non-existent relationship to DevOps:

    • Mostly unrelated neologisms like PeopleOps
    • Usage of ‘Ops’ simply as a shortcut for ‘Operations’ in a traditional setup, but with no relationship to DevOps. AIOps applies AI to operations and has nothing to do with DevOps.

Finally: The ‘Ops’ in DevOps stands for ‘operations’, not ‘operationalization’. The scientific term ‘operationalization’ means in essence finding measurable observations for abstract concepts. It has no immediate relationship to operations. However, a part of DevOps is to work metrics- and goal driven, so there is arguably an element of ‘operationalizing development and operations’ in DevOps. This leads to all kinds of confusion when ‘operations’ and ‘operationalization’ are used interchangeably.