|
| 1 | +# Migrating to Spring Cloud Data Flow |
| 2 | + |
| 3 | +[Spring Cloud Data Flow](https://cloud.spring.io/spring-cloud-dataflow/) is a microservices |
| 4 | +orchestration tool used to deploy Spring Boot applications (or Docker images) as either |
| 5 | +[streams](http://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#spring-cloud-dataflow-streams) or [tasks](http://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#spring-cloud-dataflow-task). In the migration of Spring Batch Admin use cases to Spring Cloud Data |
| 6 | +Flow, this document will outline the differences between Spring Batch Admin and Spring |
| 7 | +Cloud Data Flow and walk through what is needed to migrate from Spring Batch Admin to |
| 8 | +Spring Cloud Data Flow. |
| 9 | + |
| 10 | +## What's different between Spring Batch Admin and Spring Cloud Data Flow? |
| 11 | + |
| 12 | +Spring Batch Admin is a legacy web application that is used to orchestrate |
| 13 | +[Spring Batch](https://projects.spring.io/spring-batch/) jobs. It does this by packaging |
| 14 | +the batch jobs and the code for Spring Batch Admin into a single WAR file and deploying |
| 15 | +that onto a servlet container. Running there, a user can navigate a provided web based UI |
| 16 | +that allows users to launch jobs that were deployed with the application as well as monitor |
| 17 | +them (data provided via Spring Batch's job repository). |
| 18 | + |
| 19 | +Spring Cloud Data Flow is a server (a Spring Boot application by itself) that [orchestrates independent microservices on a |
| 20 | +platform](http://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#architecture). In the Spring Batch use case, each batch job is packaged as an independent Spring |
| 21 | +Boot über jar that is registered with Spring Cloud Data Flow. From there, a user can |
| 22 | +orchestrate their jobs by launching them via the provided web based UI, an interactive |
| 23 | +shell, or via a set of REST endpoints directly. Spring Cloud data flow [supports a number |
| 24 | +of platforms](http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations) for |
| 25 | +running batch jobs on including CloudFoundry, Kubernetes, and YARN. For |
| 26 | +the purpose of migrating existing Spring Batch Admin users Local is supported in limited |
| 27 | +production use cases. |
| 28 | + |
| 29 | +There are a few differences between Spring Batch Admin and Spring Cloud Data Flow when |
| 30 | +comparing them for Spring Batch Admin use cases: |
| 31 | + |
| 32 | +1. **Packaging -** Spring Batch Admin packages the Spring Batch jobs you want to run within a |
| 33 | +WAR file as a single monolith. While the ability to upload new XML configuration files is |
| 34 | +a feature within Spring Batch Admin, it's use is limited given there is no ability to |
| 35 | +upload additional code. With Spring Cloud Data Flow, the batch jobs are packaged independently |
| 36 | +from the orchestration server, they are entirely two different entities, which allows for |
| 37 | +more flexibility. The Spring Cloud Data Flow |
| 38 | + approach does not require a re-deploy every time a batch job is to be added or modified. |
| 39 | +2. **Execution Model -** Spring Batch Admin executes the batch jobs within the JVM the web |
| 40 | +application is running in. All batch jobs share the same memory heap, etc. This can lead to |
| 41 | +issues like noisy neighbors, etc. Spring Cloud Data Flow executes each batch job in an |
| 42 | +independent JVM. When used with CloudFoundry or Kubernetes, each batch job is run within it's |
| 43 | +own container and is completely isolated from not only the other batch jobs, but the |
| 44 | +orchestration server itself as well. When run with the Local deployer, new JVMs are |
| 45 | +launched on the same machine, again with their own independent memory heap, etc. These |
| 46 | +finite JVMs and containers in Spring Cloud Data Flow are called tasks. |
| 47 | +[Spring Cloud Task](http://cloud.spring.io/spring-cloud-task/) adds additional |
| 48 | +functionality to Spring Batch that allows Spring Batch jobs to work with Spring Cloud Data |
| 49 | +Flow. |
| 50 | +3. **Interaction options -** Spring Batch Admin has a web based UI and a set of REST |
| 51 | +endpoints that can be used to execute and monitor jobs. Spring Cloud Data Flow provides |
| 52 | +not only a web based UI and REST API to execute and monitor jobs, but also an [interactive |
| 53 | +shell](http://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#shell) to orchestrate jobs. It also provides a DSL and drag and drop UI for orchestrating |
| 54 | +complex flows of jobs ([execute jobs B and C after A completes, execute job D after both B |
| 55 | +and C complete, etc](http://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#_composed_tasks_dsl)). |
| 56 | +[](https://www.youtube.com/watch?v=KT_4kVcyfRA) |
| 57 | +4. **Customization options -** Spring Batch Admin was designed to be customizable. It |
| 58 | +provided documentation not only on how to configure your own component overrides, but also |
| 59 | +customize the UI. Spring Cloud Data Flow is a bit more rigid in that respect. There are |
| 60 | +some features you can turn off via feature toggles (turn of the stream functionality for |
| 61 | +example), but the Angular based web UI is not intended to be extensively customized |
| 62 | +without forking the repository and making deeper modifications or building your own |
| 63 | +dashboard via the provided REST APIs. |
| 64 | + |
| 65 | +## Migrating from Spring Batch Admin to Spring Cloud Data Flow |
| 66 | + |
| 67 | +With the above in mind, the steps for migrating from Spring Batch Admin to Spring Cloud |
| 68 | +Data Flow are rather straightforward. |
| 69 | + |
| 70 | +1. **Read the Spring Cloud Data Flow reference documentation -** Given the differences in |
| 71 | +packaging, execution model, and interaction options it will make the migration much easier |
| 72 | +after fully understanding how Spring Cloud Data Flow works. |
| 73 | +2. **Repackage your batch jobs as Spring Boot über jars (or Docker containers) -** If using |
| 74 | +either the CloudFoundry or Local derivatives of Spring Cloud Data Flow, you'll want to |
| 75 | +repackage your Spring Batch jobs as Spring Boot über jars with the `@EnableTask` |
| 76 | +annotation from the [Spring Cloud Task](http://cloud.spring.io/spring-cloud-task/) project |
| 77 | +added (this annotation allows Spring Cloud Data Flow to work with Spring Batch natively). |
| 78 | +If you are going to be using the Kubernetes variant of Spring Cloud Data Flow, you'll want |
| 79 | +to package your batch jobs as über jars that are then wrapped in a Docker image. |
| 80 | +3. **Register your batch jobs with Spring Cloud Data Flow -** Once you have Spring Cloud Data Flow |
| 81 | +running, you'll need to register the jar files or Docker images with the server. The |
| 82 | +Spring Cloud Data Flow documentation walks through how to do this |
| 83 | +[here](http://docs.spring.io/spring-cloud-dataflow/docs/1.2.2.RELEASE/reference/htmlsingle/#_registering_a_task_application). |
| 84 | +4. **Launch your Spring Batch Jobs as tasks -** Once the batch jobs are registered, you can launch |
| 85 | +them as tasks. Tasks are nothing more than a microservice that has an expected end (as |
| 86 | +all batch jobs do). In the Spring Batch Admin use cases, think of them as a separate JVM |
| 87 | +that will shut down once your job is complete. |
| 88 | + |
| 89 | + |
0 commit comments