Goal:
Status: Completed
My name is Pham Vu Tuan, I am a final year undergraduate student from Singapore. This is the first time I participate in Google Summer of Code and contribute to an open-source organization.
I have GSoC mentors who help me in this project Oleg Nenashev and Supun Wanniarachchi. Besides that, I also receive great support from developers in remoting project Devin Nusbaum and Jeff Thompson.
Current versions of Jenkins Remoting are based on the TCP protocol. If it fails, the agent connection and the build fails as well. There are also issues with traffic prioritization and multi-agent communications, which impact Jenkins stability and scalability.
This project aims to develop a plugin in order to add support of a popular message queue/bus technology (Kafka) as a fault-tolerant communication layer in Jenkins.
The plugin source code can be found in GitHub.
The plugin provide useful features to the community:
Provide a new method to connect agent to master using Kafka besides existing methods such as JNLP or SSH Build Agents plugin.
Help to resolve the existing issues with the TCP protocol between master and agent communication in Jenkins.
Help to resolve traffic prioritization and multi-agent communications issue in Jenkins.
Kafka itself is not a queue like ActiveMQ or RabbitMQ, it is a distributed, replicated commit log. This helps to remove message delivery complexity we have in traditional queue system.
We need to support data streaming as a requirement, and Kafka is good at this aspect, which RabbitMQ is lack of.
Kafka is said to have a better scalability and good support from the development community.
The project consists of multiple components:
Kafka Client Library - new command transport implementation, producer and consumer client logic.
Remoting Kafka Plugin - plugin implementation with KafkaGlobalConfiguration, KafkaComputerLauncher and KafkaSecretManager.
Remoting Kafka Agent - A custom JAR agent with remoting JAR packaged together with a custom Engine implementation to setup a communication channel with Kafka. The agent is also packaged as a Docker image in DockerHub.
All the components are packaged together with Docker Compose.
The below diagram is the overview of the current architecture:
With this design, master is not communicating with agent using direct TCP communication anymore, all the communication commands are transferred with Kafka.
Setup project as a set of Docker Compose components: Kafka cluster, Jenkins controller (with plugin) and a custom agent.
Create a PoC with command transport implementation to support Kafka, which involves of command invocation,RMI, classloading and data streaming.
Make necessary changes in Remoting and Core to make them extensible (if needed).
Make a decision whether Kafka is suitable as a final implementation.
Support security for master-agent connection.
Kafka authentication/authorization (JENKINS-51472, JENKINS-51473).
Agent secrets (JENKINS-51470).
Improve Kafka producer-consumer model to ensure reliability (JENKINS-51942).
Release alpha version and address feedback (JENKINS-51713).
Release 1.0 version of the plugin (JENKINS-52483).
Support remoting-kafka-plugin with Jenkins Configuration-as-code plugin (JENKINS-51471).
Ready-to-fly demo setup (JENKINS-51711).
Basic test automation for the plugin (JENKINS-52540).
Add plugin implementation information to remoting documentation (JENKINS-52514).
Bug fixing and clean up work (JENKINS-52044, JENKINS-52079, JENKINS-52422).
We have setup a ready-to-fly demo for this plugin. You can try to run a demo of the plugin by following this instruction. Features in the demo:
Docker Compose starts preconfigured Master and agent instance, they connect automatically using Kafka launcher.
Kafka is secured and encrypted with SSL.
There few demo jobs in the instance so that a user can launch a job on the agent.
Kafka Manager supported in localhost:9000 to support monitoring of Kafka cluster.
Cloud API implementation (JENKINS-51474).
Chunking capabilities for Kafka channel (JENKINS-51709).
Stop bundling remoting in Remoting Kafka Agent (JENKINS-51944).
Consumer pooling, NIO options (JENKINS-52199).
Support multiple Kafka hosts to achieve fault-tolerant communication (JENKINS-52542).
Agent recovery to continue running jobs after disconnection to Kafka (JENKINS-52954).
Make Zookeeper configuration optional to support ad-hoc topics creation (JENKINS-52870).