Version: 0.1.4

Source CRD configurations

This document lists CRD configurations available for Pulsar source connectors. The source CRD configurations consist of source connector configurations and the common CRD configurations.

Source configurations

This table lists source configurations.

FieldDescription
nameThe name of a source connector.
classnameThe class name of a source connector.
tenantThe tenant of a source connector.
ClusterNameThe Pulsar cluster of a source connector.
ReplicasThe number of instances that you want to run this source connector. By default, the Replicas is set to 1.
MaxReplicasThe maximum number of Pulsar instances that you want to run for this source connector. When the value of the maxReplicas parameter is greater than the value of replicas, it indicates that the source controller automatically scales the source connector based on the CPU usage. By default, maxReplicas is set to 0, which indicates that auto-scaling is disabled.
SourceConfigThe map to a ConfigMap specifying the configuration of a source connector.
ProcessingGuaranteeThe processing guarantees (delivery semantics) applied to the source connector. Available values: ATLEAST_ONCE, ATMOST_ONCE, EFFECTIVELY_ONCE.

Images

This section describes image options available for Pulsar source CRDs.

Base runner

The base runner is an image base for other runners. The base runner is located at ./pulsar-functions-base-runner. The base runner image contains basic tool-chains like /pulsar/bin, /pulsar/conf and /pulsar/lib to ensure that the pulsar-admin CLI tool works properly to support Apache Pulsar Packages.

Runner images

Function Mesh uses runner images as images of Pulsar connectors. Each runner image only contains necessary tool-chains and libraries for specified runtime.

Pulsar connectors supports using the Java runner images as their images. The Java runner is based on the base runner and contains the Java function instance to run Java functions or connectors. The streamnative/pulsar-functions-java-runner Java runner is stored at the Docker Hub and is automatically updated to align with Apache Pulsar release.

Output

The output topics of a Pulsar Function. This table lists options available for the Output.

NameDescription
TopicsThe output topic of a Pulsar Function (If none is specified, no output is written).
SinkSerdeClassNameThe map of output topics to SerDe class names (as a JSON string).
SinkSchemaTypeThe built-in schema type or custom schema class name to be used for messages sent by the function.
ProducerConfThe producer specifications. Available options:
- MaxPendingMessages: the maximum number of pending messages.
- MaxPendingMessagesAcrossPartitions: the maximum number of pending messages across partitions.
- UseThreadLocalProducers: configure whether the producer uses a thread.
- CryptoConfig: cryptography configurations of the producer.
- BatchBuilder: support key-based batcher.
CustomSchemaSinksThe map of output topics to Schema class names (as a JSON string).

Resources

When you specify a function or connector, you can optionally specify how much of each resource they need. The resources available to specify are CPU and memory (RAM).

If the node where a Pod is running has enough of a resource available, it's possible (and allowed) for a Pod to use more resources than its request for that resource. However, a Pod is not allowed to use more than its resource limit.

Secrets

In Function Mesh, the secret is defined through a secretsMap. To use a secret, a Pod needs to reference the secret. Pods can consume secretsMaps as environment variables in a volume.

To use a secret as an environment variable in a Pod, follow these steps.

  1. Create a secret or use an existing one. Multiple Pods can reference the same secret.
  2. Modify your Pod definition in each container, which you want to consume the value of a secret key, to add an environment variable for each secret key that you want to consume.
  3. Modify your image and/or command line so that the program looks for values in the specified environment variables.

Packages

Function Mesh supports running Pulsar connectors in Java.

FieldDescription
jarLocationPath to the JAR file for the connector.

Cluster location

In Function Mesh, the Pulsar cluster is defined through a ConfigMap. Pods can consume ConfigMaps as environment variables in a volume. The Pulsar cluster ConfigMap defines the Pulsar cluster URLs.

FieldDescription
webServiceURLThe Web service URL of the Pulsar cluster.
brokerServiceURLThe broker service URL of the Pulsar cluster.

Pod specifications

Function Mesh supports customizing the Pod running connectors. This table lists sub-fields available for the pod field.

FieldDescription
LabelsSpecify labels attached to a Pod.
NodeSelectorSpecify a map of key-value pairs. For a Pod running on a node, the node must have each of the indicated key-value pairs as labels.
AffinitySpecify the scheduling constraints of a Pod.
TolerationsSpecify the tolerations of a Pod.
AnnotationsSpecify the annotations attached to a Pod.
SecurityContextSpecify the security context for a Pod.
TerminationGracePeriodSecondsIt is the amount of time that Kubernetes gives for a Pod before terminating it.
VolumesIt is a list of volumes that can be mounted by containers belonging to a Pod.
ImagePullSecretsIt is an optional list of references to secrets in the same namespace for pulling any of the images used by a Pod.
InitContainersInitialization containers belonging to a Pod. A typical use case could be using an Initialization container to download a remote JAR to a local path.
SidecarsSidecar containers run together with the main function container in a Pod.