Skip to main content
Version: 0.3.0

Pulsar Function CRD configurations

This document lists CRD configurations available for Pulsar Functions. The CRD configurations for Pulsar Functions consist of Function configurations and common CRD configurations.

Function configurations

This table lists Pulsar Function configurations.

FieldDescription
nameThe name of a Pulsar Function.
classnameThe class name of a Pulsar Function.
tenantThe tenant of a Pulsar Function.
namespaceThe Pulsar namespace of a Pulsar Function.
clusterNameThe Pulsar cluster of a Pulsar Function.
replicasThe number of instances that you want to run this Pulsar Function. By default, the replicas is set to 1.
maxReplicasThe maximum number of Pulsar instances that you want to run for this Pulsar Function. When the value of the maxReplicas parameter is greater than the value of replicas, it indicates that the Functions controller automatically scales the Pulsar Functions based on the CPU usage. By default, maxReplicas is set to 0, which indicates that auto-scaling is disabled.
timeoutThe message timeout in milliseconds.
deadLetterTopicThe topic where all messages that were not processed successfully are sent. This parameter is not supported in Python Functions.
funcConfigPulsar Functions configurations in YAML format.
logTopicThe topic to which the logs of a Pulsar Function are produced.
autoAckWhether or not the framework acknowledges messages automatically. This field is required. You can set it to true or false.
maxMessageRetryHow many times to process a message before giving up.
processingGuaranteeThe processing guarantees (delivery semantics) applied to the function. Available values: atleast_once, atmost_once, effectively_once.
forwardSourceMessagePropertyConfigure whether to pass message properties to a target topic.
retainOrderingFunction consumes and processes messages in order.
retainKeyOrderingConfigure whether to retain the key order of messages.
subscriptionNamePulsar Functions' subscription name if you want a specific subscription name for the input-topic consumer.
cleanupSubscriptionConfigure whether to clean up subscriptions.
subscriptionPositionThe subscription position.

Annotations

In Kubernetes, an annotation defines an unstructured Key Value Map (KVM) that can be set by external tools to store and retrieve metadata. annotations must be a map of string keys and string values. Annotation values must pass Kubernetes annotations validation. For details, see Kubernetes documentation on Annotations.

This example shows how to use an annotation to make an object unmanaged. Therefore, the Controller will skip reconciling unmanaged objects in reconciliation loop.

apiVersion: compute.functionmesh.io/v1alpha1
kind: Function
metadata:
annotations:
compute.functionmesh.io/managed: "false"

Images

This section describes image options available for Pulsar Function, source, sink and Function Mesh CRDs.

Base runner

The base runner is an image base for other runners. The base runner is located at ./pulsar-functions-base-runner. The base runner image contains basic tool-chains like /pulsar/bin, /pulsar/conf and /pulsar/lib to ensure that the pulsar-admin CLI tool works properly to support Apache Pulsar Packages.

Runner images

Function Mesh uses runner images as images of Pulsar functions and connectors. Each runner image only contains necessary tool-chains and libraries for specified runtime.

This table lists available Function runtime runner images.

TypeDescription
Java runnerThe Java runner is based on the base runner and contains the Java function instance to run Java functions or connectors. The streamnative/pulsar-functions-java-runner Java runner is stored at the Docker Hub and is automatically updated to align with Apache Pulsar release.
Python runnerThe Python runner is based on the base runner and contains the Python function instance to run Python functions. You can build your own Python runner to customize Python dependencies. The streamnative/pulsar-functions-python-runner Python runner is located at the Docker Hub and is automatically updated to align with Apache Pulsar release.
Golang runnerThe Golang runner provides all the tool-chains and dependencies required to run Golang functions. The streamnative/pulsar-functions-go-runner Golang runner is located at the Docker Hub and is automatically updated to align with Apache Pulsar release.

Image pull policies

When the Function Mesh Operator creates a container, it uses the imagePullPolicy option to determine whether the image should be pulled prior to starting the container. There are three possible values for the imagePullPolicy option:

FieldDescription
AlwaysAlways pull the image.
NeverNever pull the image.
IfNotPresentOnly pull the image if the image does not already exist locally.

State storage

Function Mesh provides the following fields for Stateful functions in the CRD definition.

FieldDescription
statefulConfigThe state storage configuration for the Stateful Functions.
statefulConfig.pulsar.serviceUrlThe service URL that points to the state storage service. By default, the state storage service is the BookKeeper table service.
statefulConfig.pulsar.javaProvider(Optional) If you want to overwrite the default configuration, you can use the state storage configuration for the Java runtime. For example, you can change it to other backend services other than the BookKeeper table service.
statefulConfig.pulsar.javaProvider.classNameThe Java class name of the state storage provider implementation. The class must implement the org.apache.pulsar.functions.instance.state.StateStoreProvider interface. If not, org.apache.pulsar.functions.instance.state.BKStateStoreProviderImpl will be used.
statefulConfig.pulsar.javaProvider.configThe configurations that are passed to the state storage provider.

Input

The input topics of a Pulsar Function. The following table lists options available for the Input.

FieldDescription
topicsThe configuration of the topic from which messages are fetched.
customSerdeSourcesThe map of input topics to SerDe class names (as a JSON string).
customSchemaSourcesThe map of input topics to Schema class names (as a JSON string).
sourceSpecsThe map of source specifications to consumer specifications. Consumer specifications include these options:
- SchemaType: the built-in schema type or custom schema class name to be used for messages fetched by the function.
- SerdeClassName: the SerDe class to be used for messages fetched by the function.
- IsRegexPattern: configure whether the input topic adopts a Regex pattern.
- SchemaProperties: the schema properties for messages fetched by the function.
- ConsumerProperties: the consumer properties for messages fetched by the function.
- ReceiverQueueSize: the size of the consumer receive queue. br /> - cryptoConfig: cryptography configurations of the consumer.

Output

The output topics of a Pulsar Function. This table lists options available for the Output.

NameDescription
topicsThe output topic of a Pulsar Function (If none is specified, no output is written).
sinkSerdeClassNameThe map of output topics to SerDe class names (as a JSON string).
sinkSchemaTypeThe built-in schema type or custom schema class name to be used for messages sent by the function.
producerConfThe producer specifications. Available options: < br />- maxPendingMessages: the maximum number of pending messages.
- maxPendingMessagesAcrossPartitions: the maximum number of pending messages across partitions.
- useThreadLocalProducers: configure whether the producer uses a thread.
- cryptoConfig: cryptography configurations of the producer.
- batchBuilder: support key-based batcher.
customSchemaSinksThe map of output topics to Schema class names (as a JSON string).

Resources

When you specify a function or connector, you can optionally specify how much of each resource they need. The resources available to specify are CPU and memory (RAM).

If the node where a Pod is running has enough of a resource available, it's possible (and allowed) for a pod to use more resources than its request for that resource specifies. However, a pod is not allowed to use more than its resource limit.

Secrets

Function Mesh provides the secretsMap field for Function, Source, and Sink in the CRD definition. You can refer to the created secrets under the same namespace and the controller can include those referred secrets. The secrets are provide by EnvironmentBasedSecretsProvider, which can be used by context.getSecret() in Pulsar functions and connectors.

The secretsMap field is defined as a Map struct with String keys and SecretReference values. The key indicates the environment value in the container, and the SecretReference is defined as below.

FieldDescription
pathThe name of the secret in the Pod's namespace to select from.
keyThe key of the secret to select from. It must be a valid secret key.

Suppose that there is a Kubernetes Secret named credential-secret defined as below:

apiVersion: v1
data:
username: foo
password: bar
kind: Secret
metadata:
name: credential-secret
type: Opaque

To use it in Pulsar Functions in a secure way, you can define the secretsMap in the Custom Resource:

secretsMap:
username:
path: credential-secret
key: username
password:
path: credential-secret
key: password

Then, in the Pulsar Functions and Connectors, you can call context.getSecret("username") to get the secret value (foo).

Authentication

Function Mesh provides the tlsSecret and authSecret fields for Function, Source and Sink in the CRD definition. You can configure TLS encryption and/or TLS authentication using the following configurations.

  • TLS Secret

    FieldDescription
    tlsAllowInsecureConnectionAllow insecure TLS connection.
    tlsHostnameVerificationEnableEnable hostname verification.
    tlsTrustCertsFilePathThe path of the TLS trust certificate file.
  • Authentication Secret

    FieldDescription
    clientAuthenticationPluginThe client authentication plugin.
    clientAuthenticationParametersThe client authentication parameters.

Packages

Function Mesh supports running Pulsar Functions in Java, Python and Go. This table lists fields available for running Pulsar Functions in different languages.

FieldDescription
jarLocationPath to the JAR file for the function. It is only available for Pulsar functions written in Java.
goLocationPath to the JAR file for the function. It is only available for Pulsar functions written in Go.
pyLocationPath to the JAR file for the function. It is only available for Pulsar functions written in Python.
extraDependenciesDirIt specifies the dependent directory for the JAR package.

Cluster location

In Function Mesh, the Pulsar cluster is defined through a ConfigMap. Pods can consume ConfigMaps as environment variables in a volume. The Pulsar cluster ConfigMap defines the Pulsar cluster URLs.

FieldDescription
webServiceURLThe Web service URL of the Pulsar cluster.
brokerServiceURLThe broker service URL of the Pulsar cluster.

Pod specifications

Function Mesh supports customizing the Pod running function instance. This table lists sub-fields available for the pod field.

FieldDescription
labelsSpecify labels attached to a Pod.
nodeSelectorSpecify a map of key-value pairs. For a Pod running on a node, the node must have each of the indicated key-value pairs as labels.
affinitySpecify the scheduling constraints of a Pod.
tolerationsSpecify the tolerations of a Pod.
annotationsSpecify the annotations attached to a Pod.
securityContextSpecify the security context for a Pod.
terminationGracePeriodSecondsIt is the amount of time that Kubernetes gives for a Pod before terminating it.
volumesIt is a list of volumes that can be mounted by containers belonging to a Pod.
imagePullSecretsIt is an optional list of references to secrets in the same namespace for pulling any of the images used by a Pod.
serviceAccountNameSpecify the name of the service account which is used to run Pulsar Functions or connectors.
initContainersInitialization containers belonging to a Pod. A typical use case could be using an Initialization container to download a remote JAR to a local path.
sidecarsSidecar containers run together with the main function container in a Pod.
builtinAutoscalerSpecify the built-in autoscaling rules.
- CPU-based autoscaling: auto-scale the number of Pods based on the CPU usage (80%, 50%, or 20%).
- Memory-based autoscaling: auto-scale the number of Pods based on the memory usage (80%, 50%, or 20%).
If you configure the builtinAutoscaler field, you do not need to configure the autoScalingMetrics and autoScalingBehavior options and vice versa.
autoScalingMetricsSpecify how to scale based on customized metrics defined in Pulsar Functions. For details, see MetricSpec v2 autoscaling.
autoScalingBehaviorConfigure the scaling behavior of the target in both up and down directions (scaleUp and scaleDown fields respectively). If not specified, the default Kubernetes scaling behaviors are adopted. For details, see HorizontalPodAutoscalerBehavior v2 autoscaling.