Pulsar Function CRD configurations
This document lists CRD configurations available for Pulsar Functions. The CRD configurations for Pulsar Functions consist of Function configurations and common CRD configurations.
This table lists Pulsar Function configurations.
|The function name is a string of up to |
|The class name of a Pulsar Function.|
|The tenant of a Pulsar Function.|
|The Pulsar namespace of a Pulsar Function.|
|The Pulsar cluster of a Pulsar Function.|
|The number of instances that you want to run for a function. If it is set to |
|Configure whether to show the precise parallelism. If it is set to |
|The minimum number of instances that you want to run for a function. If it is set to |
|The image of the init container that is used to download a package from Pulsar if the download path is specified. By default, the |
|The image that is used to remove the subscriptions created or used by a function when the function is deleted. If no clean-up image is set, the runner image will be used.|
|The maximum number of instances that you want to run for this Pulsar function. When the value of the |
|The message timeout in milliseconds.|
|The topic where all messages that were not processed successfully are sent. This parameter is not supported in Python Functions.|
|Pulsar Functions configurations in YAML format.|
|The topic to which the logs of a Pulsar Function are produced.|
|Whether or not the framework acknowledges messages automatically. This field is required. You can set it to |
|How many times to process a message before giving up.|
|The processing guarantees (delivery semantics) applied to the function. Available values: |
|Configure whether to pass message properties to a target topic.|
|The function consumes and processes messages in order. When you set |
|Configure whether to retain the key order of messages. When you set |
|Pulsar Functions’ subscription name if you want a specific subscription-name for the input-topic consumer.|
|Configure whether to clean up subscriptions that are created or used by a function when the function is deleted.|
|The subscription position.|
|The configurations about the Pulsar cluster. For details, see messaging.|
|A list of claims that a Pod is allowed to reference. It provides stable storage using PersistentVolumes provisioned by a PersistentVolume Provisioner. This property is specified at the first time when you create the function and it cannot be modified when you update the resource.|
|Configure whether and how PVCs are deleted during the lifecycle of a StatefulSet. Available options are |
In Kubernetes, an annotation defines an unstructured Key Value Map (KVM) that can be set by external tools to store and retrieve metadata.
annotations must be a map of string keys and string values. Annotation values must pass Kubernetes annotations validation. For details, see Kubernetes documentation on Annotations.
This example shows how to use an annotation to make an object unmanaged. Therefore, the Controller will skip reconciling unmanaged objects in reconciliation loop.
This section describes image options available for Pulsar Function, source, sink and Function Mesh CRDs.
The base runner is an image base for other runners. The base runner is located at
./pulsar-functions-base-runner. The base runner image contains basic tool-chains like
/pulsar/lib to ensure that the
pulsar-admin CLI tool works properly to support Apache Pulsar Packages.
Function Mesh uses runner images as images of Pulsar functions and connectors. Each runner image only contains necessary tool-chains and libraries for specified runtime.
This table lists available Function runtime runner images.
|Java runner||The Java runner is based on the base runner and contains the Java function instance to run Java functions or connectors. The |
|Python runner||The Python runner is based on the base runner and contains the Python function instance to run Python functions. You can build your own Python runner to customize Python dependencies. The |
|Golang runner||The Golang runner provides all the tool-chains and dependencies required to run Golang functions. The |
Image pull policies
When the Function Mesh Operator creates a container, it uses the
imagePullPolicy option to determine whether the image should be pulled prior to starting the container. There are three possible values for the
|Always pull the image.|
|Never pull the image.|
|Only pull the image if the image does not already exist locally.|
Function Mesh provides Pulsar cluster configurations in the Function, Source, and Sink CRDs. You can configure TLS encryption, TLS authentication, and OAuth2 authentication using the following configurations.
tlsSecretare exclusive. If you configure TLS configurations, the TLS Secret will not take effect.
|The authentication configurations of the Pulsar cluster. Currently, you can only configure generic authentication or OAuth2 authentication through this field. For other authentication methods, you can configure them using the |
|The name of the authentication ConfigMap that stores authentication configurations of the Pulsar cluster.|
|The authentication configurations for removing subscriptions and intermediate topics. You can configure generic authentication or OAuth2 authentication through this field. If not provided, the `authConfig` will be used. |
|The name of the ConfigMap that stores Pulsar cluster configurations.|
|The TLS configurations of the Pulsar cluster.|
|The name of the TLS ConfigMap that stores TLS configurations of the Pulsar cluster.|
Function Mesh provides the following fields for Stateful functions in the CRD definition.
|The state storage configuration for the Stateful Functions.|
|The service URL that points to the state storage service. By default, the state storage service is the BookKeeper table service.|
|(Optional) If you want to overwrite the default configuration, you can use the state storage configuration for the Java runtime. For example, you can change it to other backend services other than the BookKeeper table service.|
|The Java class name of the state storage provider implementation. The class must implement the |
|The configurations that are passed to the state storage provider.|
Window function configurations
Function Mesh provides the following fields for window functions in the CRD definition.
|Optional. The runner class name of the implemented window function. By default, the value is the same as the |
|Optional. The late data topic for the late tuple messages. The late data topic must be defined when specifying a timestamp extractor class (|
|Optional. The maximum lag duration (in milliseconds) of the window function. By default, it is set to 0.|
|Optional. The number of messages before the window slides.|
|Optional. The time duration (in milliseconds) after which the window slides.|
|Optional. The timestamp extractor class name. It should be set to |
|Optional. The watermark interval (in milliseconds) of the window function. By default, it is set to 1000 ms.|
|Optional. The number of messages per window.|
|Optional. The time duration (in milliseconds) of the window.|
The input topics of a Pulsar Function. The following table lists options available for the
|The configuration of the topic from which messages are fetched.|
|The map of input topics to SerDe class names (as a JSON string).|
|The map of input topics to Schema class names (as a JSON string).|
|The map of source specifications to consumer specifications. Consumer specifications include these options: |
The output topics of a Pulsar Function. This table lists options available for the
|The output topic of a Pulsar Function (If none is specified, no output is written).|
|The map of output topics to SerDe class names (as a JSON string).|
|The built-in schema type or custom schema class name to be used for messages sent by the function.|
|The producer specifications. Available options: |
|The map of output topics to Schema class names (as a JSON string).|
When you specify a function or connector, you can optionally specify how much of each resource they need. The resources available to specify are CPU and memory (RAM).
If the node where a Pod is running has enough of a resource available, it's possible (and allowed) for a pod to use more resources than its
request for that resource specifies. However, a pod is not allowed to use more than its resource
Function Mesh provides the
secretsMap field for Function, Source, and Sink in the CRD definition. You can refer to the created secrets under the same namespace and the controller can include those referred secrets. The secrets are provide by
EnvironmentBasedSecretsProvider, which can be used by
context.getSecret() in Pulsar functions and connectors.
secretsMap field is defined as a
Map struct with
String keys and
SecretReference values. The key indicates the environment value in the container, and the
SecretReference is defined as below.
|The name of the secret in the Pod's namespace to select from.|
|The key of the secret to select from. It must be a valid secret key.|
Suppose that there is a Kubernetes Secret named
credential-secret defined as below:
To use it in Pulsar Functions in a secure way, you can define the
secretsMap in the Custom Resource:
Then, in the Pulsar Functions and Connectors, you can call
context.getSecret("username") to get the secret value (
Function Mesh supports running Pulsar Functions in Java, Python and Go. This table lists fields available for running Pulsar Functions in different languages.
|The path to the JAR file for the function. It is only available for Pulsar functions written in Java.|
|It specifies JVM options to better configure JVM behaviors, including |
|The path to the JAR file for the function. It is only available for Pulsar functions written in Go.|
|The path to the JAR file for the function. It is only available for Pulsar functions written in Python.|
|It specifies the dependent directory for the JAR package.|
By default, the log level for Pulsar functions is
info. Function Mesh supports setting multiple log levels for Pulsar functions.
The log levels are only available for the Go runtime 2.11 or higher.
|Critical||Description||Java runtime||Python runtime||Go runtime|
|Nothing will be logged.||✔||✗||✗|
|The logs that contain the most detailed messages.||✔||✔||✔|
|The logs that are used for interactive investigation during development. These logs primarily contain information useful for debugging and have no long-term value.||✔||✔||✔|
|The logs that highlight an abnormal or unexpected event in the function, but do not cause the function to stop.||✔||✔||✔|
|The logs that highlight when the function is stopped due to a failure. These indicate a failure in the current activity, not an application-wide failure.||✔||✔||✔|
|The logs that contain fatal errors. It indicates that the function is unusable.||✔||✔||✔|
|All events are logged.||✔||✗||✗|
|It indicates the function is in panic.||✗||✗||✔|
For details about how to set log levels and produce logs for Pulsar functions, see produce function logs.
Log rotation policies
With more and more logs being written to the log file, the log file grows in size. Therefore, Function Mesh supports log rotation to avoid large files that could create issues when opening them. You can set the log rotation policies based on the time or the log file size.
|Rotate the log file daily.|
|Rotate the log file weekly.|
|Rotate the log file monthly.|
|Rotate the log file at every 10 MB.|
|Rotate the log file at every 50 MB.|
|Rotate the log file at every 100 MB.|
For details about how to set a log rotation policy, see set log rotation policies.
In Function Mesh, the Pulsar cluster is defined through a ConfigMap. Pods can consume ConfigMaps as environment variables in a volume. The Pulsar cluster ConfigMap defines the Pulsar cluster URLs.
|The Web service URL of the Pulsar cluster.|
|The broker service URL of the Pulsar cluster.|
To enable health checks, you need to create a PVC and a PV, and bind the PVC to the PV.
With the Kubernetes liveness probe, Function Mesh supports monitoring and acting on the state of Pods (Containers) to ensure that only healthy Pods serve traffic. Implementing health checks using probes provides Function Mesh a solid foundation, better reliability, and higher uptime.
failureThreshold: # --- 
initialDelaySeconds: 10 # --- 
periodSeconds: 10 # --- 
successThreshold: 1 # --- 
# Other configs
failureThreshold: specify the times to restart a failed probe before giving up the probe. By default, it is set to
initialDelaySeconds: specify the time that should wait before performing the first liveness probe.
periodSeconds: specify the frequency to perform a liveness probe.
successThreshold: specify the minimum consecutive successes for the probe to be considered successful after having failed. By default, it is set to
For more information about probe types, probe check mechanisms, and probe parameters, see Kubernetes documentation on Pod lifecycle and configure probes.
A security context defines privilege and access control settings for a Pod. By default, Function Mesh uses the following
PodSecurityContext as the default value and applies to every function's Pod.
Apart from the
PodSecurityContext, Function Mesh also applies the following
SecurityContext to the Function's container to ensure the Pod Security Standard follows the restricted specification.
|A special supplemental group that applies to all containers in a Pod.|
|Define the behavior of changing ownership and permission of the volume before being exposed inside a Pod. This field only applies to volume types that support |
|The Group ID (GID) that is used to run the entry point of the container process. If it is unset, the runtime is used.|
|Indicate that the container must run as a non-root user. If it is set to |
|The User ID (UID) that is used to run the entry point of the container process.|
|The SELinux context that is applied to a container.|
|The seccomp options that is used by a container.|
|A list of groups that is applied to the first process running in each container, in addition to the container's primary GID, the |
|Sysctls hold a list of namespaced sysctls used for the Pod.|
|The windows-specific settings that are applied to all containers.|
|Control whether a process can gain more privileges than its parent process.|
|The capabilities to add/drop when running a container.|
|Run the container in privileged or unprivileged mode.|
|The type of proc mount that is used by a container.|
|Mount the container's root filesystem as read-only.|
Function Mesh supports customizing the Pod running function instance. This table lists sub-fields available for the
|Specify labels attached to a Pod.|
|Specify the liveness probe properties for a Pod. |
For details, see health checks.
|Specify a map of key-value pairs. For a Pod running on a node, the node must have each of the indicated key-value pairs as labels.|
|Specify the scheduling constraints of a Pod.|
|Specify the tolerations of a Pod.|
|Specify the annotations attached to a Pod.|
|Specify the security context for a Pod. For details, see [security context](#security-context).|
|The amount of time that Kubernetes gives for a Pod before terminating it.|
|A list of volumes that can be mounted by containers belonging to a Pod.|
|An optional list of references to secrets in the same namespace for pulling any of the images used by a Pod.|
|Specify the name of the service account that is used to run Pulsar Functions or connectors in the Function Mesh Worker service.|
|The initialization containers belonging to a Pod. A typical use case could be using an initialization container to download a remote JAR to a local path.|
|Sidecar containers run together with the main function container in a Pod.|
|Specify the built-in autoscaling rules. |
If you configure the
|Specify how to scale based on customized metrics defined in connectors. For details, see MetricSpec v2beta2 autoscaling.|
|Configure the scaling behavior of the target in both up and down directions (|
|Configure the behavior of the Vertical Pod Autoscaling (VPA). It contains two fields: |
|Specify the environment variables to expose on the containers. It is a key/value map. You can either use the |