Postprocessing Service Configuration

Table of Contents

Introduction
General Prerequisites
Post-Processing Functionality
Additional Prerequisites for the postprocessing Service
Post-Processing Steps
- Virus Scanning
- Delay
Custom Post-Processing Steps
- Prerequisites
- Workflow
CLI Commands
- Resume Post-Processing
Storing
Event Bus Configuration
Configuration
- Environment Variables
- YAML Example

Introduction

The Infinite Scale postprocessing service handles the coordination of asynchronous post-processing steps.

General Prerequisites

To use the postprocessing service, an event system needs to be configured for all services. By default, Infinite Scale ships with a preconfigured nats service.

Post-Processing Functionality

The storageprovider service (storage-users) can be configured to initiate asynchronous post-processing by setting the STORAGE_USERS_OCIS_ASYNC_UPLOADS environment variable to true. If this is the case, post-processing will get initiated after uploading a file and all bytes have been received.

The postprocessing service will then coordinate configured post-processing steps like scanning the file for viruses. During post-processing, the file will be in a processing state where only a limited set of actions are available.

The processing state excludes file accessibility by users.

When all postprocessing steps have completed successfully, the file will be made accessible to users.

Additional Prerequisites for the postprocessing Service

Once post-processing has been enabled, configuring any post-processing step will require the requested services to be enabled and preconfigured. For example, to use the virusscan step, one needs to have an enabled and configured antivirus service.

Post-Processing Steps

The postprocessing service is individually configurable. This is achieved by allowing a list of post-processing steps to be performed in order of their appearance in the POSTPROCESSING_STEPS envvar. This envvar expects a comma-separated list of steps that will be executed. Currently steps known to the system are virusscan and delay. Custom steps can be added but need an existing target for processing.

Virus Scanning

To enable virus scanning as a post-processing step after uploading a file, the environment variable POSTPROCESSING_STEPS needs to contain the word virusscan at one location in the list of steps. As a result, each uploaded file gets scanned for viruses as part of the post-processing steps. Note that the antivirus service must be enabled and configured for this to work.

Delay

Though this is for development purposes only and NOT RECOMMENDED on production systems, setting the environment variable POSTPROCESSING_DELAY to a duration not equal to zero will add a delay step with the configured amount of time. Infinite Scale will continue post-processing the file after the configured delay. Use the environment variable POSTPROCESSING_STEPS and the keyword delay if you have multiple post-processing steps and want to define their order. If POSTPROCESSING_DELAY is set but the keyword delay is not contained in POSTPROCESSING_STEPS, it will be executed as the last post-processing step without being listed as the last one. In this case, a log entry will be written on service startup to notify the admin about the situation. That log entry can be avoided by adding the keyword delay to POSTPROCESSING_STEPS.

Custom Post-Processing Steps

By using the envvar POSTPROCESSING_STEPS, custom post-processing steps can be added. Any word can be used as step name but be careful not to conflict with existing keywords like virusscan and delay. In addition, if a keyword is misspelled or the corresponding service either does not exist or does not follow the necessary event communication, the postprocessing service will wait forever to get the required response to proceed and therefore does not continue with any other processing.

Prerequisites

To use custom post-processing steps, you need a custom service listening to the configured event system. For more information, see General Prerequisites.

Workflow

When defining a custom postprocessing step (eg. "customstep"), the postprocessing service will eventually send an event during postprocessing. The event will be of type StartPostprocessingStep with its field StepToStart set to "customstep". When the service defined as custom step receives this event, it can safely execute its actions. The postprocessing service will wait until it has finished its work. The event contains further information (filename, executing user, size, …) and also requires tokens and URLs to download the file in case byte inspection is necessary.

Once the service defined as custom step has finished its work, it should send an event of type PostprocessingFinished via the configured events system back to the postprocessing service. This event needs to contain a FinishedStep field set to "customstep". It also must contain the outcome of the step, which can be one of the following:

delete: Abort postprocessing, delete the file.
abort: Abort postprocessing, keep the file.
retry: There was a problem that was most likely temporary and may be solved by trying again after some backoff duration. Retry runs automatically and is defined by the backoff behavior as described below.
continue: Continue postprocessing, this is the success case.

The backoff behavior as mentioned in the retry outcome can be configured using the POSTPROCESSING_RETRY_BACKOFF_DURATION and POSTPROCESSING_MAX_RETRIES environment variables. The backoff duration is calculated using the following formula after each failure: backoff_duration = POSTPROCESSING_RETRY_BACKOFF_DURATION * 2^(number of failures - 1). This means that the time between the next round grows exponentially limited by the number of retries. Steps that still don’t succeed after the maximum number of retries will be automatically moved to the abort state.

CLI Commands

Resume Post-Processing

If not noted otherwise, commands with the restart option can also use the resume option. This changes behaviour slightly.

restart
When restarting an upload, all steps for open items will be restarted, except if otherwise defined.
resume
When resuming an upload, processing will continue unfinished items from their last completed step.

If post-processing fails in one step due to an unforeseen error, current uploads will not be resumed automatically. A system administrator can instead run CLI commands to resume the failed upload manually which is at minimum a two step process.

For details on the storage-users command which provides many options see the Manage Unfinished Uploads documentation.

Depending if you want to restart/resume all or defined failed uploads, different commands are used.

First, list ongoing upload sessions to identify possibly failed ones.
Note that there never can be a clear identification of a failed upload session due to various reasons causing them. You need to apply more critera like free space on disk, a failed service like antivirus etc. to declare an upload as failed.
```
ocis storage-users uploads sessions
```

All failed uploads
If you want to restart/resume all failed uploads, just rerun the command with the relevant flag. Note that this is the preferred command to handle failed processing steps:
ocis storage-users uploads sessions --resume
Particular failed uploads
Use the postprocessing command to resume defined failed uploads. For postprocessing steps, the default is to resume . Note that at the moment, resume is an alias for restart to keep old functionality. restart is subject to change and will most likely be removed in a later version.
- Defined by ID
  If you want to resume only a specific upload, use the postprocessing resume command with the ID selected:
  
  ocis postprocessing resume -u <uploadID>
- Defined by step
  Alternatively, instead of restarting one specific upload, a system admin can also resume all uploads that are currently in a specific step.
  
  Examples:
  
  ocis postprocessing resume # Resumes all uploads where postprocessing is finished, but upload is not finished. ocis postprocessing resume -s "finished" # Equivalent to the above. ocis postprocessing resume -s "virusscan" # Resume all uploads currently in virusscan step.

Storing

The postprocessing service needs to store some metadata about uploads to be able to orchestrate post-processing. In distributed deployments it is recommended to use a persistent store, see below for more details.

The postprocessing service can use a configured store via the global OCIS_PERSISTENT_STORE environment variable.

Note that for each global environment variable, an independent service-based one might be available additionally. For precedences see Environment Variable Notes. Check the configuration section below. Supported stores are:

Store Type Description

Store Type	Description
`memory`	Basic in-memory store. Will not survive a restart. Usually the default for caches. See the store environment variable for which one is used.
`nats-js-kv`	Stores data using key-value-store feature of NATS JetStream. Usually the default for stores, see the store environment variable for which one is used.
`redis-sentinel`	Stores data in a configured Redis Sentinel cluster.
`noop`	Stores nothing. Useful for testing. Not recommended in production environments.

memory

Basic in-memory store. Will not survive a restart.
Usually the default for caches. See the store environment variable for which one is used.

nats-js-kv

Stores data using key-value-store feature of NATS JetStream.
Usually the default for stores, see the store environment variable for which one is used.

redis-sentinel

Stores data in a configured Redis Sentinel cluster.

noop

Stores nothing. Useful for testing. Not recommended in production environments.

The postprocessing service can only be scaled if not using the memory store and the stores are configured identically over all instances!

If you have used one of the deprecated stores of a former version, you should reconfigure to use one of the supported ones as the deprecated stores will be removed in a later version.

Store specific notes

When using redis-sentinel:
The Redis master to use is configured via e.g. OCIS_PERSISTENT_STORE_NODES in the form of <sentinel-host>:<sentinel-port>/<redis-master> like 10.10.0.200:26379/mymaster.
When using nats-js-kv:
- It is recommended to set OCIS_PERSISTENT_STORE_NODES to the same value as OCIS_EVENTS_ENDPOINT. That way the cache uses the same nats instance as the event bus. See the Event Bus Configuration for more details.
- Authentication can be added, if configured, via OCIS_CACHE_AUTH_USERNAME and OCIS_CACHE_AUTH_PASSWORD.
- It is possible to set OCIS_CACHE_DISABLE_PERSISTENCE to instruct nats to not persist cache data on disc.

Event Bus Configuration

The Infinite Scale event bus can be configured by a set of environment variables.

In case of an orchestrated installation like with Docker or Kubernetes, the event bus must be an external service for scalability like a Redis Sentinel cluster or a key-value-store NATS JetStream. Both named stores are supported and also used in Caching and Persistence. The store used is not part of the Infinite Scale installation and must be separately provided and configured.
Note that from a configuration point of view, caching and persistence are independent of the event bus configuration.

Note that for each global environment variable, a service-based one might be available additionally. For precedences see Environment Variable Notes. Check the configuration section below.

Without the aim of completeness, see the list of environment variables to configure the event bus:

Envvar Description

Envvar	Description
`OCIS_EVENTS_ENDPOINT`	The address of the event system.
`OCIS_EVENTS_CLUSTER`	The clusterID of the event system. Mandatory when using NATS as event system.
`OCIS_EVENTS_ENABLE_TLS`	Enable TLS for the connection to the events broker.
`OCIS_INSECURE`	Whether to verify the server TLS certificates.
`OCIS_EVENTS_AUTH_USERNAME`	The username to authenticate with the events broker.
`OCIS_EVENTS_AUTH_PASSWORD`	The password to authenticate with the events broker.

OCIS_EVENTS_ENDPOINT

The address of the event system.

OCIS_EVENTS_CLUSTER

The clusterID of the event system. Mandatory when using NATS as event system.

OCIS_EVENTS_ENABLE_TLS

Enable TLS for the connection to the events broker.

OCIS_INSECURE

Whether to verify the server TLS certificates.

OCIS_EVENTS_AUTH_USERNAME

The username to authenticate with the events broker.

OCIS_EVENTS_AUTH_PASSWORD

The password to authenticate with the events broker.

Configuration

Environment Variables

The postprocessing service is configured via the following environment variables. Read the Environment Variable Types documentation for important details. Column IV shows with which release the environment variable has been introduced.

master

Environment variables for the postprocessing service
Name	IV	Type	Default Value	Description
`OCIS_TRACING_ENABLED` `POSTPROCESSING_TRACING_ENABLED`	5.0	bool	false	Activates tracing.
`OCIS_TRACING_TYPE` `POSTPROCESSING_TRACING_TYPE`	5.0	string		The type of tracing. Defaults to '', which is the same as 'jaeger'. Allowed tracing types are 'jaeger', 'otlp' and '' as of now.
`OCIS_TRACING_ENDPOINT` `POSTPROCESSING_TRACING_ENDPOINT`	5.0	string		The endpoint of the tracing agent.
`OCIS_TRACING_COLLECTOR` `POSTPROCESSING_TRACING_COLLECTOR`	5.0	string		The HTTP endpoint for sending spans directly to a collector, i.e. http://jaeger-collector:14268/api/traces. Only used if the tracing endpoint is unset.
`OCIS_LOG_LEVEL` `POSTPROCESSING_LOG_LEVEL`	pre5.0	string		The log level. Valid values are: 'panic', 'fatal', 'error', 'warn', 'info', 'debug', 'trace'.
`OCIS_LOG_PRETTY` `POSTPROCESSING_LOG_PRETTY`	pre5.0	bool	false	Activates pretty log output.
`OCIS_LOG_COLOR` `POSTPROCESSING_LOG_COLOR`	pre5.0	bool	false	Activates colorized log output.
`OCIS_LOG_FILE` `POSTPROCESSING_LOG_FILE`	pre5.0	string		The path to the log file. Activates logging to this file if set.
`POSTPROCESSING_DEBUG_ADDR`	pre5.0	string	127.0.0.1:9255	Bind address of the debug server, where metrics, health, config and debug endpoints will be exposed.
`POSTPROCESSING_DEBUG_TOKEN`	pre5.0	string		Token to secure the metrics endpoint.
`POSTPROCESSING_DEBUG_PPROF`	pre5.0	bool	false	Enables pprof, which can be used for profiling.
`POSTPROCESSING_DEBUG_ZPAGES`	pre5.0	bool	false	Enables zpages, which can be used for collecting and viewing in-memory traces.
`OCIS_PERSISTENT_STORE` `POSTPROCESSING_STORE`	pre5.0	string	nats-js-kv	The type of the store. Supported values are: 'memory', 'redis-sentinel', 'nats-js-kv', 'noop'. See the text description for details.
`OCIS_PERSISTENT_STORE_NODES` `POSTPROCESSING_STORE_NODES`	pre5.0	[]string	[127.0.0.1:9233]	A list of nodes to access the configured store. This has no effect when 'memory' store is configured. Note that the behaviour how nodes are used is dependent on the library of the configured store. See the Environment Variable Types description for more details.
`POSTPROCESSING_STORE_DATABASE`	pre5.0	string	postprocessing	The database name the configured store should use.
`POSTPROCESSING_STORE_TABLE`	pre5.0	string		The database table the store should use.
`OCIS_PERSISTENT_STORE_TTL` `POSTPROCESSING_STORE_TTL`	pre5.0	Duration	0s	Time to live for events in the store. See the Environment Variable Types description for more details.
`OCIS_PERSISTENT_STORE_AUTH_USERNAME` `POSTPROCESSING_STORE_AUTH_USERNAME`	5.0	string		The username to authenticate with the store. Only applies when store type 'nats-js-kv' is configured.
`OCIS_PERSISTENT_STORE_AUTH_PASSWORD` `POSTPROCESSING_STORE_AUTH_PASSWORD`	5.0	string		The password to authenticate with the store. Only applies when store type 'nats-js-kv' is configured.
`OCIS_EVENTS_ENDPOINT` `POSTPROCESSING_EVENTS_ENDPOINT`	pre5.0	string	127.0.0.1:9233	The address of the event system. The event system is the message queuing service. It is used as message broker for the microservice architecture.
`OCIS_EVENTS_CLUSTER` `POSTPROCESSING_EVENTS_CLUSTER`	pre5.0	string	ocis-cluster	The clusterID of the event system. The event system is the message queuing service. It is used as message broker for the microservice architecture. Mandatory when using NATS as event system.
`OCIS_INSECURE` `POSTPROCESSING_EVENTS_TLS_INSECURE`	pre5.0	bool	false	Whether the ocis server should skip the client certificate verification during the TLS handshake.
`OCIS_EVENTS_TLS_ROOT_CA_CERTIFICATE` `POSTPROCESSING_EVENTS_TLS_ROOT_CA_CERTIFICATE`	pre5.0	string		The root CA certificate used to validate the server’s TLS certificate. If provided POSTPROCESSING_EVENTS_TLS_INSECURE will be seen as false.
`OCIS_EVENTS_ENABLE_TLS` `POSTPROCESSING_EVENTS_ENABLE_TLS`	pre5.0	bool	false	Enable TLS for the connection to the events broker. The events broker is the ocis service which receives and delivers events between the services.
`OCIS_EVENTS_AUTH_USERNAME` `POSTPROCESSING_EVENTS_AUTH_USERNAME`	5.0	string		The username to authenticate with the events broker. The events broker is the ocis service which receives and delivers events between the services.
`OCIS_EVENTS_AUTH_PASSWORD` `POSTPROCESSING_EVENTS_AUTH_PASSWORD`	5.0	string		The password to authenticate with the events broker. The events broker is the ocis service which receives and delivers events between the services.
`POSTPROCESSING_WORKERS`	6.7	int	3	The number of concurrent go routines that fetch events from the event queue.
`POSTPROCESSING_STEPS`	pre5.0	[]string	[]	A list of postprocessing steps processed in order of their appearance. Currently supported values by the system are: 'virusscan', 'policies' and 'delay'. Custom steps are allowed. See the documentation for instructions. See the Environment Variable Types description for more details.
`POSTPROCESSING_DELAY`	pre5.0	Duration	0s	After uploading a file but before making it available for download, a delay step can be added. Intended for developing purposes only. If a duration is set but the keyword 'delay' is not explicitely added to 'POSTPROCESSING_STEPS', the delay step will be processed as last step. In such a case, a log entry will be written on service startup to remind the admin about that situation. See the Environment Variable Types description for more details.
`POSTPROCESSING_RETRY_BACKOFF_DURATION`	5.0	Duration	5s	The base for the exponential backoff duration before retrying a failed postprocessing step. See the Environment Variable Types description for more details.
`POSTPROCESSING_MAX_RETRIES`	5.0	int	14	The maximum number of retries for a failed postprocessing step.

YAML Example

Note the file shown below must be renamed and placed in the correct folder according to the Configuration File Naming conventions to be effective.
See the Notes for Environment Variables if you want to use environment variables in the yaml file.

master

# Autogenerated
# Filename: postprocessing-config-example.yaml

tracing:
  enabled: false
  type: ""
  endpoint: ""
  collector: ""
log:
  level: ""
  pretty: false
  color: false
  file: ""
debug:
  addr: 127.0.0.1:9255
  token: ""
  pprof: false
  zpages: false
store:
  store: nats-js-kv
  nodes:
  - 127.0.0.1:9233
  database: postprocessing
  table: ""
  ttl: 0s
  username: ""
  password: ""
postprocessing:
  events:
    endpoint: 127.0.0.1:9233
    cluster: ocis-cluster
    tls_insecure: false
    tls_root_ca_certificate: ""
    enable_tls: false
    username: ""
    password: ""
  workers: 3
  steps: []
  delayprocessing: 0s
  retry_backoff_duration: 5s
  max_retries: 14