grove.outputs package¶

Provides collected Grove log output to supported destinations.

class grove.outputs.BaseOutput[source]¶

Bases: ABC

The basis for all Grove output handlers.

Bases: BaseSettings

Defines the configuration directives required by all output handlers.

serialize(data: List[Any], metadata: Dict[str, Any] = {}) → bytes[source]¶

Implements serialization of log entries to a gzipped NDJSON.

Parameters:

data – A list of log entries to serialize to JSON.
metadata – Metadata to append to each log entry before serialization. If not specified no metadata will be added.

Returns:

Log data serialized as gzipped NDJSON (as bytes).

Raises:

DataFormatException – Cannot serialize the input to JSON.

setup()[source]¶

Implements logic to setup any required clients, sockets, or connections.

If not required for the given output handler, this may be a no-op.

abstract submit(data: bytes, connector: str, identity: str, operation: str, part: int = 0, suffix: str | None = None, descriptor: str | None = None)[source]¶

Implements logic require to write collected log data to the given backend.

Parameters:

data – Log data to write.
connector – Name of the connector which retrieved the data.
identity – Identity the collected data was collect for.
operation – Operation the collected logs are associated with.
part – Number indicating which part of the same log stream this file contains data for. This is used to indicate that the logs are from the same collection, but have been broken into smaller files for downstream processing.
suffix – An optional suffix to allow propagation of file type information or other relevant features.
descriptor – An optional and arbitrary descriptor associated with the log data. This may be used by handlers for construction / specification of file paths, URLs, or database tables.

Submodules¶

grove.outputs.aws_s3 module¶

Grove AWS S3 output handler.

class grove.outputs.aws_s3.Handler[source]¶

Bases: BaseOutput

This output handler allows Grove to write collected logs to an AWS S3 bucket.

Bases: Configuration

Defines environment variables used to configure the AWS S3 handler.

This should also include any appropriate default values for fields which are not required.

class Config[source]¶

Bases: object

Allow environment variable override of configuration fields.

This also enforce a prefix for all environment variables for this handler. As an example the field bucket would be set using the environment variable GROVE_OUTPUT_AWS_S3_BUCKET.

case_insensitive = True¶

env_prefix = 'GROVE_OUTPUT_AWS_S3_'¶

assume_role_arn: str | None¶

aws_access_key_id: str | None¶

aws_secret_access_key: str | None¶

aws_session_token: str | None¶

bucket: str¶

bucket_region: str | None¶

setup()[source]¶

Sets up access to S3.

This handler also attempt to assume a configured role in order to allow cross-account use - if required.

Raises:

ConfigurationException – There was an issue with configuration.
AccessException – An issue occurred when accessing S3.

submit(data: bytes, connector: str, identity: str, operation: str, part: int = 0, kind: str | None = '.json.gz', descriptor: str | None = 'logs/')[source]¶

Persists captured data to an S3 compatible object store.

Parameters:

data – Log data to write to S3.
connector – Name of the connector which retrieved the data.
identity – Identity the collected data was collect for.
operation – Operation the collected logs are associated with.
part – Number indicating which part of the same log stream this file contains data for. This is used to indicate that the logs are from the same collection, but have been broken into smaller files for downstream processing.
kind – An optional file suffix to use for objects written.
descriptor – An optional path to append to the beginning of the output S3 key.

Raises:

AccessException – An issue occurred when accessing S3.

grove.outputs.local_file module¶

Grove local file path output handler.

class grove.outputs.local_file.Handler[source]¶

Bases: BaseOutput

Bases: Configuration

Defines environment variables used to configure the local file handler.

This should also include any appropriate default values for fields which are not required.

class Config[source]¶

Bases: object

Allow environment variable override of configuration fields.

This also enforce a prefix for all environment variables for this handler. As an example the field path would be set using the environment variable GROVE_OUTPUT_LOCAL_FILE_PATH.

case_insensitive = True¶

env_prefix = 'GROVE_OUTPUT_LOCAL_FILE_'¶

path: str¶

setup()[source]¶

Set up access to local filesystem path.

This also checks that an output directory is configured, and it is initially accessible and writable.

Raises:: AccessException – There was an issue accessing to the provided file path.

submit(data: bytes, connector: str, identity: str, operation: str, part: int = 0, kind: str | None = '.json.gz', descriptor: str | None = 'logs/')[source]¶

Persists captured data to a local file path.

Parameters:

data – Log data to write.
connector – Name of the connector which retrieved the data.
identity – Identity the collected data was collect for.
operation – Operation the collected logs are associated with.
part – Number indicating which part of the same log stream this file contains data for. This is used to indicate that the logs are from the same collection, but have been broken into smaller files for downstream processing.
kind – An optional file suffix to use for files written.
descriptor – An optional path to append to the beginning of the output file path.

Raises:

AccessException – An issue occurred when writing data.

grove.outputs.local_stdout module¶

Grove stdout output handler.

class grove.outputs.local_stdout.Handler[source]¶

Bases: BaseOutput

serialize(data: List[Any], metadata: Dict[str, Any] = {}) → bytes[source]¶

Serialize data to a standard format (NDJSON).

Parameters:

data – A list of log entries to serialize to JSON.
metadata – Metadata to append to each log entry before serialization. If not specified no metadata will be added.

Returns:

Log data serialized as NDJSON.

Raises:

DataFormatException – Cannot serialize the input to JSON.

submit(data: bytes, connector: str, identity: str, operation: str, part: int = 0, kind: str | None = 'json', descriptor: str | None = 'raw')[source]¶

Print captured data to stdout.

Parameters:

data – Log data to write.
connector – Name of the connector which retrieved the data.
identity – Identity the collected data was collect for.
operation – Operation the collected logs are associated with.
part – Number indicating which part of the same log stream this file contains data for. This is used to indicate that the logs are from the same collection, but have been broken into smaller files for downstream processing.
kind – The format of the data being output.
descriptor – An arbitrary descriptor to identify the data being output.

grove.outputs.remote_http module¶

Grove remote HTTP output handler.

class grove.outputs.remote_http.Handler[source]¶

Bases: BaseOutput

Bases: Configuration

Defines environment variables used to configure the remote HTTP handler.

This should also include any appropriate default values for fields which are not required.

class Config[source]¶

Bases: object

Allow environment variable override of configuration fields.

This also enforce a prefix for all environment variables for this handler. As an example the field url would be set using the environment variable GROVE_OUTPUT_REMOTE_HTTP_URL.

case_insensitive = True¶

env_prefix = 'GROVE_OUTPUT_REMOTE_HTTP_'¶

headers: str | None¶

insecure: bool¶

retries: int¶

timeout: int¶

url: str¶

serialize(data: List[Any], metadata: Dict[str, Any] = {}) → bytes[source]¶

Implements serialization of log entries to NDJSON.

Parameters:

data – A list of log entries to serialize to JSON.
metadata – Metadata to append to each log entry before serialization. If not specified no metadata will be added.

Returns:

Log data serialized as NDJSON (as bytes).

Raises:

DataFormatException – Cannot serialize the input to JSON.

setup()[source]¶

Parses and sets up HTTP headers.

This method parses pipe delimited HTTP headers from the environment. This is not perfect, but we’re relatively limited when using environment variables while wishing to retain compatibility across runtimes.

submit(data: bytes, connector: str, identity: str, operation: str, part: int = 0, kind: str | None = None, descriptor: str | None = None)[source]¶

Performs an HTTP POST with the body containing collected logs as NDJSON.

Parameters:

data – Log data to POST.
connector – Name of the connector which retrieved the data.
identity – Identity the collected data was collect for.
operation – Operation the collected logs are associated with.
part – Number indicating which part of the same log stream this file contains data for.
kind – Currently not used by this output plugin.
descriptor – Currently not used by this output plugin.

Raises:

AccessException – An issue occurred when writing data.