grove.outputs package

Provides collected Grove log output to supported destinations.

class grove.outputs.BaseOutput[source]

Bases: ABC

The basis for all Grove output handlers.

class Configuration(_env_file: str | PathLike | List[str | PathLike] | Tuple[str | PathLike, ...] | None = '<object object>', _env_file_encoding: str | None = None, _env_nested_delimiter: str | None = None, _secrets_dir: str | PathLike | None = None, **values: Any)[source]

Bases: BaseSettings

Defines the configuration directives required by all output handlers.

serialize(data: List[Any], metadata: Dict[str, Any] = {}) bytes[source]

Implements serialization of log entries to a gzipped NDJSON.

Parameters:
  • data – A list of log entries to serialize to JSON.

  • metadata – Metadata to append to each log entry before serialization. If not specified no metadata will be added.

Returns:

Log data serialized as gzipped NDJSON (as bytes).

Raises:

DataFormatException – Cannot serialize the input to JSON.

setup()[source]

Implements logic to setup any required clients, sockets, or connections.

If not required for the given output handler, this may be a no-op.

abstract submit(data: bytes, connector: str, identity: str, operation: str, part: int = 0, suffix: str | None = None, descriptor: str | None = None)[source]

Implements logic require to write collected log data to the given backend.

Parameters:
  • data – Log data to write.

  • connector – Name of the connector which retrieved the data.

  • identity – Identity the collected data was collect for.

  • operation – Operation the collected logs are associated with.

  • part – Number indicating which part of the same log stream this file contains data for. This is used to indicate that the logs are from the same collection, but have been broken into smaller files for downstream processing.

  • suffix – An optional suffix to allow propagation of file type information or other relevant features.

  • descriptor – An optional and arbitrary descriptor associated with the log data. This may be used by handlers for construction / specification of file paths, URLs, or database tables.

Submodules

grove.outputs.aws_s3 module

Grove AWS S3 output handler.

class grove.outputs.aws_s3.Handler[source]

Bases: BaseOutput

This output handler allows Grove to write collected logs to an AWS S3 bucket.

class Configuration(_env_file: str | PathLike | List[str | PathLike] | Tuple[str | PathLike, ...] | None = '<object object>', _env_file_encoding: str | None = None, _env_nested_delimiter: str | None = None, _secrets_dir: str | PathLike | None = None, *, bucket: str, aws_access_key_id: str | None = None, aws_secret_access_key: str | None = None, aws_session_token: str | None = None, assume_role_arn: str | None = None, bucket_region: str | None = 'us-east-1', **values: Any)[source]

Bases: Configuration

Defines environment variables used to configure the AWS S3 handler.

This should also include any appropriate default values for fields which are not required.

class Config[source]

Bases: object

Allow environment variable override of configuration fields.

This also enforce a prefix for all environment variables for this handler. As an example the field bucket would be set using the environment variable GROVE_OUTPUT_AWS_S3_BUCKET.

case_insensitive = True
env_prefix = 'GROVE_OUTPUT_AWS_S3_'
assume_role_arn: str | None
aws_access_key_id: str | None
aws_secret_access_key: str | None
aws_session_token: str | None
bucket: str
bucket_region: str | None
setup()[source]

Sets up access to S3.

This handler also attempt to assume a configured role in order to allow cross-account use - if required.

Raises:
submit(data: bytes, connector: str, identity: str, operation: str, part: int = 0, kind: str | None = '.json.gz', descriptor: str | None = 'logs/')[source]

Persists captured data to an S3 compatible object store.

Parameters:
  • data – Log data to write to S3.

  • connector – Name of the connector which retrieved the data.

  • identity – Identity the collected data was collect for.

  • operation – Operation the collected logs are associated with.

  • part – Number indicating which part of the same log stream this file contains data for. This is used to indicate that the logs are from the same collection, but have been broken into smaller files for downstream processing.

  • kind – An optional file suffix to use for objects written.

  • descriptor – An optional path to append to the beginning of the output S3 key.

Raises:

AccessException – An issue occurred when accessing S3.

grove.outputs.local_file module

Grove local file path output handler.

class grove.outputs.local_file.Handler[source]

Bases: BaseOutput

class Configuration(_env_file: str | PathLike | List[str | PathLike] | Tuple[str | PathLike, ...] | None = '<object object>', _env_file_encoding: str | None = None, _env_nested_delimiter: str | None = None, _secrets_dir: str | PathLike | None = None, *, path: str, **values: Any)[source]

Bases: Configuration

Defines environment variables used to configure the local file handler.

This should also include any appropriate default values for fields which are not required.

class Config[source]

Bases: object

Allow environment variable override of configuration fields.

This also enforce a prefix for all environment variables for this handler. As an example the field path would be set using the environment variable GROVE_OUTPUT_LOCAL_FILE_PATH.

case_insensitive = True
env_prefix = 'GROVE_OUTPUT_LOCAL_FILE_'
path: str
setup()[source]

Set up access to local filesystem path.

This also checks that an output directory is configured, and it is initially accessible and writable.

Raises:

AccessException – There was an issue accessing to the provided file path.

submit(data: bytes, connector: str, identity: str, operation: str, part: int = 0, kind: str | None = '.json.gz', descriptor: str | None = 'logs/')[source]

Persists captured data to a local file path.

Parameters:
  • data – Log data to write.

  • connector – Name of the connector which retrieved the data.

  • identity – Identity the collected data was collect for.

  • operation – Operation the collected logs are associated with.

  • part – Number indicating which part of the same log stream this file contains data for. This is used to indicate that the logs are from the same collection, but have been broken into smaller files for downstream processing.

  • kind – An optional file suffix to use for files written.

  • descriptor – An optional path to append to the beginning of the output file path.

Raises:

AccessException – An issue occurred when writing data.

grove.outputs.local_stdout module

Grove stdout output handler.

class grove.outputs.local_stdout.Handler[source]

Bases: BaseOutput

serialize(data: List[Any], metadata: Dict[str, Any] = {}) bytes[source]

Serialize data to a standard format (NDJSON).

Parameters:
  • data – A list of log entries to serialize to JSON.

  • metadata – Metadata to append to each log entry before serialization. If not specified no metadata will be added.

Returns:

Log data serialized as NDJSON.

Raises:

DataFormatException – Cannot serialize the input to JSON.

submit(data: bytes, connector: str, identity: str, operation: str, part: int = 0, kind: str | None = 'json', descriptor: str | None = 'raw')[source]

Print captured data to stdout.

Parameters:
  • data – Log data to write.

  • connector – Name of the connector which retrieved the data.

  • identity – Identity the collected data was collect for.

  • operation – Operation the collected logs are associated with.

  • part – Number indicating which part of the same log stream this file contains data for. This is used to indicate that the logs are from the same collection, but have been broken into smaller files for downstream processing.

  • kind – The format of the data being output.

  • descriptor – An arbitrary descriptor to identify the data being output.

grove.outputs.remote_http module

Grove remote HTTP output handler.

class grove.outputs.remote_http.Handler[source]

Bases: BaseOutput

class Configuration(_env_file: str | PathLike | List[str | PathLike] | Tuple[str | PathLike, ...] | None = '<object object>', _env_file_encoding: str | None = None, _env_nested_delimiter: str | None = None, _secrets_dir: str | PathLike | None = None, *, url: str, retries: int = 5, headers: str | None = None, timeout: int = 10, insecure: bool = False, **values: Any)[source]

Bases: Configuration

Defines environment variables used to configure the remote HTTP handler.

This should also include any appropriate default values for fields which are not required.

class Config[source]

Bases: object

Allow environment variable override of configuration fields.

This also enforce a prefix for all environment variables for this handler. As an example the field url would be set using the environment variable GROVE_OUTPUT_REMOTE_HTTP_URL.

case_insensitive = True
env_prefix = 'GROVE_OUTPUT_REMOTE_HTTP_'
headers: str | None
insecure: bool
retries: int
timeout: int
url: str
serialize(data: List[Any], metadata: Dict[str, Any] = {}) bytes[source]

Implements serialization of log entries to NDJSON.

Parameters:
  • data – A list of log entries to serialize to JSON.

  • metadata – Metadata to append to each log entry before serialization. If not specified no metadata will be added.

Returns:

Log data serialized as NDJSON (as bytes).

Raises:

DataFormatException – Cannot serialize the input to JSON.

setup()[source]

Parses and sets up HTTP headers.

This method parses pipe delimited HTTP headers from the environment. This is not perfect, but we’re relatively limited when using environment variables while wishing to retain compatibility across runtimes.

submit(data: bytes, connector: str, identity: str, operation: str, part: int = 0, kind: str | None = None, descriptor: str | None = None)[source]

Performs an HTTP POST with the body containing collected logs as NDJSON.

Parameters:
  • data – Log data to POST.

  • connector – Name of the connector which retrieved the data.

  • identity – Identity the collected data was collect for.

  • operation – Operation the collected logs are associated with.

  • part – Number indicating which part of the same log stream this file contains data for.

  • kind – Currently not used by this output plugin.

  • descriptor – Currently not used by this output plugin.

Raises:

AccessException – An issue occurred when writing data.