airflow.providers.amazon.aws.transfers.sftp_to_s3

Classes

SFTPToS3Operator

Transfer files from an SFTP server to Amazon S3.

Module Contents

class airflow.providers.amazon.aws.transfers.sftp_to_s3.SFTPToS3Operator(*, s3_bucket, s3_key, sftp_path, sftp_conn_id='ssh_default', sftp_remote_host='', sftp_filenames=None, s3_filenames=None, use_temp_file=True, fail_on_file_not_exist=True, replace=False, encrypt=False, gzip=False, acl_policy=None, aws_conn_id='aws_default', s3_conn_id=None, **kwargs)[source]

Bases: airflow.providers.common.compat.sdk.BaseOperator

Transfer files from an SFTP server to Amazon S3.

See also

For more information on how to use this operator, take a look at the guide: SFTP to Amazon S3 transfer operator

Parameters:
  • sftp_conn_id (str) – The sftp connection id. The name or identifier for establishing a connection to the SFTP server.

  • sftp_remote_host (str) – The remote host of the SFTP server. Overrides host in Connection.

  • sftp_path (str) – The sftp remote path. For a single file it must include the file path. For multiple files it is the directory path where the files are located.

  • sftp_filenames (str | list[str] | None) – Only used if you want to move multiple files. You can pass a list with exact filenames present in the sftp path, or a prefix that all files must match. Use "*" to move all files within the sftp path.

  • aws_conn_id (str) – The Airflow connection used for AWS credentials. If this is None or empty then the default boto3 behaviour is used. If running Airflow in a distributed manner and aws_conn_id is None or empty, then default boto3 configuration would be used (and must be maintained on each worker node).

  • s3_bucket (str) – The targeted s3 bucket. This is the S3 bucket to where the file is uploaded.

  • s3_key (str) – The targeted s3 key. For a single file it must include the file path. For multiple files it must end with "/".

  • s3_filenames (str | list[str] | None) – Only used if you want to move multiple files and name them differently from the originals on the SFTP server. It can be a list of filenames or a string prefix that replaces the sftp prefix.

  • use_temp_file (bool) – If True, copies file first to local, if False streams file from SFTP to S3.

  • fail_on_file_not_exist (bool) – If True, operator fails when file does not exist, if False, operator will not fail and skips transfer. Default is True.

  • replace (bool) – If True, overwrite the S3 key if it already exists.

  • encrypt (bool) – If True, the file will be encrypted on the server-side by S3.

  • gzip (bool) – If True, the file will be compressed locally before upload.

  • acl_policy (str | None) – Canned ACL policy for the file being uploaded to S3.

template_fields: collections.abc.Sequence[str] = ('s3_key', 'sftp_path', 's3_bucket', 'sftp_filenames', 's3_filenames')[source]
sftp_conn_id = 'ssh_default'[source]
sftp_path[source]
sftp_remote_host = ''[source]
s3_bucket[source]
s3_key[source]
aws_conn_id = 'aws_default'[source]
sftp_filenames = None[source]
s3_filenames = None[source]
use_temp_file = True[source]
fail_on_file_not_exist = True[source]
replace = False[source]
encrypt = False[source]
gzip = False[source]
acl_policy = None[source]
static get_s3_key(s3_key)[source]

Parse the correct format for S3 keys regardless of how the S3 url is passed.

execute(context)[source]

Derive when creating an operator.

The main method to execute the task. Context is the same dictionary used as when rendering jinja templates.

Refer to get_template_context for more context.

Was this entry helpful?