← All Adapters

AMAZON ATHENA

type: athena

AWS serverless query service for analyzing data in S3 using standard SQL. Pay-per-query with no infrastructure to manage.

PREREQUISITES

Driver: pyathena — installed automatically by:

dvt sync

CONFIGURATION FIELDS

FIELDTYPEREQUIREDDEFAULTDESCRIPTION
typestringyesMust be `athena`
region_namestringyesus-east-1AWS region
databasestringyesAthena database (Glue catalog)
s3_staging_dirstringyesS3 path for query results (e.g., s3://my-bucket/athena-results/)
aws_access_key_idstringnoAWS access key (or use IAM role)
aws_secret_access_keystringnoAWS secret key (or use IAM role)
threadsintegerno4Number of parallel threads

PROFILES.YML EXAMPLE

my_project:
  target: athena_dev
  outputs:
    athena_dev:
      type: athena
      region_name: us-east-1
      database: analytics
      s3_staging_dir: s3://my-bucket/athena-results/
      aws_access_key_id: "{{ env_var('AWS_ACCESS_KEY_ID') }}"
      aws_secret_access_key: "{{ env_var('AWS_SECRET_ACCESS_KEY') }}"

SOURCES.YML EXAMPLE

sources:
  - name: data_lake
    connection: athena_dev
    database: raw_data
    tables:
      - name: clickstream
      - name: server_logs

INCREMENTAL STRATEGIES

Append Delete+Insert Merge

KNOWN LIMITATIONS

  • Serverless — query latency includes startup time
  • Results staged to S3 before retrieval