← All Adapters
AMAZON ATHENA
type: athenaAWS serverless query service for analyzing data in S3 using standard SQL. Pay-per-query with no infrastructure to manage.
PREREQUISITES
Driver: pyathena — installed automatically by:
dvt sync
CONFIGURATION FIELDS
| FIELD | TYPE | REQUIRED | DEFAULT | DESCRIPTION |
|---|---|---|---|---|
| type | string | yes | — | Must be `athena` |
| region_name | string | yes | us-east-1 | AWS region |
| database | string | yes | — | Athena database (Glue catalog) |
| s3_staging_dir | string | yes | — | S3 path for query results (e.g., s3://my-bucket/athena-results/) |
| aws_access_key_id | string | no | — | AWS access key (or use IAM role) |
| aws_secret_access_key | string | no | — | AWS secret key (or use IAM role) |
| threads | integer | no | 4 | Number of parallel threads |
PROFILES.YML EXAMPLE
my_project:
target: athena_dev
outputs:
athena_dev:
type: athena
region_name: us-east-1
database: analytics
s3_staging_dir: s3://my-bucket/athena-results/
aws_access_key_id: "{{ env_var('AWS_ACCESS_KEY_ID') }}"
aws_secret_access_key: "{{ env_var('AWS_SECRET_ACCESS_KEY') }}"SOURCES.YML EXAMPLE
sources:
- name: data_lake
connection: athena_dev
database: raw_data
tables:
- name: clickstream
- name: server_logsINCREMENTAL STRATEGIES
✓ Append✓ Delete+Insert✓ Merge
KNOWN LIMITATIONS
- ⚠Serverless — query latency includes startup time
- ⚠Results staged to S3 before retrieval