
Description
Description
We are seeking a Data Engineer with hands-on AI/ML and LLM project experience in AWS to join the AWS AI/GenAI Solutions Team within the IRS Advanced Analytics Program (AAP). This role is responsible for designing and maintaining data pipelines and feature engineering workflows that directly enable LLM/GenAI and AI/ML model development, training, and deployment on AWS services such as SageMaker and Bedrock.
As part of the AAP common services mission, the Data Engineer will deliver secure, scalable, and reusable AWS-native data engineering solutions that simplify onboarding for IRS mission teams. The ideal candidate combines expertise in AWS data services with AI/ML-focused engineering to ensure mission teams can build and operationalize models efficiently.
Key Responsibilities
- Design, build, and optimize data pipelines in AWS (Glue, Lambda, Step Functions, S3, RDS, Redshift) to support AI/ML and LLM workloads.
- Implement data ingestion, transformation, and feature engineering workflows that feed SageMaker and Bedrock models.
- Collaborate with mission data scientists to ensure datasets are structured and optimized for LLM fine-tuning, inference, and prompt engineering.
- Integrate pipelines into CI/CD workflows for automated, repeatable, and compliant model operations.
- Apply security and governance controls (IAM roles, encryption, audit logging) to protect sensitive IRS data.
- Develop and maintain data validation, schema enforcement, and monitoring routines to ensure reliability and compliance.
- Work with MLOps/SRE engineers to align pipelines with model lifecycle operations (staging, promotion, retraining).
- Partner with Product Manager and Chief Architect to align AWS data engineering capabilities with AAP roadmap milestones.
Qualifications
Required Qualifications
- Bachelor's degree in computer science, Data Engineering, or related field.
- 10+ years of data engineering experience on AWS, including AI/ML-focused use cases.
- Hands-on expertise with AWS data services (Glue, Lambda, S3, Redshift, RDS, Step Functions).
- Strong proficiency in Python, SQL, and data transformation frameworks.
- Experience delivering feature engineering and data prep for SageMaker/Bedrock model development.
- Familiarity with CI/CD integration and IaC (Terraform, CloudFormation).
- Awareness of AI/ML lifecycle data needs (training, fine-tuning, inference, retraining).
Desired Skills
- Certifications: AWS Certified Data Analytics Specialty, AWS Certified Machine Learning Specialty, or Solutions Architect Associate/Professional.
- Experience working with LLM-specific pipelines (prompt data preparation, response validation, fine-tuning datasets).
- Familiarity with federal compliance frameworks (FedRAMP, NIST 800-53) and embedding compliance into AWS data workflows.
- Exposure to Trustworthy AI practices (bias detection, data lineage, explainability).
- Strong collaboration skills to work across architects, AI/LLM engineers, and mission data scientists.
Target salary range: $160,001 - $200,000. The estimate displayed represents the typical salary range for this position based on experience and other factors.
Apply on company website