
Description
Description
We are seeking a skilled Databricks Administrator with strong infrastructure, SRE/MLOps, and DevSecOps expertise to manage workspace provisioning, automation, and governance within the IRS Advanced Analytics Platform (AAP). This role ensures that Databricks workspaces are secure, cost-optimized, and operationally resilient, while enabling AI/ML model development and deployment.
As part of the Infrastructure team, this administrator will own workspace lifecycle automation, observability, compliance-aligned controls, and operational reliability while partnering with security, architecture, and AI/ML teams to deliver end-to-end platform enablement.
Key Responsibilities
- Provision, configure, and manage Databricks workspaces across AWS-hosted infrastructure.
- Implement Terraform automation for user roles, clusters, jobs, repos, and workspace policies.
- Manage SSO, IAM, and secure access controls, ensuring compliance with enterprise standards.
- Collaborate with security and networking teams to enforce workspace isolation, VPC design, and policy enforcement.
- Support integration of Databricks with CI/CD pipelines to enable model deployment, promotion, and rollback.
- Enable compliance-aware pipelines with audit-ready tracking across environments.
- Build observability and monitoring frameworks for Databricks workloads, including metrics, logging, and alerting.
- Lead incident response and root cause analysis to ensure uptime and reliability.
- Monitor workspace usage, performance, and cost, optimizing configurations for scalability and efficiency.
- Partner with architects, product managers, and AI/ML teams to align workspace capabilities with roadmap direction and customer needs.
Qualifications
Required Qualifications
- Bachelor's degree in Computer Science, Engineering, or related field.
- 5+ years of experience in cloud infrastructure or platform operations (AWS, Databricks, Spark).
- Must be a U.S. Citizen with the ability to obtain and maintain a Public Trust Security Clearance.
- Strong hands-on experience with Databricks administration, including clusters, jobs, repos, and governance (Unity Catalog).
- Proficiency in Terraform and infrastructure-as-code for provisioning and lifecycle automation.
- Background in monitoring, observability, and incident response within cloud platforms.
- Familiarity with MLOps practices (MLflow, CI/CD pipelines, model lifecycle management).
Desired Skills
- Active IRS Clearance highly desired.
- Certifications: Databricks Administrator Associate, Databricks Data Engineer, or AWS ML Specialty.
- Understanding of federal compliance frameworks (FedRAMP, NIST 800-53, SOC2).
- Exposure to Trustworthy AI practices and auditability requirements.
- Experience with Docker, Kubernetes, or serverless automation for extended integration.
- Strong collaboration and communication skills to work across infrastructure, security, and AI/ML engineering teams.
Target salary range: $120,001 - $160,000. The estimate displayed represents the typical salary range for this position based on experience and other factors.
Apply on company website