Location: Santa Clara, CA
Experience: 10+ Years
Employment Type: Full-Time / Contract
Position Overview:
We are seeking a highly experienced AWS Databricks Data Engineer to join our data engineering team in Santa Clara. The ideal candidate will have deep expertise in Databricks, AWS, PySpark, SQL, and large-scale data pipeline development. This role focuses on designing and optimizing modern cloud-based data platforms that support analytics, BI, and enterprise reporting use cases.
You will collaborate with cross-functional teams and business stakeholders to deliver scalable, secure, and high-performance data solutions built on a Lakehouse architecture.
Key Responsibilities:
- Design and maintain scalable ETL/ELT pipelines using Databricks on AWS
- Develop high-performance data transformations using PySpark and SQL
- Implement and optimize Lakehouse (Medallion) architecture for batch data processing
- Integrate data from S3, databases, and AWS-native services
- Optimize Spark workloads for performance, cost, and scalability
- Implement data governance and access controls using Unity Catalog
- Deploy and manage jobs using Databricks Workflows and CI/CD pipelines
- Collaborate with business and analytics teams to deliver reliable, production-ready datasets
Required Technical Skills:
- Strong expertise in Databricks:
- Delta Lake
- Unity Catalog
- Lakehouse Architecture
- Workflows
- Delta Live Pipelines
- Table Triggers
- Databricks Runtime
- Advanced proficiency in PySpark and SQL
- Experience designing and rebuilding batch-heavy data pipelines
- Strong knowledge of Medallion Architecture
- Expertise in performance tuning and Spark optimization
- Experience with Databricks Workflows & orchestration
- Familiarity with Genie enablement concepts (working understanding required)
- Experience with CI/CD and Git-based development
- Strong AWS fundamentals:
- IAM
- Networking basics
- S3
- Glue Catalog
Preferred Qualifications:
- Experience with Spark Structured Streaming
- Knowledge of real-time or near real-time data solutions
- Advanced Databricks Runtime configurations
- Experience with GitLab CI/CD pipelines
- Exposure to scalable enterprise data architectures
Certifications (Optional)
- Databricks Certified Data Engineer (Associate/Professional)
- AWS Data Engineer or AWS Solutions Architect Certification