Data Engineer – AWS Databricks

  • 0 Comments
  • 1155 Views

Location: Santa Clara, CA
Experience: 10+ Years
Employment Type: Full-Time / Contract

Position Overview:

We are seeking a highly experienced AWS Databricks Data Engineer to join our data engineering team in Santa Clara. The ideal candidate will have deep expertise in Databricks, AWS, PySpark, SQL, and large-scale data pipeline development. This role focuses on designing and optimizing modern cloud-based data platforms that support analytics, BI, and enterprise reporting use cases.

You will collaborate with cross-functional teams and business stakeholders to deliver scalable, secure, and high-performance data solutions built on a Lakehouse architecture.

Key Responsibilities:

  • Design and maintain scalable ETL/ELT pipelines using Databricks on AWS
  • Develop high-performance data transformations using PySpark and SQL
  • Implement and optimize Lakehouse (Medallion) architecture for batch data processing
  • Integrate data from S3, databases, and AWS-native services
  • Optimize Spark workloads for performance, cost, and scalability
  • Implement data governance and access controls using Unity Catalog
  • Deploy and manage jobs using Databricks Workflows and CI/CD pipelines
  • Collaborate with business and analytics teams to deliver reliable, production-ready datasets

Required Technical Skills:

  • Strong expertise in Databricks:
    • Delta Lake
    • Unity Catalog
    • Lakehouse Architecture
    • Workflows
    • Delta Live Pipelines
    • Table Triggers
    • Databricks Runtime
  • Advanced proficiency in PySpark and SQL
  • Experience designing and rebuilding batch-heavy data pipelines
  • Strong knowledge of Medallion Architecture
  • Expertise in performance tuning and Spark optimization
  • Experience with Databricks Workflows & orchestration
  • Familiarity with Genie enablement concepts (working understanding required)
  • Experience with CI/CD and Git-based development
  • Strong AWS fundamentals:
    • IAM
    • Networking basics
    • S3
    • Glue Catalog

Preferred Qualifications:

  • Experience with Spark Structured Streaming
  • Knowledge of real-time or near real-time data solutions
  • Advanced Databricks Runtime configurations
  • Experience with GitLab CI/CD pipelines
  • Exposure to scalable enterprise data architectures

Certifications (Optional)

  • Databricks Certified Data Engineer (Associate/Professional)
  • AWS Data Engineer or AWS Solutions Architect Certification

Job Type: contracting Full Time
Job Location: Santa Clara

Apply for this position

Allowed Type(s): .pdf, .doc, .docx
administrator