Data Engineer – Databricks + DevOps

  • 0 Comments
  • 956 Views

Location: Santa Clara, CA
Experience: 8+ Years
Employment Type: Full-time / Contract

Position Overview:

NovaSoft is seeking a highly experienced Databricks Data Engineer with strong DevOps expertise to design, implement, and optimize large-scale Lakehouse architectures on AWS.

This role requires deep architectural understanding of compute vs. serving layer separation, low-latency data/API access strategies, and multi-terabyte data processing. The ideal candidate combines hands-on engineering excellence with technical leadership — a true player-coach mindset.

You will work closely with cross-functional teams to build scalable, secure, automated, and high-performance data platforms using modern DevOps practices.

Key Responsibilities:

  • Design and implement scalable Databricks Lakehouse architectures on AWS
  • Build and optimize ETL/ELT pipelines using PySpark, Spark, and SQL
  • Implement Delta Lake best practices (partitioning, optimization, schema evolution)
  • Develop and manage CI/CD pipelines and automated deployments using DevOps tools
  • Optimize Spark workloads for performance, cost efficiency, and low-latency access
  • Implement data governance and security using Unity Catalog
  • Collaborate with cross-functional teams and provide technical leadership

Technical Skills (Required):

  • Strong hands-on experience with:
    • Databricks (Delta Lake, Unity Catalog, Delta Live Pipelines, Workflows, Runtime)
    • PySpark, Spark, Advanced SQL
    • Lakehouse & Medallion Architecture
  • AWS expertise including:
    • S3, IAM, Glue / Glue Catalog
    • Lambda
    • Secrets Manager
    • (Kinesis is a plus)
  • DevOps expertise:
    • Git-based workflows
    • CI/CD pipelines
    • Databricks Asset Bundles
    • Terraform (preferred)
  • Experience handling multi-terabyte workloads
  • Strong understanding of performance tuning, partitioning, and storage optimization

Preferred Experience:

  • Structured Streaming / real-time data pipelines
  • Advanced Databricks runtime configuration
  • Real-time or near real-time data solutions
  • Exposure to GitLab CI/CD pipelines

Certifications (Optional)

  • Databricks Certified Data Engineer (Associate / Professional)
  • AWS Certified Data Engineer or Solutions Architect
Job Type: contracting Full Time
Job Location: Santa Clara

Apply for this position

Allowed Type(s): .pdf, .doc, .docx
administrator