
Remote
Part-Time
India
Skills
Python (Programming Language)
Data Warehousing
Data Engineering
Machine Learning
Databases
Data Science
Extract, Transform, Load (ETL)
Apache Spark
Big Data
Written Communication
About the Role
Here is the required Skillset
Title: Data Engineer
Key Responsibilities:
- Design & build data platform tooling, services, and integrations
- Develop and maintain data lake and lakehouse architectures
- Build real-time & batch data pipelines using Apache Spark and Databricks
- Integrate streaming architectures with software products
- Implement ELT/ETL processes and data transformation frameworks
- Enable data governance, security, quality, and observability
- Collaborate with analysts, data scientists, and business stakeholders
- Participate in architecture reviews, code reviews, and DevOps integration
- Troubleshoot and optimize complex distributed systems
- Resolve data incidents and ensure high system reliability
🧠 Core Technical Skills
- Languages: Python (preferred), Scala, Java, SQL
- Frameworks: Apache Spark (strong hands-on required)
- Platforms: Databricks, AWS/Azure/GCP
- Architecture: Data Lake, Lakehouse, Streaming
- Pipelines: ELT/ETL, orchestration, distributed data processing
- APIs: RESTful, gRPC
- Streaming: Kafka, event-driven/messaging systems
- DevOps: GitHub Actions, CI/CD, Pytest, mocking/testing best practices
- Observability: Data/system monitoring tools, debugging, alerting
- Data Modeling: OLTP, document/graph stores, best practices
- Strong understanding of SDLC, software design patterns
- Must have Data Governance and Compliance experience
Tech Stack:
- Databricks
- Aws
- Kafka
- Pyspark
- GitHub Actions
- Terraform
- Data governance HIPAA
Title: Data Engineer
Key Responsibilities:
- Design & build data platform tooling, services, and integrations
- Develop and maintain data lake and lakehouse architectures
- Build real-time & batch data pipelines using Apache Spark and Databricks
- Integrate streaming architectures with software products
- Implement ELT/ETL processes and data transformation frameworks
- Enable data governance, security, quality, and observability
- Collaborate with analysts, data scientists, and business stakeholders
- Participate in architecture reviews, code reviews, and DevOps integration
- Troubleshoot and optimize complex distributed systems
- Resolve data incidents and ensure high system reliability
🧠 Core Technical Skills
- Languages: Python (preferred), Scala, Java, SQL
- Frameworks: Apache Spark (strong hands-on required)
- Platforms: Databricks, AWS/Azure/GCP
- Architecture: Data Lake, Lakehouse, Streaming
- Pipelines: ELT/ETL, orchestration, distributed data processing
- APIs: RESTful, gRPC
- Streaming: Kafka, event-driven/messaging systems
- DevOps: GitHub Actions, CI/CD, Pytest, mocking/testing best practices
- Observability: Data/system monitoring tools, debugging, alerting
- Data Modeling: OLTP, document/graph stores, best practices
- Strong understanding of SDLC, software design patterns
- Must have Data Governance and Compliance experience
Tech Stack:
- Databricks
- Aws
- Kafka
- Pyspark
- GitHub Actions
- Terraform
- Data governance HIPAA