
Hybrid
Full-Time
Bengaluru, Karnataka
India
About the Role
Velotio Technologies is a product engineering company working with innovative startups and enterprises. We are a certified Great Place to Work® and recognized as one of the best companies to work for in India. We have provided full-stack product development for 110+ startups across the globe building products in the cloud-native, data engineering, B2B SaaS, IoT & Machine Learning space. Our team of 400+ elite software engineers solves hard technical problems while transforming customer ideas into successful products.
Requirements
Design and build scalable data infrastructure with efficiency, reliability, and consistency to meet rapidly growing data needs
Build the applications required for optimal extraction, cleaning, transformation, and loading data from disparate data sources and formats using the latest big data technologies
Building ETL/ELT pipelines and work with other data infrastructure components, like Data Lakes, Data Warehouses and BI/reporting/analytics tools
Work with various cloud services like AWS, GCP, Azure to implement highly available, horizontally scalable data processing and storage systems and automate manual processes and workflows
Implement processes and systems to monitor data quality, to ensure data is always accurate, reliable, and available for the stakeholders and other business processes that depend on it
Work closely with different business units and engineering teams to develop a long-term data platform architecture strategy and thus foster data-driven decision-making practices across the organization
Help establish and maintain a high level of operational excellence in data engineering
Evaluate, integrate, and build tools to accelerate Data Engineering, Data Science, Business Intelligence, Reporting, and Analytics as needed
Focus on building test-driven development by writing unit/integration tests
Contribute to design documents and engineering wiki
You will enjoy this role if you...
Like building elegant well-architected software products with enterprise customers
Want to learn to leverage public cloud services & cutting-edge big data technologies, like Spark, Airflow, Hadoop, Snowflake, and Redshift
Work collaboratively as part of a close-knit team of geeks, architects, and leads
Desired Skills & Experience:
2+ years of data engineering or equivalent knowledge and ability
2+ years software engineering or equivalent knowledge and ability
Strong proficiency in at least one of the following programming languages: Python, Scala, or Java
Experience designing and maintaining at least one type of database (Object Store, Columnar, In-memory, Relational, Tabular, Key-Value Store, Triple-store, Tuple-store, Graph, and other related database types)
Good understanding of star/snowflake schema designs
Extensive experience working with big data technologies like Spark, Hadoop, Hive
Experience building ETL/ELT pipelines and working on other data infrastructure components like BI/reporting/analytics tools
Experience working with workflow orchestration tools like Apache Airflow, Oozie, Azkaban, NiFi, Airbyte, etc
Experience building production-grade data backup/restore strategies and disaster recovery solutions
Hands-on experience with implementing batch and stream data processing applications using technologies like AWS DMS, Apache Flink, Apache Spark, AWS Kinesis, Kafka, etc
Knowledge of best practices in developing and deploying applications that are highly available and scalable
Experience with or knowledge of Agile Software Development methodologies
Excellent problem-solving and troubleshooting skills
Process-oriented with excellent documentation skills
Bonus points if you:
Have hands-on experience using one or multiple cloud service providers like AWS, GCP, Azure and have worked with specific products like EMR, Glue, DataProc, DataBricks, DataStudio, etc
Have hands-on experience working with either Redshift, Snowflake, BigQuery, Azure Synapse, or Athena and understand the inner workings of these cloud storage systems
Have experience building DataLakes, scalable data warehouses, and DataMarts
Have familiarity with tools like Jupyter Notebooks, Pandas, NumPy, SciPy, sci-kit learn, Seaborn, SparkML, etc
Have experience building and deploying Machine Learning models to production at scale
Possess excellent cross-functional collaboration and communication skills
Benefits
Our Culture:
We have an autonomous and empowered work culture encouraging individuals to take ownership and grow quickly
Flat hierarchy with fast decision making and a startup-oriented "get things done" culture
A strong, fun & positive environment with regular celebrations of our success. We pride ourselves in creating an inclusive, diverse & authentic environment
At Velotio, we embrace diversity. Inclusion is a priority for us, and we are eager to foster an environment where everyone feels valued. We welcome applications regardless of ethnicity or cultural background, age, gender, nationality, religion, disability or sexual orientation.
Requirements
Design and build scalable data infrastructure with efficiency, reliability, and consistency to meet rapidly growing data needs
Build the applications required for optimal extraction, cleaning, transformation, and loading data from disparate data sources and formats using the latest big data technologies
Building ETL/ELT pipelines and work with other data infrastructure components, like Data Lakes, Data Warehouses and BI/reporting/analytics tools
Work with various cloud services like AWS, GCP, Azure to implement highly available, horizontally scalable data processing and storage systems and automate manual processes and workflows
Implement processes and systems to monitor data quality, to ensure data is always accurate, reliable, and available for the stakeholders and other business processes that depend on it
Work closely with different business units and engineering teams to develop a long-term data platform architecture strategy and thus foster data-driven decision-making practices across the organization
Help establish and maintain a high level of operational excellence in data engineering
Evaluate, integrate, and build tools to accelerate Data Engineering, Data Science, Business Intelligence, Reporting, and Analytics as needed
Focus on building test-driven development by writing unit/integration tests
Contribute to design documents and engineering wiki
You will enjoy this role if you...
Like building elegant well-architected software products with enterprise customers
Want to learn to leverage public cloud services & cutting-edge big data technologies, like Spark, Airflow, Hadoop, Snowflake, and Redshift
Work collaboratively as part of a close-knit team of geeks, architects, and leads
Desired Skills & Experience:
2+ years of data engineering or equivalent knowledge and ability
2+ years software engineering or equivalent knowledge and ability
Strong proficiency in at least one of the following programming languages: Python, Scala, or Java
Experience designing and maintaining at least one type of database (Object Store, Columnar, In-memory, Relational, Tabular, Key-Value Store, Triple-store, Tuple-store, Graph, and other related database types)
Good understanding of star/snowflake schema designs
Extensive experience working with big data technologies like Spark, Hadoop, Hive
Experience building ETL/ELT pipelines and working on other data infrastructure components like BI/reporting/analytics tools
Experience working with workflow orchestration tools like Apache Airflow, Oozie, Azkaban, NiFi, Airbyte, etc
Experience building production-grade data backup/restore strategies and disaster recovery solutions
Hands-on experience with implementing batch and stream data processing applications using technologies like AWS DMS, Apache Flink, Apache Spark, AWS Kinesis, Kafka, etc
Knowledge of best practices in developing and deploying applications that are highly available and scalable
Experience with or knowledge of Agile Software Development methodologies
Excellent problem-solving and troubleshooting skills
Process-oriented with excellent documentation skills
Bonus points if you:
Have hands-on experience using one or multiple cloud service providers like AWS, GCP, Azure and have worked with specific products like EMR, Glue, DataProc, DataBricks, DataStudio, etc
Have hands-on experience working with either Redshift, Snowflake, BigQuery, Azure Synapse, or Athena and understand the inner workings of these cloud storage systems
Have experience building DataLakes, scalable data warehouses, and DataMarts
Have familiarity with tools like Jupyter Notebooks, Pandas, NumPy, SciPy, sci-kit learn, Seaborn, SparkML, etc
Have experience building and deploying Machine Learning models to production at scale
Possess excellent cross-functional collaboration and communication skills
Benefits
Our Culture:
We have an autonomous and empowered work culture encouraging individuals to take ownership and grow quickly
Flat hierarchy with fast decision making and a startup-oriented "get things done" culture
A strong, fun & positive environment with regular celebrations of our success. We pride ourselves in creating an inclusive, diverse & authentic environment
At Velotio, we embrace diversity. Inclusion is a priority for us, and we are eager to foster an environment where everyone feels valued. We welcome applications regardless of ethnicity or cultural background, age, gender, nationality, religion, disability or sexual orientation.