
Remote
Contract
India
About the Role
ESSENTIAL JOB FUNCTIONS:
Design, develop and monitor large scale data pipelines empowering executive dashboards, operational reporting, and machine learning algorithms .
Consume and process data from a variety of sources (RDBMS, APIs, FTPs and other cloud storage systems) and file formats (Excel, CSV, XML, JSON, Parquet)
Use advanced data modeling skills to design and develop dimension and fact tables supporting a near real-time enterprise data model.
Create and maintain documentation pertaining to data systems (configurations, test plans, functional specs, etc.)
Use consultative skills to better understand and mature customer requirements while identifying and resolving potential design issues.
Perform duties and responsibilities specific to department functions & activities and any other assigned task by reporting manager.
Must know:
Gcp, dataproc, spark, sql, python.
4-5 years of experience as a Data Engineer
Strong SQL skills
Python experience (ideally with Pandas DataFrame)
Experience in building data pipelines (have worked on data extraction, transformation)
Experience working within GCP
NICE TO HAVE SKILLS AND EXPERIENCE
SAS Experience
Spark/Scala Experience
Experience with healthcare data and/or enterprise level data
Timing:
5 Hours/day work after 6:00 PM
Design, develop and monitor large scale data pipelines empowering executive dashboards, operational reporting, and machine learning algorithms .
Consume and process data from a variety of sources (RDBMS, APIs, FTPs and other cloud storage systems) and file formats (Excel, CSV, XML, JSON, Parquet)
Use advanced data modeling skills to design and develop dimension and fact tables supporting a near real-time enterprise data model.
Create and maintain documentation pertaining to data systems (configurations, test plans, functional specs, etc.)
Use consultative skills to better understand and mature customer requirements while identifying and resolving potential design issues.
Perform duties and responsibilities specific to department functions & activities and any other assigned task by reporting manager.
Must know:
Gcp, dataproc, spark, sql, python.
4-5 years of experience as a Data Engineer
Strong SQL skills
Python experience (ideally with Pandas DataFrame)
Experience in building data pipelines (have worked on data extraction, transformation)
Experience working within GCP
NICE TO HAVE SKILLS AND EXPERIENCE
SAS Experience
Spark/Scala Experience
Experience with healthcare data and/or enterprise level data
Timing:
5 Hours/day work after 6:00 PM