Sr. Data Engineer
About this role
WHAT YOU’LL DO:
• Build and operate production-grade data platforms that support IMO’s terminology-driven products, analytics, and machine learning use cases
• Design, develop, and maintain data pipelines for batch and incremental processing using modern lakehouse and cloud-native patterns
• Work extensively with cloud data platforms (AWS + Databricks) to ingest, transform, and serve structured and semi-structured data at scale
• Model data intentionally—developing well-documented, analytics- and product-ready data models that balance usability, performance, and correctness
• Apply strong software engineering practices to data work, including version control, testing, CI/CD, and infrastructure-as-code
• Collaborate directly with product, analytics, and AI teams to translate requirements into scalable technical solutions
• Improve reliability, performance, and cost-efficiency of data systems through monitoring, observability, and continuous optimization
• Design for data quality and trust, implementing automated checks, validation frameworks, and lineage-aware workflows
• Contribute to platform evolution, helping shape standards around orchestration, data modeling, environments, and deployment
• Operate in an Agile environment, taking ownership of deliverables and proactively identifying risks and opportunities
• Mentor and support other engineers, leading by example in code quality, problem decomposition, and technical decision-making
• Continuously learn and apply industry best practices in data engineering, analytics engineering, and AI data foundations
WHAT YOU’LL NEED:
• Bachelor’s degree in a relevant technical field and 5+ years of professional experience, or 7+ years of equivalent hands-on experience
• Demonstrated experience building and supporting end-to-end data platforms in a production environment
• Strong programming experience in Python and SQL, with an engineering mindset toward maintainability and testing
• Deep experience with cloud-based data platforms, especially:
• AWS (e.g., S3, EC2, RDS, IAM)
• Databricks / Spark-based processing
• Strong SQL skills, including complex transformations and performance-aware query design
• Hands-on experience with data orchestration frameworks (e.g., Airflow or equivalent)
• Experience designing and optimizing data models for analytics, reporting, and downstream applications
• Familiarity with CI/CD practices and infrastructure-as-code (e.g., Git, Terraform)
• Comfort working with large, complex, and evolving datasets, including managing schema change and metadata
• Strong analytical, debugging, and root-cause analysis skills
• Clear written and verbal communication skills, including documenting designs and tradeoffs
• A proactive, ownership-oriented mindset and the ability to work effectively across teams
PREFERRED EXPERIENCE:
• Experience with analytics engineering tools and patterns (e.g., dbt or similar transformation frameworks)
• Familiarity with data observability, monitoring, and cost management in cloud environments
• Experience supporting AI / ML data pipelines or feature engineering workflows
• Exposure to streaming or near–real-time data processing concepts
• Experience with healthcare, clinical, or regulated data domains
• Familiarity with metadata management, data catalogs, and lineage concepts
• AWS certifications (Data Engineer, Solutions Architect, or AI/ML)