How you will contribute:
- Refine and maintain our data infrastructure technologies to support real-time analysis of hundreds of millions of users.
- Consistently evolve data model & data schema based on business and engineering requirements.
- Own the data pipeline that surfaces 40B+ daily events to all teams, and the tools we use to improve data quality.
- Support warehousing and analytics customers that rely on our data pipeline for analysis, modeling, and reporting.
Qualifications:
- 2+ years of experience writing clean, maintainable, and well-tested code.
- Experience with Python and/or Scala.
- Familiarity with large scale distributed real-time tools such as Kafka, Flink, or Spark.
- Familiarity with ETL design (both implementation and maintenance).
- Bonus points for experience with (or desire to learn) Kubernetes.
- Excellent communication skills to collaborate with stakeholders in engineering, data science, and product.