Lead Data Engineer
- 15+ years of experience in defining data architecture solutions and establishing common data capabilities for enterprises
- Proven experience in creating actionable Data and Analytics strategies for Compliance, Risk, Financial Intelligence business functions
- Experience in defining technology blueprints, roadmaps and collaboratively defining solutions and enabling architecture capabilities
- Experience in tool selection, conducting rapid PoCs and recommending use case appropriate technologies
- 6+ years of experience building distributed solutions in Spark, MapReduce and other MPP system with associated data models and datastores (e.G., Redshift, Cassandra, HBase, Parquet)
- 2+ years of experience working with AWS Cloud data engineering stack including EC2, S3, EMR, Kinesis, Glue and other AWS Services
- Hands-on experience with Apache Ni-Fi, Kafka, Python, Spark preferably on AWS
- Experience with structured/unstructured/semi-structured data ingestion and processing
- Experience with automation and deployment (Jenkins, CloudFormation, Chef etc.)
- Experience writing high quality code in Python and one another OOP language (Java, Scala, C++, Go, etc.)
- Experience working with RDBM systems, particularly familiarity with SQL
- Solid Experience in optimizing the Hive queries using Partitioning and Bucketing techniques, which controls the data distribution, to enhance performance.
- Experience in working with UNIX shell scripts.
- Production development of event-based applications using frameworks such as Kinesis, Kafka, Spark Streaming, or similar
- Familiarity with machine learning techniques, continuous deployment pipelines and tools.
- Desire to work across internal teams to identify requirements and iterate on solutions
- Debug complex production issues across various levels of the tech stack
- Prefer bachelor' s degree or above in Computer Science or related field