Responsibilities:
Design and implement distributed data processing
pipelines using Spark, Hive, Python, and other tools and languages prevalent in
the Hadoop ecosystem. Ability to design and implement end to end solution.
Experience with open source NOSQL technologies
such as HBase, DynamoDB, Cassandra Familiar with Distributed Stream Processing
frameworks for Fast & Big Data like ApacheSpark, Flink, Kafka stream
Build utilities, user defined functions, and
frameworks to better enable data flow patterns.
Work with architecture/engineering leads and
other teams to ensure quality solutions are implements, and engineering best
practices are defined and adhered to.
Experience in Business Rule management systems
like Drools
Qualifications:
MS/BS degree in a computer science or related
discipline
Minimum 5+ years’ experience in software development
Minimum 1- 2 years’ experience in Big Data
technologies