Специализация
Data Science / Machine Learning
They are looking for a Big Data Engineer that will work on the collecting, storing, processing, and analyzing of huge sets of data. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them. You will also be responsible for integrating them with the architecture used across their clients.
Stack
HadoopMapReduceHDFSGCPStormSpark StreamingPigHiveImpalaSparkSQLLinuxETLKafkaRabbitMQLambda
Responsibilities
- Selecting and integrating any Big Data tools and frameworks required to provide requested capabilities.
- Develop and maintain data pipelines implementing ETL process, monitoring performance and advising any necessary infrastructure changes.
- Translate complex technical and functional requirements into detailed designs.
- Investigate and analyze alternative solutions to data storing, processing etc. to ensure most streamlined approaches are implemented.
- Serve as a mentor to junior staff by conducting technical training sessions and reviewing project outputs.
Skills and qualifications
- Strong understanding of data warehousing and data modeling techniques.
- Proficient understanding of distributed computing principles — Hadoop v2, MapReduce, HDFS.
- Strong data engineering skills on GCP cloud platforms.
- Experience with building stream-processing systems, using solutions such as Storm or Spark Streaming.
- Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala.
- Experience with Spark, SQL, and Linux.
- Knowledge of various ETL techniques and frameworks, such as Flume, Apache NiFi, or DBT.
- Experience with various messaging systems, such as Kafka or RabbitMQ.
- Good understanding of Lambda Architecture, along with its advantages and drawbacks.
Софья Резникова Tech Recruiter