About the company
Our client processes more than one thousand million social profiles. We enrich them by applying several classification and segmentation algorithms to help customers such as Twitter, Telefónica, DHL, and Puma to define their marketing campaigns, identify influencers and discover new market niche.
Their challenge is to optimise the intake, enriching and storage of that growing data asset and to bring the insights computing to be as close as possible to real time. For that they have a pipeline that ingests data from an enriching system mainly written in Node.js. This system consumes from many third party APIs and applies algorithms to infer new characteristics. This pipeline is able to keep updating a data lake of more than 100 TB that is used to generate reports for our customers using Spark and Scala.
Most of the Data Engineering code base is written in Scala, Spark and Typescript. This codebase is used both for maintaining our data lake and consuming it.
Their current focus is migrating and refactoring all the legacy parts of the batch-oriented ETL pipeline that feeds our data lake to a more real-time, event-oriented architecture based on Kafka.
What are they looking for?
They would like you to be able to read, refactor and test Spark code following good practices, to have experience with Kafka architecture patterns and the best strategies to implement them at scale. Knowledge of the datalake, datawarehouse and deltalake concepts is a plus and in general, being in touch with the data engineering field trends
They are looking for someone with experience in Scala and Spark, or at least Java with an interest in Scala, but not necessarily experience with Functional Programming since we write our code using an OO approach. Also having experience with Terraform and AWS would definitely be a plus, as is the flexibility to learn the different technologies involved in our projects. While none of these requirements is written in stone, we would like you to feel comfortable working with real-time technologies like Kafka (KStreams and KSql) and Spark Streaming.
They believe in pair programming and despite being remote we spend a great part of the day pairing, so we hope you feel comfortable with this practice.
Their stack is wide: Scala, Spark, Node.js (ES6 and Typescript), Python, React, MongoDB, MySQL, RabbitMQ, Redis, AWS (SNS, SQS, API Gateway, Cognito, Lambda, Redshift, Aurora, DynamoDB). However, mastering them all is not a requirement. Again, mastering them all is not a requirement. We are more interested in the principles behind them.
They invest our time and support in helping each other towards continuous learning, so it is very important that you want to learn and continue practising the skills necessary to run our profession. Practise, practise, practise!
Working remotely has many advantages but also requires an extra effort of communication and responsibility, so we understand that the following skills are essential: self-management, fluid communication, respect, and inclusiveness.
What they offer?
- 40K€-60K€ salary.
- Monthly subvention for co-workings.
- Learning days. You can learn during working hours.
- Training budget that includes unlimited access to SafariBooks and Coursera catalogues.
- 100% remote and flexible schedule.
- Local and National Bank holidays.
- Free day on your birthday.
- Rewards for hitting quarterly targets.
- Quarterly engineering meetups, 3 days of retrospective sessions, hacking, team building and leisure in Córdoba.
- Yearly global all hands, 3-4 days of team building and leisure sessions for all the company.
- English lessons.
- Furniture/Accessories Allowance via Hofy.
- 2 Duvet Days per year.
- Allowance towards therapy sessions.
- Laptop with the possibility of choosing between Mac or PC.
- Free 1Password Families Plan account.