Data Engineer in Advanced Analytics Reference number: 942
Last update: 04-09-2019, 16:29
Region: Brussels
Sector: Finance
Start: 03 Jun 2019    End: 03 Jun 2020
ASAP, 12 months
Job type: Contract
Job description
The Advanced Analytics team is currently looking for a new Data Engineer who will:

collect, clean, prepare and load the necessary data - structured or unstructured - onto Hadoop, Big Data analytics platform, so that they can be used by the data scientists to create insights and answer business challenges.

Act as a liaison between the team and other stakeholders, whether in ADM or in CT, and contribute to support the Hadoop cluster and the compatibility of all the different softwares that run on the platform (Spark, R, Python, …

Experiment new tools and technologies related to data extraction, exploration or processing (eg. OCR engines)

Depending on your skills, you may also be involved in the analytical aspects of data science projects.
What you’ll do
  • Identify the most appropriate data sources to use for a given purpose and understand their structures and contents, if necessary with the help of SMEs
  • Extract structured and unstructured data from the source systems (relational databases, data warehouses, document repositories, file systems, …), prepare such data (cleanse, re-structure, aggregate, …) and load them onto Hadoop
  • Actively support data scientists in the data exploration and data preparation phases. Where data quality issues are detected, liaise with the data supplier to do root cause analysis
  • Work under the supervision of a senior data engineer
  • Where a use case is meant to become a production application, contribute to the design, build and launch activities
  • Ensure the maintenance and support of production applications (watch duty)
  • Liaise with CT teams to address infrastructure issues and to ensure that the components and softwares used of the platform are all consistent
  • Where the skills allow for it, perform advanced data analysis on a selection of business use cases, supported by data scientists
  • Strong verbal and written communication skills, good customer relationship skills
Required Skills
  • Experience with understanding and creating data flows, with data architecture, with ETL/ELT development (MS SQL Server SSIS, Datastage, … ) and with processing structured and unstructured data
  • Proven experience with using data stored in RDBMSs and experience or good understanding of NoSQL databases
  • Ability to write performant SQL statements
  • Ability to analyze data, to identify issues like gaps and inconsistencies and to do root cause analysis
  • Knowledge of Java
  • Experience delivering scripts
  • Experience in working with customers to identify and clarify requirements
  • Ability to design solutions that are fit for purpose whilst keeping options open for future needs
  • Knowledge of R, Python and Scala
  • Understanding of the Hadoop ecosystem including Hadoop file formats like Parquet and ORC
  • Experience with open source technologies used in Big Data analytics like Spark, Pig, Hive, HBase, Kafka, …
  • Ability to write MapReduce & Spark jobs
  • Knowledge of Cloudera
  • Knowledge of IBM mainframe
Interested? Send us your resumé
To apply for this job, please complete the form below and join your resume. This instantly places your information into our database. Once we have received your information, we will be in touch by e-mail or phone. If you have not heard from us after 3 working days, please call us!

Thank you for your interest in working with Harvey Nash and we look forward to assisting you in your job search!

Only PDF, max. 10MB

Only PDF, max. 10MB