Job Details

Position : Data Scientist/Data Engineer(Independent Contractors Only)

Data Scientist/Data Engineer/Data Analyst 

[PySPARK, Oracle/ SQL, Linux, Hadoop Cloudera, IBM Spectrum Conductor, IBM Spectrum Scale]

This role is a mix between a Data Analyst and Data Engineer

  • They will be responsible to work on the bank’s regulatory data processes within applications
  • These applications capture the data and are used for earnings, reporting, auditing, etc. This cycle runs every quarter

The applications layer is built prominently in python                                              

  • Python is used for processing and time series (Pandas will be used)
  • They will not be doing research type work, but on a production scale
    • Previous experience of working with Python and/or R and Sass in this production capacity. There is SASS coding, that is sent to python and consumed through Sparky, then code is converted into PySpark, etc.
  • They use your typical big data technologies (not most, but like Hadoop), and they use a big data space with, for example, file sets that are billions of rules
  • There is SASS coding, that is sent to python and consumed through Sparky
  • The next step is to introduce converting the code into PySpark

A lot of plumbing work getting data from one side to another

  • They will be working in different cloud formats so having software engineer experience is beneficial

Data worked with is confidential and classified nonpublic information

  • Previously working in that field like is always a preferred thing

Mandatory Skills needed in beeline:

  • Linux
  • Hadoop Cloudera
  • IBM Spectrum Conductor
  • IBM Spectrum Scale
  • Python

Top Must-Have Skills:

  • Python-Pandas required
  • IBM Spectrum Conductor & Spectrum Scale
  • Linux Expert level
  • Hadoop and Cloud Platforms
  • Large data center environments

Candidate Requirements:

  • Pyspark
  • Oracle/ SQL
  • Technical skills
  • Strong communication
  • Analytical