Pyspark Engineer
Citi
Responsibilities:
- Engineering Degree with 1-2 years of experience in BigData systems, Hive, Hadoop, Spark (Python/ scala) and cloud based data management technologies
- Hands-on experience in Unix Scripting, Python and Scala programing along with strong experience in SQL.
- Comfortable working with completed unstructured, undocumented code and turning it around into best in class code redesigning costly compute and data processes and aligning to best development standards
- Experienced in working with large and multiple datasets, data warehouses and ability to pull data using relevant programs and coding.
- Well versed with necessary data preprocessing and application engineering skills
- At least 3 years of experience designing software systems with intense computational needs across real time and batch process .
- Experience and understanding of Supervised, unsupervised machine learning techniques
- Exposure to data ingestion, ETL tools such as Talend, modeling tools, Performance Management tooling such as Pepper data, Cloudera stack will be a plus
- Knowledge of data management, data governance, data security and regulatory practices
- Ability to identify, clearly articulate and solve complex business problems and present them to the management in a structured and simpler form
- Should have experience of working in onsite, offsite delivery model
- Experience working with large and multiple datasets, data warehouses and ability to pull data using relevant programs and coding.
- Experience in Credit Cards and Retail Banking
- Should have excellent communication and inter-personal skills
- Strong process/project management skills
- Multiple stake holder management
- Control orientated and Risk awareness
Qualifications:
- Fast Learner with a desire to excel and attitude to partner and solve problems in complex environments placing business objectives at center or all activity.
- Experience in Performance Tuning, Code Re-engineering is preferred.
- Experience in broad IT architecture and design preferred across data and channels
- Experience in query tuning, automation technologies (Autosys, Jenkins, Service Now) preferred
- Exposure to container technology, Machine learning will be a plus
Education:
- Bachelors/University degree or equivalent experience
This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.
------------------------------------------------------
Job Family Group:
Decision Management------------------------------------------------------
Job Family:
Data/Information Management------------------------------------------------------
Time Type:
Full time------------------------------------------------------
Most Relevant Skills
Please see the requirements listed above.------------------------------------------------------
Other Relevant Skills
Python (Programming Language), Spark SQL.------------------------------------------------------
Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.
If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.