Responsibilities
- Design and implement scalable, reliable distributed data processing frameworks and analytical infrastructure.
- Be part of a team to define, design, and implement data integration, management, storage, consumption, backup, and recovery solutions that ensure the high performance of the organization's enterprise data.
- Develop Structured Query Language (SQL), Data Definition Language (DDL), and Python or equivalent programming scripts to support data pipeline development, problem-solving, data validation, and performance tuning.
- Knowledge of Python programming language is required.
- Familiarity with Kafka, PySpark, Hadoop, DataBricks, Azure Data Factory etc.
- Understanding about data warehousing solutions, relational database theories and nosql databases is a big plus.
- Familiarity with Linux is a bigplus.
- Excellent interpersonal, communication and organizational skills are required.
Generating Apply Link...