Sample big data engineer job description
At [Company X], we’re looking for a big data engineer who loves solving complex problems across a full spectrum of technologies. The ideal candidate is excited by experimentation and looking for a new challenge that stretches their talents. The big data engineer will help ensure that our technological infrastructure operates seamlessly in support of business objectives.
Objectives of this role
- Develop and implement pipelines that extract, transform, and load data into an information product that helps the organization reach its strategic goals
- Focus on ingesting, storing, processing, and analyzing large datasets
- Create scalable, high-performance web services for tracking data
- Translate complex technical and functional requirements into detailed designs
- Investigate alternatives for data storing and processing to ensure implementation of the most streamlined solutions
- Serve as a mentor for junior staff members by conducting technical training sessions and reviewing project outputs
Responsibilities
- Develop and maintain data pipelines using ETL processes
- Take responsibility for Apache Hadoop development and implementation
- Work closely with data science team to implement data analytics pipelines
- Help define data governance policies and support data-versioning processes
- Maintain security and data privacy, working closely with data protection officer
- Analyze vast number of data stores to uncover insights
Required skills and qualifications
- Experience with Python, Spark, and Hive
- Understanding of data-warehousing and data-modeling techniques
- Knowledge of industry-wide visualization and analytics tools (ex: Tableau, R)
- Strong data engineering skills with Azure cloud platform
- Experience with streaming frameworks such as Kafka
- Knowledge of Core Java, Linux, SQL, and any scripting language
- Good interpersonal skills and positive attitude
Preferred skills and qualifications
- Degree in computer science, mathematics, or engineering
- Expertise in ETL methodology for corporate-wide solution design using DataStage