ACT, Inc. Data Engineer II - Research in Iowa City, Iowa

Overview

TheResearch Data Engineer IItakes the lead from the Senior Data Engineer to establish data pipelines that optimize ACT’s data assets, specifically forresearch and data scientist exploratory analytics and modelingand for algorithmic use in products. The position helpsthe business to leverage data as a strategic asset, includingbuilding automated processes andmetadata in support of data governance. The Research Data Engineer II is a data steward within the data lake, and collaborateswith enterprise data architectureand enterprise data engineeringto define business rules, critical data elements,anddata standards, andtocodify metadata relevant to data science and research applications.

Responsibilities

Typical work-related activities include:

  • Process structured, unstructured, and semi-structured data into a form suitable for analysis by data scientists, research scientists, and learning scientists

  • Work closely with scientists to understand the questions they are asking of the data

  • Define business rules, critical data elements, data standards, and codify metadata in data lake as data stewards, ensuring data is interpretable by those working in the data lake

  • Meet established data quality metric targets such as completeness, currency, accuracy, lineage, accessibility, timeliness, validity, integrity, precision, and representation

  • Implement and support data visualization

  • Build automated processes and metadata in support of data governance

  • Write code that is easy to understand, test, and maintain

  • Assist our engineering team to integrate data pipelines into our production systems

  • Build and document performant data pipelines; work with performance engineering to optimize code as needed

  • Keep pace with ever-changing data storage and wrangling tooling, including data in motion, cloud distributed, fog and edge data uses

  • Contribute to a culture of high achievement, industry leadership, innovation, and accountability

Qualifications

Education:

  • Bachelor’s degree in computer science, engineering, data mining, or related field required.

  • Or an equivalent combination of education and experience from which comparable knowledge and abilities can be acquired

    Experience:

  • Minimum of three years’ demonstrated success building data pipelines for analytic purposes

  • Experience with software development tools such as Github and JIRA required

  • Experience in education or assessment industries preferred

  • Expert in data modeling, with advanced knowledge of and experience writing and tuning SQL; Postgres and Amazon RDS preferred

  • Experience integrating data from multiple data sources and processing large amounts of structured and unstructured data; Spark preferred; Apache Atlas/Apache Ranger experience helpful

  • Experience with NoSQL databases preferred

  • Experience with scalable, real-time messaging systems preferred; Kafka preferred

  • Experience handling datasets exceeding 250GB preferred

Knowledge, Skills and Abilities:

  • Demonstrated skills preparing data for analytic purposes for users of Python, R, and/or SAS

  • Strong knowledge of data flows, data architecture, ETL and processing of structured, unstructured, and semi-structured data

  • Knowledge of general computer science principles, including distributed computing and object oriented design and development

  • Good oral and written communication skills

  • Exceptional collaborator and team member

  • Demonstrated eagerness to learn new techniques

  • Agile mindset

Job ID2018-1320

of Openings1

CategoryResearch

TravelUp to 25% Travel

ACT is an Equal Opportunity Employer/Minorities/Females/Protected Veterans/Disabled.