h2o.ai Data Engineer - Data Annotation (EMEA) in United States
H2O.ai is the leader in Enterprise AI/Machine Learning with a mission to democratize AI for everyone. H2O.ai is transforming the use of AI with its category-creating visionary machine learning platform. More than 25,000 companies use our open-source platform in mission-critical use cases for Finance, Insurance, Healthcare, Retail, Telco, Sales and Marketing. Our commercial DriverlessAI platform builds on this, with a ?use AI to do AI" approach to provide an easier, faster and cost-effective way of implementing data science to provide quantifiable value to the business.H2O.ai partners with leading technology companies such as NVIDIA, IBM, AWS, Intel, Microsoft and Google. To learn more about how H2O.ai is driving AI Transformation, visit www.h2o.ai.
One of the most important components of training state-of-the-art machine learning models is the targets the models are trained to learn. In this role, you will be responsible for gathering and managing AI annotations for a variety of document-specific machine learning models. This can include creating new annotations on documents or annotation sets that have not been attempted or improving existing models that have already been trained with labels. Existing practices are in place, and we see opportunity for increased focus to increase the speed and accuracy of annotations.
Own the data annotation pipeline, ensuring quality of data annotation for the entire team.
Create annotation documentation for annotators, understanding and forecasting difficult scenarios.
Work directly with members of the data science team on annotation objectives and understanding the AI pipelines to convert existing predictions into initial annotations.
Work with a small annotation team to provide guidance, answer questions, manage annotation reviews, and manage timelines.
Balance the tradeoff between annotation complexity and speed.
Analysis of the annotation pipeline to ensure consistency, flag additional review items, and provide insights to the data science team.
Fluent in Python, specifically pandas
Experience working with large datasets.
Experience with standard code management software (Github, Azure DevOps, etc.)
Bachelor's degree or higher in computer science, engineering, or a comparable discipline
Location: Czechia, Slovakia, Poland, Hungary, Germany, Austria, France, Netherlands