Our team has an immediate 12-month internship opening for a Researcher.
Responsibilities:
• Contributing to dataset quality analysis using machine learning methods for the OpenDataology open source project – A project that deals with providing data governance for AI datasets that are used to train commercial machine learning models.
• Research, design and implement automated dataset quality evaluation metrics and data quality anti-patterns analysis tools.
• Research, design and implement automated dataset provenance and lineage analysis tools.
• Contributing to the publishing of research papers in top-tier SE and AI venues (e.g. ICSE, FSE, ASE, TSE, TOSEM, ICLR, ICML, NeurIPS) and high-impact intellectual properties (e.g., patents).
• Actively engage in the OSS project OpenDataology and make regular contributions.
Job requirements
What you’ll bring to the team:
• Currently enrolled in a Master or Ph.D. degree in Computer Science, Electrical and Computer Engineering, Statistics Applied Mathematics, or a related field.
• Experience in conducting research in any one of the following areas of software engineering: software engineering for AI, AI for software engineering, software analytics, Open source licensing.
• Experience in developing software and conducting data analysis with python/R.
• Experience and understanding of end to end development of AI/ML models.
• Published papers in top tier software engineering conference and journals is an asset (e.g.,ICSE, FSE, ASE, TSE, TOSEM, EMSE).
• Experience of having worked with open sources communities is an asset.
• Good communication skills, willingness to collaborate
