Artpark Data Engineering Internship
ARTPARK Data Engineering Internship
Location: Bangalore
Type: Internship
Duration: Approximately three months between May and August 2025. The period may be extended based on a candidate’s performance and/or interest.
About the Role
ARTPARK’s One-Health team tackles interconnected challenges in human, animal, and environmental health through collaborative and interdisciplinary efforts.
Working with city, state, and national governments, we support data-driven public health responses to endemic, epidemic, and climate-related threats through innovative solutions leveraging statistical and AI/ML-based approaches.
In this role, you will have the opportunity to engage with leading experts in disease modelling, climate-health systems, engineering, and public health, both nationally and internationally, in a dynamic and highly motivated environment.
Key Responsibilities:
- Data extraction and cataloguing:
- Extract climate, image, or epidemiological data from diverse sources. The data may be in a spatio-temporal, complex, semi-structured or unstructured format.
- Integrate data into a coherent, harmonised format ready for use by advanced computational models.
- Data and model pipeline development:
- Develop and automate a robust, scalable data pipeline.
- Develop data access mechanisms and policies.
- Ensure streamlined and reliable data flow.
- Enable computational and simulation modellers to seamlessly access and utilise the data in their models without manual intervention and facilitate real-time processing.
- Exhaustive cataloguing and documentation for data and modelling work:
- Catalogue details of data sources, extraction processes, and any standards and processes used.
- Document all data analyses, model details, results, plots, evaluations, insights, etc.
- Work with and support data analysts, data scientists, and/or computational epidemiologists
- Leverage state-of-the-art techniques (AI, ML, LLM, etc.) for production-grade data extraction and models.
Experience & Skills:
- The candidate should be pursuing a bachelor's or master’s degree in computer science, engineering, mathematics or a related quantitative scientific discipline.
- Experience and understanding of programming, preferably in Python, is required. If a candidate is good in programming, but not familiar with Python, we will expect the candidate to quickly ramp up on Python programming skills.
- Experience with processing raw datasets into clean, structured formats is highly desirable. The candidate may have gained such experience via coursework or projects.
- Attention to detail while working on data and/or models is highly desirable.
- Experience with AI, ML techniques is desirable, but not required.
- Experience with AWS, GitHub, databases, etc. is desirable.
Comments
Post a Comment