Problem

India faces a significant school dropout problem, particularly in secondary and higher secondary education. Dropouts stem from a complex interplay of social, economic, health, and academic factors—but education systems often lack tools to detect these issues early. The challenge lies in the lack of insight into which students are at risk of dropping out, and an inefficient response mechanism to trigger timely, targeted interventions. This data gap and reactive approach limit the effectiveness of retention efforts, especially for underserved communities.

Why is it important to solve this?

Systemic Impact on Education

Student dropouts undermine the entire education ecosystem, affecting teacher morale, school performance metrics, and community educational outcomes. Each dropout represents a failure of the system to deliver on its promise of inclusive, quality education.

Economic Consequences

Dropouts face significantly reduced lifetime earning potential, perpetuating cycles of poverty and limiting economic mobility. At a macro level, high dropout rates reduce the skilled workforce, hampering economic growth and competitiveness.

Social Equity

Dropout rates are often highest among marginalized communities, exacerbating existing inequalities. Addressing this issue is crucial for achieving Sustainable Development Goal 4 (Quality Education) and upholding the Right to Education Act 2009.

Long-term Societal Cost

The cumulative impact of educational dropouts includes increased social welfare costs, reduced tax revenue, and diminished civic participation, creating a burden on society that compounds over generations.

Our Solution

The Early Warning System (EWS) is a machine learning-powered solution designed to identify students at risk of dropping out—from Grade 3 to Grade 8—across Gujarat.

The system processes a whole academic year’s data on attendance, test scores, school management, and socio-economic status for over 1 crore students. Using a tailored model for each grade, the system predicts a probability of dropout for every student and identifies the key risk factors driving that prediction. These results are shared through the government’s Child Tracking System (CTS)  used by school teachers and field officers.

Through the dashboard, educators and administrators can access:

A list of at-risk students

Get a prioritized list of students likely to drop out based on AI predictions

Top Risk Factors per Student

View the 3 most influential predictors driving each student’s dropout risk

Tailored Intervention Guidance

Actionable suggestions (e.g., parent meetings, transport access, tutoring, scholarships)

Actionable suggestions (e.g., parent meetings, transport access, scholarships).

The solution ensured data security, explainability , and semi-annual updates with refined model performance. It is designed to be scalable and integrated with existing education systems to maximize retention and learning outcomes.

The Bigger Picture

This isn’t just a data collection tool—it’s a foundation for sovereign AI development. Open-sourcing this platform means anyone building privacy-respecting, India-first AI solutions can adopt it, extend it, and deploy it confidently.

Who can use it

Primary End Users

School Teachers

Access risk lists, view individual student profiles with predictors and observations, and implement classroom-level interventions.

School Principals

Monitor school-wide dropout patterns, coordinate strategies, and oversee implementation of recommended actions.

Cluster Resource Coordinators (CRCs)

Support schools across clusters by guiding teachers, verifying interventions (e.g., attendance follow-ups), and tracking school-level implementation.

Other System Users

Vidya Samiksha Kendra, Gujarat

Policy implementation partner and primary data hub for monitoring and decision-making.

State Education Department

Uses aggregate insights to inform policies, allocate resources, and drive systemic reforms.

District Education Officers

Monitor trends at the district level and coordinate cross-school interventions.

Technical Teams / System Admins

Oversee AI model performance, system updates, and integration with CTS and SMA.

Extended Stakeholders

NGOs in Education

Utilize insights for targeted program design and school engagement strategies.

Academic & Research Institutions

Access anonymized data to study trends, develop models, and support policy research.

Other State Governments

Can adopt and localize the system for their school networks.

from the Noun Project

International Development Agencies

Use the platform as a model for scalable dropout prevention globally.

Key Features & Functionality

Early, Actionable Insights

Predicts dropout risk before it happens—empowering timely, targeted interventions by educators and administrators

AI-Driven

Models trained attendance, enrollment and semester end scores as data points deliver higher  precision over baseline across grades, tailored to each student’s context

Integrated & Scalable

Seamlessly plugged into existing platforms in Gujarat like the Child Tracking System (CTS) and School Monitoring Application (SMA)

Support for Frontline Decision-Makers

Provides clear, ranked predictors and intervention suggestions—no technical background needed

Cost & Resource Efficiency

Helps governments focus limited resources on students who need the most support, avoiding broad, untargeted efforts

Equity & Inclusion Focused

Surfaces systemic and social risk factors—supporting targeted outreach for students from vulnerable backgrounds

Explainable & Transparent

SHAP-based explainability ensures model decisions are interpretable by the way of predictor groups and highlighted top driving features

Performance Indicators

1 crore+

Students covered across Grades 3-8 in Gujarat

 3+ academic years

Model trained and validated using over three years of longitudinal education data

~50 features per student

Inputs include attendance, test scores, school metrics, and socio-economic indicators

Semi-annual prediction cycles

Two prediction rounds per year support timely intervention and tracking

Student-level intervention tracking

Student-level tracking

Downloadable action templates enable customized follow-up for each at-risk student

Precision-recall performance

At 20% recall, precision exceeds 60%; at 40% recall, maintains over 30% precision—demonstrating robustness in identifying high-risk students

Feedback-driven model refinement

Continual updates informed by inputs from CRCs and teachers in the field

Technical Architecture

Created by Prijun Koirala

from Noun Project

AI Models

CatBoost Models

Grade-specific models trained for tabular data with categorical variables.

Model Calibration

Isotonic regression ensures reliable risk probabilities.

Explainability Engine

SHAP values identify top risk factors for every prediction.

Created by Vectplus

from the Noun Project

Data Pipelines

Data Ingestion

Pulls from enrollment, attendance & assessment systems.

Data Cleaning & Validation

Handles 1+ crore records; ensures completeness and integrity.

Feature Engineering

Transforms raw inputs into approximately 50 ML ready features per student.

 Prediction & Analytics Engine

Risk Scoring

Generates dropout probability (0–100%) per student.

Predictor Grouping

Factors categorized into 6 actionable domains.

Intervention Mapping

Links predictors to guidance in downloadable playbooks.

Integration

CTS Sync

Uploads results into the Child Tracking System.

SMA Sync

Enables CRC workflows via School Monitoring App.

Unified Flow

Connects data, prediction, and action across platforms.

 User Interfaces

CTS Dashboard

For teachers/principals to view at-risk lists and log interventions.

SMA App (Mobile/Web)

Used by CRCs to track visits and field-level observations.

Admin BI Dashboard

Provides statewide insights for monitoring and planning.

Data Security & Storage

Secure Cloud Hosting

Deployed on AWS/GCP with end-to-end encryption.

PII Anonymization

Used during training; reversible post-prediction.

Role-Based Access

Granular control for field staff, admins, and analysts.

Technical Foundation

Cloud Infrastructure

  • Model development, storage, and serving hosted on AWS and Google Cloud

  • Platform (GCP) with secure data access protocols.

Data Sources

Extracts grievance texts, categories, department details, and requirements

Third-party Libraries & Tools

  • CatBoost for classification

  • SHAP for model explainability

  • Optuna for hyperparameter tuning

  • scikit-learn, NumPy, Pandas for data wrangling

  • Plotly, matplotlib for visualizations

Government Datasets

  • Student enrollment and attendance (U-DISE+, CTS - Samagra Siksha, Vidya Samiksha Kendra)

  • SAT scores and learning outcome-based assessments

  • School infrastructure and geographic data

Platform Integrations

  • Child Tracking System (CTS): Existing government platform for student tracking

  • School Monitoring Application (SMA): Government platform used by Cluster Resource Coordinators

  • U-DISE+: Unified District Information System for Education Plus (data source)

How to Use

Pre-requisties

(Languages, libraries, system requirements)

System Requirements

  • Python 3.11+

  • GPUs used for HyperParameter tuning; however, training is also possible on CPUs.

  • 8 vCPUs were used.

Usage Guide

Follow these steps to use the system

  • SHAP is used to quantify contributions of features to a model’s predictions

  • Contributions of features are grouped into predictors which are used to form guidelines for interventions

Contribution Guidelines

We welcome contributions! Please read our contribution guidelines before submitting PRs

How to Contribute

  1. Fork this repository

  1. Create a feature branch (git checkout -b feature-feature name)

  1. Make your changes and test thoroughly

  1. Submit a pull request with clear documentation

  1. Use Issues tab to report bugs or request new features

Opportunities for colloboration

We encourage contributions to

  • Government partners to integrate EWS into school systems like CTS/SMA and scale it across states.

  • Schools and teachers to validate predictors, provide field-level insights, and implement interventions.

  • NGOs and civil society to support on-ground execution and outreach to at-risk students.

  • Researchers and academic institutions to enhance models, analyze impact, and study dropout trends.

Inner-Source Info

This project is licensed under the Apache License 2.0, a permissive inner-source license that allows commercial use, modification, distribution, and private use. It requires preserving copyright and license notices, grants contributors’ patent rights, and permits redistribution under different terms without mandating source code disclosure.

Interested in Forking?

Reach out to us for more information on the source code, repository links and detailed usage guides—we'll email them to you!

Contact Us

Contact Us

Interested in Forking?

Reach out to us for more information on the source code, repository links and detailed usage guides—we'll email them to you!

Contact Us

Contact Us

Contributors

Team or Contributors

Digvijay Bhandari

Associate ML Scientist - II

Arvind

Machine Learning Scientist

Makarand Tapaswi

Pr ML Scientist

Manoj Karnik

Group Product Manager

Nirmit Zaveri

Product Manager

Contact Persons

Nirmit Zaveri

Product Manager

Email ID:

community.kiran@wadhwaniai.org

Acknowledgement

We acknowledge with gratitude the collaborative partnership that has made this Early Warning System (EWS) for school dropout prevention possible. This initiative has been developed at the behest of Vidya Samiksha Kendra - Samagra Shiksha, Department of Education, State of Gujarat, and in collaboration with UNICEF.


This project represents a pioneering collaboration between Vidya Samiksha Kendra, Gujarat, Wadhwani AI, and UNICEF to harness Machine Learning and Artificial Intelligence in addressing school dropouts. Through this partnership, we have published insights possible through data to enhance student retention and ensure every child stays in school and learns effectively.


We extend sincere appreciation to Vidya Samiksha Kendra, specifically MIS Department for providing comprehensive student data and program support, without which this transformative project would not have been realized.

This EWS demonstrates the power of collaborative innovation in education, uniting government institutions, technology and program partners for Gujarat's children.

Wadhwani AI @ 2025. All rights reserved.