Problem
India faces a significant school dropout problem, particularly in secondary and higher secondary education. Dropouts stem from a complex interplay of social, economic, health, and academic factors—but education systems often lack tools to detect these issues early. The challenge lies in the lack of insight into which students are at risk of dropping out, and an inefficient response mechanism to trigger timely, targeted interventions. This data gap and reactive approach limit the effectiveness of retention efforts, especially for underserved communities.
Why is it important to solve this?
Systemic Impact on Education
Student dropouts undermine the entire education ecosystem, affecting teacher morale, school performance metrics, and community educational outcomes. Each dropout represents a failure of the system to deliver on its promise of inclusive, quality education.
Economic Consequences
Dropouts face significantly reduced lifetime earning potential, perpetuating cycles of poverty and limiting economic mobility. At a macro level, high dropout rates reduce the skilled workforce, hampering economic growth and competitiveness.
Social Equity
Dropout rates are often highest among marginalized communities, exacerbating existing inequalities. Addressing this issue is crucial for achieving Sustainable Development Goal 4 (Quality Education) and upholding the Right to Education Act 2009.
Long-term Societal Cost
The cumulative impact of educational dropouts includes increased social welfare costs, reduced tax revenue, and diminished civic participation, creating a burden on society that compounds over generations.
Our Solution
The Early Warning System (EWS) is a machine learning-powered solution designed to identify students at risk of dropping out—from Grade 3 to Grade 8—across Gujarat.
The system processes a whole academic year’s data on attendance, test scores, school management, and socio-economic status for over 1 crore students. Using a tailored model for each grade, the system predicts a probability of dropout for every student and identifies the key risk factors driving that prediction. These results are shared through the government’s Child Tracking System (CTS) used by school teachers and field officers.
Through the dashboard, educators and administrators can access:
A list of at-risk students
Get a prioritized list of students likely to drop out based on AI predictions
Top Risk Factors per Student
View the 3 most influential predictors driving each student’s dropout risk
Tailored Intervention Guidance
The solution ensured data security, explainability , and semi-annual updates with refined model performance. It is designed to be scalable and integrated with existing education systems to maximize retention and learning outcomes.
The Bigger Picture
This isn’t just a data collection tool—it’s a foundation for sovereign AI development. Open-sourcing this platform means anyone building privacy-respecting, India-first AI solutions can adopt it, extend it, and deploy it confidently.
Who can use it
Primary End Users
School Teachers
Access risk lists, view individual student profiles with predictors and observations, and implement classroom-level interventions.
School Principals
Monitor school-wide dropout patterns, coordinate strategies, and oversee implementation of recommended actions.
Cluster Resource Coordinators (CRCs)
Support schools across clusters by guiding teachers, verifying interventions (e.g., attendance follow-ups), and tracking school-level implementation.
Other System Users
Vidya Samiksha Kendra, Gujarat
Policy implementation partner and primary data hub for monitoring and decision-making.
State Education Department
Uses aggregate insights to inform policies, allocate resources, and drive systemic reforms.
District Education Officers
Monitor trends at the district level and coordinate cross-school interventions.
Technical Teams / System Admins
Oversee AI model performance, system updates, and integration with CTS and SMA.
Extended Stakeholders
NGOs in Education
Utilize insights for targeted program design and school engagement strategies.
Academic & Research Institutions
Access anonymized data to study trends, develop models, and support policy research.
Other State Governments
Can adopt and localize the system for their school networks.
from the Noun Project
International Development Agencies
Use the platform as a model for scalable dropout prevention globally.
Key Features & Functionality
Early, Actionable Insights
Predicts dropout risk before it happens—empowering timely, targeted interventions by educators and administrators
AI-Driven
Models trained attendance, enrollment and semester end scores as data points deliver higher precision over baseline across grades, tailored to each student’s context
Integrated & Scalable
Seamlessly plugged into existing platforms in Gujarat like the Child Tracking System (CTS) and School Monitoring Application (SMA)
Support for Frontline Decision-Makers
Provides clear, ranked predictors and intervention suggestions—no technical background needed
Cost & Resource Efficiency
Helps governments focus limited resources on students who need the most support, avoiding broad, untargeted efforts
Equity & Inclusion Focused
Surfaces systemic and social risk factors—supporting targeted outreach for students from vulnerable backgrounds
Explainable & Transparent
SHAP-based explainability ensures model decisions are interpretable by the way of predictor groups and highlighted top driving features
Performance Indicators
1 crore+
Students covered across Grades 3-8 in Gujarat
3+ academic years
Model trained and validated using over three years of longitudinal education data
~50 features per student
Inputs include attendance, test scores, school metrics, and socio-economic indicators
Semi-annual prediction cycles
Two prediction rounds per year support timely intervention and tracking
Downloadable action templates enable customized follow-up for each at-risk student
Precision-recall performance
At 20% recall, precision exceeds 60%; at 40% recall, maintains over 30% precision—demonstrating robustness in identifying high-risk students
Feedback-driven model refinement
Continual updates informed by inputs from CRCs and teachers in the field
Technical Architecture

Created by Prijun Koirala
from Noun Project
AI Models
CatBoost Models
Grade-specific models trained for tabular data with categorical variables.
Model Calibration
Isotonic regression ensures reliable risk probabilities.
Explainability Engine
SHAP values identify top risk factors for every prediction.
from the Noun Project
Data Pipelines
Data Ingestion
Pulls from enrollment, attendance & assessment systems.
Data Cleaning & Validation
Handles 1+ crore records; ensures completeness and integrity.
Feature Engineering
Transforms raw inputs into approximately 50 ML ready features per student.
Prediction & Analytics Engine
Risk Scoring
Generates dropout probability (0–100%) per student.
Predictor Grouping
Factors categorized into 6 actionable domains.
Intervention Mapping
Links predictors to guidance in downloadable playbooks.
Integration
CTS Sync
Uploads results into the Child Tracking System.
SMA Sync
Enables CRC workflows via School Monitoring App.
Unified Flow
Connects data, prediction, and action across platforms.
User Interfaces
CTS Dashboard
For teachers/principals to view at-risk lists and log interventions.
SMA App (Mobile/Web)
Used by CRCs to track visits and field-level observations.
Admin BI Dashboard
Provides statewide insights for monitoring and planning.
Data Security & Storage
Secure Cloud Hosting
Deployed on AWS/GCP with end-to-end encryption.
PII Anonymization
Used during training; reversible post-prediction.
Role-Based Access
Granular control for field staff, admins, and analysts.
Technical Foundation
Cloud Infrastructure
Model development, storage, and serving hosted on AWS and Google Cloud
Platform (GCP) with secure data access protocols.
Data Sources
Extracts grievance texts, categories, department details, and requirements
Third-party Libraries & Tools
CatBoostfor classificationSHAPfor model explainabilityOptunafor hyperparameter tuningscikit-learn,NumPy,Pandasfor data wranglingPlotly,matplotlibfor visualizations
Government Datasets
Student enrollment and attendance (U-DISE+, CTS - Samagra Siksha, Vidya Samiksha Kendra)SAT scores and learning outcome-based assessmentsSchool infrastructure and geographic data
Platform Integrations
Child Tracking System (CTS): Existing government platform for student tracking
School Monitoring Application (SMA): Government platform used by Cluster Resource Coordinators
U-DISE+: Unified District Information System for Education Plus (data source)
How to Use
Pre-requisties
(Languages, libraries, system requirements)
System Requirements
Python 3.11+
GPUs used for HyperParameter tuning; however, training is also possible on CPUs.
8 vCPUs were used.
Usage Guide
Follow these steps to use the system
SHAP is used to quantify contributions of features to a model’s predictions
Contributions of features are grouped into predictors which are used to form guidelines for interventions
Contribution Guidelines
We welcome contributions! Please read our contribution guidelines before submitting PRs
How to Contribute
Fork this repository
Create a feature branch (git checkout -b feature-feature name)
Make your changes and test thoroughly
Submit a pull request with clear documentation
Use Issues tab to report bugs or request new features
Opportunities for colloboration
We encourage contributions to
Government partners to integrate EWS into school systems like CTS/SMA and scale it across states.
Schools and teachers to validate predictors, provide field-level insights, and implement interventions.
NGOs and civil society to support on-ground execution and outreach to at-risk students.
Researchers and academic institutions to enhance models, analyze impact, and study dropout trends.
Inner-Source Info
This project is licensed under the Apache License 2.0, a permissive inner-source license that allows commercial use, modification, distribution, and private use. It requires preserving copyright and license notices, grants contributors’ patent rights, and permits redistribution under different terms without mandating source code disclosure.
Contributors
Team or Contributors
Digvijay Bhandari
Associate ML Scientist - II
Arvind
Machine Learning Scientist
Makarand Tapaswi
Pr ML Scientist
Manoj Karnik
Group Product Manager
Nirmit Zaveri
Product Manager
Contact Persons
Nirmit Zaveri
Product Manager
Email ID:
community.kiran@wadhwaniai.org
Acknowledgement
We acknowledge with gratitude the collaborative partnership that has made this Early Warning System (EWS) for school dropout prevention possible. This initiative has been developed at the behest of Vidya Samiksha Kendra - Samagra Shiksha, Department of Education, State of Gujarat, and in collaboration with UNICEF.
This project represents a pioneering collaboration between Vidya Samiksha Kendra, Gujarat, Wadhwani AI, and UNICEF to harness Machine Learning and Artificial Intelligence in addressing school dropouts. Through this partnership, we have published insights possible through data to enhance student retention and ensure every child stays in school and learns effectively.
We extend sincere appreciation to Vidya Samiksha Kendra, specifically MIS Department for providing comprehensive student data and program support, without which this transformative project would not have been realized.
This EWS demonstrates the power of collaborative innovation in education, uniting government institutions, technology and program partners for Gujarat's children.
Wadhwani AI @ 2025. All rights reserved.




