LibSenti: Library Review Sentiment Predictor & Analyst

Live at https://libsenti.streamlit.app/

LibSenti is an end-to-end AI-powered application that leverages machine learning and natural language processing (NLP) to classify sentiment polarity Positive, Neutral, or Negative from student-submitted reviews of IIT and NIT libraries. The system combines a robust backend sentiment analysis model with an interactive Streamlit application for real-time predictions, data visualization, and institutional insights.

This project focuses on handling class imbalance and improving neutral sentiment detection, which is a common challenge in real-world sentiment analysis systems.

Key Features

Real-time review prediction through an interactive interface
Visualization of sentiment probabilities for each prediction
Unigram word cloud generation for individual institutions
Bigram comparison between institutions for phrase-level insights
Sentiment distribution comparison using pie charts
IIT vs NIT aggregate sentiment analysis
Highlighted best and worst user experiences based on sentiment

Model Details

Architecture: RoBERTaForSequenceClassification (roberta-base)
Dataset: IITs and NITs library reviews labeled as Positive, Neutral, or Negative
Frameworks: PyTorch, Hugging Face Transformers
Accuracy: ~95.3% (weighted F1-score optimized)
Achieved balanced performance across all classes with minimal bias toward dominant classes
Deployed as a live web application using Streamlit Cloud with model hosting on Hugging Face Hub

Training Strategy

Manual class weighting to address imbalance
Oversampling of the neutral class to improve recall
Label smoothing (0.1) for better generalization on ambiguous samples
Fine-tuned using Hugging Face Trainer with early stopping and best model selection based on weighted F1-score

Project Structure

LibSenti/ ├── app.py
├── requirements.txt
├── README.md
├── assets/ │ ├── unigram_wordclouds/ │ ├── bigram_wordclouds/ │ ├── piecharts/ │ └── iit_vs_nit_sentiment_comparison.png
├── data/ │ ├── raw_reviews.csv
│ └── sentiment_iit+nit.csv

Sentiment Categories

Label	Class
Negative	0
Neutral	1
Positive	2

Class imbalance is handled using a combination of:

Manual class weighting in the loss function
Oversampling of the neutral class
Label smoothing to reduce overconfidence

Model Training

Base Model: roberta-base
Framework: Hugging Face Transformers with PyTorch

Training Strategy

Weighted cross-entropy loss for imbalance handling
Manually tuned class weights
Neutral class oversampling to improve F1-score
Label smoothing (0.1) for ambiguous sentiment handling
Maximum sequence length = 512 for improved context understanding
Learning rate of 8e-6 for stable fine-tuning
Early stopping to prevent overfitting
Best model selected based on weighted F1-score

Evaluation Metrics

Accuracy
Weighted F1-score
Precision and Recall (per class)

Run Training

python train_model.py

This Script Performs

Data preprocessing
Tokenization
Model training
Class balancing
Model saving to ./saved_model/

Performance Insights

Strong performance across all classes (F1 ≈ 0.93–0.96)
Neutral class performance significantly improved (F1 increased from ~0.82 to ~0.96)
Balanced performance achieved across all sentiment classes

Key Challenge

Neutral sentiment classification is inherently difficult due to semantic ambiguity.

Solution Approach

Data balancing
Label smoothing
Increased context window (512 tokens)

Streamlit Application

To launch the application:

streamlit run app.py

Components

Review classifier for real-time sentiment prediction
Probability visualization for model confidence
Word cloud comparison (unigram and bigram)
Sentiment distribution charts for institutions
IIT vs NIT comparative analysis
Highlighted user experiences

Future Improvements

Integrate explainability tools such as LIME or SHAP
Expand dataset to include more institutions
Add metadata-based filtering (date, role, etc.)
Implement topic modeling for trend analysis
Introduce sentiment timeline visualization
Add user feedback loop for model improvement
Support multilingual sentiment analysis

Author

Aman Srivastava
amansri345@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.devcontainer		.devcontainer
Images		Images
assets		assets
data		data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LibSenti: Library Review Sentiment Predictor & Analyst

Key Features

Model Details

Training Strategy

Project Structure

Sentiment Categories

Model Training

Training Strategy

Evaluation Metrics

Run Training

This Script Performs

Performance Insights

Key Challenge

Solution Approach

Streamlit Application

Components

Future Improvements

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LibSenti: Library Review Sentiment Predictor & Analyst

Key Features

Model Details

Training Strategy

Project Structure

Sentiment Categories

Model Training

Training Strategy

Evaluation Metrics

Run Training

This Script Performs

Performance Insights

Key Challenge

Solution Approach

Streamlit Application

Components

Future Improvements

Author

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages