Emrul Hasan| Machine Learning, NLP & Recommender Systems

Multimodal LLM for news media bias detection towards Responsible AI

In this project, we developed a novel multimodal dataset called NewsMediaBias-Plus for analyzing media bias and disinformation through news articles and associated visual content. The focus of the project is to mitigate the biases in political news, particularly in North American contexts. The news articles were scraped from several mainstream journals including but not limited to CBC, BBC, The New York Times, etc. We applied LLMs (llama and Mistral series) for the next annotation, and Multimodal models (CLIP, LLaVA series, MiniCPM, PaliGemma, and Phi3 Vision) for image annotation. We also fine-tuned PaliGemma, and Llama (3, 3.1) and developed several benchmark models.

Project details

Dataset link

MRRRec: Multi-criteria Rating and Review based Recommendation Model

In this project, we develop a novel multicriteria recommendation system that integrates multicriteria rating-based features with textual review features. It captures implicit criteria ratings from textual reviews using an attention mechanism and integrates them with explicit criteria ratings. The criteria that are explicitly defined by the business are referred to as explicit criteria whereas implicit criteria are discussed in the text. The combined features are then processed through a Deep Neural Network (DNN) to estimate overall ratings or preferences. The proposed deep multicriteria recommendation model outperforms existing baselines in terms of Mean Squared Error (MSE), Mean Absolute Error (MAE), precision, recall, and F1 score.

Article Link

Source Code

Aspect-Aware Multi-Criteria Recommendation Model with Aspect Representation

A novel multi-criteria recommendation model is proposed by leveraging criteria ratings. It is a two-step process. The first step involves fine-tuning BERT for criteria rating prediction with an additional aspect representation layer. These predicted ratings are then aggregated using a deep neural network to estimate overall ratings. The model was evaluated on three datasets-Tripadvisor, RateBeer, and BeerAdvocate, outperforming the several baselines.

Article Link

Multicriteria Recommendation System by Leveraging Predefined, Implicit, and Undefined Criteria

We propose a novel multi-criteria recommendation model that utilizes predefined, implicit, and undefined criteria. We use a semantic similarity-based sentence clustering method to identify the predefined and implicit criteria and a sentiment analyzer to estimate their ratings. Semantic similarity between each sentence in the review and the predefined criteria is calculated by using cosine similarity measure, and then the sentence is assigned to the most similar criteria. A sentence is considered as expressing opinions on an undefined criterion if the similarity score between this sentence and all the predefined criteria is lower than a predefined threshold. Ratings are computed for each extracted implicit criterion and the undefined criterion based on the review content. Finally, we use all three types of criteria and an aggregation model to make the final rating prediction for the recommendation system. Our proposed method demonstrates the superiority compared to several baselines on TripAdvisor and Beer Advocate datasets.

Accommodation Review Ranking for Tourism Recommendation

This project is a RecTour Challenge offered by Booking.com as a part of a workshop at the RecSys 2024 conference in Italy. We were challenged to rank accommodation reviews based on user preferences. Our team own the competition at a rank of second position and published a challenge paper at RecSys2024 conferences. In this project, we propose an accommodation review ranking methodology by creating user and item profiles from the available features. User and item profiles are encoded using a sentence transformer for feature extraction, and finally similarity between user profiles and item features. The ranking performance is assessed with MRR@10 (Mean Reciprocal Rank) and Precision@10. Our results demonstrate that beyond helpfulness votes, leveraging additional features (e.g., accommodation type, review title, positive aspects of reviews, etc.) significantly improves performance

Project Link

Source Code