TY - CONF
A1 - Isawasan, Pradeep
A1 - Asmawi, Muhammad Akmal Hakim Ahmad
A1 - Ong, Song Quan
A1 - Ooi, Boonyaik Yaik
A1 - Savita, K. S.
PB - Institute of Electrical and Electronics Engineers Inc.
N2 - Short-form comments on TikTok's Beauty & Personal Care videos are rich yet noisy signals of consumer sentiment. This study deploys a transformer-based pipeline that couples GPT-derived sentence embeddings with BERTopic to surface salient discussion themes. We scraped 3,912 videos from the 20 highest-revenue beauty influencers and retrieved 34,597 engagement-ranked comments posted between January and August 2024. After a GPT -powered normalisation step that expands slang, translates Malay abbreviations, and converts emoji to text, comments were embedded with OpenAI's text-embedding-3-small model, reduced via UMAP, clustered using HDBSCAN, and distilled into topics with BERTopic. Topic quality was evaluated with four coherence metrics (cv, umass, cuci, cnpmi) to ensure semantic consistency. The model revealed two coherent, non-overlapping themes: product-usage experiences (e.g., frequency of use, perceived results) and product-feature commentary (e.g., scent, packaging, variants). Acv score of 0.620 indicates strong interpretability despite the brevity and informality of TikTok discourse. These findings show that embedding-based topic modeling can unearth actionable insights: influencers should foreground authentic usage outcomes, while brands can boost engagement by highlighting sensory cues in visual storytelling. The study demonstrates the viability of LLM -enhanced analytics for short social media text and provides a replicable framework for future TikTok commerce research, noting limitations and opportunities for multi-category, cross-regional, and multimodal extensions. © 2025 IEEE.
SP - 444
UR - https://www.scopus.com/inward/record.uri?eid=2-s2.0-105023663181&doi=10.1109%2FAiDAS67696.2025.11213658&partnerID=40&md5=29a5999867ca6a8f374e658707f42355
SN - 9798331586034
Y1 - 2025///
ID - scholars20415
N1 - Cited by: 0
EP - 448
AV - none
KW - Embeddings; Beauty care; Beauty discourse; Bertopic; Comprehensive analysis; GPT; Noisy signals; Personal care; Tiktok; Topic Modeling; Semantics
TI - Comprehensive Analysis of Beauty Community Discourse on TikTok Through GPT Embeddings and BERTopic Modeling
ER -