Semaglutide on TikTok: Popularity Doesn't Track Quality
Primary source: PUBMED 42077173
The central finding of a new cross-platform study (PMID: 42077173) is both unsurprising and worth stating plainly: on TikTok and Bilibili, the videos that perform best on semaglutide are not the ones that are most accurate. The correlation between likes and quality scores is r = 0.151. That is barely above noise. My position is that this is not a content problem. It is a ranking problem, and the distinction matters for anyone thinking about how to fix it.
What We Already Knew
Semaglutide (Ozempic, Wegovy) has become one of the most socially discussed pharmaceutical compounds in recent memory. The GLP-1 receptor agonist’s clinical weight-loss efficacy, established in trials including the STEP-1 trial (NCT03548935, Novo Nordisk-funded, published PMID: 33567185), generated genuine public interest. That interest predictably migrated to short-video platforms, where format constraints make nuance structurally difficult.
Prior research on health information quality on social media had already established that engagement and accuracy diverge. Studies on COVID-19 content (PMID: 32323939) and cancer-related videos (PMID: 31710518) documented the same pattern: misinformation travels faster and farther than corrections. The question for semaglutide specifically was whether platform type mattered and whether creator category predicted reliability.
What the Study Actually Found
Zeng and colleagues examined 200 semaglutide-related videos, split between TikTok and Bilibili, applying three validated assessment instruments: JAMA benchmark criteria, the Global Quality Scale, and the DISCERN tool. The dataset is modest but the methodology is defensible for a descriptive cross-sectional study of this kind.
The headline findings, with specifics:
Platform parity. No significant difference emerged between TikTok and Bilibili on overall quality scores. Both averaged moderate quality. The hypothesis that one platform’s content ecosystem would produce better health information than the other does not hold here.
Creator category matters. Non-professional organizations scored higher on JAMA criteria than individual creators. Medical information videos substantially outperformed personal experience content. This is the clearest actionable signal in the data: the format and source type of a video are better proxies for reliability than anything the engagement metrics tell you.
Engagement is a poor proxy. Likes correlated with quality at r = 0.151. Longer videos showed higher reliability scores but did not attract more engagement. In other words, the content that takes time to be accurate is systematically underrewarded by the platforms’ own ranking signals.
Overall quality was moderate, not catastrophically poor. The study does not document a landscape of pure misinformation. What it documents is a gap between what the platform surfaces and what earns quality marks.
My Reading of the Evidence
The r = 0.151 correlation deserves to be the number this study is remembered for. It is weak enough that you could almost describe engagement and quality as independent variables. That is not a quirk of semaglutide or of these two platforms. It is how attention economics work when the optimization target is watch time and sharing, not accuracy.
What I find more interesting than the quality scores themselves is the creator-category finding. Non-professional organizations outperforming individual creators on JAMA criteria suggests that institutional accountability, even lightweight accountability like an organization’s name being attached to content, correlates with something. The JAMA benchmark’s “attribution” criterion rewards content that cites sources or identifies credentials. Individual creators who are sharing personal weight-loss experiences simply are not building content to that standard, and they have no particular reason to.
The medical information versus personal experience gap is similarly structural. Someone narrating their own semaglutide journey has a different goal than someone synthesizing clinical data. Conflating them as comparable “health information” is where platforms and public health communicators often go wrong.
What This Evidence Does Not Settle
The study is cross-sectional and the sample is 200 videos. That captures a moment, not a trend. Semaglutide content on TikTok is not static. The volume of videos, the creator mix, and the algorithmic weighting of health content all shift. A longitudinal replication would tell us whether the moderate-quality picture is improving or degrading as the compound becomes more culturally embedded.
The DISCERN tool was developed for written materials and has been adapted, but not purpose-built, for short-form video. It may not fully capture the specific misinformation risks of video: visual before/after framing, anecdotal narrative structure, selective disclosure of side effects. A purpose-built instrument for pharmaceutical short-video content would give future studies sharper teeth.
I also want to be precise about what “moderate quality” means and does not mean. The study does not report what proportion of videos contained specific false claims about semaglutide dosing, eligibility, or side-effect profiles. The quality instruments used measure process features (attribution, currency, balance) rather than factual accuracy claim by claim. A video can score reasonably on DISCERN and still contain a claim that would concern an endocrinologist. That gap between process quality and propositional accuracy is the measurement challenge the field has not yet solved.
The Platform Architecture Problem
The authors recommend algorithmic changes to prioritize credentialed creators and evidence-based content. That is the correct prescription. It is also the prescription the field has been writing for a decade, and platforms have not filled it. The reason is not ignorance. Health content moderation at scale is genuinely hard and genuinely expensive, and the engagement signal that currently drives rankings is cheap and self-generating.
What would actually shift the picture is one of two things. Either a platform builds, or licenses, a real-time health-claim verification layer with sufficient coverage to demote inaccurate content before it accumulates engagement. Or a regulatory body in a major market imposes a liability structure that makes the current hands-off approach costly enough to change. The EU’s Digital Services Act creates some pressure in this direction, but pharmaceutical-specific enforcement has been limited so far.
Until one of those things happens, the Zeng et al. finding is a baseline, not an anomaly. Semaglutide is the high-interest compound right now. The next GLP-1 with a viral cultural moment will produce the same picture.