Much of China’s most promising technical research struggles to connect with international audiences—not because of language, but because of how it’s written.

I’ve lately been reading a lot of tech-focused research from various Chinese universities, ever since DeepSeek AI became a thing. What struck me early on was their claim to have superior language capabilities in both Chinese and English — particularly for users whose native language isn’t English. That includes me, of course.

My problem is: I’m never quite sure what large sections of this material are actually trying to say. Reading it feels heavy, like trudging through molasses. I need a coffee every two minutes just to recover and take a deep breath.

It’s written in a dense, almost impenetrable style. Sentences begin in acronym soup, dive into unexplained methods, and end with claims that are either highly abstract or just... random. Often it feels like you need a specialized degree just to parse the sentences.

I don’t consider myself the dumbest person around — but I often find it genuinely difficult to even understand the basics of what I’m reading. That means rereading the same paragraph five times, only to realize: I still don’t know what they did, what they found, or why it matters.

And yet, China’s researchers are no less smart. Educated. Often rigorous. So how did we end up in a situation where global talent writes things the global audience can’t absorb?

I assume the fact they’re publishing in English and distributing via Western platforms means they want to be heard. But it’s not working (at least for me).

At this point, I can’t help but wonder if flooding the internet with incomprehensible research papers is China’s secret cyber strategy to overwhelm the West’s cognitive bandwidth. (Just kidding. Probably.)

古德云:「守口如瓶,防意如城。」
眾生的嘴巴也容易造惡業。時時刻刻檢討、反省自己,
不要起貪心、瞋心、癡心、慢心、疑心,不要生邪見,這就是防意如城。

My fortress is under siege.

To give you an example:

‘Are We in the AI-Generated Text World Already? Quantifying and Monitoring AIGT on Social Media’ comes from researchers at The Hong Kong University of Science and Technology — though not in Hong Kong proper, but in Guangzhou, where HKUST runs another campus in collaboration with the Guangzhou municipal government. It’s available on https://arxiv.org/abs/2412.18148v2.

And the entire paper is written in this style:

“Shapley Value is originally introduced in [49] and recently apply to machine learning interpretation. It quantifies the impact of each feature by perturbing the input value and observing the contributions in the prediction. We follow [48] for implementation.”

[49] is a reference (and so is [48]), but I’ll only torture you with this once:

[49] Lloyd S. Shapley. A value for n-person games. Contri-
bution to the Theory of Games, 2, 1953. 14

Contri-
bution?

I mean, okay. Let’s not dwell on it and move on.

Actually, no. Such a line break in "Contri-
bution" is typically a sign that it’s copied straight out of a bad PDF-to-text conversion, where the word was hyphenated at the line break in the original typeset version. And it also means they didn’t engage with their sources.

Since when do we only have footnotes in the text? And why stop there — let’s also randomly drop letters and write in 00100110?

“14” is not the page number in the cited work where you could find whatever they’re referencing (in this case, the so-called Shapley Value). Instead, it’s the page in their own article where they use this reference. Since Shapley introduced this concept in 1953, it’s unlikely he called it that himself, don’t you think? Hence, you can’t quote it like this.

Back to “14” — I have no idea why anyone would go to the references and use that number to return to the main text. In their defense, this nonsense didn’t originate in Chinese academic circles. I assume it comes from Cornell University, which runs arXiv (where they published it). Curse you, Cornell, for this!

In other words, normally a reference like this would mean page 14 in Shapley’s work — but here, it means their own page 14.

This is not for the reader’s benefit. It helps automated tools or indexing systems (like Semantic Scholar) track where references appear. But for a human reader? 💥 Useless.

What’s the “2”? I guess it’s volume 2. Well, I don’t have to guess — because I know:

Lloyd S. Shapley (1953), A Value for n-Person Games. In: Kuhn, H. and Tucker, A., Eds., Contributions to the Theory of Games II, Princeton University Press, Princeton, pp. 307–317.

Do you know why I’m going on about this?

Because — why on earth does this paper cite something from 1953 when all the other references are from 2023 and 2024? Not only that — they even reference documents as late as February 2025, even though the article was published on 24 December 2024 at 04:04:54 UTC (that’s noon in Guangzhou). Santa came early in the form of a PDF. Is this the 21st-century version of giving misbehaving children (researchers) lumps of coal? Probably not, since Christmas isn’t a public holiday in China.

Here’s the actual problem:

This is a paper about AI-generated text detection on social media.

Why are we citing 1953 game theory?

Shapley values are sometimes used in AI explainability — but they don’t use his original work here. What they use is something else. This is just name-dropping. It doesn’t add credibility, because they don’t make a clear argument.

They say they follow [48], which is:

M. Scott, Lee Su-In, et al. A unified approach to interpret- ing model predictions. Advances in Neural Information Processing Systems.

interpret-
ing” — again!

Also, who is “M. Scott”? And who are the mysterious “et al”? The only valid reference I can find is Scott M. Lundberg and Su-In Lee — writing something titled A Unified Approach to Interpreting Model Predictions. Guess what?

Scott and Su-In did it by themselves. So — no et al.

In serious academic work, this is not acceptable. It’s the kind of thing you might forgive from an undergrad on a tight deadline — but not from a published academic paper aiming to measure global AI influence.

But we are not done yet.

“It quantifies the impact of each feature by perturbing the input value and observing the contributions in the prediction.”

What does that even mean? Observing the contribution in the prediction.These are English words. I know the meaning of each word. But I cannot understand this.

“In the prediction” implies that the contributions are inside the prediction — like hidden ingredients in a dish. But a prediction is just an output — there’s nothing “in” it to observe. It violates natural semantic flow. It confuses subject-object relationships

I had to ask for help (from AI), and after several tries trying to reverse-engineer the sentence, this is where we landed:

“We apply Shapley values, as defined by Lundberg and Lee, to estimate how much each feature (e.g., length, sentiment, complexity) changes the model’s output — by perturbing inputs and measuring the resulting change in prediction.”

That’s clear and understandable. This sentence is not the exception — it is typical of the entire body of work. And based on my reading so far, it’s characteristic of the writing style adopted in much Chinese academic tech research.

And we’re still not done.
The Monkey King was warned in Journey to the West about the horrors awaiting him in the Ten Courts of Hell.
So you, dear reader, deserve a warning too:

“此处血海滔天,莫要多看。”(Here the sea of blood floods the sky — best not look too long.)

You were warned.
But if you’re as nosy as I am… then come along.

The following content is NC-17. The code needs no explanation — I assume you know what it stands for.


In the revised February version, the authors added three more “unseen” LLMs to test the robustness of their detector (OSM-Det). These were:

  • GLM-4-Flash
  • Gemini 1.5-Flash
  • Gemini 2.0-Flash

They report new detection results for these models in Appendix Table A8, showing how well their detector performs on text generated by them.

But… What They Didn’t Change:

Despite adding new LLMs, the main findings — especially the AI Attribution Rates (AAR) for social media platforms — remained exactly the same:

  • Medium: 1.77% → 37.03%
  • Quora: 2.06% → 38.95%
  • Reddit: 1.31% → 2.45%

These numbers are identical in both versions.

So What’s the Problem?

New LLMs ≠ new data samples?
Either they added new models but didn’t re-run the platform-wide detection — which undermines the whole claim that they’re measuring real-world AI usage accurately.

OSM-Det might have been trained on different models than those used for social media estimation.
This is hinted at but never clarified. If the AAR rates were estimated using a fixed version of OSM-Det trained only on AIGTBench models, then the addition of new LLMs in Table A8 may have no effect on the overall AAR — but they don’t explain this clearly.

No changelog, no disclaimer.
There’s no indication of whether they re-ran the social media analysis with the updated model evaluation, or whether those results are frozen from the earlier version.

Their platform-level AI usage numbers are presented as authoritative — yet despite a major update to the models considered “unseen,” nothing changed in the real-world detection results.

This Calls Into Question:

  • The sensitivity of their detector to different LLM outputs
  • Whether they updated their dataset or not
  • And most importantly: how meaningful their reported AARs actually are

It’s a real methodological red flag — and disqualifies the entire work from being treated as serious science.

Unfortunately, this is more common than we’d like.
I only checked one footnote because of 1953.
Who knows what other Leichen im Keller they have. (This German expression means skeletons in the closet, but we hide corpses instead of skeletons — and we use basements, not closets. Which makes perfect sense, since many American homes don’t have basements… but we do in Germany.)

A Few Closing Comments

The writing style that leads to this dense, formal, and structurally confusing academic tone is counterproductive. Some recurring issues:

  • A tendency toward hyper-formality, passive constructions, and abstract language
  • Key terms introduced without context, or defined after they’ve already been used
  • Heavy use of acronyms and self-invented terminology (e.g. “OSM-Det,” “SM-D,” “AAR”) with no intuitive grounding
  • Sentences pile up without advancing understanding — the words are there, but meaning isn’t

Many of these papers become unreadable to non-specialists — and, frankly, to many specialists as well. You can see the effect in how often (or how little) such work is actually cited.

Western readers expect a more intuitive flow:
intro → question → method → findings → discussion → limitations

In contrast, many Chinese-authored papers start in the middle, bury the context, or overload the reader with formalism. That needs to stop. If this is what their foreign English teachers taught them — it’s time to change horses.

Personally, I think it’s a pity.
I believe the authors put genuine effort into providing an English version for the benefit of people like me — not for themselves. And yet, I can’t get much out of it.

In the Chinese academic and cultural context, clarity is not always a virtue — harmony and deference often are.
But science needs clarity — in fact, pursuing it ruthlessly is the only way to make real progress.

And if Chinese research is to influence thinking outside China, it needs to learn to speak in a voice the world can hear.

That’s my 2 cents of wisdom.

And if you’re wondering what some Western readers feel when trying to understand dense AI papers from China, here’s how that experience might translate back into Chinese:

于理解之维度起始之端,常困于不解之定义,然本无端也。以比拟之名论实,或失语于象外而非之中。

Feels like it means something, right? But what?
Welcome to my world. 😅

And I do think this article should be used to teach at HJKUST: 

The Werner Paper – How to Speak English Like a German

Yup, that’s good!

PS:

  • “于理解之维度…” = "At the starting edge of understanding dimensions…"
  • “常困于不解之定义” = "Often trapped in undefined definitions…"
  • “然本无端也” = "Yet perhaps there was never a beginning…"
  • “以比拟之名论实” = "Using analogy to debate reality…"
  • “或失语于象外而非之中” = "…one may lose language outside the symbol, not within."

Feels like it means something, right? But what?
Welcome to my world. 😅

This post was first published on Substack 29 March 2025.