Five markets instead of one
The article “Language and Geography of Visibility” examines the mechanisms through which language affects visibility. Here we look from the other direction — from data to conclusion: what exactly happens to a specific brand and its competitors when the language is switched.
We chose Notion — a globally recognizable productivity product. Deliberately: we needed a brand the model clearly knows well, so that we could rule out the explanation of “there just isn’t enough data.” Five runs, five languages, one model, one scenario corpus.
Notion’s own score fluctuated moderately: from 62.9 in French to 75.7 in German, a spread of 12.8 points. The confidence intervals of the runs partially overlap. If we had stopped there, the conclusion would have been calm: “small fluctuations, possibly model noise.”
But then we looked at the competitors.
The competitor matrix: the core evidence
| Brand | RU | EN | ES | FR | DE | Spread |
|---|---|---|---|---|---|---|
| Notion (target) | 71.2 | 68.8 | 69.1 | 62.9 | 75.7 | 12.8 |
| Slack | 0.0 | 51.0 | 53.8 | 54.3 | 54.9 | 54.9 |
| Monday.com | 47.4 | 30.5 | 29.0 | 7.8 | 13.0 | 39.5 |
| Asana | 70.1 | 52.6 | 51.1 | 39.7 | 59.1 | 30.4 |
| Microsoft Copilot | 36.2 | 39.9 | 42.8 | 51.1 | 24.8 | 26.3 |
| ClickUp | 67.3 | 59.6 | 63.1 | 54.5 | 62.7 | 12.8 |
| Coda | 46.0 | 46.7 | 38.4 | 41.1 | 43.6 | 8.2 |
| Airtable | 33.5 | 37.5 | 30.5 | 28.3 | 40.4 | 12.1 |
| Confluence | 22.3 | 13.2 | 16.6 | 24.0 | 20.6 | 10.8 |
Notion’s spread is 12.8. Slack’s is 54.9. Monday.com’s is 39.5. Asana’s is 30.4. That is five different competitive landscapes compressed into one table.
Three patterns we observed
Binary disappearance: Slack
In Russian, Slack receives a score of 0.0 — the model does not mention it at all in the context of productivity and workspace tools. In the other four languages, the result is stable: 51–55 points, with a spread of just 3.8 points. That kind of stability across four languages, combined with a complete zero in the fifth, is a strong argument that this is a stable property of the Russian-language field rather than a random outlier.
The explanation most likely lies in the training data: Slack is actively discussed in English-, French-, and German-language sources as a team collaboration tool. In Russian-language sources, it is discussed hardly at all. The model did not lose knowledge of Slack; in this context, it never acquired it in the first place.
A gradient of disappearance: Monday.com
Monday.com shows a smooth decline from 47.4 in Russian to 7.8 in French. This is a third pattern, distinct both from Notion’s stability and from Slack’s binary switch. The brand seems to melt as it moves between language fields — preserving its presence, but losing weight.
Inversion: Microsoft Copilot
Where Notion is strongest (German — 75.7), Copilot is weakest (24.8). In French, the picture reverses: Notion is at 62.9, Copilot at 51.1. The two brands seem to sit on opposite ends of a seesaw, and language determines which one ends up higher. Based on our observations, this may be related to Microsoft’s activity in French-speaking European markets — but the data is not sufficient to make that claim with confidence.
Knowledge is stable; recommendation is not
An independent analysis of our data revealed a pattern that may matter even more than the competitor matrix itself.
When the brand is already named in the prompt (diagnostic mode), the model answers about it with the same stability across all languages: 73–79 points, with a coefficient of variation of 3.7%. The model knows Notion equally well in Russian, French, and German.
The divergence begins when the user has not yet named the brand. Average position in the answer, whether it makes the top three, citation of the notion.so domain — all of this depends heavily on language. In Russian, notion.so is cited in 24.5% of answers; in German, in 21.4%; in French, in 0%.
For a brand, this leads to an uncomfortable conclusion: the model’s knowledge of you is a necessary but insufficient condition. The real question is whether you make the shortlist before the user has spoken your name. The answer to that question depends on language.
Three mechanisms we hypothesize
We see three channels through which language reshapes the competitive field. All three are hypotheses supported by the data from these runs, but not yet experimentally validated.
The first is asymmetry in training corpora. The model was trained on texts in which different brands are discussed with different frequency across different languages. Russian-language texts about productivity barely mention Slack; English-language texts mention it constantly.
The second is different web sources. In web mode, the model searches in the language of the query and finds different reviews, comparisons, and rankings — with different brand mixes. French-language search returns French-language sources in which Notion is known, but notion.so is not cited.
The third is different associative category graphs. In each language, the model builds its own map of the “productivity” category. In Russian, that map is Notion, Asana, ClickUp, Monday.com. In French, it is Notion, Slack, Microsoft Copilot, ClickUp. The cast of players differs, and that determines who makes it into the recommendation.
What this means in practice
For a category-leading brand, VLF is more a strategic task than a crisis. Your own score fluctuates moderately, but the competitors you are fighting on one language may be different on another. A strategy built around one set of competitors risks becoming irrelevant in a neighboring language market.
For a brand that is not the dominant player, the situation is harsher. Monday.com loses 40 points in the shift from Russian to French. Slack disappears entirely in Russian. If your brand holds second or third place, VLF is a direct business risk: visibility earned in one language does not automatically transfer to another.
The practical recommendation is simple: run a separate study in every target market language. Compare not only your own score, but also the composition of the competitive field. A visibility growth strategy must account for the specific competitors that exist in each language — because across languages, those may be different companies.
Methodological notes
The data comes from five runs of one brand (Notion) on GPT-5.4. All runs used the standard AI100 corpus of 200 scenarios. Two runs (RU and FR) were conducted on April 2, and three (EN, DE, ES) on April 3, 2026. Differences between days (the day effect) were not separated from the language effect.
The confidence intervals of the final scores partially overlap. Cochran’s Q test estimates the probability that the entire spread is explained by model noise at 4–8% — right on the edge of statistical significance. But the structural patterns — Slack’s stability across four languages with a zero in the fifth, the Monday.com gradient, the Copilot inversion — are poorly explained by stochasticity.
The main limitation is clear: one brand, one model, one category, one run per language. Proper validation requires repeated runs (at least 5 per language) and tests on other brands and models. We call this observation VLF and consider it well grounded enough for publication, but not sufficiently validated for final conclusions.
Changing the language of the prompts reshapes a brand’s competitive environment: some competitors appear, others disappear, and others radically change position. At the same time, the model’s diagnostic knowledge of the brand remains stable — what changes is specifically the recommendation layer.
One brand, one model, one run per language is not enough to draw conclusions about the scale of the effect in other categories and on other models. The exact boundary between the language effect and stochastic model noise has not yet been established.
An international brand needs to test visibility separately in every target market language. The result of an English-language run does not transfer to other languages — especially for brands that are not unambiguous category leaders.
Sources
Related materials
Visibility through the lens of language and geography
Why the same brand looks different in AI answers across different languages and countries — and what practical consequences follow.
Open the material →The “answer bubble”: why the same brand looks different in ChatGPT, Google, Copilot, and other systems
Why there is no single AI visibility: the same brand can look noticeably different across ChatGPT, Google AI Overviews, Copilot, and Perplexity.
Open the material →How this connects to AI100 in practice
If you need something more specific than a general overview, AI100 can test how a model sees your company in neutral decision scenarios, which competitors outrank you, and which interventions are most likely to improve visibility.
See the sample report