Wikipedia is old enough to rent a car

AI companies are ponying up on Wikipedia's 25th birthday, but no one knows what's in store for the next quarter-century of free knowledge.

Wikipedia's birthday

Wikipedia mascot by BaduFerreira, CC BY 4.0, via Wikimedia Commons

🔔 Wiki Briefing

A quarter-century of free knowledge: Wikipedia turns 25

The world has changed a lot since 2001, and will change even faster in the years to come. It’s more important than ever that we hold onto the value of high-quality, human knowledge and the perspective that it brings.

So said Wikipedia cofounder Jimmy Wales at the platform’s 25th birthday celebration last week. In the last quarter-century, the free encyclopedia has gone from a fringe nerd community that gets shoutouts in “Weird Al” Yankovic songs to a central knowledge hub, as essential to training AI models as it is to ending fights when games of Six Degrees of Kevin Bacon go sideways.

Though the Wikimedia Foundation’s coffers look decidedly plump, the big anniversary still prompted plenty to ponder about Wikipedia’s future even as they pontificate on its past. Financial Times, Wired, and others set out to figure out what no one has: what happens next?

LLMs may be siphoning readers with their generated topic summaries—or maybe not, it seems to depend on who you ask. Even so, there’s no question that AI is placing an outsized strain on Wikipedia’s servers or that editors kindly but firmly rejected using LLMs in any capacity to generate content on English Wikipedia.

What the future holds for the site is anyone’s guess, but it did get a little birthday gift from most of the companies breaking down its virtual doors: the Foundation announced Thursday that it had signed Meta, Amazon, Microsoft, and others to deals to use its Enterprise product instead of scraping the general public servers, which should significantly ease their burden, and instead of costing the Foundation money, will help secure Wikipedia financially for the future.

Whether it gets another 25 years, only time will tell.


📰 In the News

Ditching the middleman

One thing has remained consistent over Wikipedia’s 25 years: its search functionality. Querying Wikipedia may still work the way it did on the Web when the site was born, but many people do not honor the old ways. With today’s Gen AI search results, having conversational, easy to digest answers is very much en vogue.

The Wikimedia Foundation-led Information Retrieval Working Group aims to find out if there is a way to keep Wikipedia from getting cut out of those conversations. Google’s Q&A style results are a type of semantic search, which makes use of machine learning algorithms that have developed an understanding of how language works and can resolve queries phrased like a question.

Wikipedia search

While major search engine companies have built up this network, Wikipedia has lagged behind. They’re so effective that most people go back to Google to find their next Wikipedia article instead of using internal “Wikilinks“. Getting people to stick around on Wikipedia—and hopefully become editors themselves—is critical, and part of that involves making it easier to find information. Without that, the site spirals into outdated irrelevance and loses its ubiquity — and usefulness — to LLMs.

Semantic search is one way the team is looking at combatting this pingpong effect. The effort is still in its infancy and likely years away from any kind of serious implementation, if other initiatives are any indicator. Still, it builds on other recent efforts, like turning Wikidata into a vector database. AI is coming for Wikipedia in one way or another, it seems.


📚 Research Report

AI translations dooming small languages

Have you ever played a game of Telephone where phrases devolve into nonsensical blather? If so, you know all too well the struggle some smaller language Wikipedias are facing. Volunteer Wikipedia editors in four African languages said between 40% and 60% of articles in their languages were poor machine translations. The news is even worse for Greenland, as AI translation and lack of native speakers got bad enough for the community to shut down its entire language edition.

MIT Technology Review published a report in September about the “doom spiral” phenomenon for less common languages. The spiral starts when models scrape Wikipedia to “learn” new languages for AI translation tools. Non-fluent editors then use these tools and incorrect translations are put on Wikipedia, expanding the bad knowledge base. The model scrapes again and the spiral swirls, degrading its understanding of those languages even further while ever more articles are (incorrectly) translated.

At least for now, this is one problem AI doesn’t seem ready to solve.


🧩 Wikipedia Facts

Wikipedia has been edited from outer space! The first article to have been edited beyond the atmosphere? Fittingly, the list of spacewalks from 2015 to 2024, updated by astronaut Christina Koch.


💡 Tips & Tricks

You can now make your own conspiracy board-style page to track your Wikipedia rabbit holes, complete with digital yarn. Wikiboard allows users to browse Wikipedia from its interface, tracking the path they take through the encyclopedia to connect Kermit to the Boston Tea Party. Try it here.

Next
Next

Hail to the new chief (executive)