Who holds the holy grail of cybernetics?

In our time, conversations about artificial intelligence and neural networks have become everyday occurrences; moreover, their widespread use by the general population has also become a daily reality.

AI text models are literally omnipresent – they are being pushed into advertising, replacing professions that do not require high expertise and skills, and used for entertainment. However, they are essentially not capable of solving serious tasks – their actual functionality is limited only to generating random dialogues or solving school math problems.

Text models are limited and have rather vague development prospects, but in contrast, there exists an AI product of a completely different class. A product used by very serious clients with no less serious goals. Not everyone has heard of it – it is only accessible to governments and transnational corporations. Those who have heard about it generally only know that this system is closely connected to the U.S. government and belongs to the PayPal mafia, an informal association of oligarchs, technocrats, neo-reactionaries, and cyber-fascists among the founders of the famous payment platform of the same name.

EVOLUTION OF THE SYSTEM FAR FROM PUBLIC EYE

As you may have guessed, this refers to Peter Thiel, his company Palantir Technologies, and their line of products with geeky names like Gotham, Metropolis, and others, which the public often incorrectly identifies as a single entity – showing how little the general public knows and understands what it actually is.

Palantir Technologies’ products are presented as artificial intelligence tools, emerging as early as 2003, before the boom of text models, and are radically different in architecture. They are much closer to the concept of artificial intelligence as envisioned by its founders in the 1970s and 1980s. Interestingly, the evolution of such systems quietly continued all along – scientific papers were published (which no one except a narrow circle of experts read), software prototypes were created (used only by scientists in labs), new machine languages appeared (which none of the Habr users wrote in). Eventually, evolution triumphed, and when hardware resources became strong enough to run such models (around the beginning of the text neural networks boom – with the development of GPUs after 2010), everything was already prepared for product rollout.

AI PRODUCT FOR SELECTED CLIENTS

There is a clear turning point – until 2011, Palantir Technologies’ contract value with the U.S. government amounted to an almost invisible $5–10 million, solely for maintaining technical processes. Between 2011 and 2013, the amount suddenly reached nearly $400 million, in 2014–2016 almost $700 million, and by 2019 it had grown to $1.5 billion. Why this happened is clear – computers had become powerful enough to operate with this model. Now, its implementation is spreading not day by day, but hour by hour.

Why aren’t such AI systems available to the general public? First, they are not designed for entertainment – Palantir Technologies’ products work completely differently from primitive text models. Second, Palantir Technologies does not sell its software to random people. Its use is extremely expensive – hundreds of millions for installation, and the issue isn’t even the cost of the hardware it runs on. Palantir products are customized for specific tasks: although it is a universal framework (as most of the development effort was directed at making it integrable with various systems), implementing it for a specific task is neither a one-day job nor a job for a single expert.

TERRIFYING POTENTIAL OF THE TECHNOLOGY

And finally, the third reason why you probably will never see it in action is that it tries not to be publicly visible – and it does so very successfully. There is a website, a Medium blog, plenty of official information about HOW Palantir works, but nowhere and never (except maybe among DeepWeb researchers) will you find official data about WHERE and FOR WHAT exactly it works. Most examples of Palantir’s practical applications are deeply immoral from the standpoint of commonly accepted ethics and morals, and some are theoretically even illegal – or at least fall into a gray zone. People don’t like the very idea of being controlled by soulless machines that can overturn their lives with a single decision, and that is exactly what Palantir products are designed for.

The essence of how this AI system works is that Palantir can ingest virtually any volume of any kind of data (audio, video, documents, etc.). It will analyze them and create an ontology – a logically connected model of reality built on those data. After that, it is possible to ask it questions and get answers, if they can be analytically derived from the ontology. Palantir is a kind of machine Sherlock Holmes, but very fast and capable of working with any volume and set of data.

Naturally, the potential of this technology is frightening: every person leaves hundreds of digital and documented traces of their life. With enough intent, Palantir can uncover hidden interconnections and draw conclusions about anyone, which in turn can lead to much more unpleasant consequences than a loan denial. An example is Salvador, previously mentioned on the main channel – judging by all available data, machine algorithms were the ones that created the lists for mass arrests.

PALANTIR IS NOT A PIONEER

It would be worth discussing the functioning of similar AI systems in more detail – because Palantir is not a pioneer in this field.

In the theory of true artificial intelligence, an expert system is a computer that mimics a human expert’s ability to make meaningful decisions (unlike ChatGPT, which, in principle, does not understand what it’s talking about). Expert systems are designed to solve complex problems through reasoning using large knowledge bases, represented mainly as “if–then” rules. During the 1980s, they were one of the first truly successful forms of AI software and were considered its future – until evolution shifted toward text models.

Every expert system is divided into two subsystems: ontology, or a knowledge base that represents facts and rules in a special format, and an inference engine that applies rules to known facts to derive new facts and can include explanation and debugging capabilities.

Machines that can think like humans have represented the Holy Grail of cybernetics since its inception in the 1940s (ironically, Turing was seriously mistaken – as practice has shown, machines that can pass his famous communication test are actually no smarter than toasters and do it essentially at random, without understanding what they’re writing).

FEIGENBAUM’S STEP FORWARD

Doctors showed the greatest interest in expert systems at the time: the possibility of support in diagnosing could significantly improve available healthcare – theoretically, a well-trained machine, taught by expert doctors, could work with patients 24/7 without fatigue, equally or even more efficiently than a living person, including remote work. Moreover, for years, doctors themselves had been using standardized questionnaires and diagnostic forms – it seemed to be the ideal field – socially beneficial and already well formalized.

Early diagnostic systems from the 1950s used patient symptoms and laboratory test results as input data to develop diagnoses. However, it turned out that standard decision-making methods – such as block diagrams, statistical pattern comparison, or probability theory – performed very poorly in practice.

A new approach emerged in 1965 as part of Stanford’s heuristic programming project led by Edward Feigenbaum, known as the “father of expert systems.” The idea that “intelligent systems derive their power from the knowledge they possess, not from the specific formalism and inference schemes they use,” as Feigenbaum said, was a significant step forward at the time, since previous research focused solely on processing heuristics.

ONTOLOGY AS A WORKING PRINCIPLE

Feigenbaum was one of the first to propose an approach that later became known as ontology. It is based on the idea of a conceptual scheme – a semantic network of concepts connected by certain rules. Ontology is an exhaustive formalization of a particular domain, containing a conceptual schema and data structure that includes all relevant object classes, their relationships, rules, theorems, and constraints accepted in that domain. Only after 20 years of research were universal algorithms developed for formalizing any domain and representing it in the form of ontology.

In fact, it’s much simpler than it sounds – people still manage to perform logical analysis of information (albeit far weaker than machines), despite the fact that their neuroarchitecture is not at all suited for formal reasoning (unlike processor architecture). The human brain is more like ChatGPT than Palantir: there are far fewer Sherlock Holmeses among us than there are people babbling loosely connected nonsense.

So, the ambitious task – to describe the entire world to a machine with all its relationships and constraints – is actually not that hard. As the example of humans who can think logically shows, effective reasoning doesn’t require knowing millions of facts and billions of connections among them – if that were necessary, the human brain, which is incapable of processing such amounts of data unlike a computer, wouldn’t be able to function in such a way.

INSTANCE–CONCEPT–ATTRIBUTE–RELATIONSHIP

Ontology consists of instances, concepts, attributes, and relationships. Instances or individuals are the basic elements of the database, the lowest parts of ontology. They represent abstract notions in any form (a number, a letter, a tone, a term, a frame, etc.). Concepts or classes are abstract groups, collections, or sets of objects. They may include instances, other classes, or a combination of both. For example, in the concept “alphabet,” the instances include “letter.”

All classes in ontology necessarily form a taxonomy – a hierarchy of concepts based on inclusion relationships. Example: “letter” – “word” – “sentence” – “paragraph” – “page” – “book.” Objects in ontology must have attributes. Every attribute has at least a name and a value and is used to store information specific to and related to the object. The value of an attribute may also be complex, for example, containing a list of classes, each of which has its own sets of objects with their own attributes. Say, the attribute “engine” in the object “car” may contain the class “engine parts,” hierarchically broken down to the last gasket.

Another key concept in ontology is the relationship (or dependency). A relationship is an attribute whose value is another object. For example, we have two books, and one is a review of the other. In that case, the relationship will be an attribute of the second book with the value “review of” and a link to the other book. There are special mathematical models for describing such things, like relational algebra, which underpins typical DBMSs (database management systems) like SQL. Naturally, based on these models, many description languages have been created – dozens and hundreds of types.

AUTOMATIC AND 100% ACCURATE ANSWERS

Now we see that a well-written ontology needs nothing else – it is a completely self-sufficient, referential, hierarchical system to which questions can be asked and answers will be given automatically and 100% accurately (unlike the statistical inference of ChatGPT).

This is the principle by which the first medical and biological models were created, such as the expert systems MYCIN, Internist-I, and CADUCEUS. The first languages for creating and working with expert systems were the famous Lisp and Prolog: the former developed more in the U.S., the latter in Europe and Japan. One of the first tests of their power was the legal system APES, written in Prolog and created in 1986, which contained the British Nationality Act. As one might guess, it could automatically draw conclusions about the legal grounds for acquiring citizenship.

During the 1980s, expert systems were extremely popular – universities offered courses in this field, and two-thirds of Fortune 500 companies used some of these programs. Numerous companies developed them: Symbolics, Intellicorp, Inference Corporation, Aion Corporation, Neuron Data, Exsys, VP-Expert, and many others.

Expert systems were applied everywhere – for example, the program Synthesis of Integral Design, written in 1982 in Lisp, generated 93% of the logic circuits for the new VAX 9000 processor. Interestingly, here’s how it worked. Its input was a set of rules describing how to make a microprocessor, written by several expert engineers. SID itself expanded those rules and generated synthesis procedures, which in size and variability far exceeded the original rules. That combination resulted in a design whose quality surpassed the capabilities of the experts themselves and the hand-designed chips before it.

THE WORLD MOVED ON…

The boom of expert systems is more reminiscent of today’s neural network boom, except their actual usefulness was much greater. What happened to them? Exactly what will happen to modern neural networks. First, expert systems of that era reached the hardware performance ceiling of their time; second – the gap between expectations and what they actually delivered was too large. Oh, their capabilities were vast, but those who worked with them dreamed of robots with whom they could discuss Nietzsche (ironically, they wanted ChatGPT, even though they didn’t know that’s what it would look like).

Meanwhile, mathematicians weren’t sitting idly by and demonstrated that all models of this type are subject to fundamental logical and mathematical limitations and that there are combinatorial problems they cannot solve. All this led to a significant cooling of interest in expert systems in the early 1990s. The world moved on because expert systems did not fulfill their overhyped promises.

RENAISSANCE OF EXPERT SYSTEMS – PALANTIR FOLLOWED THE TREND

In the 2000s, after a decade of dormancy, a renaissance of expert systems occurred. Palantir simply followed that trend – in fact, SAP, Siebel, and Oracle were doing the same at that time – embedding sets of business logic and rules into their software products. The field of data mining emerged – knowledge exploitation, which enables optimal extraction of raw data and their transformation into ontology. A discipline called “knowledge engineering” appeared, describing how to do this. By 2020, a large number of expert systems exist and function worldwide – from the infamous Chinese social credit system to criminal profiling used by U.S. police.

Naturally, transnational companies also use such expert systems for market analysis, product design, tax optimization, employee control, performance ranking like Amazon, bonus and penalty calculation; hedge funds use them for investment operations. So why does so little seem to be known about this? Expert systems have been invisibly wrapping our society for 15–20 years now, but they are too expensive and narrowly specialized for the general public to be aware of them.