UK AISI and the 2025 Unscientific International Scientific Report on the Safety Security of Advanced AI
UK Technology Secretary Called for Scientific Investigation of AI. What He Got Was Unverifiable Opinion, Undefined Categories, and Imagined Risks — An Ethically Compromised Report.
The UK AI Safety Institute (UK AISI) is… phew… hm… well, at least we can blame Sunak for it. Not that Labour needs the reminder. The 2024 policy paper introducing AISI welcomes the reader with this, in big bold letters:
This was published under the 2022 to 2024 Sunak Conservative government.
I don’t think they would be quite so dramatic about it if they actually thought something good could come out of it. That’s just a hunch, of course—I wouldn’t know either way. And to be fair, I’d make the same joke if the roles were reversed, given my basically objective neutrality on everything.
The former minister kicks things off with:
“The release of ChatGPT was a Sputnik moment for humanity.”
Was ChatGPT not only built by humans, but by friendly humans in a country with which the UK enjoys, as far as I recall, friendly relations? I vaguely remember NATO being ‘sputniked’—but the Warsaw Pact was certain of its superiority and not exactly surprised they had a lead in space technology. And I’m fairly confident Khrushchev didn’t feel the need to launch a Институт безопасности Спутника-1 (Sputnik-1 Safety Institute).
And now, more big news:
The AI Safety Institute has been renamed the AI Security Institute.
Technology Secretary Peter Kyle delivered this rebranding moment at the Munich Security Conference in January 2025.
“We’re going to create one of the biggest clusters of AI innovation in the world and deliver a new era of prosperity and wealth creation for our country. This is a once-in-a-generation opportunity. If we can seize it, we will close the door on a decade of slow growth and stagnant productivity. Of—[?? taxes? logic? conjugation? our?]—taxes that are just too high. We will deliver new jobs that put more money in working people’s pockets.”
Then—after some additional visionary murmurings—he introduces a concern:
“But none of that is possible unless we can mitigate the risks that AI presents. After all, businesses will only use these technologies if they can trust them. Security and innovation go hand in hand.”
Sure. Then comes the ‘but’:
“Last year, attackers used live deepfake technology during a video call to mimic bank officials. They stole $25 million.”
And another ‘but’:
“Make no mistake, I’m talking about risks to our people, their way of life, and the sovereignty and stability which underpins it. That is why today, I am renaming our AI Safety Institute as the AI Security Institute. This change brings us into line with what most people would expect an institute like this to be doing.”
I have never heard anyone express a preference over whether AISI calls itself the ‘I’ of Safety or Security. But he does say one thing that matters:
“They are not politicians—nor should they be. They are scientists [...]”
By insisting the Institute is
“not deciding what counts as bias or discrimination,”
Kyle may be trying to escape the current AI safety/security circus—reduced to moral pageantry, identity metrics, and dollhouse playing.
He may be signalling: We want this to be real. Not political theater. So when he says, “They are scientists,” I think what he means is: “I need them to be scientists. I need them to produce something I can act on.”
But they aren’t. They’re policy theater producers in imaginary lab coats. And I say that based on evidence: this is a scientific claim, fully and wholeheartedly made in accordance with the scientific method.
The minister also said this:
“Earlier this month, the UK set out plans to make it illegal to own AI tools optimised to make images of child sexual abuse.”
Let me be absolutely clear: I am not arguing that this is a desirable use of AI. It’s not.
But I cannot understand the legal or moral basis for criminalization assuming the AI model is not trained on material involving real abuse.
If the output is synthetically generated—meaning:
It is not a photograph,
It does not depict a real person,
It was not derived from abusive source material,
then what, exactly, is the harm?
The image may be disturbing. It may even suggest pathological intent. But that is not the same as harm. If no real person was involved, and the content is synthetically generated, then the offense lies not in consequence but in appearance.
Is Labour now enforcing Victorian morality codes? If so, we should add this to the AI risk register: the UK legal system reformatting itself into Orwell’s 1984, in the name of protecting us from fiction.
An unjustified exception — like criminalizing content without victims — corrupts the entire legal foundation. You cannot carve out logic-free zones in law and expect the rest of the system to remain coherent.
Intentions do not rescue this. A mistake made “for the right reasons” is still a mistake. And when codified into law, it becomes a structural flaw with lasting damage. Maybe I am missing something and I recognise this is a serious issue. But so is freedom.
The 2025 Unscientific International Scientific Report on the Safety Security of Advanced AI
He then cites from the 2025 International AI Safety Report,[when will we rebrand it to the AI Security Report?]
“led by Yoshua Bengio" warns us that - without the checks and balances of people directing them - we must consider the possibility that risks won’t just come from malicious actors misusing AI models, but from the models themselves [..]
Because losing oversight and control of advanced AI systems, particularly Artificial General Intelligence (AGI), would be catastrophic.
It must be avoided at all costs [..]
But success is not a given.”
Now I can assure Minister Kyle that there is zero risk that the currently known AI LLM architecture could deliver AGI. And yes I can say this confidently because I use the scientific method to understand how AI actually works. But as your $25 million heist example shows, you don’t need super advanced AI to cipher off some money. You don’t need AI at all.
The International AI Safety Report 2025 makes expansive claims about systemic and catastrophic AI risks — but admits in nearly every case that no evidence exists. Here’s what the first ten pages show:
1. Catastrophic and Existential Risks
They cite risks like loss of control, large-scale social disruption, or extinction-level events. But then immediately acknowledge:
“We lack conclusive evidence about whether these outcomes are likely.”
This is a bait-and-deflate move: present a threat, then withdraw all responsibility for proving it.
2. Systemic Risk
They define “systemic risks” vaguely: disruption to economic, political, and social systems — and cite examples like:
Labor displacement,
Misinformation,
Loss of epistemic trust.
But again, they offer no methodology, no causal modeling, no falsifiable prediction. It’s just narrative scaffolding.
“These risks are complex, indirect, and highly uncertain.”
We may also call it progress, like the steam engine.
3. Evaluation and Evidence
The report concedes:
“There is currently no agreed-upon standard for evaluating whether a model is safe.”
This is damning. They admit:
No criteria,
No metrics,
No testable conditions.
Yet still proceed to recommend international coordination, regulatory mechanisms, and research redirection — all without a single operational benchmark.
4. Claims of Misuse
They warn about:
Cyber-attacks,
Bioweapons,
Manipulative persuasion,
Autonomous weapons.
Yet each is prefaced by conditional language:
“It is unclear how likely these capabilities are to emerge.”
“There is no consensus on their feasibility or timeline.”
Then this
“This report highlights that frontier AI remains a field of active scientific inquiry, with experts continuing to disagree on its trajectory and the scope of its impact.”
No, it does not.
It shows that we lack scientific investigation entirely. What it presents is not inquiry but aggregated superstitions and opinions, unverifiable speculation, and consensus theatre.
This work disqualifies anyone from claiming expertise. It took over 90 contributors to produce a document whose core message is:
We don’t know what’s happening,
We can’t define the risks,
We haven’t agreed on methods,
But we’re extremely concerned.
This isn’t a scientific report. It’s a committee-generated mood board — and I can produce more substantive content in ten seconds by refusing to pretend ignorance is insight.
“Privacy risks: General-purpose AI can cause or contribute to violations of user privacy. For example, sensitive information that was in the training data can leak unintentionally when a user interacts with the system.”
There should be no sensitive or private information in the training data. This is not a technical risk — it’s a procedural failure. If the model leaks private data, the problem isn’t the model. It’s negligent data governance. This is unspecific to AI. Any system trained on unvetted data might reveal something it shouldn’t. That’s not a novel threat. That’s called malpractice and we know how to deal with such risk.
“However, so far, researchers have not found evidence of widespread privacy violations associated with general-purpose AI.”
Good we talked.
By this standard of reasoning, one might also write:
There is currently no evidence that AI systems will autonomously establish a colony on Mars, construct a more equitable society, and then trigger a global labor crisis as humans attempt to migrate en masse to Martian utopia. However, we should not be complacent about the possibility of such a risk emerging. I agree.
They write:
“Developers still understand little about how their general-purpose AI models operate... The inner workings of these models are largely inscrutable, including to the model developers.”
If that is the case, it belongs on page one. And the report should end there.
Because if you cannot describe the object of inquiry, then nothing you say about risks, safeguards, or governance has any epistemic relevance.
This is not a minor limitation. It is a category error: issuing predictions, policy recommendations (which you claim you don’t do on 300 pages!), and international coordination plans for an object you admit you cannot observe, test, or explain.
This violates the most basic precondition of the scientific method:
Know what you are studying.
By their own admission: There is no mechanistic understanding in this report. The authors repeatedly concede they do not know how the systems they’re describing actually work.
They show no grasp of causality — only references to emergent behaviors from stochastic training. But this is conceptually empty. Stochastic does not mean unknowable. Financial systems, weather patterns, and epidemiology all operate on probabilistic models — yet we analyze them, predict outcomes, and manage risk.
The model is probabilistic but that is not an excuse for non-explanation. It only demonstrates being either disingenuous or incompetent.
Behavior is not specified or modeled — it is retrospectively guessed at. Failure modes are not derived from systems theory, adversarial design, or empirical testing. They are imagined, often with dramatic bias, no grounding and absurd.
This means every downstream claim — about safety, security, alignment, interpretability, control — is built on non-knowledge.
This is not precaution. It is ritualized speculation, dressed in scientific language. That distinction matters.
When you present unverifiable opinion, undefinable categories, and imagined risks as if they were the outcome of systematic inquiry, you are not doing science — you are performing legitimacy.
To call this a scientific report under those conditions is not just an exaggeration. It is knowingly misleading.
And to publish it with institutional backing, without falsifiability, without metrics, without causal models, and without internal dissent is not merely a failure of rigor — it is ethically compromised.
You cannot use the language of science to license power, and then retreat into ambiguity when precision is demanded.
That’s not caution. That’s fraud.
And to Minister Kyle: this is the opposite of the required conduct of scientists. Science demands testable claims, causal modeling, and epistemic discipline.
This is the uncomfortable truth.
“General-purpose AI’ refers to artificial intelligence models or systems that can perform a wide range of tasks rather than being specialised for one specific function. While all AI operates on a fundamental input-to-output basis ..”
While all AI operates on a fundamental input-to-output basis — I mean, all arithmetic involves numbers. That’s a tautology, not a scientific observation. And did all 90 experts suddenly agree on this one point when they couldn’t agree on anything else? Why is there no caveat like: we have found evidence that this is the case?
On the same page they say:
“Examples of general-purpose AI include:
[..] Image generators (13), such as DALL-E 3 (14*) and Stable Diffusion-3 (15*).
[..] Video generators such as SORA (16*), Pika (17), and Runway (17) ..”
If it can only do images it is per definition not a general purpose AI.
I mean? What? That’s the writing of amateurs, operating beyond their technical grasp.
“General-purpose AI distinguishes itself by its ability to handle a diverse range of tasks, e.g. summarising text, generating images, or writing computer code ..”
No they don’t.
If you ask GPT-4: “Generate a picture of a robot,” the process looks like this:
Your text prompt is tokenized → GPT-4 gets a string of tokens.
GPT-4 outputs a new text string (e.g., a DALL·E-compatible image prompt).
That string is passed to a separate model (like DALL·E 3).
DALL·E 3 processes it and renders the image.
At no point does the LLM generate an image.
It generates a prompt that another AI interprets. That AI is trained on image data — not text tokens.
Same goes for code:
If you use GPT to "write Solidity code," it generates text.
If you then run that code or visualize it, that’s handled by another system (IDE, interpreter, compiler, etc.).
This has massive consequences for reasoning.
They don’t know anything and call this a science report!
They lump GPT-4 and the ERNIE family into the same category of “general-purpose AI.” Never mind that GPT-4 is documented, benchmarked (to some extent), and tested independently across institutions. ERNIE, developed by Baidu, is not independently auditable. That doesn’t make its claims false — but it does make them unverifiable.
And in any serious scientific report, such differences in data quality, transparency, and reproducibility must be acknowledged. You cannot equate outputs from closed, state-aligned ecosystems with those from (relatively) open academic or commercial labs — not without collapsing the standards of verification.
This is not a geopolitical argument. It’s a scientific one: if you cannot verify the evidence, you cannot include it as equivalent.
“Researchers have observed modest further advancements towards AI capabilities that are likely necessary for commonly discussed loss of control scenarios to occur. These include capabilities for autonomously using computers, programming, gaining unauthorised access to digital systems, and identifying ways to evade human oversight.”
No, that’s wrong. Anthropic dreams up these fantasies and claims that when their model generates text in the form of an email, it is equivalent to the AI autonomously using systems and actually sending emails — which it cannot do. All the while, they are prompting it, and the AI simply responds. Nothing is autonomous, because the AI has no agency to do anything.
“General-purpose AI models are developed via a process called ‘deep learning’. Deep learning is a paradigm of AI development focused on building computer systems that learn from examples. Instead of programming specific rules into systems, researchers feed these systems examples [..]”
You cannot bundle LLMs and diffusion models under a single explanatory umbrella like this. Saying "general-purpose AI models are developed via a process called ‘deep learning’" is technically true, but epistemically useless — it's like saying "all vehicles move via engines."
LLMs (like GPT-4) and diffusion models (like Stable Diffusion) are:
Architecturally distinct,
Trained on different data types,
Operate under different loss functions,
And respond to prompts in entirely different ways.
Grouping them as "deep learning systems" erases these distinctions and prevents any serious discussion of risk, behavior, or interpretability. And let’s be clear: a diffusion model trained on pictures will not develop bioweapon manufacturing skills — unless bioweapons can be assembled by drawing intricate baroque illustrations.
Also: citing AlphaGo's 2016 Go victory as the solidification of “general-purpose AI” is pure historical distortion. AlphaGo was narrow, domain-trained, and non-generative. It was not general-purpose — and it did not operate like modern LLMs or diffusion systems.
This paragraph is not a summary of deep learning. This is sci-fi comedy.
“Today’s general-purpose AI models are neural networks, which are inspired by the animal brain.”
Which animal? Chicken? This analogy is decades old and serves no explanatory function. The resemblance is superficial at best:
LLMs are matrix multipliers, not neurons.
They operate on tokens and gradients, not synapses and neurotransmitters.
There is no sense of embodiment, no sensory grounding, no autonomous learning.
“Hidden layers - Source: International AI Safety Report”
So they are evidencing themselves. Way to go. If it’s boxed and labeled “hidden,” then it’s not hidden. It’s a standard term in multilayer perceptrons (MLPs) and feedforward neural networks. It refers to layers that are neither the input nor the final output. But is nothing mysterious about and this an historic term but doesn’t signify hidden AI scheming.
“Deep learning works by processing data through ‘layers’ of interconnected mathematical nodes [..] As information flows from one layer of neurons to the next, the model refines its representations. [..}The strength of each connection between nodes is often called a ‘weight”
No. This is completely confusing because they describe two mutually exclusive systems — classical neural nets and Transformers — as if they’re the same model. One is built on fixed-layer activations, the other on dynamic attention over tokens. Merging them into a single explanation creates a model that doesn’t exist.
And Btw all your definitions are also wrong
“Weights: Model parameters that represent the strength of connection between nodes in a Neural network." Weights play an important part in determining the output of a model in response to a given input and are iteratively updated during model training to improve its performance.”
Weights are not “connections” in any meaningful sense outside of legacy metaphors. In modern models — especially Transformers — weights are just numerical parameters in matrices used for computing projections, attention, or layer transformations.
They don’t “connect nodes” in any biological or spatial sense.
Also, not all weights are iteratively updated during training — some are frozen, others are initialized with structure. And “improve its performance” is vague: weights are updated to minimize a loss function, not to achieve some general notion of performance.
Weights aren’t just an important part — they’re the core of the model. In deep learning, the weights are the model. Everything else — architecture, layers, tokenization — is scaffolding around the learned weight values. That’s what’s trained, stored, and deployed.
Calling them “connections” and “important” undersells the whole system. It’s not a feature — it’s the substance.
“Developers feed models massive amounts of diverse data – such as text, code, and images – to instil general knowledge. Pre-training produces a ‘base model’.This is a highly compute-intensive process.”
Maybe, maybe not.
You can pre-train a small language model on a laptop. Not every model is GPT-4, and not every training run melts a data center. This line conflates scale with essence, and turns infrastructure footprint into a defining property — which it’s not.
LLMs are trained on text — they don’t need to be shown a picture of a cat. They learn from how “cat” appears in language.
Diffusion models need images paired with text (e.g., a photo labeled “cute cat”) to learn the visual pattern. They require millions of labeled examples.
Treating both as if they’re trained the same way is false. It hides the fact that LLMs learn from structure, while diffusion models learn from correlation across modalities. Totally different epistemics.
“During pre-training, developers present general-purpose AI models with large amounts of data, which allows the model to learn patterns. At the beginning of the training process, an untrained model produces random outputs. However, through exposure to millions or billions of examples – such as pictures, texts, or audio – the model gradually learns facts and patterns which allow it to make sense of information in context.”
At the beginning, the model doesn’t "produce random outputs" like a child guessing — it just hasn't optimized any weights yet. And during training, it doesn’t “learn” like a conscious entity. It optimizes parameters to minimize prediction error — full stop.
You could show a plank of wood a billion pictures; it wouldn’t “gradually learn.” A model only “learns” because we program an optimizer to adjust weights based on pattern correlation. It’s gradient descent, not comprehension.
And "make sense of information in context" is anthropomorphic nonsense. The model doesn't "make sense" of anything — it aligns statistical patterns across training data.
This is not learning yet. It’s pattern compression under probabilistic guidance. In the case of LLMs, the model constructs a dataset that describes concepts (tokens) statistically, using vectors — each concept is defined through its relation to other concepts. This representation is initially entirely self-referential — there is no grounding, no external validation, just internal coherence across text.
“In some ways, fine-tuning a general-purpose AI can be compared to teaching a student through practice and feedback.”
No. Fine-tuning is marginal. It doesn’t teach the AI “skills” — it teaches it to conform to tone, style, and social expectations. It aligns outputs with human preferences or policies. Comparing it to “teaching a student” is false. The model isn’t gaining competence — it’s being nudged not to sound flippant, or off-brand.
The core capabilities are set in pre-training. Fine-tuning just shapes the surface.
And can I point out that your description started with:
“General-purpose AI models are not programmed in the traditional sense.”
So if they are not programmed, then why do you think what used to be programmed is now rebranded as fine-tuning? Do we program, or do we not?
General-purpose AI systems are themselves increasingly being used to help fine-tune other general-purpose models.”
This is like saying “autocorrect is used to write spelling software.” It’s not false, but it’s epistemically empty. What does “help fine-tune” mean?
“In practice, fine-tuning is typically an iterative process in which developers will alternate between fine-tuning and testing runs until their tests show that the system meets desired specifications.”
What specifications? This isn’t deterministic software. You can’t specify exact behavior — you can only influence distributions. Fine-tuning doesn’t guarantee outputs. It nudges tendencies.
So what does “meets specifications” mean here? That it produces fewer offensive completions? That it mimics polite tone in 8 out of 10 prompts?
This is not software engineering. It’s statistical conditioning. Calling it “meeting specifications” implies a level of control that doesn’t exist — and misleads anyone assuming this works like traditional code.
“This is often known as a ‘system card’.”
No — “system card” is not an industry standard. It’s a term popularized by Anthropic, and mostly used in their own publications. Most labs still refer to model cards, capability evaluations, or safety overviews.
And the “system cards” Anthropic produces are often vague, selective, and non-reproducible — high on narrative, low on verification and misleading in unimaginable ways.
So referring to this as a common or final deployment step inflates a practice that lacks rigor, standardization, or accountability.
It’s not science. It’s structured self-promotion. So what is this. We reprint press releases?
And btw the model starts drifting over time and is not forever stable. So this model misses the most crucial point.
“Most experts agree that general-purpose AI is currently not capable of...”
Then, immediately:
“General-purpose AI agents can increasingly act and plan autonomously by controlling computers.”
“Markedly improved at tests of scientific reasoning and programming.”
You can’t have both. Either AI:
Cannot independently execute complex tasks, or
Is now planning, reasoning, coding, and researching autonomously.
They’re trying to maintain incapability for safety framing while implying capability for economic hype.
And note the bait-and-switch:
They say AI agents “struggle with work that requires many steps” — but then point to improvements via “chains of thought.”
That’s not agency. That’s prompt engineering for intermediate outputs. The system still completes text — it doesn’t plan, adapt, or revise unless instructed.
This whole passage is a layered contradiction:
Deny capacity for liability,
Assert capacity for investment value,
Invoke scientific reasoning without mechanism,
Celebrate “chains of thought” as if the model thought.
Chain of thought is not changing how the AI reasons. This editorial and I cannot believe that our 90 experts don’t understand anything about how AI technically works.
“API Access to Fine-Tuning Users can fine-tune the model for their specific needs GPT-4o (OpenAI) Enterprise software with customisation APIs (e.g. Salesforce Development Platform)”
No, nope.
OpenAI does not currently allow public fine-tuning of GPT-4 models. Customization through API parameters (like system prompts or retrieval augmentation) is not fine-tuning — it’s prompt engineering.
So the statement is:
Technically inaccurate,
Misleading about actual capabilities,
Confusing customization with parameter-level model adjustment.
This confuses interface-level control with architectural modification — and it misleads policymakers into thinking these systems are widely adaptable when they’re not.
The report warns that pre-training and fine-tuning are “highly resource-intensive.”
Then casually claims that users on Salesforce can fine-tune models.
You don’t get resource-intensive fine-tuning through a dashboard button. And you certainly don’t get it without compute infrastructure, technical expertise, and access to base weights. And how would Salesforce know about model weights?
They claim:
“Hosted Access: Users can only interact through a specific application or interface”
e.g., Midjourney, Google Docs
Then:
“API Access to Model: Users can send requests to the model programmatically”
e.g., Claude 3.5, Squarespace
Access via APIs is still hosted. You don’t get “access to the model” — you send tokens to a remote server.
Using ChatGPT in a browser or via API are both mediated by OpenAI’s infrastructure. There is no local model execution. It’s all hosted. The browser is my external application. And yes — I can use Gemini in Google Docs from my browser.
An API is just a data interface. I can speak to an API using a browser — that’s still API access, just wrapped in a UI.The interface may change, but the underlying connection is the same: a request to a hosted model via the internet.
And claiming Squarespace as an example of “API access to the model” is meaningless. That’s a frontend builder, not a protocol.
This is terminological fiction:
“Hosted” and “API” are not opposites.
Claude, ChatGPT, Midjourney — all accessed via hosted APIs.
Midjourney started with Discord-only access — Discord was the front end, API the transport, and Midjourney servers hosted the model.
Embedding a model’s response into a doc editor or form does not alter the architecture. It’s still a remote call. The model still lives somewhere else.
So what they’re describing isn’t access type — it’s user experience metaphors, framed as if they define system boundaries.
This is not technical classification. It’s a glossary of UI sensations.
It is not science. And it shouldn’t be in a scientific report.
“Open-weight: Weights Available For Download Users can download and run the model locally”
Yes, “open-weight” means downloadable. But good luck running GPT-4 — even if you had the weights. These models are too large for consumer hardware. They require specialized infrastructure: multi-GPU clusters, optimized inference stacks, and massive RAM.
So “open-weight” doesn’t mean “runnable by the user.” It means legally accessible under a license, with no guarantee of usability.
The fact that small models can run on a laptop doesn’t make them “AI” in the same sense as GPT-4. They’re algorithms informed by AI training methods, not full general-purpose systems.
This is what gets lost in the term “LLM.” It now stretches from multi-billion-parameter models running on supercomputers to lightweight pattern matchers running offline.
This drift in terminology dilutes meaning. Not every model trained on text is an “LLM” in the original sense — and not every “LLM” has comparable reasoning, context retention, or generalization ability.
We should call these models what they are: trained transformers, not general-purpose AI.
“Impact of open-weight general-purpose AI models on AI risks discusses how the open release of model weights affects risks. There are some emerging industry best practices that focus on release and deployment strategies for general-purpose AI. Possible release strategies include releasing the model in stages to learn from real-world evidence prior to full release, providing cloud-based or API (application programming interface) access to have greater ability to prevent misuses.”
Nonsense.
You can’t run Skynet on a laptop. Most “open-weight” models are too small, too constrained, and too hardware-limited to pose systemic threats.
The real risk doesn’t come from the AI. It comes from scale, infrastructure, and integration. Staging releases, cloud APIs, gated access — these aren’t “safety practices.” They’re platform control mechanisms, often wrapped in safety rhetoric to justify vendor dominance.
“Risk management strategies from other domains can be applied to general-purpose AI. Common risk management tools in other safety-critical industries such as biosafety and nuclear safety include planned audits and inspection [..] Although translating best practices from other domains to general-purpose AI can be difficult, there is some guidance on ways it can be done.”
You can’t “audit” a model like you audit a nuclear plant. You can’t “inspect” the training process like you inspect a BSL-4 lab. What do you inspect — the optimizer’s loss curve?
You need to develop appropriate tools. But first you need learn how it actually works before you do that.
We don’t know how, but we’re confident it could be meaningful.
This is not risk management.
You can't borrow tools from biosafety or nuclear regulation unless you can map:
The system architecture,
The failure modes,
The measurable outputs,
The intervention points.
And they say don’t know.
I stopped reading at page 37 - but read a few definitions here and there and it’s all the same. So here is a quick fact based summary.
General! What Now?
Autonomous AI, as it exists today, is widely discussed in theoretical, marketing, and ethical contexts. But from a systems engineering perspective, it fails to meet the minimum structural requirements for high-stakes operational domains. This is not a moral debate. It is a specification failure.
Conditions (AI as it exists today):
No awareness of rules
No internal model of its own decisions
No capacity for refusal
No declarative trace
No assurance of consistency
No concept of context beyond token prediction
No capability to compute with numerical precision
What AI can do:
Limited but very useful reasoning without potential to reach consciousness and without potential for reaching 'AGI' status (if you hear otherwise they are categorically wrong, just saying)
Perform transformations over symbolic representations
Deliver bias-free perception in structured tasks (pattern recognition)
There is no domain — legal, medical, military, financial, or civic — where autonomous AI satisfies the structural constraints necessary for unmonitored deployment. Not because it is dangerous in the abstract, but because it cannot meet the operational requirements.
And the constraint cannot be removed without losing AI's reasoning capability. Its probabilistic nature — the very feature that enables symbolic flexibility and contextual adaptation — is what prevents deterministic reliability because it is not anchored in the real world and cannot identify alignment errors.
A human can make a mistake — and (sometimes)know it. An AI can make the same mistake — and do it again, faster, with confidence. That’s not risk. That’s unbounded non-agency.
You don’t need ethics to say no. You just need the spec.
And that means you can’t let an LLM run any critical infrastructure directly. That’s it. Because it cannot know when it makes a mistake. Under the current architecture.
And that is why calling Driverless AV systems “AI” is a rhetorical nonsense. They are traditional software stacks with probabilistic components.
And AISI - I will come back to you in Part 2
#AutonomousAI #artificialintelligence #myndOS