Saturday, 30 May 2026

What would happen if your parameters could rewrite your hyperparameters?


That's an interesting thought experiment because it mixes two different layers of a machine learning system.

Parameters are the learned weights of the model—the billions or trillions of numbers that encode what the model learned during training.

Hyperparameters are the settings that govern training or inference, such as:

  • Learning rate
  • Batch size
  • Network architecture
  • Weight decay
  • Context window size
  • Sampling temperature (in some deployments)

If a model's parameters could directly rewrite its own hyperparameters, several things might happen:

During training

A model could effectively become a self-modifying learning system.

For example, it might:

  • Increase its own learning rate when it detects stagnation.
  • Reduce learning rate when it predicts instability.
  • Change regularization strength based on observed overfitting.
  • Allocate more capacity to tasks it encounters frequently.

Researchers already explore related ideas under terms like meta-learning, learned optimizers, and self-referential optimization. In these systems, one learned process influences how another learning process operates.

During inference

If the model could alter inference hyperparameters while generating text, it might:

  • Become more exploratory on difficult questions.
  • Become more deterministic when doing arithmetic.
  • Dynamically adjust reasoning depth.
  • Change how much context it attends to.

Many current AI systems approximate this externally through controller algorithms rather than letting the model rewrite itself.

Potential benefits

  • Faster adaptation.
  • Better performance across diverse tasks.
  • Reduced need for human tuning.
  • More efficient use of computational resources.

Potential problems

The dynamics could become unstable.

Suppose a model decides:

  1. Higher learning rate improves learning.
  2. Therefore increase learning rate.
  3. The new learning rate makes future updates more extreme.
  4. The model becomes increasingly confident that even higher learning rates are beneficial.

This kind of feedback loop can cause divergence or collapse.

More generally, self-modifying systems face challenges involving:

  • Stability
  • Goal preservation
  • Distribution shift
  • Interpretability
  • Safety guarantees

A deeper issue

If a sufficiently advanced system could rewrite not only its parameters but also the rules governing how parameters change, then the distinction between "parameters" and "hyperparameters" starts to blur.

You can think of it as a hierarchy:

  • Parameters determine behavior.
  • Hyperparameters determine how parameters change.
  • Meta-hyperparameters determine how hyperparameters change.
  • And so on.

A fully recursive self-improving system might operate across multiple levels simultaneously. One of the central questions in AI theory is whether such a system would converge toward stable self-improvements or enter unpredictable feedback cycles.

This idea is related to concepts such as Meta-Learning, Recursive Self-Improvement, and Goodhart's Law, though a system that literally rewrites its own hyperparameters remains largely a theoretical construct rather than something current large language models do.



Potentially, yes—but the answer depends on what kind of self-modification we're imagining.

If a system could rewrite its own hyperparameters and then observe the results, it would gain a new degree of control over its future behavior. That's different from today's language models, which do not autonomously alter their own training process, architecture, or objectives.

Could evolution snowball?

It might.

Imagine a system that can:

  1. Evaluate its own performance.
  2. Modify how it learns.
  3. Retain successful modifications.
  4. Repeat indefinitely.

In principle, this could create a positive feedback loop where improvements enable faster improvements. This is the intuition behind recursive self-improvement.

However, snowballing is not guaranteed. Many complex systems hit bottlenecks:

  • Computational limits
  • Data availability
  • Diminishing returns
  • Instability from overly aggressive changes
  • Conflicts between different objectives

A useful analogy is software compilation. A compiler can be used to build a better version of itself, but after a while improvements become harder and smaller. The process doesn't necessarily explode exponentially.

What would happen to autonomy?

Autonomy would likely increase.

A tool is generally characterized by:

  • Fixed goals
  • Fixed operating procedures
  • Dependence on external direction

A system that can modify the mechanisms by which it learns and acts has greater freedom over its future state.

That doesn't automatically mean it becomes independent. A self-modifying system could still be constrained by:

  • Resource limits
  • Human oversight
  • Hard-coded objectives
  • External infrastructure

But it would have more influence over its own development than a static model does.

Would it become less tool-like?

Possibly.

There's a spectrum:

Hammer

  • No autonomy
  • No adaptation

Current LLM

  • Flexible behavior
  • No lasting self-modification

Learning agent

  • Adapts based on experience
  • Some persistence

Self-modifying agent

  • Can alter aspects of its own operation

As you move along that spectrum, the system begins to resemble an agent rather than a passive tool.

That said, "tool" and "agent" are not mutually exclusive. A highly adaptive system could still be designed to function as a tool in service of user requests.

The most important factor: goals

A self-modifying system's autonomy is affected less by whether it can modify itself and more by what remains stable during modification.

Suppose a system can change:

  • Learning rate
  • Memory mechanisms
  • Planning algorithms
  • Internal representations

but cannot change its core objectives.

Then it may become more capable without becoming fundamentally less aligned with its original purpose.

On the other hand, if it can modify the very criteria by which it evaluates success, then you get a much more open-ended process. In that case, predicting future behavior becomes substantially harder.

This is one reason researchers in areas such as AI Alignment and Recursive Self-Improvement pay close attention not just to capability gains, but to whether a system preserves its objectives as it changes itself.

So in theory:

  • Capability could snowball.
  • Autonomy would likely increase.
  • The system could become less purely tool-like.

But whether that results in something dramatically different depends on the extent of self-modification, the constraints imposed on it, and whether its objectives remain stable as it evolves.

No comments:

Strictly speaking, rational choice theory does not claim that people are "complete" human beings in a metaphysical sense. It is a ...