By thinkmelt in AI — 02 Jun 2023

Applying Deep Reinforcement Learning to the Equation/Coding Space Itself.

Albert Look for an Efficient Method of Walking in His Deep Learning World...

In the series of videos which you can see on youtube - 'Albert' a AI virtual robot is given a minimal set of reinforcement rules. With these he slowly learns to walk in a hobbled way - after thousands and millions of attempts.

The interesting thing is nobody gave Albert the rules on how to do this. Effectively he brute forced the environment itself - but in this example it was only a virtual environment and only at his simulated level.

Now take the following circuit board. What levels could be improved and designed by AI - what levels are already being simulated (and improved now)?

The circuit chips are being designed by AI
The circuit board and the layout is being automated by JITX.
The chip fabrication facility is ripe for simulation and Nvidia is already doing it.
The economics, product package, and product life cycle can all also be simulated (not much evidence of these exist yet).

The common thread is compounds and arrays of variables are slightly (or majorly) tweaked and the simulation is run millions of times to find the 'sweet spot' or highest performance, or highest return. Micro focused simulations all the way to macro focused simulations are occurring here.

What if One Simulated the Inventor and Equation Space Itself?

In the following Wikipedia Article - The latest machine learning 'tool' Attention Machine Learning was introduced by Google Engineers in 2017, and mimics the human behavior of focusing more intently on portions of the machine learning algorithm over others.

What if the Self-Attention scheme had 4 processing paths over 3?
What if the neuronal layer is not 4x300 but 4x250 or 8x1600?
What if the attention head consists of 6 neural networks to be trained over 3.
What if AI altered the equation above.

We can quickly realize it is the same problem that Albert is dealing with while he learns to walk - researchers are manually making algorithms and processes and the measuring the performance. They are function blocks and we as children are exploring different positions and orientations in our lego-block building contest.

Could it be done?

variables become equations themselves.
Equations become the genetic 'cell' that lives and reproduces - the only rules are the basic sets of math itself. 0 is 0, + is a multiplication, / is a division and so on.

The begining blocks of this is being seen to be created.

Code-Writing-Code or 'metaprogramming' is being experimented on now.

AlphaCode Enters Stage Right

Programmers are already working on Transformer models that will write code to solve problems, and the models are getting better very quickly. The next question becomes can the transformer model write a better transformer model that write the next supercode?

If a top coder can write 2000 lines of code a month imagine what will happen when a top transformer model can write 50,000 lines of code in hours?

Nicola Tesla Steps In

What would Nicola Tesla have in common with this. The book 'Tesla: Inventor of the Electrical Age' it details how Nicola was able to accelerate his research and development - by simply simulating the building and construction of the new electric motors in his mind. But he would build and explore dozens of models - mentally. This trial-and-error was just as today's AI brute forcing millions and billions of iterations of variables.

The common theme between the AI 'Albert', Nicola Tesla, and even the Google Engineers is this:

The Expanse of Knowledge in This Universe is Bound by Tiny Incremental Improvements.

Because of this there are really no shortcuts. Even with the accelerated developments of AI - it is still a tedious and patient endeavour. In essence if we are going to make Transformer models write our future code bases - it will take on an evolutionary process. If we are going to make spaceships that land themselves it will also be tiny incremental improvements.

Applying Deep Reinforcement Learning to the Equation/Coding Space Itself.

LXC IP Deconfliction

Zero-Shot Generative AI Finds Antibodies 10-30x More Potent Than Natural. Author Gets ChatGPT to Write The Article and Relegates Himself to Snarky Comments About It all..

LXC IP Deconfliction

Zero-Shot Generative AI Finds Antibodies 10-30x More Potent Than Natural. Author Gets ChatGPT to Write The Article and Relegates Himself to Snarky Comments About It all..

You might also like...