Beyond Trials and Errors: A Call to the r/YC's Visionaries

I am working in the field of AI (technical) for the last 4 years. I have always been interested in optimization techniques like RLs and GAs. I have already applied to W24 with a slightly doable idea (not an OpenAI wrapper :p). However, there is one idea I have always been intrigued by. I have tried and failed multiple times. It doesn't matter to me who does it, but someone needs to do it, especially now that we have LLMs.

To understand the problem, you need to comprehend how people get things done and the psychology behind it. For any task, we define a set of processes. These processes have some underlying rules. For example,

task = 'get the butter'

processes = 'stand up', 'move your leg', 'walk towards the fridge', 'open the fridge', etc.

rules (for standing up) = you must apply x amount of force on the ground, your body posture must be a certain way, etc.

These rules are generally learned traits from trials and errors. In every business, in every profession, the task of your brain is to figure out these processes and learn the rules. However, our mind is not designed to create the optimum rule; we stick to the first rule that works, out of infinite rules. And may adopt a new better rule after 20 years. The question is, can we build a machine that can generate and return optimum rules for a given process? I would love to discuss about it with all the entrepreneurial geniuses of r/YC
 
@ethomson92 Well I am highly curious now. Maybe no, but on a second thought, I think you are right. In case a machine gets the ability to optimism processes, they will be able to basically come up with new better rules to a process or in other words 'IDEAS'! And these simulations will be 'IMAGINATION'! HOLY F**K an AGI! OK Lets drop this now...
 
@anythingbutnormal it is quite a broad question you’ve put out there, and entirely depends on what the goal is and how well defined the environment is for the problem. If you’re specifically talking about reinforcement learning, I myself have also wondered and have had an interest in the topic. If you haven’t already you should check out this video:

LMK if you’re keen to chat
 
@lauriesinglemom This is really good. But its also similar to most RL and GA problem. You provide a reward function to the agents and they maximize that. Im not talking about the scenarios where we have to make the reward function. Thats where LLMs come, think of something like AutoGPT but for devising a reward function based on what the process is and the properties of the performing agent. After you have a good enough reward function, you can then run the simulation and come up with the optimum rules.
 
Back
Top