Void Main Lab

Our Research


Why

At Void Main Lab, we believe the next frontier of AI is creation.
Human creativity is bounded by experience and data. AI, however, has the potential to learn, explore, and invent in ways humans cannot.

While Large Language Models (LLMs) are powerful, they are fundamentally imitation-based — once trained, they stop learning. True innovation requires systems that:

  • Learn from experience rather than static datasets
  • Discover new knowledge through reinforcement and exploration
  • Invent beyond imitation, generating ideas and tools not found in human data

We align with the vision of Richard Sutton, the father of modern reinforcement learning, who emphasizes that intelligence must be grounded in learning through experience.

Our research is therefore focused not on building larger LLMs, but on AI-native systems that continuously improve, adapt, and create.


Research Principles

  • Openness: Knowledge must be shared to grow.
  • Learning by Experience: Real progress comes from reinforcement, not imitation.
  • Play and Create: Automate the trivial, let machines explore and invent.
  • Co-Evolution: Human-AI collaboration drives lasting breakthroughs.

Our Core Research Areas

1. Synthetic Data Generation

We design systems that generate high-quality synthetic data, enabling AI to learn from self-created experiences rather than being limited to human-collected datasets. Synthetic data expands the boundaries of what AI can imagine and invent.

2. Long-Context Learning

We research methods for AI to retain, reason, and plan over long horizons, allowing agents to build deep memory, sustain narratives, and manage complex workflows that span hours, days, or even lifetimes.

3. Reinforcement Learning

At the heart of our research is reinforcement learning — training agents through interaction, feedback, and discovery. We focus on scalable RL methods that enable AI to adapt in open-ended environments and evolve new strategies.

4. Reward Design

Reward signals define what AI values. We explore advanced reward architectures that balance creativity, safety, and utility — enabling agents to optimize for outcomes that align with human goals while discovering new solutions.