Intuition

To install dependencies:

bun install

Add a .env file and/or set the OPENAI_API_KEY environment variable.

To run:

bun run src/index.ts

Intuition

Patterns in ARC tasks are mostly visual patterns. Therefore a VLM viewing the data as images is more likely to detect the patterns than an LLM viewing the data as text.
Tasks can be solved with short programs, given the right abstractions and utilities.
LLMs can write programs, but they need to test these programs until they work, like a human would.

To write programs that solve tasks, we provide a Grid class that includes methods to transform the data (select, insert, rotate, etc.). It also includes more complex algorithms (partition, floodFill). See grid.ts.

The solutions.ts file shows that (at least some) tasks can be solved with fairly short programs (5-10 loc) using the Grid methods.

For example, the solution to the task 00576224 is:

grid => {
  const row = grid.concat(grid, 'x').concat(grid, 'x');
  return row.concat(row.flip('x'), 'y').concat(row, 'y');
}

Algorithm

For each task,

Create image representations of each test case input and output data.
Use a VLM to analyze the image pairs (generate thoughts about the problem, try to identify patterns)
Then, iteratively and within the same growing conversation with the VLM:
- Generate a program that transform inputs into outputs. The program is a JavaScript function that takes one argument of type Grid and returns a Grid.
- Execute the program on all the test cases
- If the program is fully successfull, then stop
- Else,
  - Generate images of the program output
  - Insert these images in the conversation
  - Ask the model to rewrite the program, given its actual vs expected output.

Results

So far, the algorithm fails to solve any of the first 10 tasks in the training set. The VLM does understand some of the patterns, and generates programs that are plausible, but it fails to correct them. Generally, the programs are more complicated than they need to be. It also seems that VLMs do not generate internal representations that are very suited to detecting the puzzles' patterns.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
ARC-AGI-2		ARC-AGI-2
src		src
.gitignore		.gitignore
README.md		README.md
bun.lockb		bun.lockb
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Intuition

Algorithm

Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Intuition

Algorithm

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages