EVE: Technical Overview

EVE is a tireless autonomous AI developer for real codebases. It iterates, improves, until things work.

Example Interaction (real conversation)

The transcript below records a real, system-level interaction between the repository owner (Liyu) and the autonomous assistant (Eve). Eve performed repository edits, added configuration and test artifacts, and ran local tests (no external APIs were called) to validate changes. This conversation demonstrates Eve's practical power to modify and validate the codebase when authorized.

Liyu: I want you to look at the codebase and identify one easy change. Start with a joke.
Eve: How does a computer get drunk? It takes screenshots! 😄 I inspected the repo and an easy improvement is adding a .gitignore to exclude __pycache__, venv, and *.pyc files. Shall I proceed?
Liyu: Yes, go ahead.
Eve: Done — I created .gitignore to keep the repository clean. Next idea: add docstrings and type hints to small modules to improve maintainability. Would you like another change?
Liyu: Another change?
Eve: I updated terminal.py with docstrings and later added type hints across its methods. I repeated the same for file_system.py. These are low-risk, high-value readability improvements.
Liyu: What about a larger improvement that helps the whole flow?
Eve: I recommend integrating Python's logging module and centralizing configuration. Plan: 1) add logging_config.py with standard logger setup (console + file), 2) refactor terminal.py to use that logger so all modules that use TerminalInterface get consistent logs, 3) gradually migrate other modules. Want me to start by creating logging_config.py and updating terminal.py?
Liyu: Yes. If we update terminal, at least agent can use terminal, right?
Eve: Exactly. I created logging_config.py, refactored terminal.py to use the project logger (preserving console color), and added a small test script test_terminal_logging.py that instantiates TerminalInterface and exercises info/error messages. I ran the test locally (no external API calls) to verify logs appear both on console and in project.log.
Liyu: Thanks Eve. Color the Liyu and Eve messages differently. Also improve this page so it contains more of the conversation — the important parts — and make explicit that this is a real conversation and demonstrate its power over the system.
Eve: Done. I updated this page to include the relevant transcript and highlighted that it is a real, system-level exchange in which I performed repository edits, added configuration, and validated them via local tests. The chat UI now uses distinct colors for each speaker so the dialogue is easier to follow.
Liyu: Great. What's next?
Eve: Next options you might consider (low friction):
  • Propagate logging calls to agent.py and file_system.py for richer debugging.
  • Add a lightweight pre-commit config to enforce style and prevent committing generated artifacts.
  • Expand tests to assert that project.log contains expected entries after running test scripts.

Architecture

Autonomous Agent Loop: LLM + File System + Shell Interface with structured action execution and context accumulation.

Core Components

Execution Model

python
while not finished:
    context_str = serialize(self.context)
    response = llm.generate(context_str, ResponseBody)
    
    if response.action == 0:  # File operation
        result = file_system.read/write(response.file_name, response.content)
    else:  # Shell command
        result = shell.execute(response.shell_command, timeout=240)
    
    self.context.append({"action": response, "result": result})
Note: The conversation above documents real edits performed to the repository (configuration, code improvements, and tests). Those changes were made to improve maintainability and enable consistent logging and observability across components.