No description

Find a file

jayjader f6fb910ec0 README.md: mention system context diagrams in `## development`		2026-06-06 21:29:16 +02:00
.idea	first commit	2026-04-10 04:01:31 +02:00
debugging	client.py: Try recovering from the API server closing the existing session connection because of idling by wrapping calls to `requests.Session.post` with a helper method on `RawClient`.	2026-04-13 23:02:09 +02:00
docs	docs/: Add system diagrams using the C4 model approach in plantuml. Add index.md that displays all the diagrams together, and CONTRIBUTING.md with instructions on how to (re-)build the diagrams.	2026-06-06 20:48:03 +02:00
prompts	prompts/summarize_files_in_current_working_directory.md: Tweak the prompt so that very small models (for example, qwen3.5:0.8b) actually read some file contents instead of just guessing from the files' names.	2026-04-10 15:58:52 +02:00
src/jayjaders_llm_harness	Convert project to src-layout. Also reformat some files. Change project name (declared in `pyproject.toml`) from "jayjaders-agent-harness" to "jayjaders_llm_harness".	2026-05-09 15:47:27 +02:00
tests	Convert project to src-layout. Also reformat some files. Change project name (declared in `pyproject.toml`) from "jayjaders-agent-harness" to "jayjaders_llm_harness".	2026-05-09 15:47:27 +02:00
.gitignore	docs/: Add system diagrams using the C4 model approach in plantuml. Add index.md that displays all the diagrams together, and CONTRIBUTING.md with instructions on how to (re-)build the diagrams.	2026-06-06 20:48:03 +02:00
LICENSE	Initial commit	2026-04-10 01:58:12 +00:00
pyproject.toml	pyproject.toml: Bump dev dependency version (mypy: >=1.20.2 -> >=2.00	2026-05-09 16:07:54 +02:00
README.md	README.md: mention system context diagrams in `## development`	2026-06-06 21:29:16 +02:00

README.md

llm-agent-harness

A harness for running an LLM-based agent, suited to my needs:

uses self-hosted LLMs behind an OpenAI-compatible API (explicitly supported: ollama, lm-studio, lemonade)
terminal-first
human access to and control over conversation/session messages
bash command execution tool is sandboxed inside a container

Quick Start
Installation
Usage
- Examples
- Interactive vs One-Shot
Environment Variables
Models
Development
Sandboxing

Quick Start

# Create and activate the virtualenv
python3 -m venv .venv
source .venv/bin/activate
# Install dependencies
pip install .
# Create the required container volume (if using podman)
podman volume create llm-harness-tool-workspace
# Launch in interactive mode
start-harness-jayjader

Installation

Python / the harness itself

The harness uses Python 3.14. You need to create a virtualenv and install the harness's dependencies inside of it. The simplest way to do this is to run the following shell commands from the root directory of the harness's source code:

python3 -m venv .venv
source .venv/bin/activate
pip install .

If you prefer to use uv, do the following:

uv venv .venv
source .venv/bin/activate
uv pip install .

Additional prerequisites

Feature	Preconditions
`tmux` integration	- `tmux` installed - harness process started inside an existing `tmux` client session
`execute_sandboxed_command` tool	- container runtime engine installed (`podman` or `docker`) - a named volume created that can be mounted (ex: `podman volume create llm-harness-tool-workspace`)

Usage

$ start-harness-jayjader -h
usage: start-harness-jayjader [-h] [--model-name MODEL_NAME] [--stream] [--session-name SESSION_NAME] [--resume-session RESUME_SESSION | --clone-session CLONE_SESSION]
                              [--interactive | --one-shot] [--test-connection | --refresh-models | --prompt PROMPT | --prompt-file PROMPT_FILE]

Jayjader's LLM Harness

options:
  -h, --help            show this help message and exit
  --model-name, --model, -m MODEL_NAME
                        The name of the model to use for inference. If omitted, the last model used by the harness will be used.
  --stream, -s          Have the API stream responses (default: False)
  --session-name, -n SESSION_NAME
                        Name for the new session
  --resume-session, -r RESUME_SESSION
                        Resume an existing session by its ID
  --clone-session, -c CLONE_SESSION
                        Clone an existing session and continue from it
  --interactive         Run in interactive mode (default when no mode specified)
  --one-shot            Run in one shot mode
  --test-connection     Test connection to inference provider API
  --refresh-models      Refresh the list of models the API provides
  --prompt, -p PROMPT   The exact prompt to be sent to the LLM as the "user message"
  --prompt-file, -P PROMPT_FILE
                        The path to a file whose contents will be sent to the LLM as the "user message"

Example uses:

`start-harness-jayjader`

Starts the harness in interactive mode. If the list of available models is not known, the user will be asked to confirm the refresh of this list with the declared inference provider. The previous model (name stored in the harness's data directory) will be used. If no previous model is known, the user is prompted to choose from the list of known models. The session name will be the current date and time as a string.

`start-harness-jayjader --interactive`

Same as no args.

`start-harness-jayjader --test-connection`, `start-harness-jayjader --interactive --test-connection`

Tests the connection to the inference provider, then same as no args.

`start-harness-jayjader --one-shot --test-connection`

Tests the connection to the inference provider, then exits.

`start-harness-jayjader --refresh-models`, `start-harness-jayjader --interactive --refresh-models`

Refreshes the list of models available at the inference provider, then same as no args.

`start-harness-jayjader --one-shot --refresh-models`

Refreshes the list of models available at the inference provider, then exits.

`start-harness-jayjader --one-shot --prompt <message>`

Runs the agentic inference loop on "<message>", then exits. The previous model (stored in the harness's data directory) will be used. The session name will be the current date and time as a string. If the list of available models is not known or empty, or the previous used model is not known or not among the list of available models, a warning is printed to stderr. The harness then exits without running inference.

`start-harness-jayjader --one-shot --prompt-file <file path>`

Same as --one-shot --promt <message>, but the agentic inference loop is run on the contents of <file path>.

`start-harness-jayjader --session-name <name>`, `start-harness-jayjader --interactive --session-name <name>`

Same as no args, but the session name will be "<name>" instead of the current date and time.

`start-harness-jayjader --one-shot --session-name <name> --prompt-file <file path>`, `start-harness-jayjader --one-shot --session-name <name> --prompt <message>`

Same as --one-shot --prompt <message> or --one-shot --prompt-file <file path>, but the session name will be "<name>" instead of the current date and time.

`start-harness-jayjader --resume-session`, `start-harness-jayjader --interactive --resume-session`

Resumes an existing/prior session instead of starting a new (empty) session. Otherwise, same as no args.

`start-harness-jayjader --one-shot --resume-session --prompt <message>`, `start-harness-jayjader --one-shot --resume-session --prompt-file <file path>`

Resumes an existing/prior session instead of starting a new (empty) session. Otherwise, same as --one-shot --prompt <message> or --one-shot --prompt-file <file path>.

`start-harness-jayjader --clone-session <name>`, `start-harness-jayjader --interactive --clone-session <name>`

Clones an existing/prior session instead of starting a new (empty) session. The resulting session is named "<name>-clone". Otherwise, same as no args.

`start-harness-jayjader --one-shot --clone-session <name> --prompt <message>`, `start-harness-jayjader --one-shot --clone-session <name> --prompt-file <file path>`

Clones an existing/prior session instead of starting a new (empty) session. The resulting session is named "<name>-clone". Otherwise, same as --one-shot --prompt <message> or --one-shot --prompt-file <file path>.

`start-harness-jayjader --clone-session <existing name> --session-name <clone name>`, `start-harness-jayjader --interactive --clone-session <existing name> --session-name <clone name>`

Clones the existing/prior session named "<existing name>" instead of starting a new (empty) session. The resulting session is named "<clone name>". Otherwise, same as no args.

`start-harness-jayjader --one-shot --clone-session <existing name> --session-name <clone name> --prompt <message>`, `start-harness-jayjader --one-shot --clone-session <existing name> --session-name <clone name> --prompt-file <file path>`

Clones the existing/prior session named "<existing name>" instead of starting a new (empty) session. The resulting session is named "<clone name>". Otherwise, same as --one-shot --prompt <message> or --one-shot --prompt-file <file path>.

Interactive vs One-Shot

The harness runs the same "agentic loop" (inference → tool calls → more inference until a final answer) in both modes. The main difference is whether you stay inside the harness afterward. While in interactive mode you input the prompt once the harness has started, in one-shot mode you must provide the prompt up-front, as a command-line argument to the harness.

Interactive Mode (default)

Start in interactive mode by simply running the harness, or optionally with the --interactive flag. You will be greeted with a > prompt where you can type a message and run commands (e.g. /model, /refresh-models, /session new). After each assistant response, you are returned to the same prompt where you can run a command or continue the conversation with the LLM.

Arguments for Interactive Mode

Argument	Behavior
`--model-name <name>`	Sets the initial model used for inference
`--stream`	Enables streaming responses by default during interactions
`--session-name <name>`	Sets a custom name for the session
`--resume-session <name>`	Chooses an existing/prior session to be resumed
`--clone-session <name>`	Chooses an existing/prior session to be cloned. The cloned session is then resumed, leaving the original session unaltered.
`--test-connection`	Tests the connection to the inference provider before entering interactive mode. If the test fails, the harness exits instead.
`--refresh-models`	Refreshes the list of models available at the inference provider before entering interactive mode

Commands in Interactive Mode

Command	Description	Autocomplete Available?	Requires Tmux?
`/model <name>`	Switch which model is used for the next generation(s)	✅	❌
`/session new`	Start a new session with zero message history. The session name will be the current date and time.	N/A	❌
`/session rename <name>`	Rename the current session	N/A	❌
`/session resume <name>`	Resume an existing session	✅	❌
`/session clone`	Clone the current session into a new one	N/A	❌
`/session show-messages`	Display all the messages in the current session	N/A	❌
`/session edit`	Open the session log file in `nvim`	N/A	✅
`/stream`	Toggle streaming inference responses on/off	N/A	❌
`/continue`	Trigger inference with the current session state	N/A	❌
`/editor`	Edit the prompt in `nvim` before sending	N/A	✅
`/refresh-models`	Refresh the list of available models from the current inference provider	N/A	❌
`/test-connection`	Test connectivity to the inference provider	N/A	❌
`/promptfile <path>`	Load the contents of a file in `<root harness data dir>/prompts` as the next prompt and start inference	✅	❌

One-Shot Mode

Start in one-shot mode by specifying the --one-shot flag. In one-shot mode, the harness needs some work to do: you must either specify a command (--refresh-models, test-connection) or provide a prompt to trigger inference. You can either write your prompt inline (e.g. --prompt "What do you know about LLMs?") or give the harness a path to a file containing your prompt (e.g. --prompt-file ./my_prompt.txt). After the command is executed or the agentic loop finishes, the harness exits with shell code 0.

Arguments for One-Shot Mode

Argument	Behavior
`--model-name <name>`	Sets the model used for inference
`--stream`	Enables streaming responses
`--session-name <name>`	Sets a custom name for the inference session
`--resume-session <name>`	Chooses an existing/prior session to be resumed for inference
`--clone-session <name>`	Chooses an existing/prior session to be cloned and resumed for inference. The original session is not modified.
`--test-connection`	Tests the connection and then exits
`--refresh-models`	Refreshes the list of models available at the inference provider and then exits
`--prompt <message>`	Sets the prompt used for inference to "`<message>`", runs the agentic inference loop, and then exits.
`--prompt-file <file path>`	Sets the prompt used for inference to the contents of `<file path>`, runs the agentic inference loop, and then exits.

Differences between one-shot and interactive modes

Situation	One-Shot Mode	Interactive Mode
No model specified as command-line argument and no previous model saved in state directory	The harness prints a warning message to `stderr` and then exits.	The user is prompted to chose one of the known models to be used before they can input a prompt or run a command.
The list of known models is empty and `--refresh-models` is not present as a command-line argument	The harness prints a warning to `stderr` and then exits.	The user is asked if the harness can refresh the model list. If they refuse, the harness exits. If they accept, the list is fetched from the inference provider. If the previous model is among the list and no model was chosen via command-line argument, that model is chosen by the harness. If the model chosen via command-line argument is among the list, that model is chosen by the harness. Otherwise, the harness prompts the user to chose from the list before allowing them to input a prompt or run commands.

Many of the commands in interactive mode have a similar command-line flag that can be used in one-shot mode, though their syntax and behavior are not always exactly equivalent.

Interactive Command	One-Shot Flag	Behavior Differences
`/test-connection`	`--test-connection`	In interactive mode, the connection test is run and then the harness properly enters interactive mode. In one-shot mode, the connection test is run and then the harness exits.
`/refresh-models`	`--refresh-models`	In interactive mode, the model list is refreshed and then the harness properly enters interactive mode. In one-shot mode, the model list is refreshed and then the harness exits.
`/stream`	`--stream`	In interactive mode, this toggles between on and off. In one-shot mode, streaming is on if the flag is present, off otherwise.
`/promptfile <path>`	`--prompt-file <path>`	In interactive mode, paths are considered relative to `<root harness data dir>/prompts`. In one-shot mode, paths are absolute or relative to the current working directory.
`/session resume <name>`	`--resume-session <name>`	(None)
`/session clone`	`--clone-session <name>`	In interactive mode, this clones the current session. In one-shot mode, this clones the given named session.

The main difference in options is with renaming an existing session. In interactive mode, you can use /session rename <name> to change the name of the current session. One-shot mode has no direct equivalent; you cannot change the name of an existing session. The closest approximation is to combine --clone-session and --session-name flags. Note that this is equivalent to running the following two commands in interactive mode in succession: /session clone, then /session rename <name>.

Environment Variables

The following settings can be overridden by setting environment variables and/or preparing a .env file (python-dotenv is used to load the values found in any surrounding .env file):

Env var name	Setting / Description	Expected value format	Default value
`HARNESS_ROOT_DIRS_NAME`	the name used for the harness state and data directories	any valid directory name	`"jayjaders-llm-harness"`
`XDG_STATE_HOME`	the parent directory in which the harness creates its state directory (`HARNESS_ROOT_DIRS_NAME`) to store state that is to be resumed on next run (like the current model)	any valid path	`~/.local/state` on Linux, `~/Library/Application Support` on macOS
`XDG_DATA_HOME`	the parent directory in which the harness creates its data directory (`HARNESS_ROOT_DIRS_NAME`) to store longer-term harness data (like the session message logs and reusable prompts)	any valid path	`~/.local/share` on Linux, `~/Library/Application Support` on macOS
`OLLAMA_API_URL`	host address of the inference provider used	any valid url	`"http://localhost:11434"`
`HARNESS_INFERENCE_PROVIDER`	the software running as inference provider	`"ollama"` or `"lm-studio"` or `"lemonade"`	`"ollama"`
`SANDBOX_CONTAINER_RUNTIME_EXECUTABLE`	the container engine used on the host machine for sandboxing command execution	`"podman"` or `"docker"`	`"podman"`
`SANDBOX_CONTAINER_IMAGE`	the container image used for sandboxing command execution	any non-empty string that resolves to a container image	`"alpine"`
`SANDBOX_CONTAINER_WORKSPACE_VOLUME_NAME`	the container volume used for receiving and persisting file writes caused by sandboxed command execution	any non-empty string that resolves to an existing container volume	`"llm-harness-tool-workspace"`

Models

The harness will attempt to read the list of available models from the API on first startup. It saves this list in a file named models.jsonl and reads from it on subsequent startups. The harness will force you to choose a model if it cannot find the model that was used when it last ran. To force a refresh of the list of available models from within a running instance of the harness, use the /refresh-models command. To force a refresh at the start of interactive mode, pass the --refresh-models flag. To just refresh the list of models and then exit, pass the --refresh-models flag in one-shot mode.

The harness will attempt to determine which models support tool-calling when fetching/refreshing the list, depending on the declared type for the inference provider.

Local Files

The harness stores information in several files on-disk:

Location / File Name	Information Stored
`$XDG_STATE_HOME/$HARNESS_ROOT_DIRS_NAME/models.jsonl`	The list of models available at the current inference provider
`$XDG_STATE_HOME/$HARNESS_ROOT_DIRS_NAME/previous_model_name`	The name of the last model used by the harness for inference
`$XDG_STATE_HOME/$HARNESS_ROOT_DIRS_NAME/logs/<YYYY-MM-DD_HH-MM-SS>`	The logs for harness runs, with the date time being when the harness is started
`$XDG_DATA_HOME/$HARNESS_ROOT_DIRS_NAME/sessions/session_<name>.jsonl`	The past messages for each session - these are the single source of truth for session contents as well as which sessions the harness knows about

Development

You might find it useful to install the additional dev dependencies after creating the virtualenv, and to install the project in "editable" mode:

.venv/bin/pip install -e '.[dev]'

"Editable" mode allows you to test source code modifications without re-running the installation command, while the dev dependencies are required for:

running tests (with pytest)
(re)formatting source code (with ruff)

Check out ./docs/index.md for some diagrams that will give you an overview of the system.

Sandboxing

The harness relies on a docker or podman container to sandbox the arbitrary shell command execution tool that it proposes to LLMs. In theory, any OCI-compatible engine should work, but for now the only supported tools are those that allow running as if they were podman (the exact invocation the harness performs is along the lines of <engine> run -u <user id> -v <volume name>:<mountpoint inside container> -w <working directory> <image name> and can be found in ./tools.py). The harness also relies on a named volume being mounted inside the running container to provide a writable directory to the agent; not only allowing the agent to write to files and the like but, equally importantly, letting the user access any files created by the agent's tool calls once the sandboxing container shuts down.

README.md

llm-agent-harness

Table of Contents

Quick Start

Installation

Python / the harness itself

Additional prerequisites

Usage

Example uses:

start-harness-jayjader

start-harness-jayjader --interactive

start-harness-jayjader --test-connection, start-harness-jayjader --interactive --test-connection

start-harness-jayjader --one-shot --test-connection

start-harness-jayjader --refresh-models, start-harness-jayjader --interactive --refresh-models

start-harness-jayjader --one-shot --refresh-models

start-harness-jayjader --one-shot --prompt <message>

start-harness-jayjader --one-shot --prompt-file <file path>

start-harness-jayjader --session-name <name>, start-harness-jayjader --interactive --session-name <name>

start-harness-jayjader --one-shot --session-name <name> --prompt-file <file path>, start-harness-jayjader --one-shot --session-name <name> --prompt <message>

start-harness-jayjader --resume-session, start-harness-jayjader --interactive --resume-session

start-harness-jayjader --one-shot --resume-session --prompt <message>, start-harness-jayjader --one-shot --resume-session --prompt-file <file path>

start-harness-jayjader --clone-session <name>, start-harness-jayjader --interactive --clone-session <name>

start-harness-jayjader --one-shot --clone-session <name> --prompt <message>, start-harness-jayjader --one-shot --clone-session <name> --prompt-file <file path>

start-harness-jayjader --clone-session <existing name> --session-name <clone name>, start-harness-jayjader --interactive --clone-session <existing name> --session-name <clone name>

start-harness-jayjader --one-shot --clone-session <existing name> --session-name <clone name> --prompt <message>, start-harness-jayjader --one-shot --clone-session <existing name> --session-name <clone name> --prompt-file <file path>