Skip to content

Load Function

The load function is the core equation that decides which agent gets the resource next.

The Equation

L(i) = alpha * (Qi / max Qj) + (1 - alpha) * (Dmax_i / max Dmax_j)
Term What it is
Qi Weighted queue depth -- sum of task.weight in agent i's queue
Dmax_i Age of the oldest waiting task (measured in ticks)
alpha Tradeoff: 0.0 = latency-first, 0.5 = throughput-first

Both terms are normalized across all competing agents. An agent with Q=10 when everyone else has Q=10 scores the same as Q=1 when everyone else has Q=1. Relative load, not absolute.

The agent with the highest L(i) gets the resource next.

Ticks

A tick is one unit of work completed -- not wall clock time. Each release() increments the tick counter and ages all waiting tasks by 1.

  • Under heavy load: ticks fire fast, priority shifts quickly
  • Under low load: ticks fire slowly, priority barely shifts
  • No contention: no scoring needed, immediate grant

Priority only shifts when there's actual contention. This is by design -- scheduling overhead is zero when the system isn't loaded.

Alpha

Alpha controls the tradeoff between latency (serve longest-waiting agents) and throughput (serve deepest-backlog agents).

Setting alpha Behavior Use when
"latency" 0.0 Serve longest-waiting agents first Webhooks, user-facing requests
"balanced" 0.25 Default Most workloads
"throughput" 0.5 Serve deepest-backlog agents first Batch processing, ETL
from loco import AsyncLOCOScheduler

# Using the string API
scheduler = AsyncLOCOScheduler(agents, resource, optimize_for="balanced")

# Using the numeric API
scheduler = AsyncLOCOScheduler(agents, resource, alpha=0.25)

Adaptive Alpha

With auto_tune=True, the scheduler adjusts alpha automatically based on observed wait-time variance:

  • High wait-time variance across agents: alpha nudges down (toward latency/fairness)
  • Growing queue depths: alpha nudges up (toward throughput)
  • Clamped to [0.0, 0.5] safe range
scheduler = AsyncLOCOScheduler(agents, resource, auto_tune=True)

Task Weight

Task weight is a cost proxy set at submit time. The scheduler uses it for queue depth scoring.

Model tier Typical weight
haiku / gpt-4o-mini / gemini-flash 1.0
sonnet / gpt-4o / gemini-pro 2.0--3.0
opus / o1 5.0

With the convenience API:

await loco.wrap(fn, agent_id="analyst", weight=2.0, ...)

With the full API:

await scheduler.submit_task("analyst", Task(weight=2.0))

Framework adapters compute weight automatically from model name and prompt length.

Scoring at Grant Time

A critical design choice: scores are computed when a resource slot becomes available (grant time), not when the agent first calls acquire() (request time).

This prevents priority inversion. An agent that has been waiting for 10 ticks will have a higher Dmax than one that just arrived, even if the newcomer has a deeper queue. The waiting agent's urgency grows over time.