Nahom Zewdu

Nuvom: Task Scheduling for Python

Challenge

Building asynchronous systems requires a task scheduler, but existing solutions like Celery are feature-rich to the point of complexity. They're designed for large distributed teams at enterprises, not for developers who want simplicity. The challenge was: can we build a task scheduler that handles the common patterns (retry logic, scheduling, monitoring) without the operational overhead?

Additionally, most schedulers assume you're using Redis. What if you wanted to swap storage backends? What if you wanted to understand exactly how scheduling works without diving into thousands of lines of abstraction?

Design Decisions

Dynamic Scheduling with Multiple Modes

Instead of building separate primitives for one-off tasks, periodic jobs, and cron expressions, we unified them around a single scheduling interface. This simplified the API and made the mental model clearer: everything is a task with an execution time. The scheduler executes tasks when their time arrives.

Pluggable Backends

By designing the storage and queue layers as interfaces, we avoided locking users into Redis. Tasks are serialized as msgpack, which is faster and smaller than JSON. This abstraction cost minimal overhead but gained enormous flexibility.

Static Task Discovery via AST Introspection

Rather than requiring users to register tasks, we inspect the Python AST to find all decorated task functions. This means you don't register tasks — you just define them. The CLI discovers them automatically.

Resilient Worker Pool with Fault Isolation

Workers are ephemeral processes. If a worker dies, another picks up the work. Task failures are retried with exponential backoff, but failures don't poison the entire pool. Each task fails independently.

Tradeoffs

Simplicity vs. Completeness

We chose simplicity. Nuvom doesn't handle all edge cases, but it handles the common ones well. For unusual requirements, you're better served by a more complete system.

Performance vs. Clarity

The code prioritizes readability. This means it's not the fastest task scheduler, but you can understand how it works by reading the source. That clarity matters more than squeezing extra throughput.

Flexibility vs. Conventions

Pluggable backends add flexibility, but they also add API surface area. We tried to minimize this by having sensible defaults while allowing power users to customize.

Operational Overhead

Monitoring is built in with Prometheus metrics. Task execution time, queue depth, failure rates are all observable out of the box.

Outcome

Nuvom successfully demonstrates that you don't need thousands of lines of code to build a useful task scheduler. The result is a system that solo developers can understand, deploy, and maintain without becoming experts in distributed systems.

The project validates the hypothesis: simplicity is more valuable than feature completeness when it comes to developer tools. Users should be able to read the source code and understand what's happening.

Technical Stack

Python · Pydantic · Redis · msgpack · Prometheus · typer