← Back to work
2023 · Workforce-management SaaS · CTO engagement

Exiqtive

A multi-tenant workforce-management SaaS — Django + DRF backend, React frontend, Terraform-managed AWS. We've held the CTO chair since 2023, architecting all three layers and growing the engineering team into the system as it scaled.

Role
CTO · architecture · backend · frontend · infra
Duration
Since 2023, ongoing
Team
1 Hazenfield CTO
PROJECT HERO · PLACEHOLDER
FIG. 01
Context

Exiqtive (a.k.a. Clear Intent) is a workforce-management SaaS for distributed operations teams — accounts, employees, organisational structure, badges and checkpoints, performance scorecards, and a readiness signal that cascades bottom-up through the org tree. Multi-tenant, OAuth2-authenticated, real-time over WebSocket, with forty-two notification workflows wiring it to the people who care.

We've held the CTO chair since the project began in 2023. That meant architecting all three layers at once — backend on Django + DRF + Celery + Channels, frontend on React with a Storybook-anchored design system, infrastructure on Terraform-managed AWS — at first single-handedly, then as the place a growing team of engineers plugged into.

The shape we chose was deliberate. A service layer, not view-fat views. Async cascades through Celery with `transaction.on_commit` so the next step only runs after the database has agreed. Multi-tenant isolation enforced at the row level by Postgres RLS, not just by the application layer. OAuth2 scopes gating the endpoints. A parallel v1 / v2 API surface so paying customers never lose a contract. Tests fast enough to run on every push — two thousand two hundred of them, in two and a half minutes.

Two and a half years in, the architectural shape still holds. A dedicated DevOps engineer eventually took over the day-to-day of the infrastructure we set up; the architecture, the boundaries between apps, the notification taxonomy, the test investments — all still ours to evolve.

Scope

What we built.

Architecture (3 layers)01

Backend, frontend and AWS infrastructure architected as a single coherent system, before the team grew to fit it.

common02

Core Django app: accounts, employees, organisational structure, content management, notifications. ~275 migrations.

proficiency03

Badge / checkpoint / responsibility lattice — the heart of the readiness model. 61 KB of signal handlers; ~194 migrations.

performance04

KPIs, performance periods, scorecards.

assignments05

Async assignment processing — no models, only Celery tasks; relief processes with Redis locks and progress tracking.

OAuth2 + scopes06

django-oauth-toolkit with read / write / admin scopes per resource family; 8 DRF permission classes layered on top.

Multi-tenant via RLS07

Postgres row-level security policies + RLSMiddleware. Tenant leakage stops at the database.

API v1 / v2 in parallel08

Customer contracts age in place — both versions are first-class citizens of the codebase and the test suite.

WebSocket layer09

Django Channels with OAuth2-authenticated tenant connections; channel routing supports /ws/account/<id>/invites/.

Notification taxonomy10

42 Knock workflows defined in code, used across the cascade and the activity layer.

Test infrastructure11

2,220 tests; `--keepdb --parallel N`; in-memory broker / channel / cache for the test environment. 13 min serial → ~2.5 min parallel.

Terraform AWS estate12

VPC, ALB, ECR, bastion, SES, RDS, multi-environment (dev / staging / prod / v1). Operationally owned by a dedicated DevOps engineer today; we still review.

Approach

What the work looked like, in four pieces.

01

Architect across three layers

Held the CTO chair from day one. Backend on Django + DRF + Celery + Channels, frontend on React with a Storybook-anchored design system, infra on Terraform-managed AWS (VPC, ALB, ECR, bastion, SES, RDS, multi-environment). All three came from the same hand, deliberately, so the seams between them would be where we decided rather than where the org chart left them. A dedicated DevOps engineer later took over the day-to-day of the infrastructure; the architectural shape stays.

02

Multi-tenant by row, not by app

Tenant isolation is enforced at the Postgres level by RLS, not just at the application layer. A misconfigured permission class can't accidentally leak across accounts — the database refuses. RLSMiddleware sets the session variable before every request, OAuth2 scopes gate the endpoints, and eight permission classes layer on top.

03

Async cascades that respect the DB

Readiness cascades bottom-up through five levels of the org tree (responsibility → role → position assignment → employee → account). Each level uses Celery + `transaction.on_commit` so the next stage only fires after the previous write has committed. Retries with backoff; Redis locks to prevent concurrent recalcs of the same account.

04

Test fast or test never

A 2,220-test suite that finished in thirteen minutes serial was a tax we couldn't afford to pay daily. We invested in parallel test infrastructure: `--keepdb --parallel N`, in-memory broker, channel layer and cache for the test environment, faster password hasher. Two and a half minutes on an 8-core machine. 80% coverage gated in CI.

Engineering highlights

A handful of the solves we are proudest of.

01

RLS-enforced multi-tenancy

Multi-tenant isolation lives at the Postgres level: RLSMiddleware sets the session variable before every request; row-level security policies on the tenanted tables filter automatically. A bug in a permission class or a forgotten queryset filter cannot leak across accounts — the database stops it. Belt and braces with the application-layer permission classes.

02

Five-level readiness cascade

Readiness flows bottom-up through `responsibility → role → position assignment → employee → account`. Each level is a Celery task that fires only after the previous write commits, via `transaction.on_commit`. Retries with `max_retries=3, default_retry_delay=60`. Helpers like `trigger_readiness_cascade_from_responsibility(id)` are the public entry points; signals call them, never the underlying tasks directly.

03

API versioning that lets contracts age

Customers paying for the v1 contract should not be migrated on our schedule. Each app carries `urls/v1.py`, `urls/v2.py`, `urls/shared.py` plus parallel serializer and viewset trees. A `CustomVersioning` class reads `?version=` and dispatches. Both versions are first-class citizens of the test suite until a contract sunsets.

04

OAuth2 with scope-gated DRF permissions

django-oauth-toolkit for issuance, an `OAuth2ScopesPermission` class on every viewset, and scopes split read / write / admin per resource family — `employees:read`, `employees:write`, `assignments:recalc`. WebSocket connections authenticate the same way through `OAuth2TokenAuthMiddleware`. The frontend never sees a session cookie.

05

Parallel test infrastructure, 13 min → 2.5 min

The first cost we cut: a 2,220-test suite at 13 minutes serial. `--keepdb --parallel N` with N clone databases, `ALWAYS_EAGER` for Celery in tests, `InMemoryChannelLayer` for Channels, `LocMemCache` for cache, a faster password hasher. On an 8-core machine the same suite finishes in two and a half minutes. Coverage gate at 80% in CI.

06

Terraform AWS estate, multi-environment

Four environments declared in a single Terraform tree (`clear_intent`, `clear_intent _stg`, `clear_intent _prod`, `clear_intent _v1`); shared global resources (ECR, DNS, SES) at the top; per-env stacks below — backend, frontend, admin, Storybook each managed there. Bastion for ad-hoc access, ALB for traffic, ECR for images, RDS for state. Built single-handedly at first; operated by a dedicated DevOps engineer today with us still reviewing the changes.

Outcomes

A few shapes, in their raw form.

2.5+ yrs
CTO engagement, ongoing
3 layers
Backend · frontend · infra
2,220
Tests, ~2.5 min in parallel
42
Knock notification workflows

Stack
Django + DRFPython 3.13PostgreSQL 15Redis 7CeleryChannelsOAuth2ReactTerraformAWS

Have a project that deserves this kind of care?

Start a conversation