Brief-to-Doc content pipeline for a B2C software publisher

A multi-stage Python pipeline that compresses a brief-to-Doc cycle from roughly a workday to 15–20 minutes — most of it Claude working. Three article-type renderers, a four-language translation track, two copywriters using it daily, each in their own voice. Three months of solo build; finished as of May 2026.

build time
~3 months iteration · finished May 2026
first published
last updated

In one paragraph

A content team at a consumer-software publisher was writing comparison guides, tool roundups, and how-to articles by hand: read a Russian/Ukrainian internal brief, mentally separate the marketing wishlist from the technical truth, draft in a chat-style AI with a lot of back-and-forth, then format the result in Google Docs by hand (bullet punctuation, screenshot filenames, callout capitalisation, table of contents — the works). One article was a workday. This pipeline compresses that to 15–20 minutes of mostly Claude working, plus 5–20 minutes of human polish. Three months of iteration. Two copywriters using it daily, each configured with their own voice profile, sharing one renderer. Four-language translation track running off the same English source. Finished — stable, daily-use, no further architectural changes planned.

0 months
solo iteration
0 min
brief → Doc · was a workday
0 languages
same renderer · per-language glossaries
0 writers
same pipeline · own voices

The problem in business terms

A consumer-software publisher’s content team writes comparison guides, listicles (“best X tools for Y”), and how-to articles in English for the company’s owned blog and for syndication to external tech publications. Each article starts as an internal brief — a Markdown document with three sections: a marketing wishlist (what the campaign wants the article to say), a structural template (which headings, which screenshots), and a technical-truth block (what the engineers verified the software actually does). The brief is in Russian and Ukrainian; the article ships in English, then in four more languages.

Before this pipeline, the brief-to-Doc round-trip was a full workday. Most of it wasn’t drafting — most of it was formatting: full stops at the end of every bullet, the right screenshot filenames, the correct callout casing, a proper native-Docs table of contents, the right paragraph spacing. The kind of consistency a tired writer skips and an editor flags two days later.

The pipeline replaces that round-trip with a five-stage flow. The writer drops a brief into an inputs/briefs/inbox/ folder; the pipeline normalises it, validates it, renders the draft following the right template for that article type, validates the doc-model structure, and publishes a fully-styled Google Doc to the right Drive folder. What used to take a workday now takes twenty minutes and arrives more consistent than human work.

The non-obvious move is that the pipeline is parameterised by the writer’s voice profile, not by the article. Two copywriters use the same pipeline; each gets drafts that sound like them, because each has their own author-profile.json declaring tone, voice, and preferred first-person phrases. The renderer reads the profile at draft time.

how an article gets from brief to published Doc before vs after the pipeline
without the pipeline
  • writer reads the RU/UK brief, mentally splits marketing wishlist from technical truth
  • drafts in a chat AI semi-manually — long back-and-forth per section
  • formats the result in Google Docs by hand (bullets, callouts, screenshot filenames, TOC)
  • translations per language done by hand, glossary lives in the writer's head
  • ~1 workday brief-to-Doc; consistency drops when the writer is tired
with the pipeline
  • writer drops brief.md into an inbox folder
  • normalise → validate → render → validate-format → publish runs as one command
  • 15–20 min of mostly Claude working + 5–20 min of human polish
  • four languages roll off the same English source through versioned glossaries
  • consistency is enforced by the pipeline; the writer keeps the parts that need taste

What makes this different from “a script that calls an LLM and saves a Doc”

Plenty of teams have wired Claude or Gemini to a Google Doc and called it a content pipeline. What separates that from a system two writers can use as part of their daily work, for months, without things drifting is a small set of decisions about what the pipeline owns and what it deliberately doesn’t.

Author voice is config, not code.

The most valuable structural decision — and the one that turns this from “Max’s personal tool” into a shared utility — is that every writer has their own author-profile.json. The file declares the author’s name, voice description, preferred first-person phrases, and tone. The renderer reads the profile before drafting; without one, it falls back to a neutral house style.

Two copywriters share the same pipeline today. Each gets drafts that sound like them. There is no per-writer fork of the renderer, no per-writer prompt tweak that ages out of sync. The voice is a file in the writer’s home directory; everything else is shared.

author voice as config vs hardcoded into the prompt why a second writer didn't require a second pipeline
criterion voice in the prompt (typical) voice as per-writer config (chosen) chosen
adding a second writer fork the prompt · diverge over time new profile.json
shared bug fix cherry-pick across forks one commit, both writers benefit
voice tuning touches code review writer edits their own file
house-style enforcement leaks into voice prompts lives in shared rule docs
neutral-fallback path missing profile → house style

For the business: the cost of onboarding the next writer is the cost of writing one JSON file. House style stays consistent; voice stays personal. The line between “what the company owns” and “what the writer owns” is in the filesystem.

Editorial rules live in git, not in someone’s head.

Roughly a dozen Markdown reference files sit under the pipeline’s references/ directory: core formatting, core style, author-tone guidance, per-article-type templates (general guide, listicle, comparison), brief-normalisation rules, technical-precedence rules, delivery-mapping rules, screenshot rules, per-language translation glossaries, an 8-point translation QA checklist. Claude reads these at runtime, every run, every article.

For the business: editorial conventions are versioned, diffable, and durable. Style decisions don’t live in someone’s memory or a Notion page that gets stale; they live next to the code, change through pull requests, and propagate to every draft on the next run. A new hire reads the references; they don’t need a handover meeting.

Linear pipeline with checkable artifacts at every step.

The pipeline is not one prompt that does everything. It is five stages, each writing a file the next stage reads:

brief → publish-ready Doc five stages · each writes an inspectable artifact
  1. 01
    normalise
    RU/UK brief → brief.normalized.json
  2. 02
    validate brief
    block if comparison source spreadsheet inaccessible
  3. 03
    render
    article.md + article.doc.json by article type
  4. 04
    validate format
    doc-model structure check before publish
  5. 05
    publish
    Google Doc in the right Drive folder · dry-run mode

Every stage’s output is a file. Every stage can be re-run independently. The validator blocks the pipeline when the brief references a spreadsheet the writer can’t access; the format validator catches broken doc structure before it lands in Drive; the publisher’s --dry-run mode proves the Drive call would succeed before it actually performs it.

For the business: the pipeline is debuggable, not magical. When a draft comes back wrong, the writer can read the normalised brief, diff it against last week’s, re-render with a different profile, and publish to a test folder — without anyone needing to understand the whole pipeline. Each artifact is the contract between two stages.

Translation is a human-in-the-loop sub-pipeline, not an API call.

The pipeline ships drafts in four languages — German, Spanish, Russian, Ukrainian — off the same English source. There is no external translation API wired in. Claude translates block-by-block following a per-language glossary that lives in references/translations/{de,es,ru,uk}.md, and an 8-point QA checklist runs before staging.

The decision keeps quality control with the writer who knows the source brief. A wrong feature name in Spanish would be caught by the checklist; a brand voice drift would be caught by the per-language glossary. The pipeline is honest about what it can and can’t guarantee on its own.

human-in-the-loop translation vs API translation why no DeepL / Google Translate · why glossaries instead
criterion machine translation API (typical) model + glossary + QA checklist (chosen) chosen
brand voice continuity lost on every render preserved · same model · same profile
domain-specific terms frozen in API's training per-language glossary, editable
QA surface API output · trust it 8-point checklist · the writer signs
vendor lock bound to one translation provider model-agnostic · provider swap is a prompt change
incremental edit full retranslate block-by-block, regenerate one section

For the business: language is a liability surface that scales with distribution. The pipeline treats it as such — the writer is in the loop for every shipping language, the conventions are documented, the final QA is human.

Two publishing paths, deliberately separate

Articles ship to two places: the company’s own blog (with inline screenshots, callouts, a Docs-native table of contents) and external syndication partners (plain styling, fixed Drive folder, 1.5-line spacing baked in). One script could have handled both — and would have become an unreadable knot of conditionals six months later. Two scripts, no conditionals.

in-house publish own blog · full styling
  1. 01
    screenshot staging
    copy from writer's Drive → temp public folder
  2. 02
    insert + style
    callouts · inline images · TOC · paragraph spacing
  3. 03
    cleanup
    delete temp folder · log Drive link
external publish syndication partners · plain styling
  1. 01
    render plain
    no inline screenshots · no callouts
  2. 02
    fixed Drive folder
    external articles by Max Sushchuk
  3. 03
    format defaults
    1.5-line spacing · spaceBelow paragraph styling

For the business: each surface gets exactly the styling it needs. Adding a third syndication target is a new script, not a new flag.

A note on the platform contract

The pipeline depends on a small CLI for talking to Google Workspace. Rather than rely on the user’s shell config being right, the pipeline’s Python scripts set GOOGLE_WORKSPACE_CLI_CONFIG_DIR themselves (defaulting to the right profile, overridable via an environment variable). The pipeline never depends on the user remembering to activate the right profile in their shell.

The small lesson: the pipeline owns its own platform contract. The writer doesn’t have to know that gws is profile-scoped; the pipeline takes care of it. A new machine, a new shell, a new collaborator — the pipeline still works.

Governance for a personal tool

The pipeline lives in two branches in the repo: v1.x is the shared baseline, frozen for the second writer; v2.x is the personal fork the author iterates on. The README warns collaborators not to follow the personal branch. The collaborator’s pipeline does not break when the author experiments.

This is governance applied to a tool used by two people. Unusual at this scale — and the reason both writers are still using it, three months in, without anyone having to baby-sit it.

What’s shipped

Renderers: general-guide template (linear steps, troubleshooting sections), listicle template (per-tool meta cards, transposed comparison table, compatibility row, methodology section), comparison template (feature matrices, verdict paragraphs, blocking validation when the source spreadsheet is inaccessible).

Editorial reference docs (~12 files): core formatting, core style, author tone, per-article-type templates (general, listicle, comparison), brief normalisation, technical precedence, delivery mapping, screenshot rules, self-checks, per-language translation glossaries, translation checklist.

Translation track: four languages (German, Spanish, Russian, Ukrainian) with per-language glossaries, a reviewer-metadata staging script, and an 8-point QA checklist before publish.

Publishing: two distinct scripts — in-house with inline screenshots and full styling; external/guest-article with plain styling and fixed Drive folder. Both with --dry-run mode.

Onboarding: a /setup slash command walks new users through Google Workspace auth, author-profile creation, and config setup. README + a project CLAUDE.md cover the pipeline contract and the known gotchas. Three .template.json configs in git; the personal copies are gitignored.

Tests: the load-bearing paths — spreadsheet-dependent comparison drafting, translation staging — are covered.

Adoption: two writers, daily use, three months in. Same pipeline, two voices, no divergence.

What this says about the builder

The interesting thing about this pipeline isn’t that a copywriter automated their own work — that is something many writers attempt and most abandon after a week. The interesting thing is that the pipeline survived contact with a second user.

The author-profile abstraction exists because a colleague needed to use the same pipeline in their own voice — not as a hypothetical, but in practice. The v1.x baseline was frozen for the collaborator so that experiments in the personal v2.x fork would not break their workflow. The README explicitly warns collaborators not to follow the personal track. The pipeline ships with a /setup slash command and its own onboarding doc. None of these were needed when the tool only served the builder. All of them appeared once it served two.

The vocabulary borrowed from elsewhere is the giveaway — normalise, validate, dry-run, schema, fixture, profile, contract. These are software-engineering words applied to an editorial workflow most copywriters treat as inherently manual. The result is a pipeline a software engineer would recognise as well-factored: linear stages, checkable artifacts, validation gates, dual publishing paths kept deliberately separate rather than overloaded into one. Editorial conventions are versioned in git like any other dependency.

The other tell is that this case is finished. The pipeline is stable; two writers use it daily; no further architectural changes are planned. In a portfolio where every other system is in progress, this is the one that moved from project to infrastructure. A content team treats it the way they used to treat their text editor: they don’t think about it, they just write.

Three months. One builder. Two writers, daily. Four languages, one renderer. A brief-to-Doc compression from a workday to twenty minutes.


This case describes architecture and patterns. The publisher’s name, the colleague’s name, the syndication targets’ names, the repository URL, and any internal Drive paths that would identify either are deliberately out of scope by policy.