All tools/match_voice2 creditssynthesize

What is match_voice?

match_voice is a Hooklayer MCP tool that takes 3+ reference samples (creator video URLs or text) and returns a quantified voice profile — type-token ratio, filler rate, average sentence length, top recurring 2-3-grams — plus a rewritten version of your draft in that voice and a reusable prompt_instructions string you can apply to future scripts.

Math, not vibes. The voice_metrics block returns deterministic numbers: vocab_diversity_ttr (moving-window 100), filler_rate_per_100_words (matches 18 filler tokens), avg_sentence_length_words, signature_phrases[] (top 5 recurring 2-3-grams with counts). Agents cite "TTR 0.62, signature: 'real talk' ×4" instead of "Excited Discovery Evangelist."

Banned-phrase detection. The rewriter is prompted to avoid 26 corporate-speak phrases (weaponize, leverage, framework, comprehensive, "Hey guys"). A post-hoc scanner returns quality_warnings.phrases[] when any survive, downgrading quality.level. Catches when the rewrite slips into LinkedIn voice.

Reusable prompt_instructions. Returns a system-prompt-ready string you can paste into other generation calls to maintain the same voice without re-running match_voice. Saves credits on subsequent scripts in the same voice.

Inputs & outputs

Endpoint: POST /api/v1/voice/match

Inputs

  • draftstringrequired

    The text to rewrite in the target voice

  • reference_samplesstring[]required

    At least 3 reference samples — TikTok/YouTube/Instagram URLs or raw text. URLs auto-extract transcripts.

Output fields

  • voice_profile

    Qualitative: energy_level, humor_style, personality, vocabulary_level, sentence_length, signature_moves, favorite_words, avoided_words, audience_relationship

  • voice_metrics

    Deterministic numbers: total_words, total_sentences, avg_sentence_length_words, vocab_diversity_ttr, filler_rate_per_100_words, signature_phrases[]

  • rewritten_draft

    Your draft rewritten in the target voice

  • prompt_instructions

    Reusable system-prompt string for future generations

  • quality_warnings

    Banned phrases that survived the rewrite (if any) with note

  • quality

    level (full | partial | degraded if metrics unavailable or banned phrases present)

cURL

curl -X POST https://hooklayer.dev/api/v1/voice/match \
  -H "Authorization: Bearer hl_live_..." \
  -H "Content-Type: application/json" \
  -d '{
    "draft": "I want to teach you about index funds.",
    "reference_samples": [
      "https://www.tiktok.com/@humphreytalks/video/7188273048459857195",
      "https://www.tiktok.com/@humphreytalks/video/7194567891234567890",
      "https://www.tiktok.com/@humphreytalks/video/7201234567890123456"
    ]
  }'

Example prompts

Paste any of these into Claude Desktop (with Hooklayer connected) to see the live response.

Match a specific creator voice (with metrics)

Use Hooklayer match_voice with these 3 Humphrey Yang video URLs as reference_samples and "I'm going to teach you about index funds" as the draft. Show me the FULL voice_metrics block — vocab_diversity_ttr, filler_rate_per_100_words, avg_sentence_length_words, and the top signature_phrases with their counts. Then show the rewritten_draft and tell me if quality_warnings flagged any corporate phrases.

Expected output: Returns voice_metrics with TTR ~0.62, filler rate ~1.4/100w, signature phrases with counts. Rewritten draft uses Humphrey's signature "okay, yeah, bro" cadence. quality_warnings empty if clean.

Save the prompt_instructions for reuse

Run Hooklayer match_voice on the @humphreytalks reference samples above. Capture the prompt_instructions field — I want to save it so I can write 10 more scripts in his voice WITHOUT calling match_voice again. Show me the exact prompt_instructions string verbatim.

Expected output: Returns a reusable system-prompt-ready string like "Write with clear, confident authority using simple dialogue scenarios..." that the agent can apply to future generations.

Detect corporate-speak slips

Use Hooklayer match_voice with a casual-creator reference set and this corporate draft: "I want to leverage this comprehensive framework to weaponize the curiosity gap and elevate your strategic positioning." If quality_warnings flags banned phrases, list them and retry the call with the same inputs — does the rewrite improve on retry?

Expected output: Demonstrates banned-phrase detection. quality_warnings.phrases lists "leverage", "comprehensive", "framework", "weaponize", "elevate", "strategic." quality.level drops to partial. Retry returns cleaner output.

Frequently asked

Why does match_voice return numbers AND qualitative labels?

Both layers matter. Qualitative labels (energy_level: 5/10, humor_style: dry) give agents and humans a quick read. The deterministic voice_metrics (TTR 0.62, filler rate 1.4/100w, signature_phrases with counts) are the reproducible signature — you can cite them mathematically and they're consistent across runs on the same samples. The eval feedback was explicit: "voice DNA labels are vibes, not systems" — adding voice_metrics fixed that.

What does vocab_diversity_ttr mean?

Type-token ratio. Number of unique words / total words, computed over moving 100-token windows (so long samples aren't penalized). 0.7+ = highly varied vocabulary. 0.4-0.6 = average. <0.4 = very repetitive. Useful for matching creators whose voice is intentionally simple vs intentionally varied.

How many reference samples do I need?

3 is the minimum required. 5-7 gives a meaningfully more stable voice signature. URLs auto-extract transcripts (Whisper, ~3s each). Text samples skip extraction. The metrics computation requires at least 40 characters of total text — too-short samples return null voice_metrics with quality.level = degraded.

What about creators who use code-switching or multiple registers?

match_voice surfaces the DOMINANT voice across the samples. If a creator code-switches (e.g. casual on TikTok, formal on YouTube), pass platform-consistent reference samples or you'll get an averaged voice that fits neither register cleanly. Best practice: 3+ samples from the same content type on the same platform.

Can I match my own voice to apply to AI-generated drafts?

Yes — that's a common use case. Pass 3+ samples of YOUR existing scripts/captions as reference_samples, then run match_voice on each AI-generated draft to rewrite it in your voice. The prompt_instructions field is especially useful here — save it once and reuse for every draft without re-extracting your voice.

How is this different from a voice clone like ElevenLabs?

ElevenLabs clones acoustic / audio voice (tone, accent, pitch). match_voice clones written voice (word choice, sentence cadence, filler rhythm, signature phrases). They're complementary — match_voice gives you the script in the creator's linguistic voice; ElevenLabs reads it in their acoustic voice. Different layer.

Try match_voice in 30 seconds.

100 free credits at signup. No card. Works in Claude Desktop, Cursor, n8n, or any MCP client.