Baoyu YouTube Transcript โ€” Extract Clean Subtitles, No API Key

Baoyu YouTube Transcript โ€” Extract Clean Subtitles, No API Key

A Claude Code skill that turns any YouTube URL into a clean, chaptered document with timestamps, speaker labels, and cover images โ€” no API key needed. Input a link, get a structured Markdown transcript. The secret: it uses YouTubeโ€™s InnerTube API โ€” an internal but publicly accessible endpoint that returns subtitle data without requiring Google API keys or OAuth.

*Source: ๅฎ็މ xp Weibo announcement Skill: baoyu-youtube-transcript (installed via baoyu utility skills)*

What It Does

Input: YouTube URL (any format)
         โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   baoyu-youtube-transcript      โ”‚
โ”‚                                 โ”‚
โ”‚  1. Fetch subtitles (InnerTube) โ”‚
โ”‚  2. Smart sentence processing   โ”‚
โ”‚  3. Chapter detection           โ”‚
โ”‚  4. Speaker identification      โ”‚
โ”‚  5. Cover image extraction      โ”‚
โ”‚  6. Format as Markdown/SRT      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ†“
Output: Clean chaptered document
        with timestamps + speakers

Supported Input Formats

Format Example
Full URL https://www.youtube.com/watch?v=abc123
Short link https://youtu.be/abc123
Embed link https://www.youtube.com/embed/abc123
Shorts link https://www.youtube.com/shorts/abc123
Video ID abc123

Smart Sentence Processing

YouTubeโ€™s raw subtitles are fragmented โ€” word-by-word chunks with misaligned timestamps. This skill fixes that:

YouTube Raw This Skill
Word-by-word fragments Complete sentences by punctuation
Arbitrary time splits Time allocated by character length
No sentence boundaries Split at periods, questions, exclamations
Generic handling Special CJK processing for Chinese/Japanese/Korean

The result: natural readable text, not the choppy fragments you get from YouTubeโ€™s auto-generated captions.

Key Features

Feature Detail
No API key Uses InnerTube API โ€” no Google API Key, no OAuth
Caching First fetch caches raw data; format changes are instant
Multi-language Supports all YouTube subtitle languages
Translation Can specify preferred language or translate to another
Markdown output Timestamps, chapters, speakers (default)
SRT export Standard subtitle file format for video editing
Cover image Extracts video thumbnail automatically

Caching Mechanism

First run caches four files โ€” subsequent format/parameter changes are instant:

Cached File Content
meta.json Video metadata (title, duration, channel)
transcript-raw.json Raw subtitle segments from YouTube
transcript-sentences.json Processed natural sentences
cover.jpg Video thumbnail

Add --refresh to force re-fetch from YouTube.

How InnerTube Works

Regular YouTube API:
  Google Cloud Console โ†’ Create project โ†’ Enable API โ†’
  Generate key โ†’ Quota limits โ†’ OAuth for some features
  = 30 minutes setup

InnerTube API:
  HTTP request โ†’ Get subtitle data
  = No setup, no key, no quota

InnerTube is YouTubeโ€™s internal API for fetching subtitle/caption data. Itโ€™s publicly accessible (YouTubeโ€™s own player uses it) but has no official documentation. The skill wraps this into a clean interface.

How LearnAI Team Could Use This

  • Turn YouTube lectures, conference talks, tutorials, and Chinese/English AI commentary videos into searchable study notes and wiki drafts.
  • Build lesson materials from public video sources while preserving timestamps for review and citation.
  • Extract transcripts for translation workflows โ€” English lectures โ†’ Chinese notes for bilingual learners.

Real-World Use Cases

This skill is particularly powerful when combined with other tools:

Workflow How
/mywiki from YouTube Extract transcript โ†’ research topic โ†’ create wiki entry
Lecture notes Pull professorโ€™s YouTube lecture โ†’ structured notes with timestamps
Translation Extract English transcript โ†’ translate to Chinese for study
Content analysis Get transcript โ†’ analyze with Claude for key points
Voice-Pro pipeline Extract transcript โ†’ translate โ†’ dub into another language

Installation

# Install directly
npx skills add jimliu/baoyu-skills --skill baoyu-youtube-transcript

# Or via Claude Code plugin marketplace
/plugin
# Search for baoyu-skills or utility-skills

<!โ€“ REVIEW-TODO: [source_links] Weibo source link is generic (https://weibo.com) โ€” find specific ๅฎ็މ xp post URL and add specific repo/marketplace link โ€“>