Est. 2026Philosophy · Technology · WisdomLinkedIn ↗

PaddySpeaks

Where ancient wisdom meets the architecture of tomorrow

← All Articles
ai

How I Extracted 91 LinkedIn Articles in Under an Hour

My content was trapped inside LinkedIn. Claude helped me break it free.

I have been writing on LinkedIn for over five years. Philosophy, technology, data architecture, AI, the Bhagavad Gita applied to modern work — 91 articles, spanning six years of my thinking. And all of it was locked inside LinkedIn’s walled garden. No export button. No “download all.” No API for your own content. Your writing, their platform, their rules.

So I decided to break it all free. And with Claude as my AI co-pilot, I did it in under an hour.

★ ★ ★

The Problem: Your Content, Their Platform

LinkedIn does offer a data export through their settings. You go to Settings > Data Privacy > Get a copy of your data, select “Articles,” and wait. What you get back is a zip file containing raw HTML files — one per article. The markup is LinkedIn’s proprietary format: inconsistent class names, embedded styles, metadata scattered across custom elements.

It is your data, technically. But it is not usable data. Not without serious work.

The Raw Numbers

96 HTML files exported from LinkedIn. Proprietary markup. No consistent structure. Dates in different formats. Images pointing to expired CDN links. Zero documentation on the format.

★ ★ ★

The Solution: Claude + Python + GitHub Pages

Here is what I did, step by step, with Claude doing most of the heavy lifting:

Step 1: Parse LinkedIn’s HTML

I asked Claude to write a custom Python HTML parser that could handle LinkedIn’s messy export format. The parser extracts:

  • Title from the first <h1> tag
  • Publication date from LinkedIn’s custom metadata elements
  • Hero image URL (if present)
  • Full article body — while stripping out LinkedIn’s wrapper cruft

The LinkedInHTMLParser class handles all of LinkedIn’s quirks — nested divs, empty paragraphs, hashtag spam at the bottom of articles, and inconsistent date formatting.

Step 2: Clean, Categorize, and Transform

Raw parsing was not enough. Claude helped me build a processing pipeline that:

  • Cleans the content — strips empty paragraphs, removes LinkedIn hashtags, eliminates broken image references
  • Categorizes every article — using keyword matching against four categories: AI & Future, Philosophy, Technology, and Life
  • Generates URL-friendly slugs from article titles
  • Estimates reading time (word count ÷ 200 words per minute)
  • Extracts first paragraphs for meta descriptions and subtitle cards

Step 3: Generate a Beautiful Website

Claude did not just dump the content into plain HTML. It generated a complete, styled website with:

  • A homepage with horizontal-scrolling article cards
  • Individual article pages with reading progress bars
  • A consistent design system — custom CSS with serif typography, warm tones, and a literary aesthetic
  • Category navigation so readers can browse by topic
  • Responsive design that works on mobile, tablet, and desktop

Step 4: Deploy on GitHub Pages

The entire site is static HTML and CSS. No frameworks. No build tools. No dependencies. Push to GitHub, enable Pages, point a custom domain, done. paddyspeaks.com was live.

91 Articles extracted
<1hr Total time spent
0 Frameworks used
★ ★ ★

What Surprised Me

The speed was obvious. But what really surprised me was the quality of judgment Claude demonstrated:

  • It correctly categorized articles about the Bhagavad Gita under Philosophy and articles about data mesh under Technology — without me labeling a single one
  • It handled edge cases like articles with no date, duplicate filenames, and malformed HTML without crashing
  • It designed a consistent aesthetic across 91 pages that looked like a human designer had spent days on it
  • It knew when to ask and when to just make a reasonable decision and move forward

This was not “AI generates slop.” This was a genuine collaboration. I provided the vision and direction. Claude handled the tedious, complex, error-prone work of parsing, transforming, and generating 91 perfectly formatted pages.

★ ★ ★

The Bigger Point

This is what AI should be. Not replacing your thinking — amplifying your ability to act on it.

I have been meaning to liberate my writing from LinkedIn for years. The thought of manually copying 91 articles, formatting them, building a site, categorizing everything — it always felt like a weekend project that would stretch into weeks. So I never did it.

With Claude, I went from “I should really do this someday” to “it is live” in under an hour.

The Takeaway

Your content should live where you control it. Not inside a platform that can change its algorithm, paywall your reach, or shut down your profile tomorrow. If you have years of writing trapped in LinkedIn, Medium, or anywhere else — get it out. AI makes the migration trivially easy now.

★ ★ ★

How You Can Do This Too

  1. Export your LinkedIn data — Settings > Data Privacy > Get a copy of your data > Select “Articles”
  2. Use Claude to write a parser for the HTML exports (or use mine — the code is open source on GitHub)
  3. Generate your site — let Claude handle the HTML/CSS generation with a design you love
  4. Deploy on GitHub Pages — free hosting, custom domain support, zero maintenance

The entire pipeline is two Python scripts and a CSS file. No React. No Next.js. No npm install that downloads half the internet. Just HTML, CSS, and your words — finally free.

★ ★ ★

Six years of writing. Ninety-one articles. One hour. One AI. Zero excuses left for keeping your content locked inside someone else’s platform.

Your words deserve a home you own.