<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Ryan's Substack]]></title><description><![CDATA[Sharing my side projects and thoughts on AI, data science, software engineering, and more]]></description><link>https://blog.ryanbbrown.com</link><image><url>https://substackcdn.com/image/fetch/$s_!Cg07!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69232400-f62b-4cd1-bf66-3f0d591b24d7_622x622.png</url><title>Ryan&apos;s Substack</title><link>https://blog.ryanbbrown.com</link></image><generator>Substack</generator><lastBuildDate>Tue, 05 May 2026 12:04:09 GMT</lastBuildDate><atom:link href="https://blog.ryanbbrown.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Ryan Brown]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[ryanbbrown@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[ryanbbrown@substack.com]]></itunes:email><itunes:name><![CDATA[Ryan Brown]]></itunes:name></itunes:owner><itunes:author><![CDATA[Ryan Brown]]></itunes:author><googleplay:owner><![CDATA[ryanbbrown@substack.com]]></googleplay:owner><googleplay:email><![CDATA[ryanbbrown@substack.com]]></googleplay:email><googleplay:author><![CDATA[Ryan Brown]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[An LLM wiki won't change your life]]></title><description><![CDATA[Why all the Karpathy-inspired Obsidian posts are missing the point]]></description><link>https://blog.ryanbbrown.com/p/an-llm-wiki-wont-change-your-life</link><guid isPermaLink="false">https://blog.ryanbbrown.com/p/an-llm-wiki-wont-change-your-life</guid><dc:creator><![CDATA[Ryan Brown]]></dc:creator><pubDate>Sun, 03 May 2026 16:29:00 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/d10e362b-fb09-48f5-97f0-514856febc7b_1376x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>After <a href="https://x.com/karpathy/status/2039805659525644595">Karpathy&#8217;s tweet</a> about LLM wikis, Obsidian has been blowing up. I&#8217;ve read maybe a dozen posts by now, and they all promise the same thing.</p><p>It sounds great: drop your articles, papers, and notes into a folder, point an agent at it, and watch it compile a living wiki&#8212;interlinked entity pages, summaries, contradictions flagged, the whole thing self-maintaining.</p><p>Let me save you some time. It&#8217;s not as useful as it seems.</p><p>In 2022, before LLMs were any good at editing markdown, I got deep into Obsidian. I graduated college early and had time before starting my first job, so I spent four or five months building the perfect knowledge base to help me prepare. Then I started the job and never opened it again.</p><p>Obviously, things are different now with the state of AI. The burden of maintenance is entirely gone; all the articles are right about how easy and effective it is to create an impressive knowledge base.</p><p>But cost was never the bottleneck. They&#8217;re hyping up a solution to the wrong problem.</p><p>An LLM can connect note A to notes B and C in your vault and answer questions across hundreds of curated sources. None of that changes whether you&#8217;ve internalized any of it. AI can&#8217;t be in the room when a coworker says something at lunch that should connect to one of those notes&#8212;the connection has to happen live, drawing on knowledge that&#8217;s actually in your head. People have been writing <a href="https://bulletjournal.com/blogs/bulletjournalist/i-deleted-my-second-brain">&#8220;I deleted my second brain&#8221;</a> pieces for years; having AI build it for you doesn&#8217;t fix the underlying problem that storing something isn&#8217;t the same as understanding it.</p><p>In fairness to Karpathy, read <a href="https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f">his original gist</a> if you haven&#8217;t already. It&#8217;s narrower than the posts that followed. He&#8217;s explicit that the pattern is for &#8220;topics of research interest&#8221;: going deep on a bounded subject, building a companion wiki for a book, doing competitive analysis. An LLM wiki is great for that work.</p><p>Even outside those use cases, I&#8217;m not saying this trend is a bad thing. It&#8217;s fun to set up, the results are impressive, and you still gain some knowledge just from curating the sources. But nobody in the recent craze talks about what to actually do with the wiki once it exists.</p><p>I do use Obsidian, but <a href="https://readwise.io/read">Readwise Reader</a> has been my primary interface for the past couple of months. It&#8217;s an AI-friendly tool (with an Obsidian integration) that makes it easy to ingest, consume, and engage with content from anywhere&#8212;Substack, Twitter, personal blogs, company news pages, etc.</p><p>My personal knowledge system is still a work-in-progress, but it&#8217;s pretty light and is focused primarily on understanding. I&#8217;d rather slowly read and digest twenty articles than build a wiki of five hundred that I never truly understand.</p><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://x.com/karpathy/status/2049907410303865030?s=20&quot;,&quot;full_text&quot;:&quot;This is the the quote I've been citing a lot recently.&quot;,&quot;username&quot;:&quot;karpathy&quot;,&quot;name&quot;:&quot;Andrej Karpathy&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/1296667294148382721/9Pr6XrPB_normal.jpg&quot;,&quot;date&quot;:&quot;2026-04-30T17:43:06.000Z&quot;,&quot;photos&quot;:[],&quot;quoted_tweet&quot;:{&quot;full_text&quot;:&quot;you can outsource your thinking\nbut you cannot outsource your understanding&quot;,&quot;username&quot;:&quot;yacineMTB&quot;,&quot;name&quot;:&quot;kache&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/1901438455927668736/FjhhhN0b_normal.jpg&quot;},&quot;reply_count&quot;:641,&quot;retweet_count&quot;:3946,&quot;like_count&quot;:43048,&quot;impression_count&quot;:1639541,&quot;expanded_url&quot;:null,&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div>]]></content:encoded></item><item><title><![CDATA[I reverse-engineered Kindle to build AI audiobooks]]></title><description><![CDATA[The absurd lengths I went to for 8-minute audiobook snippets]]></description><link>https://blog.ryanbbrown.com/p/i-reverse-engineered-kindle-to-build</link><guid isPermaLink="false">https://blog.ryanbbrown.com/p/i-reverse-engineered-kindle-to-build</guid><dc:creator><![CDATA[Ryan Brown]]></dc:creator><pubDate>Sun, 25 Jan 2026 13:26:18 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/723a7e19-38ec-4a49-bbae-2b5e568f36a2_1264x848.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Imagine this:</p><p>You usually read on a <a href="https://www.amazon.com/All-new-Amazon-Kindle-Paperwhite-glare-free/dp/B0CFPJYX7P?th=1">Kindle Paperwhite</a>, but on the way to dinner you only have your phone with you, so you use the Kindle app on your phone while on the subway. You get off the subway and have an 8 minute walk to the restaurant, during which you can no longer read.</p><p>Wouldn&#8217;t it be nice to listen to an audiobook on that walk? But you don&#8217;t usually listen to audiobooks, and you only really need it for a few minutes, so you don&#8217;t want to buy the whole audiobook. Plus, your book isn&#8217;t <a href="https://help.audible.com/s/article/listen-with-whispersync-for-voice?language=en_US#:~:text=eBook%20purchase%2C%20too.-,How%20do%20I%20know%20if%20a%20title%20has%20Whispersync%20for%20Voice%3F,-You%20can%20tell">Whispersync-for-Voice enabled</a>, so even if you had the audiobook and listened, your progress wouldn&#8217;t sync&#8212;you&#8217;d have to manually find the right spot in the book.</p><p>This very relatable scenario is just my life, so I built an app to generate audiobook snippets on-demand with AI&#8212;and they always sync.</p><h2>1.0 Decoding Kindle</h2><p>My original idea was to have an app that would automatically swipe/screenshot the Kindle app and use OCR to extract text, but Apple doesn&#8217;t allow that level of device control (for good reason). So I had to find a way to programmatically get the content of the book and sync reading progress.</p><h3>1.1 Content Deobfuscation</h3><p>After a bit of searching, I discovered that Kindle ebook files download as <code>.tar</code> and have to be deobfuscated&#8212;the actual page images are embedded as a series of XOR&#8217;d glyphs rather than plain text. I got lucky on timing; PixelMelt had just posted <a href="https://blog.pixelmelt.dev/kindle-web-drm/">an article</a> on how to reverse engineer the obfuscation.</p><p>I did this project in November and had been hearing <a href="https://steipete.me/posts/just-talk-to-it">good things about Codex</a> so thought this would be a nice opportunity to give it a try; PixelMelt didn&#8217;t publicly share the code so I&#8217;d have to recreate it. I gave both Codex and Claude Code the article content and let them run with it. Both failed pretty miserably<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>, but Codex failed <em>less miserably</em>, so I decided to switch over for the rest of the project.</p><p>Amidst its failure Claude did uncover <a href="https://shkspr.mobi/blog/2025/10/improving-pixelmelts-kindle-web-deobfuscator/">a more recent article</a> with improvements to the original deobfuscation methodology&#8212;render all the glyphs, stitch them together into their page structure, screenshot the page, OCR the screenshot. Armed with this new approach + some more intentional instructing, Codex got the job done pretty quickly.</p><h3>1.2 Kindle Access</h3><p>That solved extracting text once I had the <code>.tar</code> file, but I still needed to be able to programmatically download the file and update reading position in the book. I pretty quickly found the <a href="https://github.com/Xetera/kindle-api">kindle-api</a> library for access to the Kindle Web API, but there were two problems:</p><ol><li><p>Only books you buy on the Kindle store show up in the web reader, but I purchase most of my books elsewhere</p></li><li><p>The library only supported retrieving high-level book info</p></li></ol><p>I spent ~45 minutes with <a href="https://github.com/mitmproxy/mitmproxy">mitmproxy</a> to try to reverse engineer the Kindle Mac reader&#8217;s API, but it was a lot harder to figure out than the web API and I didn&#8217;t want to spend the whole project just reverse engineering, so I decided that being limited to books bought on the Kindle store was fine.</p><p>The original repo has you log into the web reader, inspect element, and manually copy cookies (valid for a year) from the network requests. The endpoints I needed had additional shorter-lived tokens, though, so I needed a better way to collect the info.</p><p>Rather than trying to automate with puppeteer/playwright and beat Amazon&#8217;s captcha, I just put an embedded browser in the iOS app&#8212;the user logs in manually, and the app grabs the required cookies from the network requests.</p><h2>2.0 Building the App</h2><p>Now that I had all the pieces, time to put it together!</p><h3>2.1 Architecture</h3><p>I used Fastify for the server and SwiftUI for the iOS app, with <code>WKWebView</code> + injected JavaScript to capture Amazon credentials from network requests. The server calls my Python deobfuscation code as a subprocess and relies on my <a href="https://github.com/ryanbbrown/kindle-api">fork of kindle-api</a> for Kindle access.</p><p>The initial version was minimal: download the book content, convert it to text, and serve text chunks to the app. Before integrating text-to-speech and reading progress, I had to make a decision about data storage.</p><p>There are a lot of artifacts to manage&#8212;raw <code>.tar</code> files, extracted JSONs, page images, OCR text, audio files. I could have set up S3 or a database, but that felt like overkill for a self-hosted single-user app. Instead, I went with an the simplest option: store everything in a local data folder within the server. If the server dies, so does my data, but for relatively ephemeral audiobook snippets that&#8217;s probably fine.</p><h3>2.2 Audio &amp; Position Sync</h3><p>I used ElevenLabs for text to speech; they have a <a href="https://elevenlabs.io/docs/api-reference/text-to-speech/convert-with-timestamps">with-timestamps endpoint</a> that provides the necessary character-level timing information to be able to sync reading progress with position. I stored mapping of audio timestamps to Kindle position IDs at 5-second intervals, so the app knows where to update progress as playback advances.</p><p>I then added a frame-synced listener to the audio playback that checks every frame if the current time has crossed a benchmark checkpoint. When it does, it hits the <code>/books/:asin/progress</code> endpoint with the corresponding Kindle position ID, syncing the progress back to Amazon.</p><h2>3.0 Debugging</h2><p>Unsurprisingly, getting the above functionality working wasn&#8217;t as painless as I made it sound.</p><h3>3.1 Oh you want another book?</h3><p>Up to this point, I had been testing on the only book that I had in the Kindle web viewer: <em>Wind and Truth</em> by Brandon Sanderson. I figured it was time to expand, so I got a free sample book and ran it through the app, but the content download failed. I compared sample vs. free vs. paid books, but only Wind and Truth was working. Looking at full book metadata didn&#8217;t reveal anything.</p><p>Codex actually wasn&#8217;t much help here, I ended up discovering the issue myself: <code>renderRevision</code>, one of the required arguments for the content download endpoint, was hardcoded to the value for <em>Wind and Truth</em>. I guess it slipped under my AI-slop radar. It&#8217;s not the <em>craziest</em> thing; most of the args for the download endpoint are hardcodes (like fontSize, height, width, margins, etc.), and there&#8217;s also a kindleSessionId parameter that&#8217;s just a randomly generated UUID.</p><h3>3.2 Corruption</h3><p>Once I had that fixed, the download worked, but I ran into another error: the <code>.tar</code> download was incomplete and didn&#8217;t always have the necessary json files that we needed to reconstruct the text.</p><p>Codex did some analysis of the file contents and discovered that all the files were present in the <code>.tar</code>, but some of them were corrupt and didn&#8217;t get properly extracted. It thought the issue was on Amazon&#8217;s side, but I wasn&#8217;t convinced. After some testing, I discovered that copying the raw cURL request made by the browser worked, but making a request from my server with the exact same parameters would fail.</p><p>That narrowed it down to the way my server sent requests vs. a raw cURL, and I finally figured it out: the TLS proxy. All requests from my server were being proxied through an external <a href="https://github.com/bogdanfinn/tls-client-api">tls-client-api</a>, which is required due to changes in Amazon&#8217;s TLS fingerprinting in July 2023 (discovered by the original developer of kindle-api). The proxy mimics Chrome&#8217;s TLS fingerprint so Amazon&#8217;s servers accept the requests&#8212;requests from standard Node.js libraries get rejected because their TLS handshake doesn&#8217;t match a real browser.</p><p>I hadn&#8217;t changed any of the TLS proxy code, and the default setting was to treat every response as text. This worked fine for the original kindle-api functionality of fetching book data, but breaks with tarballs. It basically tries to shove the binary tarball into a UTF-8 string before returning it to our server, which then has to encode it (sometimes unsuccessfully) before writing to disk. Once I added a setting to treat the <code>.tar</code> responses as binary, everything worked perfectly.</p><h2>4.0 Deployment &amp; Polish</h2><p>There was still a lot of functionality to add, but the core pipeline was working, so I deployed the app to make sure there wasn&#8217;t anything I failed to consider from a deployment perspective.</p><h3>4.1 Server Deployment</h3><p>Deploying the server to <a href="http://fly.io/">fly.io</a> ended up being pretty easy. The Dockerfile uses a multi-stage build to install three runtimes&#8212;Node (for the server), Python (for text deobfuscation), and Go (for the TLS client)&#8212;but otherwise nothing fancy.</p><p>As an aside, if you haven&#8217;t used <a href="http://fly.io/">fly.io</a> I would <strong>highly</strong> recommend. It&#8217;s not as fully featured as AWS or Azure but is so much easier to work with and makes deployment a breeze for side projects.</p><h3>4.2 iOS Signing</h3><p>There were a few options for bundling the iOS app and actually testing it on my phone vs. the iPhone simulator on Mac. Since I&#8217;m not planning to publish this to the app store and didn&#8217;t want to pay Apple $99/year for a developer account, I went with the free option&#8212;sign the app every 7 days when the provisioning profile expires.</p><p>There&#8217;s a ton of permission/certificate stuff to go through on Mac that I hadn&#8217;t done before, but eventually got it up and running! On-demand audiobooks for Kindle that properly sync my reading position.</p><p>I was also tired of the empty app icon so I quickly had ChatGPT generate one for me; it looked exactly how you would expect when asking AI to make a &#8220;minimal app icon for AI generated audiobooks&#8221;:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-1YC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ae0aed9-e73f-40d1-b4e3-24f7dd69c02c_180x180.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-1YC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ae0aed9-e73f-40d1-b4e3-24f7dd69c02c_180x180.png 424w, https://substackcdn.com/image/fetch/$s_!-1YC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ae0aed9-e73f-40d1-b4e3-24f7dd69c02c_180x180.png 848w, https://substackcdn.com/image/fetch/$s_!-1YC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ae0aed9-e73f-40d1-b4e3-24f7dd69c02c_180x180.png 1272w, https://substackcdn.com/image/fetch/$s_!-1YC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ae0aed9-e73f-40d1-b4e3-24f7dd69c02c_180x180.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-1YC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ae0aed9-e73f-40d1-b4e3-24f7dd69c02c_180x180.png" width="150" height="150" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4ae0aed9-e73f-40d1-b4e3-24f7dd69c02c_180x180.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:180,&quot;width&quot;:180,&quot;resizeWidth&quot;:150,&quot;bytes&quot;:46227,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.ryanbbrown.com/i/185680596?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ae0aed9-e73f-40d1-b4e3-24f7dd69c02c_180x180.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-1YC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ae0aed9-e73f-40d1-b4e3-24f7dd69c02c_180x180.png 424w, https://substackcdn.com/image/fetch/$s_!-1YC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ae0aed9-e73f-40d1-b4e3-24f7dd69c02c_180x180.png 848w, https://substackcdn.com/image/fetch/$s_!-1YC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ae0aed9-e73f-40d1-b4e3-24f7dd69c02c_180x180.png 1272w, https://substackcdn.com/image/fetch/$s_!-1YC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ae0aed9-e73f-40d1-b4e3-24f7dd69c02c_180x180.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>4.3 Post-Deployment Fixes</h3><p>The first thing I noticed after deploying is that the app slowed down a <strong>lot</strong> when using the remote server hosted on fly vs. when I was testing locally. Turns out Tesseract OCR is very fast on my Macbook M4 Pro but not so fast on a tiny 1x shared CPU machine. Easy enough to switch over to using an API from <a href="https://ocr.space/">ocr.space</a>, although I had to change from their default engine to &#8220;Engine 2&#8221; to handle the multi-column text and maintain proper reading order.</p><p>Other updates:</p><ul><li><p><strong>Logging</strong>: I was getting by with the default logging on the Fastify server / TLS proxy but it was getting a bit painful, so I added proper logging</p></li><li><p><strong>Security</strong>: Someone could theoretically clone my repo, point their app at my server, and use up my TTS credits, so I added a simple server API key (login would have been overkill)</p></li><li><p><strong>iOS App</strong>: I cleaned up the app code to use MVVM architecture and cleaned up the UI a bit</p></li></ul><h2>5.0 Additional Features</h2><p>Time for a few quality-of-life improvements.</p><h3>5.1 Cartesia + LLM Preprocessing</h3><p>I was getting close to the credit limit for ElevenLabs and their cheapest subscription didn&#8217;t allow for usage-based pricing, so it was time to add another TTS provider. <a href="https://cartesia.ai/sonic">Cartesia</a> had a really impressive demo on their site and is cheaper than ElevenLabs, so I decided to give it a try. They provide time benchmarks at the word level rather than the character level, but that was fine given that my app only sends progress updates every 5 seconds.</p><p>To take advantage of Cartesia&#8217;s emotion and pause features, I added an LLM layer that puts emotion tags and pauses in the text (along with cleaning up the occasional spacing mistake from OCR). I was really excited about emotion, but it only seemed to work with shorter texts where the entire thing has the same emotion. From the <a href="https://docs.cartesia.ai/build-with-cartesia/sonic-3/ssml-tags">documentation</a>:</p><blockquote><p>Emotion control is highly experimental, particularly when emotion shifts occur mid-generation.</p></blockquote><p>Given that these are long book passages with emotion shifts throughout, it&#8217;s not surprising that it didn&#8217;t work<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a>. The OCR cleanup + pauses still helped a bit but it takes a while (even with small models like gpt-5-nano), so I left it in as an optional step.</p><h3>5.2 Partial Generation</h3><p>I was tired of always generating the full ~8 minutes worth of content, so I added duration selection (1-8 minutes). This had some interesting implications.</p><p>First, since the downloaded content covers a fixed range (typically 8 minutes worth), I needed to slice the text to the appropriate length for the requested duration. I used a simple linear proportion&#8212;if you request 5 minutes, you get ~5/8 of the text&#8212;and aligned the slice to the nearest sentence boundary so audio doesn&#8217;t cut off mid-sentence.</p><p>Second, there&#8217;s the question of overlap: what happens if you generate 5 minutes starting at position 50000, then later request 5 minutes starting at position 52000? For simplicity, I store multiple audio artifacts per provider per chunk, and only reuse an existing artifact if it <strong>fully covers</strong> the requested range. So if you have audio covering 50000-55000 and request 52000-57000, it will regenerate (the existing artifact doesn&#8217;t cover past 55000). This avoids the complexity of stitching audio files together.</p><h3>5.3 Finishing Touches</h3><p>A few final additions:</p><ul><li><p><strong>Library</strong>: Delete or re-play previously generated audiobooks</p></li><li><p><strong>Resume</strong>: Use current Kindle position to automatically seek to the right spot when resuming playback of a partially-complete audiobook</p></li><li><p><strong>UI:</strong> Proper overhaul to feel more like an iOS app with different tabs; everything was previously on one screen for easier debugging</p></li></ul><h2>The final result</h2><p>So finally, after much strife, I achieved what I sought after: on-demand AI audiobooks that would sync properly with my Kindle.</p><div id="youtube2-zyrIhUSQ_O0" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;zyrIhUSQ_O0&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/zyrIhUSQ_O0?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Was it worth it? Let&#8217;s break it down:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!81tP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F224b7200-b954-45a4-872b-39b53ff7fe6c_982x354.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!81tP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F224b7200-b954-45a4-872b-39b53ff7fe6c_982x354.png 424w, https://substackcdn.com/image/fetch/$s_!81tP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F224b7200-b954-45a4-872b-39b53ff7fe6c_982x354.png 848w, https://substackcdn.com/image/fetch/$s_!81tP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F224b7200-b954-45a4-872b-39b53ff7fe6c_982x354.png 1272w, https://substackcdn.com/image/fetch/$s_!81tP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F224b7200-b954-45a4-872b-39b53ff7fe6c_982x354.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!81tP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F224b7200-b954-45a4-872b-39b53ff7fe6c_982x354.png" width="568" height="204.75763747454175" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/224b7200-b954-45a4-872b-39b53ff7fe6c_982x354.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:354,&quot;width&quot;:982,&quot;resizeWidth&quot;:568,&quot;bytes&quot;:71479,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.ryanbbrown.com/i/185680596?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F224b7200-b954-45a4-872b-39b53ff7fe6c_982x354.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!81tP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F224b7200-b954-45a4-872b-39b53ff7fe6c_982x354.png 424w, https://substackcdn.com/image/fetch/$s_!81tP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F224b7200-b954-45a4-872b-39b53ff7fe6c_982x354.png 848w, https://substackcdn.com/image/fetch/$s_!81tP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F224b7200-b954-45a4-872b-39b53ff7fe6c_982x354.png 1272w, https://substackcdn.com/image/fetch/$s_!81tP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F224b7200-b954-45a4-872b-39b53ff7fe6c_982x354.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>I think the table speaks for itself. </p><p>(See <a href="https://github.com/ryanbbrown/kindle-storyteller">here</a> for the full codebase)</p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>Will emphasize that most of this project was completed in November; coding agents have continued to improve rapidly since then, and parts of it would likely be quite a bit easier if I were to do it today. The journey was still a lot of fun, though.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>I could technically stitch together the results from a bunch of different API calls with different emotions, but that seemed like it would be take a lot of integration effort so I didn&#8217;t pursue.</p></div></div>]]></content:encoded></item><item><title><![CDATA[AI will never master PowerPoint]]></title><description><![CDATA[Why visual editors are the wrong interface for LLMs]]></description><link>https://blog.ryanbbrown.com/p/ai-will-never-master-powerpoint</link><guid isPermaLink="false">https://blog.ryanbbrown.com/p/ai-will-never-master-powerpoint</guid><dc:creator><![CDATA[Ryan Brown]]></dc:creator><pubDate>Tue, 16 Dec 2025 13:18:22 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/6d7da414-d6ab-4734-b0d1-dc3ccd8c6fd8_1264x848.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>PowerPoint&#8217;s modern XML-based format was introduced in 2007, long before anyone imagined computers could reason about layout and make presentations for us. The core assumption was simple: a human will drag shapes/text around a canvas and visually confirm that things look right.</p><p>That assumption no longer holds.</p><p>Progress has been made, but AI will never truly master PowerPoint because PowerPoint was built to be manipulated by hand, not generated programmatically. This is just one example of a broader trend: <strong>formats that are purpose-built for AI to create and edit will eventually displace those designed for humans alone</strong>.</p><h2>Why powerpoint is hard for AI</h2><p>Consider a task that takes two lines of CSS but requires AI to perform five separate calculations in PowerPoint: making items equally spaced.</p><h3>HTML/CSS: Represents Intent</h3><pre><code><code>.container {
  display: flex;
  justify-content: space-evenly;
}</code></code></pre><p>This code says: <strong>&#8220;arrange the items inside this container with equal spacing between them.&#8221;</strong> The browser figures out all the pixel math. If you resize the container, the spacing recalculates automatically. The AI doesn&#8217;t need to know the container width, the number of items, or do any arithmetic. It just states the intent.</p><h3>PowerPoint: Represents Outcomes</h3><p>Under the hood, a .pptx file is just a ZIP archive containing XML files&#8212;specifically, a format called Office Open XML (OOXML). Every shape, every text box, every image is defined in XML with absolute positioning:</p><pre><code><code>&lt;p:sp&gt;
  &lt;p:spPr&gt;
    &lt;a:xfrm&gt;
      &lt;a:off x="1676400" y="914400"/&gt;    &lt;!-- Position in EMUs --&gt;
      &lt;a:ext cx="1219200" cy="1219200"/&gt; &lt;!-- Size in EMUs --&gt;
    &lt;/a:xfrm&gt;
  &lt;/p:spPr&gt;
&lt;/p:sp&gt;
</code></code></pre><p>To create &#8220;equally spaced&#8221; items, the AI must:</p><ol><li><p>Read the slide width from the slide size XML element</p></li><li><p>Read each shape&#8217;s width from the <code>&lt;a:ext&gt;</code> XML element shown above</p></li><li><p>Calculate spacing: (slide_width - sum(shape_widths)) / (n + 1)</p></li><li><p>Calculate each shape&#8217;s x position: spacing + sum(prev_shape_widths) + (i * spacing)</p></li><li><p>Write each x position to the <code>&lt;a:off&gt;</code> XML element shown above</p></li></ol><p>&#8220;Equally spaced&#8221; and &#8220;arbitrarily placed&#8221; require the exact same amount of work; they&#8217;re just different numbers. There&#8217;s no semantic distinction in the data model.</p><p>You could create friendlier wrappers and APIs around the XML content, but that&#8217;s a clunky workaround rather than a proper solution to the problem.</p><h2>What&#8217;s the alternative?</h2><h3>1. AI-native presentation software</h3><p>Products that are built with AI in mind can design their internal object model in a way that makes it easy for AI to create and modify.</p><p><a href="https://gamma.app/">Gamma</a> is the best example of this approach. Recently valued at $2.1B with $100M in ARR, they&#8217;ve grown rapidly and have a very different product&#8212;instead of absolutely-positioned shapes, they use scrollable, block-based &#8216;cards&#8217; that feel more like web documents.</p><p>They don&#8217;t provide a ton of visibility into their internal representation, but the focus on structured content + strong AI features make me confident that it&#8217;s significantly easier for AI to work with than PowerPoint&#8217;s OOXML.</p><h3>2. PowerPoint as an export format</h3><p>Even if you&#8217;re set on creating a .pptx file, using PowerPoint isn&#8217;t necessarily the best way for AI to get there.</p><p>Anthropic&#8217;s <a href="https://github.com/anthropics/skills/tree/main/skills/pptx">Claude Code skill</a> has the best output of any general-purpose AI I&#8217;ve seen, and it creates slides by writing HTML that gets converted into a PowerPoint file using a ~1000 line conversion script built on PptxGenJS, Playwright, and Sharp. The data model that the AI actually works with is HTML, <em>not</em> PowerPoint<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>.</p><p>If this pattern grows, PowerPoint could end up as a legacy export format rather than the primary place people actually do work.</p><h3>3. Code-native presentations</h3><p>If LLMs are great at writing HTML, why convert to .pptx at all? <a href="https://revealjs.com/">Reveal.js</a> is an open-source HTML presentation framework that makes slides first-class web documents. The advantages:</p><ul><li><p>Makes AI creation and iteration easy with plain HTML+CSS and <a href="https://github.com/astefanutti/decktape">DeckTape</a> for quick slide screenshots</p></li><li><p>Includes the core features presenters expect: speaker notes, slide transitions, progressive content reveal, PDF export</p></li><li><p>Offers new features that traditional software can&#8217;t match: syntax-highlighted code with line-by-line reveal, vertical slide stacks for drilling into subtopics, and embedded interactive content</p></li></ul><p>The one thing Reveal.js lacks out of the box is an interactive visual editor, but <a href="http://Slides.com">Slides.com</a> attempts to fills that gap<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> &#8212;it&#8217;s a GUI built on top of Reveal.js. This inverts PowerPoint&#8217;s approach: instead of building a visual editor first and treating the data model as an afterthought, you start with an AI-friendly format and layer human editing on top.</p><p>I put together a quick Claude Code skill for Reveal.js (<a href="https://github.com/ryanbbrown/revealjs-skill">revealjs-skill</a>) and put it head to head vs. Anthropic&#8217;s pptx skill to generate a pitch deck for a fictional legal AI startup. Here&#8217;s the <a href="https://github.com/ryanbbrown/revealjs-skill/blob/main/examples/prompt.md">prompt I used</a>, <a href="https://ryanbbrown.com/revealjs-skill/examples/pptx/presentation.pdf">pptx skill output</a>, <a href="https://ryanbbrown.com/revealjs-skill/examples/revealjs/presentation.pdf">revealjs skill output</a>. </p><p>I&#8217;m a little biased but would argue that the Reveal.js output looks better, and it took less time to generate; see the footnotes for a full breakdown<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a>.</p><h2>Beyond PowerPoint</h2><p>Presentations aren&#8217;t the only category where legacy formats aren&#8217;t ideal for AI; this pattern appears across productivity software.</p><h3>Documents</h3><p>Word&#8217;s .docx format has a similar structure as PowerPoint: complex XML with formatting specified in separate blocks per sentence/word/character. Word focuses on storing the visual rendering details of the content, which aligns with a human-designed editor, while Markdown has been used in codebases for decades and much more concisely stores the content + structure.</p><p>Markdown-like editors (Notion, Obsidian, GitHub) are much more common than presentation editors and it&#8217;s generally an easier format to convert to, so there&#8217;s less friction&#8212;Markdown is already a go-to for AI-generated reports or documents, with exports to Word becoming less common.</p><h3>Spreadsheets</h3><p>Continuing down the Microsoft Office suite, the same pattern is emerging for Excel.</p><p><a href="https://univer.ai/">Univer</a> is an open-source framework for building &#8220;AI-native spreadsheets&#8221; and currently leads the <a href="https://spreadsheetbench.github.io/">SpreadsheetBench leaderboard</a> at 68.9%, beating Microsoft Copilot&#8217;s Agent Mode (57.2%) and Claude (42.9%) by a significant margin.</p><p>Unlike add-ons that layer AI directly on top of Excel, Univer is a totally independent spreadsheet engine built from the ground up for natural language interaction. It can import and export .xlsx files, but that&#8217;s legacy compatibility rather than a primary format.</p><h3>Business Intelligence</h3><p>Tableau and Power BI have historically dominated the dashboarding market, but their formats have limitations similar to PowerPoint. Tableau&#8217;s .twb files are XML documents encoding visual layouts with absolute coordinates, and Power BI&#8217;s .pbix files bundle similar positional data with a proprietary query language. Both were designed around drag-and-drop interfaces where humans visually arrange charts on a canvas.</p><p>A growing category of &#8220;BI-as-code&#8221; tools take a different approach. <a href="https://evidence.dev/">Evidence.dev</a> uses SQL for queries and Markdown for layout; <a href="https://www.lightdash.com/">Lightdash</a> defines metrics in YAML and integrates with dbt. Instead of encoding &#8220;place this bar chart at coordinates (1200, 400),&#8221; these tools let AI write SQL queries and declare charts in Markdown or YAML&#8212;formats it already understands.</p><h2>Conclusion</h2><p>In the short term, building to extend and automate legacy human tools makes a lot of sense; it&#8217;s where everyone is already doing work. In the long term, though, if AI is really going to transform and potentially automate knowledge work as we know it&#8212;why in the world would it be using PowerPoint?</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.ryanbbrown.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Ryan's Substack! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>To edit existing files, Claude does modify the OOXML directly, but results have been <a href="https://www.smithstephen.com/p/how-to-get-claude-to-create-powerpoint#:~:text=The%20results%20were%20less%20than%20desirable">mixed</a>.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p><a href="http://slides.com/">Slides.com</a> has some limitations when trying to edit AI-generated Reveal.js, but it still proves the concept and is a step in the right direction.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>The Reveal.js skill took 6m 23s, whereas the pptx skill took 9m 52s. That&#8217;s the happy path for pptx; another time where I tried it (from the web UI) it took over 15 minutes because it was having issues with footnotes getting cut off, and the cycle of html &#8594; pptx &#8594; screenshot is quite slow.</p><p>Stylistic choices aside, there are some objective issues with the pptx skill&#8217;s output:</p><ul><li><p>I had to manually open it in PowerPoint and print to pdf for the charts to work</p></li><li><p>The chart on page 7 is awkwardly small</p></li><li><p>The box on page 8 is completely empty</p></li><li><p>There&#8217;s no margin between the bottom of the content and bottom of the slide on page 8</p></li><li><p>The spacing/kerning on the text is a bit off throughout (e.g. &#8220;ae&#8221; and &#8220;ez&#8221; in &#8220;Mich<strong>ae</strong>l Rodrigu<strong>ez</strong>&#8221; on page 10)</p></li><li><p>Some of the text in the bar chart on page 11 (bottom left) has poor visibility/contrast</p></li></ul></div></div>]]></content:encoded></item><item><title><![CDATA[Thoughts on the AI bubble]]></title><description><![CDATA[Understanding tech cycles and what's different about AI]]></description><link>https://blog.ryanbbrown.com/p/thoughts-on-the-ai-bubble</link><guid isPermaLink="false">https://blog.ryanbbrown.com/p/thoughts-on-the-ai-bubble</guid><dc:creator><![CDATA[Ryan Brown]]></dc:creator><pubDate>Thu, 13 Nov 2025 23:04:49 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/13e02c5b-36c8-4ebb-8f07-cd44dc3642d2_1376x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>There&#8217;s been <a href="https://www.thevccorner.com/p/coatue-ai-report-18-charts">a</a> <a href="https://www.derekthompson.org/p/this-is-how-the-ai-bubble-will-pop">lot</a> <a href="https://danielmiessler.com/blog/revisiting-the-ai-bubble">of</a> <a href="https://www.goldmansachs.com/pdfs/insights/goldman-sachs-research/ai-in-a-bubble/report.pdf">discussion</a> recently about the <a href="https://www.wired.com/story/ai-bubble-will-burst/">AI</a> <a href="https://insights.som.yale.edu/insights/this-is-how-the-ai-bubble-bursts">bubble</a>, usually involving a claim about whether AI is actually in a bubble and predictions on how it might pop.</p><p>Instead of doing that, I&#8217;m going to synthesize insights from across the discussion in one place&#8212;why tech bubbles form, how they pop, what feels different about AI in that context, and how to respond.</p><p>AI is going through the same fundamental cycle as past technologies, but it&#8217;s unique because it has two distinct bubble mechanisms: inflated valuations and infrastructure overbuilds. Regardless of whether the bubble pops, there&#8217;s going to be real disillusionment, but in the long run AI will still be a transformative technology.</p><p>Let&#8217;s dive in.</p><h2>Why bubbles?</h2><p>Things feel a little crazy right now&#8212;valuations, marketing, rush to put &#8220;AI&#8221; on everything&#8212;and that&#8217;s to be expected. Virtually every transformative technology goes through a hype cycle, and we&#8217;re squarely in the &#8220;peak of inflated expectations&#8221; as defined by Gartner.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tJjf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bd99b-cfc9-4798-83ff-0e6e41eef3e4_1540x1094.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tJjf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bd99b-cfc9-4798-83ff-0e6e41eef3e4_1540x1094.png 424w, https://substackcdn.com/image/fetch/$s_!tJjf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bd99b-cfc9-4798-83ff-0e6e41eef3e4_1540x1094.png 848w, https://substackcdn.com/image/fetch/$s_!tJjf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bd99b-cfc9-4798-83ff-0e6e41eef3e4_1540x1094.png 1272w, https://substackcdn.com/image/fetch/$s_!tJjf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bd99b-cfc9-4798-83ff-0e6e41eef3e4_1540x1094.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tJjf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bd99b-cfc9-4798-83ff-0e6e41eef3e4_1540x1094.png" width="416" height="295.42857142857144" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d36bd99b-cfc9-4798-83ff-0e6e41eef3e4_1540x1094.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1034,&quot;width&quot;:1456,&quot;resizeWidth&quot;:416,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tJjf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bd99b-cfc9-4798-83ff-0e6e41eef3e4_1540x1094.png 424w, https://substackcdn.com/image/fetch/$s_!tJjf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bd99b-cfc9-4798-83ff-0e6e41eef3e4_1540x1094.png 848w, https://substackcdn.com/image/fetch/$s_!tJjf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bd99b-cfc9-4798-83ff-0e6e41eef3e4_1540x1094.png 1272w, https://substackcdn.com/image/fetch/$s_!tJjf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd36bd99b-cfc9-4798-83ff-0e6e41eef3e4_1540x1094.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The AI companies recognize this and are using <a href="https://jenson.org/hype/">hype as a business tool</a> to allow them to continue raising huge amounts of money for as long as possible. The goal: survive the crash and initiate the slope of enlightenment by making a breakthrough before funding runs out.</p><p>AI has two additional elements that bring the hype levels even higher:</p><ol><li><p>It requires expensive infrastructure buildout, which brings in Perez&#8217;s <a href="https://carlotaperez.org/wp-content/downloads/books/btr-en/PEREZ_TRFC_Ch%207.pdf">theory of installation-deployment</a>: <strong>installation</strong> involves a finance-led frenzy and over-investment in infrastructure that ends in a crash; after a turning point and realignment, <strong>deployment</strong> sees economy-wide diffusion and productivity gains</p></li><li><p>It&#8217;s the first technology where it feels like there might be a path to significantly compress the trough: AI could spur a research productivity flywheel that shortens iteration cycles through faster coding and discovery</p></li></ol><h2>How bubbles pop</h2><p>Eventually the hype dies as the technology fails to live up to inflated expectations; a crash follows and we enter the &#8220;trough of disillusionment&#8221;. This crash often comes with the popping of a bubble, which can occur in two distinct ways<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>:</p><ol><li><p><strong>Valuation bubbles</strong> pop when overpriced companies collapse due to weak fundamentals</p></li><li><p><strong>Infrastructure bubbles</strong> pop when overbuilt capacity far exceeds demand</p></li></ol><p>The dot-com era is a classic example of a valuation bubble, and the 1990s telecom build-out followed an infrastructure bubble.</p><p>Looking at these past technology cycles<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> helps illuminate how AI has potential for both.</p><h2>A brief history</h2><p>The two kinds of bubbles correspond with different technology cycle types: infrastructure buildouts and software-led tech.</p><h4>Infrastructure Buildouts</h4><ul><li><p><strong>Telecommunications (late 1990s)</strong>: Telecom companies expected unlimited demand but built too fast and capacity went unused; the cheap bandwidth that remained was used to power the cloud + mobile revolutions</p></li><li><p><strong>Railroad (early 1870s)</strong>: Subsidized &#8220;essential&#8221; railroads were built far too quickly, and with profit a long way out, funding dried up; the important lines survived and the US benefitted heavily from them</p></li></ul><h4>Software-led Tech</h4><ul><li><p><strong>Dot-com</strong> <strong>(mid 1990s-2002)</strong>: Hype for the internet outweighed the need for presently strong financials, but shallow business models eventually failed; it wasn&#8217;t until later that massively profitable e-commerce, SaaS, and ads businesses emerged</p></li><li><p><strong>Mobile Phones</strong> <strong>(early 2000s)</strong><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a>: Having &#8220;the internet in your pocket&#8221; was supposed to &#8220;change everything&#8221;, but the technology wasn&#8217;t ready; several years later the real revolution actually began with the iPhone 3GS</p></li></ul><p>Not every hype cycle has an associated major bubble pop (e.g. mobile phones), but those that do are either an infrastructure bubble <strong>or</strong> a valuation bubble. AI is unique because there&#8217;s clear potential for both; inflated company valuations could crash and/or we could have a significant amount of excess data center capacity.</p><h2>Is it a bubble?</h2><p>I don&#8217;t think we&#8217;re in a bubble yet, but the current direction of travel raises concern<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-4" href="#footnote-4" target="_self">4</a>.</p><p>The analysis I find most helpful for understanding the current state + trend is Exponential View&#8217;s <a href="https://www.exponentialview.co/p/is-ai-a-bubble">&#8220;gauge&#8221; framework</a>. It uses five indicators that can be monitored and compared across current + historical situations to evaluate bubble risk; two reds = trouble, three reds = imminent trouble.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wJN9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220d08c5-6e82-4d26-a5a4-2ca45173402e_916x576.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wJN9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220d08c5-6e82-4d26-a5a4-2ca45173402e_916x576.png 424w, https://substackcdn.com/image/fetch/$s_!wJN9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220d08c5-6e82-4d26-a5a4-2ca45173402e_916x576.png 848w, https://substackcdn.com/image/fetch/$s_!wJN9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220d08c5-6e82-4d26-a5a4-2ca45173402e_916x576.png 1272w, https://substackcdn.com/image/fetch/$s_!wJN9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220d08c5-6e82-4d26-a5a4-2ca45173402e_916x576.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wJN9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220d08c5-6e82-4d26-a5a4-2ca45173402e_916x576.png" width="430" height="270.3930131004367" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/220d08c5-6e82-4d26-a5a4-2ca45173402e_916x576.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:576,&quot;width&quot;:916,&quot;resizeWidth&quot;:430,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wJN9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220d08c5-6e82-4d26-a5a4-2ca45173402e_916x576.png 424w, https://substackcdn.com/image/fetch/$s_!wJN9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220d08c5-6e82-4d26-a5a4-2ca45173402e_916x576.png 848w, https://substackcdn.com/image/fetch/$s_!wJN9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220d08c5-6e82-4d26-a5a4-2ca45173402e_916x576.png 1272w, https://substackcdn.com/image/fetch/$s_!wJN9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F220d08c5-6e82-4d26-a5a4-2ca45173402e_916x576.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We&#8217;re already seeing slowdowns in progression of AI and these indicators have worsened from September to now<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-5" href="#footnote-5" target="_self">5</a>, so unless a major breakthrough comes soon it&#8217;s unlikely that the technology will live up to inflated expectations. We probably <a href="https://x.com/karpathy/status/1979644538185752935">have significant work left to do</a> and will go through a proper trough of disillusionment&#8212;which could either be associated with a bubble pop or a slow deflation.</p><p>If there is an imminent pop, it feels more likely to be from a valuation bubble than an infrastructure bubble. Data center construction will take years and capacity can be used in ways that don&#8217;t depend on demand (model training, continuous learning) while valuations and financial performance respond much more quickly.</p><h2>The playbook</h2><p>Bubble or not, I believe that AI will ultimately be a generational technology&#8212;even if it takes longer than expected and doesn&#8217;t look exactly like what we&#8217;re working with today. The path might vary but the destination will be the same, so individuals and companies should take action accordingly.</p><p>Some of my personal recommendations:</p><h4>For individuals</h4><ul><li><p>Become extremely proficient at using AI tools, regardless of your domain&#8212;it might be more relevant than you think (<a href="https://www.ai-supremacy.com/p/everybody-is-using-claude-code-for-more-than-code-ai-2025">Claude Code is for more than just code</a>)</p></li><li><p>Develop <a href="https://shyamal.me/blog/age-of-taste/">taste</a>&#8212;for design, for software, for writing, for analysis&#8212;discerning quality of AI-generated [insert_artifact] will be a valuable skill for a long time</p></li><li><p><a href="https://ryanbbrown.com/learning-to-learn/">Learn to learn</a>, and learn often + diversely&#8212;learning is one of the highest leverage things you can do, and being able to learn effectively ensures you stay relevant regardless of how AI disruption shakes out</p></li></ul><h4>For companies</h4><ul><li><p>Don&#8217;t ignore AI transformation, but start with real business problems rather than &#8220;shiny object&#8221; gen AI initiatives; there&#8217;s still a ton of uncaptured value in core data infrastructure + predictive AI</p></li><li><p>Build a culture that&#8217;s as strongly AI-native as possible (within industry-specific compliance boundaries); <a href="https://www.firstround.com/ai/shopify">Shopify</a> is a nice north star</p></li><li><p>Remember <a href="http://www.incompleteideas.net/IncIdeas/BitterLesson.html">the bitter lesson</a>; start simple and build solutions that scale as models improve, rather than being made obsolete by the next update</p></li></ul><p></p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>The split between infrastructure and valuation bubbles refers to the mechanics through which bubbles form and pop; Stratechery put together a <a href="https://stratechery.com/2025/the-benefits-of-bubbles/">much deeper analysis</a> that splits bubbles by benefit, which can span across the two types I laid out.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>I found it helpful to understand past cycles in more detail than what I have space to show in this article; see <a href="https://ryanbbrown.com/historical-tech-cycles/">here</a> my extended notes.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>Mobile phones arguably involved some infrastructure building as well, but infrastructure wasn&#8217;t the primary mechanism for hype.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-4" href="#footnote-anchor-4" class="footnote-number" contenteditable="false" target="_self">4</a><div class="footnote-content"><p>This position is (intentionally) more about timing than a binary yes/no; as mentioned in the intro, my intent isn&#8217;t to make a super strong claim.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-5" href="#footnote-anchor-5" class="footnote-number" contenteditable="false" target="_self">5</a><div class="footnote-content"><p>Full breakdown of the changes from September to now (<a href="https://boomorbubble.ai/">live dashboard</a>):<strong><br>Industry Strain</strong> <strong>(Capex/GDP)</strong>: 0.9% (green) &#8594; 0.8% (green)<br><strong>Industry Strain (Investment/Revenue)</strong>: 6x (<s>yellow</s>) &#8594; 8.5x (red)<br><strong>Revenue Growth (doubling time)</strong>: 1 year (green) &#8594; 1 year (green)<br><strong>Valuation Heat (Price/Earnings)</strong>: 32 (<s>green</s>) &#8594; 35 (yellow)<br><strong>Funding Quality (Composite index)</strong>: 1.1 (<s>green</s>) &#8594; 1.4 (yellow)</p></div></div>]]></content:encoded></item></channel></rss>