Yek: The Fast Code Processor for LLM Projects

Moment

Yek: The Fast Code Processor for LLM Projects

LLM

Hey there! Ever found yourself struggling to feed your large codebase into an LLM? You know, that moment when you’re trying to get GPT to understand your entire project, but you’re not quite sure how to package it all up efficiently?

Well, I’ve been there, and that’s exactly why I want to tell you about this amazing tool called Yek that’s been a total game-changer for me.

What’s Yek, and Why Should You Care?

You know how we often need to feed our code repositories into LLMs for analysis or documentation? Yek is this incredibly fast Rust-based tool that basically does all the heavy lifting for you.

It’s like having a super-efficient assistant that knows exactly how to package your code for LLM consumption.

Let me share a quick story: Just the other day, I was working on a project where I needed to analyze a massive codebase with ChatGPT. Using traditional methods would have been a nightmare, but Yek processed everything in seconds – we’re talking about being 230 times faster than similar tools like Repomix!

Yek Tool Description

Key Features That Make Yek Special

Smart File Selection

  • Automatically follows your `.gitignore` rules (so no more accidentally processing those `node_modules`!)
  • Uses Git history to figure out which files are actually important (pretty clever, right?)
  • Intelligently skips binary files and other stuff you don’t need

Effortless Processing

  • Splits content into manageable chunks based on either tokens or byte size
  • Handles multiple directories in a single command
  • Configurable through a simple `yek.toml` file

Getting Started is Super Easy

Want to give it a try? Installation is a breeze:

For Mac or Linux users:

bash
curl -fsSL https://bodo.run/yek.sh | bash

for Windows :

powershell
irm https://bodo.run/yek.ps1 | iex

Let’s say you have a typical web application structure like this:


├── README.md 
├── package.json
├── src
│   ├── components
│   │   ├── Header.js
│   │   └── Footer.js
│   ├── pages
│   │   ├── index.js
│   │   └── about.js
│   └── styles
│       └── main.css
└── tests
    └── components.test.js

The simplest way to use Yek? Just run:

bash
yek

That’s it! Seriously! It’ll process everything and tell you where it saved the output. But here’s where it gets really cool – Yek is smart enough to prioritize important files to appear last in the output, which is perfect for LLMs since they tend to pay more attention to content that appears later.

Pro Tips from Personal Experience

  • Piping to Clipboard: On macOS, I often use `yek src/ | pbcopy` to get the output right to my clipboard. Super convenient!
  • Token-Based Chunking: If you’re working with specific LLM token limits, try `yek –max-size 128K –tokens src/` to chunk by token count instead of bytes.
  • Custom Output Location: Need the files somewhere specific? Just use `–output-dir` like this: `yek –max-size 100KB –output-dir /tmp/yek src/`

Limitations (Let’s Be Honest)

  • It’s primarily designed for text-based files – it won’t help with binary file analysis
  • While it’s blazing fast, extremely large repositories might still need to be processed in chunks
  • The token counting is approximate – if you need exact token counts, you might want to double-check

Yek is perfect for you if:

  • You regularly work with LLMs and need to feed them codebase content
  • You’re tired of manually selecting and formatting files for LLM processing
  • You want smart prioritization of your code files

Give it a try – I think you’ll be as impressed as I am with how it streamlines the whole process of preparing code for LLM analysis. And hey, if you run into any issues, the GitHub community is pretty active and helpful!

Leave a Comment