Skip to content

Latest commit

 

History

History
131 lines (99 loc) · 5.65 KB

README.md

File metadata and controls

131 lines (99 loc) · 5.65 KB
AI Context Logo

AI Context

Release Build GitHub Release

Go Report Card License: MIT GoDoc

Generate AI-friendly markdown context files from repositories or videos or webpages or local source code.


A command-line tool designed to produce context files from various sources, to make interactions with LLM apps (like ChatGPT, Claude, etc.) easy. It can process multiple sources to output the context in a markdown format optimized for use by AI models.

Quickstart

ai-context -u "https://github.com/tanq16/ai-context" # single URL
ai-context -f urllist.file                           # URL file

Features

  • Local Directory Processing
    • this is mainly for locally available code bases (directories or already cloned git repos)
    • the context file includes directory structure and all file contents within context
  • GitHub Repository Processing
    • this clones and processes provided github link and does the same as Local Directory Processing
    • it temporarily clones the repository, so no need for cleanup
  • YouTube Transcript Processing
    • this downloads transcripts for given YouTube video link and preserves time segments
  • WebPage Processing
    • this converts the HTML of a webpage to markdown
    • it also downloads all images from the page and stores them locally with UUID names
    • the images in markdown will refer to local paths

Installation

  • Binary
    • Download the latest release for your platform and OS from the releases page
    • Binaries are build via GitHub actions for MacOS, Linux, and Windows for both AMD64 (x86_64) and ARM64 (incl. Apple Silicon) architectures
    • Use this to download specific version if needed
  • Go Install
    • Run the following command (requires Go v1.22+):
    go install github.com/tanq16/ai-context@latest
    • For specific versions, prefer the binaries or local build process as I haven't implemented Go binary versioning for the project
  • Local Build
    • To build locally, do the following:
    # Clone
    git clone https://github.com/tanq16/ai-context.git && \
    cd ai-context
    # Build
    go build .

Usage

# Process a single path (local directory) with additional ignore patterns
ai-context -u /path/to/directory  -i "tests,docs,*doc.*"

# Process one URL (GitHub repo or YouTube Video or Webpage URL)
ai-context -u https://www.youtube.com/watch?v=video_id

# Make a list of paths
cat << EOF > listfile
../notif
/working/cybernest
https://github.com/assetnote/h2csmuggler
https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html
EOF

# Process everything concurrently
ai-context -f listfile

Warning

For directory path (in URL or listfile mode), the path should either start with / (absolute) or with ./ or ../ (relative). For current directory, always use ./ for correct regex matching.

Output

  • The tool creates a local folder called context and puts all gathered context into .md files in that folder.
  • The filenames have the following syntax: TYPE-PATHNAME.md (example, gh-ffuf_ffuf.md).
  • Every single path in the listfile mode will result in a new context file.
  • All images are named as a UUID and are downloaded to context/images directory.

Command Line Options

  • -u, --url: provide a path (GitHub repo, YouTube video, WebPage link, or relative/absolute directory path) to process
  • -f, --file: provide a file with a list of paths (URLs or directory paths) to process
  • -i, --ignore: add additional patterns to ignore during processing (comma-separated)
  • -t, --threads: (optional) number of workers for file processing; default value is 5
  • --debug: verbose logging (helpful if something isn't working as expected)

Tip

  • Do a head -n 200 context/FILE.md (or 500 lines) to view the content tree of the processed code base or directory to see what's been included. Then refine your -i flag arguments to ignore additional patterns.
  • When processing a large number of items, it can look stalled due to thread limits and image download times; use --debug to enable verbose logs to know it's running.

Default Ignores

The tool automatically ignores common files and directories that typically don't add value to the context, including:

  • Version control files (.git, .gitignore)
  • Dependencies (node_modules, vendor)
  • Compiled files (*.exe, *.dll)
  • Media files (images, videos, audio)
  • Documentation files
  • Lock files (package-lock.json, yarn.lock)
  • Build artifacts and caches

For a full list, see aicontext/ignores.go.

Acknowledgments

This project takes inspiration from and references:

  • repomix: inspiration for turning code into context
  • innertube: inspiration for code to get transcript from YouTube video
  • html-to-markdown: used to convert HTML to MD