gh-find
A find(1)-like utility for GitHub repositories
gh-find searches for files across GitHub repositories from the command line. It offers intuitive glob patterns, many filtering options, and sensible defaults.
Features
- Glob patterns (
**/*.go, *.test.js) instead of regex
- Concurrent search across multiple repositories
- Automatic caching reduces API calls
- Filter by repository type (sources, forks, archives, mirrors)
- Filter by file type (
-t), extension (-e), or size
- Case-insensitive (
-i) and full-path (-p) matching
Installation
As a GitHub CLI extension
gh extension install jparise/gh-find
From source
git clone https://github.com/jparise/gh-find
cd gh-find
go build
gh extension install .
Examples
Basic Usage
# Find all files in a repository
gh find cli/cli
# Find Go files in specific repository
gh find "*.go" cli/cli
# Search multiple repositories
gh find "*.go" cli/cli golang/go
# Search all repos under an owner (user or organization)
gh find "*.md" torvalds
Pattern Matching
# Case-insensitive search
gh find -i "readme*" cli
# Match against full paths (e.g., find tests)
gh find -p "**/*_test.go" golang/go
# Exclude patterns
gh find "*.js" -E "*.test.js" -E "*.spec.js" facebook/react
Filtering
# Filter by extension
gh find -e go -e md cli
# Filter by type (files only, no directories)
gh find -t f "README*" cli
# Filter by type (executables)
gh find -t x "*.sh" cli/cli
# Filter by size (files over 50KB)
gh find --min-size 50k "*.go" golang/go
# Include forks and archives
gh find --repo-types sources,forks,archives "*.md" cli
Usage
gh-find [<pattern>] <repository>... [flags]
Arguments
pattern - Glob pattern (optional, defaults to * for single repo)
repository - One or more repositories to search:
owner - All repos for a user or organization
owner/repo - Specific repository
gh find cli/cli # Single repo: defaults to "*"
gh find "*.go" cli/cli # Single repo: explicit pattern
gh find "*.go" cli golang/go # Multiple repos: pattern required
Pattern Matching
Patterns match basename (filename) by default. Use -p/--full-path for full path matching.
# Basename (default)
gh find "*.go" cli/cli # Matches any .go file
gh find "main.go" cli/cli # Matches main.go in any directory
# Full path
gh find -p "cmd/**/*.go" cli/cli # Only .go files in cmd/
Glob syntax: * (any chars), ** (with /), ? (single char), [abc] (char set), {a,b} (alternatives)
Options
Pattern Matching
-i, --ignore-case - Case-insensitive pattern matching
-p, --full-path - Match pattern against full path instead of basename
-t, --type type - Filter by file type (can be specified multiple times for OR matching)
- Valid types:
f/file, d/dir/directory, l/symlink, x/executable, s/submodule
- Examples:
-t f (files only), -t f -t d (files or directories)
-e, --extension ext - Filter by file extension (can be specified multiple times)
-E, --exclude pattern - Exclude files matching pattern (can be specified multiple times)
--min-size size - Minimum file size (e.g., 1M, 500k, 1GB)
--max-size size - Maximum file size (e.g., 5M, 1GB)
Repository Filtering
--repo-types type[,type...] - Repository types to include when expanding owners (default: sources)
- Valid types:
sources, forks, archives, mirrors, all
- Only filters owner expansion (e.g.,
cli). Does NOT filter explicitly specified repos (e.g., cli/archived-fork)
- See Repository Filtering for details
-j, --jobs N - Maximum concurrent API requests (default: 10)
- Increase for faster searches:
-j 20
- Decrease if hitting rate limits:
-j 5
Caching
--no-cache - Bypass cache, always fetch fresh data
--cache-dir path - Override cache directory (default: ~/.cache/gh/)
--cache-ttl duration - Cache time-to-live (default: 24h, e.g., 1h, 30m)
Output
-c, --color mode - Colorize output: auto, always, never (default: auto)
--hyperlink mode - Hyperlink output: auto, always, never (default: auto)
auto only enables hyperlinks when color is also enabled
Repository Filtering
Default: source repositories only (excludes forks, archives, mirrors).
# Include forks and archives
gh find --repo-types sources,forks,archives "*.go" cli
# Everything
gh find --repo-types all "*.js" facebook
Note: --repo-types only filters owner expansion (cli → all repos). Explicitly specified repos (cli/archived-fork) are always included.
Caching
API responses are cached automatically to improve performance and reduce rate limit usage.
- What: All GET requests (repository lists, file trees)
- Where:
~/.cache/gh/ (configurable with --cache-dir)
- TTL: 24 hours (configurable with
--cache-ttl)
# Bypass cache for fresh results
gh find --no-cache "*.go" cli/cli
# Custom cache location or TTL
gh find --cache-dir /tmp/cache "*.go" cli
gh find --cache-ttl 1h "*.go" cli
Control concurrency with -j/--jobs (default: 10):
gh find -j 20 "*.go" myorg # Faster (uses rate limit faster)
gh find -j 5 "*.go" myorg # Slower (conserves rate limit)
GitHub API rate limits:
- Authenticated: 5,000 requests/hour
- Unauthenticated: 60 requests/hour
Each search uses 1 request per repository (plus 1 for listing repos). Cached requests don't count against limits.
Known Limitations
GitHub's Git Trees API truncates responses for repositories with >100,000 files or >7MB tree data. Results will be partial with a warning:
Warning: username/repo has >100k files, results incomplete
Troubleshooting
"No repositories match the filter" - Check --repo-types. Default excludes forks, archives, mirrors. Try --repo-types all.
"Pattern matching not working" - Patterns match basename by default. Use -p for full path matching:
gh find -p "cmd/*.go" cli/cli # Full path
gh find "*.go" cli/cli # Basename
"Rate limit exceeded" - Wait for reset, use cache (enabled by default), or reduce concurrency with -j 5.
"Failed to get owner type" - Username/org doesn't exist or no access. Verify with gh api users/username.
License
This software is released under the terms of the MIT License.