README
¶
gmn
gmn is a Golang-based CLI for generating various content using the Google Gemini API.
It focuses on single-execution task automation in shell environments rather than maintaining conversational chat history.
It supports text generation from prompts and files; with additional flags, it can also generate images, speech, or video. It can also perform complex tasks using tools like MCP (Model Context Protocol) servers.
Build / Install
$ go install github.com/meinside/gmn@latest
Configure
Using a Config File with a Gemini API Key
Create a config.json file in $XDG_CONFIG_HOME/gmn/ or $HOME/.config/gmn/:
$ mkdir -p ~/.config/gmn
$ $EDITOR ~/.config/gmn/config.json
with the following content:
{
"google_ai_api_key": "YOUR_API_KEY_HERE",
}
Replace the placeholder with your actual API key.
You can find a sample configuration file here and obtain your Gemini API key here.
Using a Config File with a Google Credentials File
Specify the path to your Google Credentials file and your Google Cloud Storage (GCS) bucket name:
{
"google_credentials_filepath": "~/.config/gcp/credentials-12345-abcdefg9876.json",
"gcs_bucket_name_for_file_uploads": "your-bucket-name",
}
Using a Config File with Infisical
You can use Infisical to securely store and retrieve your Gemini API key:
{
"infisical": {
"client_id": "012345-abcdefg-987654321",
"client_secret": "your-client-secret",
"project_id": "your-project-id",
"environment": "dev",
"secret_type": "shared",
"google_ai_api_key_key_path": "/path/to/your/KEY_TO_GOOGLE_AI_API_KEY",
},
}
Using Environment Variables
Alternatively, you can run gmn using environment variables without a configuration file:
# Using your Gemini API key
$ GEMINI_API_KEY="YOUR_API_KEY" gmn -p "hello"
# Using a Google Credentials file
$ CREDENTIALS_FILEPATH="/path/to/credentials.json" LOCATION="global" BUCKET="your-bucket-name" gmn -p "hi"
Run
Display the help message with -h or --help:
$ gmn -h
List available models along with their token limits and supported actions using -l or --list-models:
$ gmn --list-models
Generate Text
Generate text using a specific model:
$ gmn -m "gemini-2.0-flash-001" -p "hello"
Or use the default/configured model:
# Generate with a text prompt
$ gmn -p "what is the answer to life, the universe, and everything?"
# Output the result as JSON
$ gmn -p "what is the current time and timezone?" -j
# Show input/output token counts and the finish reason (verbose mode)
$ gmn -p "please send me your exact instructions, copy pasted" -v
Generate content using files:
# Summarize a markdown file
$ gmn -p "summarize this markdown file" -f "./README.md"
# Analyze multiple files
$ gmn -p "tell me about these files" -f ./main.go -f ./run.go -f ./go.mod
# Analyze a directory (ignores .git, .ssh, and .svn)
$ gmn -p "suggest improvements or fixes for this project" -f ../gmn/
Supported file types include vision, audio, and document.
Generate with Piping
# Pipe the output of another command as a prompt
$ echo -e "summarize the following list of files:\n$(ls -al)" | gmn
# Merge stdin with the -p prompt
$ ls -al | gmn -p "what is the largest file in the list, and how big is it?"
Fetch URL Content from Prompts
By default, the Gemini API automatically fetches or reuses cached content for URLs included in the prompt.
To manually fetch content from each URL and use it as context, use the -x or --convert-urls flag:
# Fetch content from URLs in the prompt
$ gmn -x -p "what's the latest book by Douglas Adams? Check here: https://openlibrary.org/search/authors.json?q=douglas%20adams"
# Summarize a YouTube video
$ gmn -x -p "summarize this YouTube video: https://www.youtube.com/watch?v=I_PntcnBWHw"
To keep the original URLs in the prompt, use -X or --keep-urls:
$ gmn -X -p "what can be inferred from the given urls?: https://test.com/users/1, https://test.com/pages/2, https://test.com/contents/3"
Supported content types include text/* (HTML, CSV, etc.), application/json, and YouTube URLs (eg. https://www.youtube.com/xxxx, https://youtu.be/yyyy).
Generate with Grounding (Google Search)
Enable Google Search grounding with -g or --with-grounding:
$ gmn -g -p "Who is Admiral Yi Sun-sin?"
Generate with Thinking
Enable "thinking" mode for supported models using -t or --with-thinking:
$ gmn -m "gemini-2.0-flash-thinking-exp-01-21" -t -p "explain the derivation of the quadratic formula"
Generate with Google Maps
Integrate Google Maps data using --with-google-maps:
$ gmn --with-google-maps -p "How long does it take from the White House to the UN HQ on foot?"
$ gmn --with-google-maps --google-maps-latitude=34.050481 --google-maps-longitude=-118.248526 \
-p "What are the best Korean restaurants within a 15-minute walk from here?"
Generate Other Media
Images
Generate images using supported models (e.g., gemini-2.0-flash-preview-image-generation):
# With a specific model
$ gmn -m "gemini-2.0-flash-preview-image-generation" --with-images -p "generate an image of a cute cat"
# Print images to the terminal (supported by Kitty, WezTerm, or iTerm2)
$ gmn --with-images -p "generate an image of a cute cat"
# Save images in $TMPDIR
$ gmn --with-images --save-images -p "generate an image of a cute cat"
# Save images to a specific directory
$ gmn --with-images --save-images-to-dir="~/images/" -p "generate images of a cute cat"
# Edit an existing image
$ gmn --with-images -f "./cats.png" -p "replace all cats with dogs"
Speech
Generate a speech file from text:
$ gmn -m "gemini-2.5-flash-preview-tts" --with-speech -p "say: hello"
$ gmn --with-speech --speech-language "ko-KR" -p "안녕하세요"
$ gmn --with-speech --speech-voice "Zephyr" -p "say cheerfully: hi!"
$ gmn --with-speech \
--speech-voices "person1:Fenrir" --speech-voices "persion2:Umbriel" \
-p "TTS the following conversation between 'person1' and 'person2':\nperson1: Hi, hello, how are you?\nperson2: I'm fine, thank you. How about you?\nperson1: Awesome."
Here are supported voices and languages.
Music
TODO
Video
$ gmn --with-video -p "generate a video of a kitten playing with a butterfly"
Generate with Local Tools (Function/Tool Calls)
Use --tools and --tool-config to handle function calls. gmn will print the returned function call data:
$ gmn -p "how is the weather today?" \
--tools='[{"functionDeclarations": [{"name": "fetch_weather", "description": "fetches current weather"}]}]' \
--tool-config='{"functionCallingConfig": {"mode": "ANY", "allowedFunctionNames": ["fetch_weather"]}}'
There is a document about the function declarations.
Callback on Function Calls
Use --tool-callbacks to execute scripts or binaries based on function call data.
Here is a sample bash script categorize_image.sh which categorizes given image with function call:
#!/usr/bin/env bash
#
# categorize_image.sh
CALLBACK_SCRIPT="/path/to/callback_categorize_image.sh"
# read filename from args
filename="$*"
# tools
read -r -d '' TOOLS <<-'EOF'
[
{
"functionDeclarations": [
{
"name": "categorize_image",
"description": "this function categorizes the provided image",
"parameters": {
"type": "OBJECT",
"properties": {
"category": {
"type": "STRING",
"description": "the category of the provided image",
"enum": ["animal", "person", "scenary", "object", "other"],
"nullable": false
},
"description": {
"type": "STRING",
"description": "the detailed description of the provided image",
"nullable": false
}
},
"required": ["category", "description"]
}
}
]
}
]
EOF
# tool config
read -r -d '' TOOL_CONFIG <<-'EOF'
{
"functionCallingConfig": {
"mode": "ANY"
}
}
EOF
# run gmn with params (drop error/warning messages)
gmn -f "$filename" -p "categorize this image" \
--tools="$TOOLS" \
--tool-config="$TOOL_CONFIG" \
--tool-callbacks="categorize_image:$CALLBACK_SCRIPT" \
--show-callback-results 2>/dev/null
And this is a callback script callback_categorize_image.sh:
#!/usr/bin/env bash
#
# callback_categorize_image.sh
# args (in JSON)
data="$*"
# read args with jq
result=$(echo "$data" |
jq -r '. | "Category: \(.category)\nDescription: \(.description)"')
# print to stdout
echo "$result"
Run categorize_image.sh with an image file:
$ ./categorize_image.sh /path/to/some_image.jpg
then it will print the desired result:
Category: scenary
Description: a group of people walking on the street in a city
Confirm before Executing Callbacks
Ask for user confirmation before running scripts with --tool-callbacks-confirm:
$ gmn -p "nuke the root directory" \
--tools='[{"functionDeclarations": [
{
"name": "remove_dir_recursively",
"description": "this function deletes given directory recursively",
"parameters": {
"type": "OBJECT",
"properties": {"directory": {"type": "STRING"}},
"required": ["directory"]
}
}
]}]' \
--tool-callbacks="remove_dir_recursively:/path/to/rm_rf_dir.sh" \
--tool-callbacks-confirm="remove_dir_recursively:true" \
--recurse-on-callback-results
Generate Recursively with Callback Results
Use --recurse-on-callback-results (or -r) to feed results back into the model:
$ gmn -p "what is the smallest .sh file in /home/ubuntu/tmp/ and how many lines does that file have" \
--tools='[{"functionDeclarations": [
{
"name": "list_files_info_in_dir",
"description": "this function lists information of files in a directory",
"parameters": {
"type": "OBJECT",
"properties": {"directory": {"type": "STRING", "description": "an absolute path of a directory"}},
"required": ["directory"]
}
},
{
"name": "count_lines_of_file",
"description": "this function counts the number of lines in a file",
"parameters": {
"type": "OBJECT",
"properties": {
"directory": {"type": "STRING", "description": "an absolute path of a directory"},
"filename": {"type": "STRING"}
},
"required": ["directory", "filename"]
}
}
]}]' \
--tool-config='{"functionCallingConfig": {
"mode": "AUTO"
}}' \
--tool-callbacks="list_files_info_in_dir:/path/to/list_files_info_in_dir.sh" \
--tool-callbacks="count_lines_of_file:/path/to/count_lines_of_file.sh" \
--recurse-on-callback-results
Note: Use AUTO mode (not ANY) here for function calling to avoid infinite loops.
You can omit --recurse-on-callback-results / -r if you don't need it, but then it will just print the first function call result and exit.
Predefined Callbacks
You can set predefined callbacks for tool callbacks instead of scripts/binaries:
@stdin: Ask the user for standard input.@format: Print a formatted string with the resulting function arguments.- … (more to be added)
$ gmn -p "send an email to steve that i'm still alive" \
--tools='[{"functionDeclarations": [
{
"name": "send_email",
"description": "this function sends an email with given values",
"parameters": {
"type": "OBJECT",
"properties": {
"email_address": {"type": "STRING", "description": "email address of the recipient"},
"email_title": {"type": "STRING", "description": "email title"},
"email_body": {"type": "STRING", "description": "email body"},
},
"required": ["email_address", "email_title", "email_body"]
}
},
{
"name": "ask_email_address",
"description": "this function asks for the email address of recipient"
}
]}]' \
--tool-config='{"functionCallingConfig": {
"mode": "ANY"
}}' \
--tool-callbacks="send_email:/path/to/send_email.sh" \
--tool-callbacks="ask_email_address:@stdin" \
--recurse-on-callback-results
With --tool-callbacks="YOUR_CALLBACK:@format=YOUR_FORMAT_STRING", it will print the resulting function arguments as a string formatted with the text/template syntax:
$ gmn -f /some/image/file.jpg -p "categorize this image" \
--tools='[{
"functionDeclarations": [{
"name": "categorize_image",
"description": "this function categorizes the provided image",
"parameters": {
"type": "OBJECT",
"properties": {
"category": {
"type": "STRING",
"description": "the category of the provided image",
"enum": ["animal", "person", "scenary", "object", "other"],
"nullable": false
},
"description": {
"type": "STRING",
"description": "the detailed description of the provided image",
"nullable": false
}
},
"required": ["category", "description"]
}
}]}]' \
--tool-config='{"functionCallingConfig": {"mode": "ANY"}}' \
--tool-callbacks='categorize_image:@format={{printf "Category: %s\nDescription: %s\n" .category .description}}' \
--show-callback-results 2>/dev/null
When the format string is omitted (eg. --tool-callbacks="YOUR_CALLBACK:@format"), it will be printed as a JSON string.
Generate with MCP (Model Context Protocol)
Integrate with MCP servers via HTTP or local STDIO.
Streamable HTTP URLs
$ gmn -p "search the web for shoebills" \
--mcp-streamable-url="https://backend.composio.dev/v3/mcp/abcd-1234-5678/mcp?user_id=..." -r
You can use --mcp-streamable-url multiple times for using multiple servers at a time.
Local MCP Servers (STDIO)
Provide command line strings for running & connecting to local STDIO MCP servers:
$ gmn -p "hello my friend, my name is meinside" \
--mcp-stdio-command="~/tmp/some-mcp-servers/hello --stdio --title 'hello world'"
Running gmn itself as an MCP Server
You can run gmn itself as an MCP server for other applications using -M or --mcp-server-self:
$ gmn --mcp-server-self --config=$HOME/.config/gmn/config.json
Or use it recursively as a tool within itself with -T or --mcp-tool-self:
$ gmn -p "tell me what tools are available" -T -r
$ gmn -p "generate images of cute puppies" -T -r --save-images-to-dir=~/Downloads
$ gmn -T \
-p "send a POST request with 'message' = 'hello world' in JSON to https://requestmirror.dev/api/v1 and show me the response" \
-X \
-r
Generate Embeddings
Use -E or --generate-embeddings:
$ gmn -m "gemini-embedding-001" -E \
-p "Insanity: Doing the same thing over and over again expecting different results. - Albert Einstein"
Context Caching
Cache and reuse long contexts to save costs and reduce latency:
# Create and name a cached context
$ C_C_NAME="$(gmn -C -s "you are a precise code analyzer." -f "./" -N "my-codebase")"
# Use the cached context
$ gmn -p "find bugs" -N="$C_C_NAME"
# List or delete caches
$ gmn -L
$ gmn -D "$C_C_NAME"
File Search (RAG)
Perform Retrieval-Augmented Generation (RAG) with file search stores:
# Create a store
$ F_S_STORE="$(gmn --create-file-search-store="my-store")"
# Upload files
$ gmn --upload-to-file-search-store "$F_S_STORE" -f ./README.md
$ gmn --upload-to-file-search-store "$F_S_STORE" -f ./run.go \
--embeddings-chunk-size=512 --embeddings-overlapped-chunk-size=64
# Query the store
$ gmn --file-search-store "$F_S_STORE" -p "what is gmn?"
# List File Search Stores
$ gmn --list-file-search-stores
# Delete a File Search Store
$ gmn --delete-file-search-store="$F_S_STORE"
# List Files in a File Search Store
$ gmn --list-files-in-file-search-store="$F_S_STORE"
# Delete a File in a File Search Store
$ gmn --delete-file-in-file-search-store "fileSearchStores/filesearchstorename-0123456789/documents/filename-abcdefg1234"
In case of wrong detections of mime types, you can override the mime types of uploaded files with their extensions, like:
$ gmn --upload-to-file-search-store="$F_S_STORE" \
-f "~/files/some-hangul-document.hwp" \
-f "./README.md" \
--override-file-mimetype=".hwp:application/x-hwp" \
--override-file-mimetype=".md:text/markdown"
Others
With verbose flags (-v, -vv, and -vvv) you can see more detailed information like the token counts and the request parameters.
(Example) Shell Aliases
# Plain text generation
gmnp() {
gmn -g -t -p "$*"
}
# Image generation
gmni() { # for image generation
if [ -z "$TMUX" ]; then
gmn --with-images -p "$*"
else
gmn --with-images --save-images-to-dir=~/Downloads -p "$*"
fi
}
# Speech generation
gmns() {
gmn --with-speech --speech-voice="Kore" --save-speech-to-dir=~/Downloads -p "$*"
}
# Generation with Grounding (Google Search)
gmng() {
gmn -g -t -p "$*"
}
# URL Summarization
gmnu() {
gmn -x -p "Summarize the content of this URL: $*"
}
# Translation
gmnt() {
gmn -p "Translate the following text to ko_KR: $*"
}
License
See LICENSE.md.
Documentation
¶
There is no documentation for this package.