README
¶
# jsluice command-line tool
The `jsluice` command-line tool extracts URLs, paths, secrets, and other interesting bits
from JavaScript files.
Values are extracted based not just on how they *look*, but also based on how they are *used*.
That means `jsluice` can find the path in this code:
```javascript
fetch('/api/users?id=' + userId + '&format=json', {
method: "GET",
headers: {
"X-Env": "stage"
}
})
```
But also the method, and headers:
```
▶ jsluice urls demo.js | jq
{
"url": "/api/users?id=EXPR&format=json",
"queryParams": ["id", "format"],
"method": "GET",
"headers": {
"X-Env": "stage"
},
"type": "fetch"
}
```
Because `jsluice` is doing [static analysis](https://en.wikipedia.org/wiki/Static_program_analysis) it
can't know the value of that `userId` variable, but it *does* understand string concatenation. The value
of expressions like this are replaced with `EXPR` by default, but that can be changed with the
`-P`/`--placeholder` flag.
## Contents
* [Installation](#install)
* [Usage](#usage)
* [Extracting URLs](#extracting-urls)
* [Resolving Relative Paths](#resolving-relative-paths)
* [Including Original Source](#including-original-source)
* [Extracting Secrets](#extracting-secrets)
* [Custom Secret Matchers](#custom-secret-matchers)
* [Printing Syntax Trees](#printing-syntax-trees)
* [Running Queries](#running-queries)
* [Formatting JavaScript Source](#formatting-javascript-source)
* [Using remote files over HTTP](#requesting-files-from-remote-hosts)
* [Using WARC files](#using-warc-files)
* [Getting help](#help)
## Install
To install `jsluice` you need [Go](https://go.dev/doc/install).
Once Go is installed and configured, run:
```
▶ go install github.com/antqt/jsluice-mod/cmd/jsluice@latest
```
If everything worked correctly, you should be able to run `jsluice --help` and
see the [help output](#help).
## Usage
Provide `jsluice` with a mode, any options, and a list of JavaScript files (both local, and remote via HTTP based protocols):
```
jsluice <mode> [options] [file...]
```
You can also provide files one-per-line on `stdin`:
```
find . -name '*.js' | jsluice <mode> [options]
```
`jsluice` has five modes:
* `urls` - for extracting URLs and paths
* `secrets` - for finding secrets and so on
* `tree` - for printing syntax trees
* `query` - for running tree-sitter queries
* `format` - for formatting JavaScript source
Output is in [JSONL](https://jsonlines.org/) format. Piping `jsluice` to a tool
like [jq](https://jqlang.github.io/jq/) allows for human-readable formatting,
filtering and further processing.
### Extracting URLs
In `urls` mode, `jsluice` extracts URLs and paths from several different places:
* Assignments to document.location, val.href, val.src etc
* Calls to location.replace, window.open, and fetch
* Uses of XMLHttpRequest
* Calls to jQuery's $.get, $.post, and $.ajax
* Any string literal that contains something that looks like a URL
If you want to ignore string-literal matches you can use the `-I`/`--ignore-strings` flag.
When possible, HTTP methods, headers etc are also extracted.
Here's a call to [jQuery](https://jquery.com/)'s `$.ajax` as an example:
```javascript
$.ajax({
method: "PUT",
url: "/api/v1/posts",
data:{ postId: 324 },
headers: {
"Content-Type": "application/json",
"x-backend": "prod"
}},
function(data, status){
location.href = data.redirect;
}
)
```
And the output from `jsluice`:
```
▶ jsluice urls jquery.js | jq
{
"url": "/api/v1/posts",
"queryParams": [],
"bodyParams": [
"postId"
],
"method": "PUT",
"headers": {
"Content-Type": "application/json",
"x-backend": "prod"
},
"type": "$.ajax",
"filename": "jquery.js"
}
```
#### Resolving Relative Paths
Relative paths can be resolved using a base URL provided with the `-R`/`--resolve-paths` flag.
```
▶ cat location.js
document.location = '../../guestbook.html'
▶ jsluice urls location.js -I -R https://example.com/~tom/photos/2003/ | jq
{
"url": "https://example.com/~tom/guestbook.html",
"queryParams": [],
"bodyParams": [],
"method": "GET",
"type": "locationAssignment",
"filename": "location.js"
}
```
#### Including Original Source
Sometimes it's useful to be able to see the complete source code that a URL was extracted from.
Using the `-S`/`--include-source` flag adds a `source` field to the results containing that source code:
```
▶ jsluice urls location.js -I -S | jq
{
"url": "../../guestbook.html",
"queryParams": [],
"bodyParams": [],
"method": "GET",
"type": "locationAssignment",
"source": "document.location = '../../guestbook.html'",
"filename": "testdata/relative-location.js"
}
```
### Extracting Secrets
The `secrets` mode is for extracting API keys, passwords, and other interesting bits of data.
There are built-in extractors for:
* AWS keys
* GCP keys
* GitHub keys
* Firebase configurations
That's not very many, so you can supply your own in a file specified with the `-p`/`--patterns` flag.
Here's an example of some JavaScript that contains an AWS key:
```javascript
var config = {
bucket: "examplebucket",
awsKey: "AKIAIOSFODNN7EXAMPLE",
awsSecret: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
server: "someserver.example.com"
};
```
And the output of `jsluice secrets` when run against that file:
```
▶ jsluice secrets awskey.js | jq
{
"kind": "AWSAccessKey",
"data": {
"key": "AKIAIOSFODNN7EXAMPLE",
"secret": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
},
"filename": "awskey.js",
"severity": "high",
"context": {
"awsKey": "AKIAIOSFODNN7EXAMPLE",
"awsSecret": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
"bucket": "examplebucket",
"server": "someserver.example.com"
}
}
```
The key and associated secret are put in the `data` field with predictable names to ease
the automation of, for example, checking the validity of found secrets.
The entire object in which the secret was found is included in the `content` field.
#### Custom Secret Matchers
A JSON file containing an array of pattern objects can be supplied using the `-p`/`--patterns` flag.
Here's an example of a basic patterns file:
```json
[
{
"name": "base64",
"value": "(eyJ|YTo|Tzo|PD[89]|rO0)[%a-zA-Z0-9+/]+={0,2}",
"severity": "low"
},
{
"name": "genericSecret",
"key": "(secret|private|key)",
"value": "[%a-zA-Z0-9+/]+"
},
{
"name": "firebaseConfig",
"severity": "high",
"object": [
{"key": "apiKey", "value": "^AIza.+"},
{"key": "authDomain"},
{"key": "projectId"},
{"key": "storageBucket"}
]
}
]
```
Each pattern can have the following fields:
* `name`, which is used in the output
* `severity`, which should be one of `info`, `low`, `medium`, or `high`
* `value`, a regular expression to match against string values
* `key`, a regular expression to match against key names
* `object`, an array of patterns to match against the keys and values of an entire object
All regular expressions use the [Go regex syntax](https://pkg.go.dev/regexp/syntax).
Here's a, somewhat silly, example JavaScript file to run the patterns file against:
```javascript
function getConfig(){
let config = {
randomStr: "abc123xyz256",
secret: "I quite like PHP",
}
return "eyJsb2wiOiAic29tZSBKU09OISIsICJjb3VudCI6IDEyM30K"
}
```
Running `jsluice secrets` using the above patterns file (saved as `patterns.json`):
```
▶ jsluice secrets -p patterns.json simple-b64.js | jq
{
"kind": "base64",
"data": {
"match": "eyJsb2wiOiAic29tZSBKU09OISIsICJjb3VudCI6IDEyM30K"
},
"filename": "simple-b64.js",
"severity": "low",
"context": null
}
{
"kind": "genericSecret",
"data": {
"key": "secret",
"value": "I quite like PHP"
},
"filename": "simple-b64.js",
"severity": "info",
"context": {
"randomStr": "abc123xyz256",
"secret": "I quite like PHP"
}
}
```
Note that the `base64` matcher worked as expected, but the `genericSecret` matcher
returned a rather different sort of secret than expected. That's because the regular
expression lacks [anchors](https://www.regular-expressions.info/anchors.html):
```
[%a-zA-Z0-9+/]+
```
If you wanted the match against all of the value, the regex could be changed to:
```
^[%a-zA-Z0-9+/]+$
```
### Printing Syntax Trees
The `tree` mode prints a textual representation of the syntax tree for each JavaScript file.
This is especially helpful when [writing queries](#running-queries).
The output can be quite long, so here's a tiny example program:
```javascript
console.log("Hello, world!")
```
And the output of `jsluice tree`:
```
▶ jsluice tree hello.js
hello.js:
program
expression_statement
call_expression
function: member_expression
object: identifier (console)
property: property_identifier (log)
arguments: arguments
string ("Hello, world!")
```
### Running Queries
The `query` mode lets you run [Tree-sitter](https://tree-sitter.github.io/tree-sitter/) queries against JavaScript files.
The query syntax is fully documented [here on the Tree-sitter project site](https://tree-sitter.github.io/tree-sitter/using-parsers#query-syntax).
Just about the most simple query you could run extracts all of the string literals from the input files.
Here's an example file to try it with:
```javascript
const config = {
stage: false,
server: "example.com",
ttl: 3600,
dns: ["1.1.1.1", "8.8.8.8"],
paths: {
"home": "/",
"blog": "/blog"
}
}
```
And how to run the query:
```
▶ jsluice query -q '(string) @str' config.js
"example.com"
"1.1.1.1"
"8.8.8.8"
"home"
"/"
"blog"
"/blog"
```
The `@str` part of the query identifies which part of the query should be extracted.
In this case there is only one thing to match in the query, but it is still required.
`jsluice` tries to make the output valid JSONL where possible, and because it understands
objects, arrays, strings, etc: it's possible to get JSON represenations of those things
as output:
```
▶ jsluice query -q '(object) @match' config.js | jq
{
"dns": [
"1.1.1.1",
"8.8.8.8"
],
"paths": {
"blog": "/blog",
"home": "/"
},
"server": "example.com",
"stage": false,
"ttl": 3600
}
{
"blog": "/blog",
"home": "/"
}
```
If you don't want that to happen, you can use the `-r`/`--raw-output` flag.
### Formatting JavaScript Source
The `format` mode uses [jsbeautifier-go](https://github.com/ditashi/jsbeautifier-go) to format JavaScript source code:
```
▶ cat testdata/location.min.js
function goToLogin(){location.href="/login/"+document.location.hash.substring(1)} let logout=()=>{document.location.replace("/logout")}
▶ jsluice format testdata/location.min.js
function goToLogin() {
location.href = "/login/" + document.location.hash.substring(1)
}
let logout = () => {
document.location.replace("/logout")
}
```
### Requesting files from remote hosts:
`jsluice` will detect when an argument is passed to the tool that begins with `http://` or `https://`. These arguments will be used to retrieve the associated files, and work on them in the same process as the local files.
This means that URLs can be specified aswell as the local files.
```
▶ jsluice urls demo.js https://example.com/jquery.js | jq
{
"url": "/api/users?id=EXPR&format=json",
"queryParams": ["id", "format"],
"method": "GET",
"headers": {
"X-Env": "stage"
},
"type": "fetch"
}
{
"url": "/api/v1/posts",
"queryParams": [],
"bodyParams": [
"postId"
],
"method": "PUT",
"headers": {
"Content-Type": "application/json",
"x-backend": "prod"
},
"type": "$.ajax",
"filename": "jquery.js"
}
```
### Using WARC files
When the `-w`/`--warc` flag is specified, `jsluice` treats the input files as
[WARC](https://iipc.github.io/warc-specifications/specifications/warc-format/warc-1.1-annotated/) files.
```
▶ jsluice urls --warc example.warc.gz | jq
{
"url": "/blog/admin.php?redirect=/login",
"queryParams": [
"redirect"
],
"bodyParams": [],
"method": "GET",
"type": "location.replace",
"filename": "https://example.com/blog/"
}
```
### Help
You can see the `jsluice` help output with the `-h`/`--help` flag.
```
▶ jsluice --help
jsluice - Extract URLs, paths, and secrets from JavaScript files
Usage:
jsluice <mode> [options] [file...]
Modes:
urls Extract URLs and paths
secrets Extract secrets and other interesting bits
tree Print syntax trees for input files
query Run tree-sitter a query against input files
Global options:
-c, --concurrency int Number of files to process concurrently (default 1)
-C, --cookie string Cookies to use when making requests to the specified HTTP based arguments
-H, --header string Headers to use when making requests to the specified HTTP based arguments (can be specified multiple times)
-P, --placeholder string Set the expression placeholder to a custom string (default 'EXPR')
-w, --warc Treat the input files as WARC (Web ARChive) files
URLs mode:
-I, --ignore-strings Ignore matches from string literals
-S, --include-source Include the source code where the URL was found
-R, --resolve-paths <url> Resolve relative paths using the absolute URL provided
Secrets mode:
-p, --patterns <file> JSON file containing user-defined secret patterns to look for
Query mode:
-q, --query <query> Tree sitter query to run; e.g. '(string) @matches'
-r, --raw-output Do not JSON-encode query output
Examples:
jsluice urls -C 'auth=true; user=admin;' -H 'Specific-Header-One: true' -H 'Specific-Header-Two: false' local_file.js https://remote.host/example.js
jsluice query -q '(object) @m' one.js two.js
find . -name *.js' | jsluice secrets -c 5 --patterns=apikeys.json
```
Documentation
¶
There is no documentation for this package.
Click to show internal directories.
Click to hide internal directories.