Information about the context of script execution is available in the `env` global object.

## Environment (`env`)

The `env` global object contains properties that provide information about the script execution context.
`env` is populated automatically by the GenAIScript runtime.

### `env.files`

The `env.files` array contains all files within the execution context. The context is defined implicitly
by the user based on:

- `script` `files` option

```js
script({
    files: "**/*.pdf",
})
```

or multiple paths

```js
script({
    files: ["src/*.pdf", "other/*.pdf"],
})
```

- the UI location to start the tool

- [CLI](/genaiscript/reference/cli) files arguments.

The files are stored in `env.files` which can be injected in the prompt.

- using `def`

```js
def("FILE", env.files)
```

- filtered,

```js
def("DOCS", env.files, { endsWith: ".md" })
def("CODE", env.files, { endsWith: ".py" })
```

- directly in a `$` call

```js
$`Summarize ${env.files}.
```

In this case, the prompt is automatically expanded with a `def` call and the value of `env.files`.

```js
// expanded
const files = def("FILES", env.files, { ignoreEmpty: true })
$`Summarize ${files}.
```

### `env.vars`

The `vars` property contains the variables that have been defined in the script execution context.

```javascript
// grab locale from variable or default to en-US
const locale = env.vars.locale || "en-US"
```

Read more about [variables](/genaiscript/reference/scripts/variables).

## Definition (`def`)

The `def("FILE", file)` function is a shorthand for generating a fenced variable output.

```js "def"
def("FILE", file)
```

It renders approximately to

````markdown
FILE:

```file="filename"
file content
```
````

or if the model support XML tags (see [fence formats](/genaiscript/reference/scripts/fence-formats)):

```markdown
<FILE file="filename">
file content
</FILE>
```

The `def` function can also be used with an array of files, such as `env.files`.

```js "env.files"
def("FILE", env.files)
```

### Language

You can specify the language of the text contained in `def`. This can help GenAIScript optimize the rendering of the text.

```js 'language: "diff"'
// hint that the output is a diff
def("DIFF", gitdiff, { language: "diff" })
```

### Referencing

The `def` function returns a variable name that can be used in the prompt.
The name might be formatted differently to accommodate the model's preference.

```js "const f = "
const f = def("FILE", file)

$`Summarize ${f}.`
```

### File filters

Since a script may be executed on a full folder, it is often useful to filter the files based on

- their extension

```js "endsWith: '.md'"
def("FILE", env.files, { endsWith: ".md" })
```

- or using a [glob](<https://en.wikipedia.org/wiki/Glob_(programming)>):

```js "glob: '**/*.{md,mdx}'"
def("FILE", files, { glob: "**/*.{md,mdx}" })
```

:::tip

You can open the completion menu and discover all the options
by pressing **Ctrl+Space** after the curly brace `{` character.

```js
def("FILE", env.files, { // press Ctrl+Space
```

:::

### Empty files

By default, if `def` is used with an empty array of files, it will cancel the prompt. You can override this behavior
by setting `ignoreEmpty` to `true`.

```js "ignoreEmpty: true"
def("FILE", env.files, { endsWith: ".md", ignoreEmpty: true })
```

### Line-based extraction

You can extract content around a specific line number using the `line` option. This is particularly useful when you want to focus on a specific area of interest in large files.

```js "line: 25"
// Focus on line 25 with dynamic context
def("FUNCTION_CODE", fileContent, { line: 25 })
```

The `line` option dynamically calculates the surrounding context based on file size:
- Very small files (≤20 lines): Include most content
- Small files (≤100 lines): 15 lines on each side  
- Medium files (≤500 lines): 25 lines on each side
- Large files (≤2000 lines): 50 lines on each side
- Very large files (>2000 lines): 75 lines on each side

#### Token budget support

When combined with `maxTokens`, the `line` option performs intelligent token-aware range calculation:

```js "line: 25, maxTokens: 500"
// Focus on line 25 with token budget constraint
def("FUNCTION_CODE", fileContent, { line: 25, maxTokens: 500 })
```

The implementation:
- **Smart Expansion**: Starts with the center line and expands alternately up/down until token budget is reached
- **Accurate Counting**: Uses precise token estimation for better control
- **Graceful Fallback**: Falls back to file-size-based calculation when no `maxTokens` specified
- **Budget Overflow**: Returns just the center line if it already exceeds the token budget

#### Priority rules

Explicit line ranges take precedence over the `line` option:

```js
// lineStart/lineEnd override line option and maxTokens
def("EXPLICIT_WINS", codeFile, { 
  lineStart: 10, 
  lineEnd: 20, 
  line: 50, 
  maxTokens: 100 
}) // Uses lines 10-20
```

### `maxTokens`

It is possible to limit the number of tokens that are generated by the `def` function. This can be useful when the output is too large and the model has a token limit.
The `maxTokens` option can be set to a number to limit the number of tokens generated **for each individual file**.

```js "maxTokens: 100"
def("FILE", env.files, { maxTokens: 100 })
```

When used with the `line` option, `maxTokens` controls the total size of the extracted range around the center line rather than truncating individual files.

### Data filters

The `def` function treats data files such as [CSV](/genaiscript/reference/scripts/csv) and [XLSX](/genaiscript/reference/scripts/xlsx) specially. It will automatically convert the data into a
markdown table format to improve tokenization.

- `sliceHead`, keep the top N rows

```js "sliceHead: 100"
def("FILE", env.files, { sliceHead: 100 })
```

- `sliceTail`, keep the last N rows

```js "sliceTail: 100"
def("FILE", env.files, { sliceTail: 100 })
```

- `sliceSample`, keep a random sample of N rows

```js "sliceSample: 100"
def("FILE", env.files, { sliceSample: 100 })
```

### Prompt Caching

You can use `cacheControl: "ephemeral"` to specify that the prompt can be cached
for a short amount of time, and enable prompt caching optimization, which is supported (differently) by various LLM providers.

```js 'cacheControl("ephemeral")'
$`...`.cacheControl("ephemeral")
```

```js '"cacheControl: "ephemeral"'
def("FILE", env.files, { cacheControl: "ephemeral" })
```

Read more about [prompt caching](/genaiscript/reference/scripts/prompt-caching).

### Safety: Prompt Injection detection

You can schedule a check for prompt injection/jai break with your configured [content safety](/genaiscript/reference/scripts/content-safety) provider.

```js "detectPromptInjection: true"
def("FILE", env.files, { detectPromptInjection: true })
```

### Predicted output

Some models, like OpenAI gpt-4o and gpt-4o-mini, support specifying a [predicted output](https://platform.openai.com/docs/guides/predicted-outputs) (with some [limitations](https://platform.openai.com/docs/guides/predicted-outputs#limitations)). This helps reduce latency for model responses where much of the response is known ahead of time.
This can be helpful when asking the LLM to edit specific files.

Set the `prediction: true` flag to enable it on a `def` call. Note that only a single file can be predicted.

```js
def("FILE", env.files[0], { prediction: true })
```

:::note

This feature disables line number insertion.

:::

## Data definition (`defData`)

The `defData` function offers additional formatting options for converting a data object into a textual representation. It supports rendering objects as YAML, JSON, or CSV (formatted as a Markdown table).

```js
// render to markdown-ified CSV by default
defData("DATA", data)

// render as yaml
defData("DATA", csv, { format: "yaml" })
```

The `defData` function also supports functions to slice the input rows and columns.

- `headers`, list of column names to include
- `sliceHead`, number of rows or fields to include from the beginning
- `sliceTail`, number of rows or fields to include from the end
- `sliceSample`, number of rows or fields to pick at random
- `distinct`, list of column names to deduplicate the data based on
- `query`, a [jq](https://jqlang.github.io/jq/) query to filter the data

```js
defData("DATA", data, {
    sliceHead: 5,
    sliceTail: 5,
    sliceSample: 100,
})
```

You can leverage the data filtering functionality
using `parsers.tidyData` as well.

## Diff Definition (`defDiff`)

It is very common to compare two pieces of data and ask the LLM to analyze the differences. Using diffs is a great way
to naturally compress the information since we only focus on differences!

The `defDiff` takes care of formatting the diff in a way that helps LLM reason. It behaves similarly to `def` and assigns
a name to the diff.

```js
// diff files
defDiff("DIFF", env.files[0], env.files[1])

// diff strings
defDiff("DIFF", "cat", "dog")

// diff objects
defDiff("DIFF", { name: "cat" }, { name: "dog" })
```

You can leverage the diff functionality using `parsers.diff`.