goapiuse

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 24, 2026 License: Apache-2.0 Imports: 14 Imported by: 0

README

go-apiuse

Offline index of dominant call patterns ("shapes") for Go stdlib and popular package APIs. Part of the CivNode Training semantic engine.

Training uses the index to answer one question: how is this API actually called in production Go code? The answer powers "Used in production" sidebars, hover popovers, and kata grading hints. It is deterministic, has no runtime dependency on an LLM, and does not execute user code.

Status

v0.1.0. Usable public API, reference tiny_index.bin fixture, ingest tool, full test suite. The full production index artifact is distributed out-of-band (see "Artifact distribution" below).

Install

go get github.com/CivNode/go-apiuse@v0.1.0

Library usage

import goapiuse "github.com/CivNode/go-apiuse"

idx, err := goapiuse.Load("/var/lib/civnode/apiuse/latest.bin")
if err != nil {
    return err
}
for _, u := range idx.Usage("context.WithTimeout", 3) {
    fmt.Printf("%.0f%%  %s\n", u.Frequency*100, u.Pattern)
}
Public surface
type Index struct { /* opaque */ }

func Load(path string) (*Index, error)
func LoadFromFS(fsys fs.FS, path string) (*Index, error)

func (i *Index) Usage(qualName string, topN int) []Usage
func (i *Index) Meta() Meta
func (i *Index) Len() int

type Usage struct {
    Pattern      string
    Frequency    float64
    ExampleRepos []string
}

qualName is the fully-qualified callee. Examples:

  • context.WithTimeout
  • net/http.HandlerFunc
  • sync.Mutex.Lock
  • (builtin).len

Call shape

A shape is a compact, deterministic string describing how a call is written. The goal is aggressive clustering: a corpus of 10 calls to a given API typically collapses to three or fewer shapes.

Shapes encode three facts:

  1. Context. How the result is consumed. short-decl[2], assign[2], return, stmt, arg-of-call, defer, go, control-head, value-spec, composite-elt, operand.
  2. Arity. args=N; variadic is appended when the call uses ....
  3. Arg categories. One compact token per argument. Type-informed when types are available: context, duration, error, chan, map, slice, func, interface, int-const, float-const, string-const, bool-const. Syntactic fallback when types are unavailable: int-literal, string-literal, ident, nil, addr-of, mul-expr, composite-lit, func-lit, ...

Example output for ctx, cancel := context.WithTimeout(parent, 5*time.Second):

short-decl[2] | args=2 | context, duration

Identifier names are deliberately discarded. Package-qualified selector literals keep the selector path since time.Second vs http.StatusOK carries real semantic weight.

Ingest tool

go install github.com/CivNode/go-apiuse/cmd/go-apiuse-ingest@v0.1.0

go-apiuse-ingest \
    -o index.bin \
    -source "stdlib + top-200 GitHub Go repos, 2026-04-24" \
    /srv/corpus/stdlib \
    /srv/corpus/third-party/...

The ingest uses golang.org/x/tools/go/packages with full type info (NeedTypes | NeedTypesInfo | NeedSyntax). Packages that fail to type-check are logged and skipped; the tool exits non-zero only if every package in the corpus failed. This matters because real-world corpora contain generated code, tagged-build files, and packages whose transitive imports cannot be resolved in isolation; dropping those is far better than failing the whole run.

Artifact distribution

The full corpus index is much larger than is appropriate for a Git repo. CivNode ships it as a static asset on civnode-storage (OVHcloud Frankfurt):

s3://civnode-storage/apiuse/index-2026-04-24.bin
s3://civnode-storage/apiuse/latest.json

latest.json is a pointer of the form:

{
  "version": "2026-04-24",
  "bin": "apiuse/index-2026-04-24.bin",
  "sha256": "...",
  "size_bytes": 48234112,
  "built_at": "2026-04-24T02:17:00Z",
  "source": "stdlib + top-200 GitHub Go repos"
}
Release pipeline
  1. Weekly scheduled job on HEXD runs the ingest against the mirrored corpus, producing index-YYYY-MM-DD.bin.
  2. Verify the artifact loads and returns usages for known APIs as a smoke test.
  3. Upload to civnode-storage under apiuse/.
  4. Update apiuse/latest.json atomically.
  5. CivNode's Training backend polls latest.json on startup plus once per day, downloads if the version differs, and caches the result locally.
  6. The browser grabs the artifact once per release and keeps it in IndexedDB.

Fixture for tests

testdata/tiny_index.bin ships a hand-built index covering ten snippets across three APIs (context.WithTimeout, net/http.HandlerFunc, sync.Mutex.Lock). Regenerate it with:

REGEN_TINY_INDEX=1 go test -run TestLoad_TinyIndex ./...

Development

make test   # go test ./... -race -count=1
make lint   # gofumpt -l -d . + golangci-lint run ./...
make fuzz   # go test -fuzz=FuzzDecode .
make fmt    # gofumpt -w .

Licence

Apache-2.0. See LICENSE.

Documentation

Overview

Package goapiuse is part of the CivNode Training semantic engine.

It loads an offline index of dominant call patterns ("shapes") for Go stdlib and popular third-party APIs. The index is built offline by the companion cmd/go-apiuse-ingest tool and consumed at runtime by Training to show "used in production" sidebars and hover popovers.

See https://github.com/CivNode/go-apiuse for details.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func EncodePublic

func EncodePublic(w io.Writer, art ArtifactV1) error

EncodePublic writes an artifact to w. It is the stable entry point the ingest command uses to serialise the corpus result. External callers outside the CivNode Training toolchain should not rely on this; the gob format may evolve between artifact schema versions.

func ExtractShape

func ExtractShape(sc ShapeContext) string

ExtractShape returns a canonical, compact string describing the call. The goal is aggressive clustering: a corpus of 10 real-world calls to a given API should collapse to 3 or fewer shapes.

The shape encodes three facts:

  • the call's syntactic context (assignment arity, return statement, argument position, bare statement, ...);
  • the argument count;
  • for each argument, a compact category (identifier, literal, call, selector, duration-like, ...).

Identifier names are deliberately discarded. Package-qualified selectors keep the selector path because it is often semantically load-bearing (e.g. time.Second vs http.StatusOK).

func ResolveCallee

func ResolveCallee(call *ast.CallExpr, info *types.Info) string

ResolveCallee returns the fully-qualified name of the callee in call, or the empty string if the callee cannot be resolved (e.g. an unresolved function literal or a dynamic dispatch through an interface with no type information available).

The returned form is:

pkg/path.FuncName           for package-level functions and constructors
pkg/path.TypeName.Method    for methods on named types (pointer or not)
(builtin).FuncName          for Go built-ins like len, cap, make

Only the latter two use a period separator between receiver and method to stay compatible with the string keys the ingest tool emits.

Types

type ArtifactV1

type ArtifactV1 struct {
	Meta    Meta
	Entries map[string][]Usage
}

ArtifactV1 is the on-disk layout, gob-encoded. A nested struct means future fields can be added without breaking older consumers as long as we only ever add new trailing fields. Exported so the ingest command in cmd/go-apiuse-ingest can construct it directly without duplicating the definition.

type Index

type Index struct {
	// contains filtered or unexported fields
}

Index is an opaque handle over a loaded artifact. Safe for concurrent reads once Load / LoadFromFS has returned.

func Load

func Load(path string) (*Index, error)

Load reads an index from the filesystem. It is a thin wrapper over LoadFromFS using os.DirFS on the parent directory.

func LoadFromFS

func LoadFromFS(fsys fs.FS, path string) (*Index, error)

LoadFromFS reads an index from an fs.FS. This is the preferred form when the artifact is shipped via embed.FS.

func (*Index) Len

func (i *Index) Len() int

Len returns the number of distinct qualified API names in the index.

func (*Index) Meta

func (i *Index) Meta() Meta

Meta returns the artifact metadata. Zero value if the artifact did not carry metadata.

func (*Index) Usage

func (i *Index) Usage(qualName string, topN int) []Usage

Usage returns the top-N shapes for qualName, sorted by descending frequency. If qualName is unknown, it returns nil. If topN <= 0 the full list is returned.

type Meta

type Meta struct {
	Version   int    // artifact schema version; current = 1
	BuiltAt   string // RFC3339 timestamp
	Source    string // free-form description of the corpus
	CallCount int    // total number of resolved calls ingested
}

Meta describes the artifact that was loaded. Stored as the first value in the gob stream so that older consumers can still read the entries map.

type ShapeContext

type ShapeContext struct {
	Call   *ast.CallExpr
	Parent ast.Node // immediate parent of Call in the AST walk

	// Info is the type-checked info produced by go/packages. Must include
	// Types + Uses + Defs. A nil Info means "types not available"; the
	// shape will fall back to syntactic classification.
	Info *types.Info
}

ShapeContext is what the ingest tool feeds into ExtractShape. It carries everything the shape extractor needs to decide on a canonical string without holding a reference to a types.Package.

type Usage

type Usage struct {
	// Pattern is a canonical, compact description of the call shape. For
	// example: "two-result assign, 2 args, arg[1] is duration literal".
	Pattern string

	// Frequency is the fraction (0..1) of all observed calls to the API
	// that match this shape. A value of 1.0 means every call in the
	// corpus used this shape.
	Frequency float64

	// ExampleRepos lists up to five source locations (repo or file paths)
	// where the shape was observed. Purely informational.
	ExampleRepos []string
}

Usage is one call-shape entry for a given fully-qualified API name.

Directories

Path Synopsis
cmd
go-apiuse-ingest command
go-apiuse-ingest walks one or more Go source trees, resolves the callee of every function call, extracts a canonical call-shape for each, and writes a gob-encoded index.bin suitable for consumption by the goapiuse runtime library.
go-apiuse-ingest walks one or more Go source trees, resolves the callee of every function call, extracts a canonical call-shape for each, and writes a gob-encoded index.bin suitable for consumption by the goapiuse runtime library.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL