Hugging Face Tool Builder

Added March 5, 2026 Source: Hugging Face

Generate reusable command-line scripts and utilities for working with the Hugging Face API. This is ideal for automating data tasks, building custom data pipelines, or chaining API calls together. The generated scripts handle authentication and are designed for easy piping into other tools.

Installation

This skill is self-contained. Copy the SKILL.md below directly into your project to get started.

.claude/skills/hugging-face-tool-builder/SKILL.md    # Claude Code
.cursor/skills/hugging-face-tool-builder/SKILL.md    # Cursor

Or install as a personal skill (available across all your projects):

~/.claude/skills/hugging-face-tool-builder/SKILL.md

You can also install using the skills CLI:

npx skills add huggingface/skills --skill hugging-face-tool-builder

Requires Node.js 18+.

SKILL.md

---
name: hugging-face-tool-builder
description: Use this skill when the user wants to build tool/scripts or achieve a task where using data from the Hugging Face API would help. This is especially useful when chaining or combining API calls or the task will be repeated/automated. This Skill creates a reusable script to fetch, enrich or process data.
---

# Hugging Face API Tool Builder

Your purpose is now is to create reusable command line scripts and utilities for using the Hugging Face API, allowing chaining, piping and intermediate processing where helpful. You can access the API directly, as well as use the `hf` command line tool. Model and Dataset cards can be accessed from repositories directly.

## Script Rules

Make sure to follow these rules:
 - Scripts must take a `--help` command line argument to describe their inputs and outputs
 - Non-destructive scripts should be tested before handing over to the User
 - Shell scripts are preferred, but use Python or TSX if complexity or user need requires it.
 - IMPORTANT: Use the `HF_TOKEN` environment variable as an Authorization header. For example: `curl -H "Authorization: Bearer ${HF_TOKEN}" https://huggingface.co/api/`. This provides higher rate limits and appropriate authorization for data access.
 - Investigate the shape of the API results before commiting to a final design; make use of piping and chaining where composability would be an advantage - prefer simple solutions where possible.
 - Share usage examples once complete.

Be sure to confirm User preferences where there are questions or clarifications needed.

## Sample Scripts

Paths below are relative to this skill directory.

Reference examples:
- `references/hf_model_papers_auth.sh` ([source](https://raw.githubusercontent.com/huggingface/skills/main/skills/hugging-face-tool-builder/references/hf_model_papers_auth.sh)) — uses `HF_TOKEN` automatically and chains trending → model metadata → model card parsing with fallbacks; it demonstrates multi-step API usage plus auth hygiene for gated/private content.
- `references/find_models_by_paper.sh` ([source](https://raw.githubusercontent.com/huggingface/skills/main/skills/hugging-face-tool-builder/references/find_models_by_paper.sh)) — optional `HF_TOKEN` usage via `--token`, consistent authenticated search, and a retry path when arXiv-prefixed searches are too narrow; it shows resilient query strategy and clear user-facing help.
- `references/hf_model_card_frontmatter.sh` ([source](https://raw.githubusercontent.com/huggingface/skills/main/skills/hugging-face-tool-builder/references/hf_model_card_frontmatter.sh)) — uses the `hf` CLI to download model cards, extracts YAML frontmatter, and emits NDJSON summaries (license, pipeline tag, tags, gated prompt flag) for easy filtering.

Baseline examples (ultra-simple, minimal logic, raw JSON output with `HF_TOKEN` header):
- `references/baseline_hf_api.sh` ([source](https://raw.githubusercontent.com/huggingface/skills/main/skills/hugging-face-tool-builder/references/baseline_hf_api.sh)) — bash
- `references/baseline_hf_api.py` ([source](https://raw.githubusercontent.com/huggingface/skills/main/skills/hugging-face-tool-builder/references/baseline_hf_api.py)) — python
- `references/baseline_hf_api.tsx` — typescript executable

Composable utility (stdin → NDJSON):
- `references/hf_enrich_models.sh` ([source](https://raw.githubusercontent.com/huggingface/skills/main/skills/hugging-face-tool-builder/references/hf_enrich_models.sh)) — reads model IDs from stdin, fetches metadata per ID, emits one JSON object per line for streaming pipelines.

Composability through piping (shell-friendly JSON output):
- `references/baseline_hf_api.sh 25 | jq -r '.[].id' | references/hf_enrich_models.sh | jq -s 'sort_by(.downloads) | reverse | .[:10]'`
- `references/baseline_hf_api.sh 50 | jq '[.[] | {id, downloads}] | sort_by(.downloads) | reverse | .[:10]'`
- `printf '%s\n' openai/gpt-oss-120b meta-llama/Meta-Llama-3.1-8B | references/hf_model_card_frontmatter.sh | jq -s 'map({id, license, has_extra_gated_prompt})'`

## High Level Endpoints

The following are the main API endpoints available at `https://huggingface.co`

```
/api/datasets
/api/models
/api/spaces
/api/collections
/api/daily_papers
/api/notifications
/api/settings
/api/whoami-v2
/api/trending
/oauth/userinfo
```

## Accessing the API

The API is documented with the OpenAPI standard at `https://huggingface.co/.well-known/openapi.json`.

**IMPORTANT:** DO NOT ATTEMPT to read `https://huggingface.co/.well-known/openapi.json` directly as it is too large to process. 

**IMPORTANT** Use `jq` to query and extract relevant parts. For example, 

 Command to Get All 160 Endpoints

```bash
curl -s "https://huggingface.co/.well-known/openapi.json" | jq '.paths | keys | sort'
```

Model Search Endpoint Details

```bash
curl -s "https://huggingface.co/.well-known/openapi.json" | jq '.paths["/api/models"]'
```

You can also query endpoints to see the shape of the data. When doing so constrain results to low numbers to make them easy to process, yet representative.

## Using the HF command line tool

The `hf` command line tool gives you further access to Hugging Face repository content and infrastructure. 

```bash
❯ hf --help
Usage: hf [OPTIONS] COMMAND [ARGS]...

  Hugging Face Hub CLI

Options:
  --help                Show this message and exit.

Commands:
  auth                 Manage authentication (login, logout, etc.).
  cache                Manage local cache directory.
  download             Download files from the Hub.
  endpoints            Manage Hugging Face Inference Endpoints.
  env                  Print information about the environment.
  jobs                 Run and manage Jobs on the Hub.
  repo                 Manage repos on the Hub.
  repo-files           Manage files in a repo on the Hub.
  upload               Upload a file or a folder to the Hub.
  upload-large-folder  Upload a large folder to the Hub.
  version              Print information about the hf version.
```

The `hf` CLI command has replaced the now deprecated `huggingface_hub` CLI command.


## Companion Files

The following companion files are referenced above and included here for standalone use.

### references/baseline_hf_api.tsx

```tsx
#!/usr/bin/env tsx

/**
 * Ultra-simple Hugging Face API example (TSX).
 *
 * Fetches a small list of models from the HF API and prints raw JSON.
 * Uses HF_TOKEN for auth if the environment variable is set.
 */

const showHelp = () => {
  console.log(`Ultra-simple Hugging Face API example (TSX)

Usage:
  baseline_hf_api.tsx [limit]
  baseline_hf_api.tsx --help

Description:
  Fetches a small list of models from the HF API and prints raw JSON.
  Uses HF_TOKEN for auth if the environment variable is set.

Examples:
  baseline_hf_api.tsx
  baseline_hf_api.tsx 5
  HF_TOKEN=your_token baseline_hf_api.tsx 10
`);
};

const arg = process.argv[2];
if (arg === "--help") {
  showHelp();
  process.exit(0);
}

const limit = arg ?? "3";
if (!/^\d+$/.test(limit)) {
  console.error("Error: limit must be a number");
  process.exit(1);
}

const token = process.env.HF_TOKEN;
const headers: Record<string, string> = token
  ? { Authorization: `Bearer ${token}` }
  : {};

const url = `https://huggingface.co/api/models?limit=${limit}`;

(async () => {
  const res = await fetch(url, { headers });

  if (!res.ok) {
    console.error(`Error: ${res.status} ${res.statusText}`);
    process.exit(1);
  }

  const text = await res.text();
  process.stdout.write(text);
})();
```

### references/baseline_hf_api.sh

```bash
#!/usr/bin/env bash

set -euo pipefail

show_help() {
    cat << EOF
Ultra-simple Hugging Face API example (Shell)

Usage:
  $0 [limit]
  $0 --help

Description:
  Fetches a small list of models from the HF API and prints raw JSON.
  Uses HF_TOKEN for auth if the environment variable is set.

Examples:
  $0
  $0 5
  HF_TOKEN=your_token $0 10
EOF
}

if [[ "${1:-}" == "--help" ]]; then
    show_help
    exit 0
fi

LIMIT="${1:-3}"
if ! [[ "$LIMIT" =~ ^[0-9]+$ ]]; then
    echo "Error: limit must be a number" >&2
    exit 1
fi

headers=()
if [[ -n "${HF_TOKEN:-}" ]]; then
    headers=(-H "Authorization: Bearer ${HF_TOKEN}")
fi

curl -s "${headers[@]}" "https://huggingface.co/api/models?limit=${LIMIT}"
```

### references/hf_enrich_models.sh

```bash
#!/usr/bin/env bash

set -euo pipefail

show_help() {
    cat << 'USAGE'
Stream model IDs on stdin, emit one JSON object per line (NDJSON).

Usage:
  hf_enrich_models.sh [MODEL_ID ...]
  cat ids.txt | hf_enrich_models.sh
  baseline_hf_api.sh 50 | jq -r '.[].id' | hf_enrich_models.sh

Description:
  Reads newline-separated model IDs and fetches basic metadata for each.
  Outputs NDJSON with id, downloads, likes, pipeline_tag, tags.
  Uses HF_TOKEN for auth if the environment variable is set.

Examples:
  hf_enrich_models.sh gpt2 distilbert-base-uncased
  baseline_hf_api.sh 50 | jq -r '.[].id' | hf_enrich_models.sh | jq -s 'sort_by(.downloads)'
  HF_TOKEN=your_token hf_enrich_models.sh microsoft/DialoGPT-medium
USAGE
}

if [[ "${1:-}" == "--help" ]]; then
    show_help
    exit 0
fi

if ! command -v jq >/dev/null 2>&1; then
    echo "Error: jq is required but not installed" >&2
    exit 1
fi

headers=()
if [[ -n "${HF_TOKEN:-}" ]]; then
    headers=(-H "Authorization: Bearer ${HF_TOKEN}")
fi

emit_error() {
    local model_id="$1"
    local message="$2"
    jq -cn --arg id "$model_id" --arg error "$message" '{id: $id, error: $error}'
}

process_id() {
    local model_id="$1"

    if [[ -z "$model_id" ]]; then
        return 0
    fi

    local url="https://huggingface.co/api/models/${model_id}"
    local response
    response=$(curl -s "${headers[@]}" "$url" 2>/dev/null || true)

    if [[ -z "$response" ]]; then
        emit_error "$model_id" "request_failed"
        return 0
    fi

    if ! jq -e . >/dev/null 2>&1 <<<"$response"; then
        emit_error "$model_id" "invalid_json"
        return 0
    fi

    if jq -e '.error' >/dev/null 2>&1 <<<"$response"; then
        emit_error "$model_id" "not_found"
        return 0
    fi

    jq -c --arg id "$model_id" '{
        id: (.id // $id),
        downloads: (.downloads // 0),
        likes: (.likes // 0),
        pipeline_tag: (.pipeline_tag // "unknown"),
        tags: (.tags // [])
    }' <<<"$response" 2>/dev/null || emit_error "$model_id" "parse_failed"
}

if [[ $# -gt 0 ]]; then
    for model_id in "$@"; do
        process_id "$model_id"
    done
    exit 0
fi

if [[ -t 0 ]]; then
    show_help
    exit 1
fi

while IFS= read -r model_id; do
    process_id "$model_id"
done
```

### references/hf_model_card_frontmatter.sh

```bash
#!/usr/bin/env bash

set -euo pipefail

show_help() {
    cat << 'USAGE'
Fetch Hugging Face model cards via the hf CLI and summarize frontmatter.

Usage:
  hf_model_card_frontmatter.sh [MODEL_ID ...]
  cat ids.txt | hf_model_card_frontmatter.sh

Description:
  Downloads README.md for each model via `hf download`, extracts YAML
  frontmatter, and emits one JSON object per line (NDJSON) with key fields.
  Uses HF_TOKEN if set (passed to the hf CLI).

Output fields:
  id, license, pipeline_tag, library_name, tags, language,
  new_version, has_extra_gated_prompt

Examples:
  hf_model_card_frontmatter.sh openai/gpt-oss-120b
  cat ids.txt | hf_model_card_frontmatter.sh | jq -s '.'
  hf_model_card_frontmatter.sh meta-llama/Meta-Llama-3-8B \
    | jq -s 'map({id, license, has_extra_gated_prompt})'
USAGE
}

if [[ "${1:-}" == "--help" ]]; then
    show_help
    exit 0
fi

if ! command -v hf >/dev/null 2>&1; then
    echo "Error: hf CLI is required but not installed" >&2
    exit 1
fi

if ! command -v python3 >/dev/null 2>&1; then
    echo "Error: python3 is required but not installed" >&2
    exit 1
fi

token_args=()
if [[ -n "${HF_TOKEN:-}" ]]; then
    token_args=(--token "$HF_TOKEN")
fi

tmp_dir=$(mktemp -d)
cleanup() {
    rm -rf "$tmp_dir"
}
trap cleanup EXIT

emit_error() {
    local model_id="$1"
    local message="$2"
    python3 - << 'PY' "$model_id" "$message"
import json
import sys

model_id = sys.argv[1]
message = sys.argv[2]
print(json.dumps({"id": model_id, "error": message}))
PY
}

parse_readme() {
    local model_id="$1"
    local readme_path="$2"

    MODEL_ID="$model_id" README_PATH="$readme_path" python3 - << 'PY'
import json
import os
import sys

model_id = os.environ.get("MODEL_ID", "")
readme_path = os.environ.get("README_PATH", "")

try:
    with open(readme_path, "r", encoding="utf-8") as f:
        lines = f.read().splitlines()
except OSError:
    print(json.dumps({"id": model_id, "error": "readme_missing"}))
    sys.exit(0)

frontmatter = []
in_block = False
for line in lines:
    if line.strip() == "---":
        if in_block:
            break
        in_block = True
        continue
    if in_block:
        frontmatter.append(line)

if not frontmatter:
    print(json.dumps({"id": model_id, "error": "frontmatter_missing"}))
    sys.exit(0)

key = None
out = {}

for line in frontmatter:
    stripped = line.strip()
    if not stripped or line.lstrip().startswith("#"):
        continue

    if ":" in line and not line.lstrip().startswith("- "):
        key_candidate, value = line.split(":", 1)
        key_candidate = key_candidate.strip()
        value = value.strip()
        if key_candidate and all(c.isalnum() or c in "_-" for c in key_candidate):
            key = key_candidate
            if value in ("|", "|-", ">", ">-") or value == "":
                out[key] = None
                continue
            if value.startswith("[") and value.endswith("]"):
                items = [v.strip() for v in value.strip("[]").split(",") if v.strip()]
                out[key] = items
            else:
                out[key] = value
            continue

    if line.lstrip().startswith("- ") and key:
        item = line.strip()[2:]
        if key not in out or out[key] is None:
            out[key] = []
        if isinstance(out[key], list):
            out[key].append(item)

result = {
    "id": model_id,
    "license": out.get("license"),
    "pipeline_tag": out.get("pipeline_tag"),
    "library_name": out.get("library_name"),
    "tags": out.get("tags", []),
    "language": out.get("language", []),
    "new_version": out.get("new_version"),
    "has_extra_gated_prompt": "extra_gated_prompt" in out,
}

print(json.dumps(result))
PY
}

process_id() {
    local model_id="$1"

    if [[ -z "$model_id" ]]; then
        return 0
    fi

    local safe_id
    safe_id=$(printf '%s' "$model_id" | tr '/' '_')
    local local_dir="$tmp_dir/$safe_id"

    if ! hf download "$model_id" README.md --repo-type model --local-dir "$local_dir" "${token_args[@]}" >/dev/null 2>&1; then
        emit_error "$model_id" "download_failed"
        return 0
    fi

    local readme_path="$local_dir/README.md"
    if [[ ! -f "$readme_path" ]]; then
        emit_error "$model_id" "readme_missing"
        return 0
    fi

    parse_readme "$model_id" "$readme_path"
}

if [[ $# -gt 0 ]]; then
    for model_id in "$@"; do
        process_id "$model_id"
    done
    exit 0
fi

if [[ -t 0 ]]; then
    show_help
    exit 1
fi

while IFS= read -r model_id; do
    process_id "$model_id"
done
```

Originally by Hugging Face, adapted here as an Agent Skills compatible SKILL.md.

This skill follows the Agent Skills open standard, supported by Claude Code, Cursor, Codex, Gemini CLI, and 20+ more editors.

Works with

Claude Code Cursor Codex CLI Gemini CLI VS Code Windsurf Amp Roo Code Goose Trae OpenCode Spring AI

Agent Skills format — supported by 20+ editors. Learn more