Ollama (Local)

Ollama enables running AI models locally on your machine for private, offline security testing.

📸 SCREENSHOT: ollama-model-select.png

Ollama local model selection

Overview

Ollama advantages:

Complete data privacy
No API costs
Offline operation
No rate limits
Full control over models

Installation

macOS

brew install ollama

Linux

curl -fsSL https://ollama.com/install.sh | sh

Windows

Download from ollama.com/download

Starting Ollama

Start Server

ollama serve

The server runs on http://localhost:11434 by default.

Verify Installation

ollama --version

Available Models

Recommended Models

Model	Size	Best For
llama3.3:70b	40GB	Best quality, requires high-end GPU
llama3.3	4.7GB	Good balance of quality and speed
codellama:34b	19GB	Code analysis
mistral	4.1GB	Fast general purpose
mixtral:8x7b	26GB	High quality, efficient
deepseek-coder:33b	19GB	Code security review
qwen2.5:72b	41GB	Excellent reasoning

Pull Models

# Recommended for security testing
ollama pull llama3.3

# For code review
ollama pull codellama:34b

# Lightweight option
ollama pull mistral

List Installed Models

ollama list

Configuration

Basic Setup

{
  "provider": {
    "ollama": {
      "options": {
        "baseURL": "http://localhost:11434"
      }
    }
  },
  "model": "ollama/llama3.3"
}

Custom Host

For remote Ollama server:

{
  "provider": {
    "ollama": {
      "options": {
        "baseURL": "http://192.168.1.100:11434"
      }
    }
  }
}

Usage

Command Line

cyberstrike --model ollama/llama3.3

In-Session

/model
# Select Ollama model

Model Configuration

Context Length

Increase context window:

ollama run llama3.3 --num-ctx 8192

Or in Modelfile:

FROM llama3.3
PARAMETER num_ctx 8192

GPU Layers

Control GPU usage:

OLLAMA_NUM_GPU=999 ollama serve  # Use all GPU layers
OLLAMA_NUM_GPU=0 ollama serve    # CPU only

Memory Management

# Limit VRAM usage
OLLAMA_MAX_VRAM=8G ollama serve

Custom Models

Create Modelfile

FROM llama3.3

SYSTEM """
You are a security testing assistant specialized in:
- Web application vulnerabilities
- Code security review
- Network penetration testing

Always follow OWASP guidelines and report findings with:
- Vulnerability name
- Severity (Critical/High/Medium/Low)
- Evidence
- Remediation steps
"""

PARAMETER temperature 0.7
PARAMETER num_ctx 8192

Build Custom Model

ollama create security-assistant -f Modelfile

Use Custom Model

cyberstrike --model ollama/security-assistant

Performance Optimization

Hardware Requirements

Model Size	Min RAM	Recommended GPU
7B	8GB	8GB VRAM
13B	16GB	12GB VRAM
34B	32GB	24GB VRAM
70B	64GB	48GB VRAM

Quantization

Use quantized models for lower memory:

ollama pull llama3.3:q4_0  # 4-bit quantization
ollama pull llama3.3:q8_0  # 8-bit quantization

Parallel Requests

Enable concurrent requests:

OLLAMA_NUM_PARALLEL=4 ollama serve

Offline Usage

Pre-download Models

ollama pull llama3.3
ollama pull codellama

Air-Gapped Systems

Download models on connected system
Copy ~/.ollama/models/ to air-gapped system
Run Ollama without internet

Docker Deployment

Run in Container

docker run -d \
  --gpus all \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama

Pull Models in Container

docker exec -it ollama ollama pull llama3.3

API Compatibility

Ollama supports OpenAI-compatible API:

{
  "provider": {
    "openai": {
      "options": {
        "baseURL": "http://localhost:11434/v1",
        "apiKey": "ollama"
      }
    }
  },
  "model": "openai/llama3.3"
}

Troubleshooting

Connection Refused

Error: Connection refused

Ensure Ollama is running:

ollama serve

Out of Memory

Error: CUDA out of memory

Solutions:

Use smaller model
Use quantized version
Reduce context length
Set OLLAMA_NUM_GPU=0 for CPU

Slow Performance

Enable GPU acceleration
Use quantized models
Increase num_parallel
Check thermal throttling

Tip

For best results, use llama3.3:70b with a high-end GPU. For resource-limited systems, mistral provides good quality with lower requirements.

Providers Overview - All providers
Custom Providers - OpenAI-compatible setup
Configuration - Full options