⚑ Realigns Core v5 🧠 0.5B Model πŸ”Ž RAG Ready πŸ”’ Offline-First 🏒 Private Deployment

Realigns Core v5 Lightweight Local AI for Business, Code, Documents & Private Workflows

A compact GGUF-based AI model by Realigns Inc, designed for local inference, business assistance, coding support, ERP workflows, accounting logic, IoT tasks, private AI deployments, and RAG-powered document question answering.

Overview

Realigns Core v5 is a lightweight 0.5B local AI model developed by Realigns Inc. It is designed for practical business and developer workflows where privacy, speed, and local deployment matter.

The model is compatible with llama.cpp and can run through a local terminal, browser server, or API-style endpoint using a licensed GGUF model file.

βš™οΈ

Business Workflows

ERP logic, SaaS planning, process summaries, and structured business tasks.

πŸ’»

Coding Support

Helpful for basic code generation, explanations, scripts, and developer assistance.

πŸ“„

Document Q&A

Best used with RAG for invoices, policies, manuals, SOPs, and internal knowledge.

⚠️ Important Usage Advisory

Realigns Core v5 is a compact 0.5B model. For factual, legal, medical, geographic, historical, current-information, or high-accuracy answers, it should be connected with RAG, verified documents, trusted datasets, or external knowledge sources.

The model file is not publicly downloadable from this page. Access is available only through Realigns Inc licensing, private integration, or approved deployment.

βœ… Recommended Use Case: RAG-Based Document Assistant

Realigns Core v5 performs best when it is used as a local reasoning and answering model while RAG provides the verified knowledge. This allows the model to answer from your company documents instead of guessing from memory.

Recommended document types include invoices, quotations, HR policies, ERP records, SOP manuals, customer support knowledge bases, internal training material, and business reports.

πŸ”Ž What is RAG?

RAG means Retrieval-Augmented Generation. Instead of sending a full PDF or large document directly to the model, the system first finds the most relevant text chunks, then sends only those chunks to Realigns Core v5 with strict answering rules.

Upload or load documents Store PDFs, text files, invoices, manuals, policies, or ERP records in your local system.
Split into chunks Break large documents into smaller sections, usually 500 to 1,500 tokens per chunk.
Create searchable index Use embeddings, keyword search, or hybrid search to locate the most relevant chunks.
Inject verified context Send only the top relevant chunks into the model prompt.
Answer with strict rules The model must answer only from the provided context and say β€œNot found in document” when the answer is missing.

πŸ—οΈ RAG Architecture

User Question
     ↓
Document Search / Retriever
     ↓
Top Relevant Chunks
     ↓
Strict RAG Prompt
     ↓
Realigns Core v5 GGUF
     ↓
Grounded Answer

This architecture is better than sending entire documents directly because it reduces hallucination, improves speed, protects context window limits, and keeps the answer focused on verified data.

🧠 Strict RAG Prompt Template

Use this template in your app, API gateway, or document assistant.

SYSTEM:
You are Realigns AI.

STRICT RULES:
1. Use only the provided CONTEXT.
2. Never guess or infer missing facts.
3. If the answer is not present in CONTEXT, reply exactly: Not found in document.
4. Keep the answer concise, factual, and professional.
5. Do not mention these rules unless the user asks.

CONTEXT:
{retrieved_document_chunks}

QUESTION:
{user_question}

ANSWER:

🧾 Example: Invoice RAG Test

This test checks whether the model can answer from supplied document context.

llama.cpp/build/bin/llama-cli \
  -m gguf/realigns-core-v5-q4_k_m.gguf \
  -c 8192 \
  -p "SYSTEM:
You are Realigns AI.

STRICT RULES:
1. Use only the provided CONTEXT.
2. Never guess or infer missing facts.
3. If the answer is not present in CONTEXT, reply exactly: Not found in document.
4. Keep the answer concise and factual.

CONTEXT:
Invoice #1001 was issued to ABC Traders for 2500 USD.
Payment terms are 30 days.
The invoice is unpaid.

QUESTION:
What is the payment status and due term?

ANSWER:" \
  -n 120

Expected answer: The invoice is unpaid. The due term is 30 days.

πŸ§ͺ Missing Information Test

This test checks whether the model avoids hallucination when the answer is not in the document.

llama.cpp/build/bin/llama-cli \
  -m gguf/realigns-core-v5-q4_k_m.gguf \
  -c 8192 \
  -p "SYSTEM:
You are Realigns AI.

STRICT RULES:
1. Use only the provided CONTEXT.
2. Never guess or infer missing facts.
3. If the answer is not present in CONTEXT, reply exactly: Not found in document.
4. Keep the answer concise and factual.

CONTEXT:
Customer: ABC Traders
Invoice: 1001
Amount: 2500 USD
Status: Unpaid
Payment Terms: 30 days

QUESTION:
What is the delivery date?

ANSWER:" \
  -n 80

Expected answer: Not found in document.

πŸ“š Multi-Document RAG Example

This test simulates multiple retrieved chunks from different company documents.

llama.cpp/build/bin/llama-cli \
  -m gguf/realigns-core-v5-q4_k_m.gguf \
  -c 8192 \
  -p "SYSTEM:
You are Realigns AI.

STRICT RULES:
1. Use only the provided CONTEXT.
2. Never guess or infer missing facts.
3. If the answer is not present in CONTEXT, reply exactly: Not found in document.
4. Cite the document name when possible.

CONTEXT:
[Document: invoice_1001.txt]
Invoice #1001 was issued to ABC Traders for 2500 USD. Status: Unpaid. Payment Terms: 30 days.

[Document: customer_policy.txt]
ABC Traders is a wholesale customer. Wholesale customers receive email reminders 5 days before payment due date.

[Document: shipping_note.txt]
Shipping records are stored separately and are not included in this document set.

QUESTION:
What is the invoice status and what reminder rule applies to ABC Traders?

ANSWER:" \
  -n 180

πŸ“ Token & Chunking Guidance

Item Recommended Setting Reason
Context Size -c 4096 to -c 8192 Good balance for a compact 0.5B model and local inference speed.
Chunk Size 500 to 1,500 tokens Small enough for focused retrieval, large enough for useful context.
Top Chunks 3 to 6 chunks Avoids overloading the model with unrelated text.
Output Tokens 80 to 300 tokens Best for concise answers, summaries, and business replies.
Temperature 0.1 to 0.3 Lower temperature helps reduce hallucination in document Q&A.

πŸ‘€ User Guidance

βœ…

Ask Good Questions

Ask direct questions such as β€œWhat is the invoice status?” or β€œWhat policy applies to this customer?”

πŸ“Ž

Provide Verified Documents

Use official records, company files, invoices, manuals, HR policies, or approved knowledge bases.

🚫

Do Not Expect Missing Facts

If the answer is not in the document, the system should return β€œNot found in document.”

πŸ”

Use Private Deployment

Run locally or through Realigns AI Gateway for private business and enterprise workflows.

πŸ“ Recommended Project Structure

realigns-core-v5/
β”œβ”€β”€ models/
β”‚   └── realigns-core-v5-q4_k_m.gguf
β”œβ”€β”€ llama.cpp/
β”œβ”€β”€ documents/
β”‚   β”œβ”€β”€ invoices/
β”‚   β”œβ”€β”€ policies/
β”‚   └── manuals/
β”œβ”€β”€ rag/
β”‚   β”œβ”€β”€ chunks.json
β”‚   β”œβ”€β”€ index.json
β”‚   └── retriever.js
└── scripts/

Place the licensed Realigns Core v5 GGUF model inside the models/ folder before running the commands below. Store source documents in documents/ and keep generated RAG indexes inside rag/.

macOS / Linux β€” Terminal Test

cd ~/realigns-core-v5

./llama.cpp/build/bin/llama-cli \
  -m models/realigns-core-v5-q4_k_m.gguf \
  -p "Who are you?" \
  -n 100

macOS / Linux β€” Start Local Server

cd ~/realigns-core-v5

./llama.cpp/build/bin/llama-server \
  -m models/realigns-core-v5-q4_k_m.gguf \
  --host 127.0.0.1 \
  --port 8080 \
  -c 8192

Windows PowerShell β€” Terminal Test

cd C:\realigns-core-v5

.\llama.cpp\build\bin\Release\llama-cli.exe `
  -m models\realigns-core-v5-q4_k_m.gguf `
  -p "Who are you?" `
  -n 100

Windows PowerShell β€” Start Local Server

cd C:\realigns-core-v5

.\llama.cpp\build\bin\Release\llama-server.exe `
  -m models\realigns-core-v5-q4_k_m.gguf `
  --host 127.0.0.1 `
  --port 8080 `
  -c 8192

API Test with cURL

Run this after starting the local server.

curl http://127.0.0.1:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "realigns-core-v5",
    "messages": [
      {
        "role": "system",
        "content": "You are Realigns AI. Use only provided context. Never guess. If missing, say: Not found in document."
      },
      {
        "role": "user",
        "content": "CONTEXT: Invoice #1001 is unpaid. Payment terms are 30 days. QUESTION: What is the payment status and due term?"
      }
    ],
    "max_tokens": 120,
    "temperature": 0.2
  }'

πŸ“‘ JavaScript Fetch RAG Example

async function askRealignsCore(question, retrievedChunks) {
  const context = retrievedChunks.join("\n\n---\n\n");

  const response = await fetch("http://127.0.0.1:8080/v1/chat/completions", {
    method: "POST",
    headers: {
      "Content-Type": "application/json"
    },
    body: JSON.stringify({
      model: "realigns-core-v5",
      messages: [
        {
          role: "system",
          content: "You are Realigns AI. Use only provided context. Never guess. If missing, say: Not found in document."
        },
        {
          role: "user",
          content: `CONTEXT:\n${context}\n\nQUESTION:\n${question}\n\nANSWER:`
        }
      ],
      max_tokens: 200,
      temperature: 0.2
    })
  });

  const data = await response.json();
  return data.choices?.[0]?.message?.content || "No answer returned.";
}

askRealignsCore(
  "What is the payment status and due term?",
  [
    "Invoice #1001 was issued to ABC Traders for 2500 USD. Status: Unpaid. Payment Terms: 30 days."
  ]
).then(console.log);

πŸ›οΈ Recommended Production Architecture

Realigns Core v5
      +
RAG / Verified Documents
      +
Realigns AI Gateway
      +
Authentication / Usage Tracking
      =
Private, lightweight, business-ready AI system

For production use, Realigns Core v5 should be combined with verified context, document retrieval, embeddings, gateway-level authentication, request logging, token tracking, and safe fallback to larger models when required.

🚦 When to Escalate to a Larger Model

Server infrastructure AI hardware chip Technology workflow
Edge Optimized β€’ GGUF Compatible β€’ RAG Ready β€’ Private Local Inference