AI Security | Miguel Lameiro | Cybersecurity Blog & Security Writeups

This section contains practical notes on building, testing, monitoring, and defending LLM applications.

The focus is on LLMOps workflows, automated evaluation, guardrails, safety metrics, RAG controls, and red teaming techniques for AI-powered systems.

The goal is to keep the notes implementation-oriented: useful patterns, common failure modes, and security controls that can be applied when designing or assessing LLM applications.

This section contains distilled notes from my Cyber AI Security study.

Full repository (expanded notes, examples, and supporting material): https://github.com/lameiro0x/cyber-ai-security-notes

LLMOps Pipeline: From Dataset Preparation to Safe Prediction

Introduction LLMOps is the operational layer around LLM applications. It is not only model deployment. A useful LLM workflow also needs data preparation, artifact versioning, orchestration, prompt consistency, endpoint management, safety checks, and monitoring. The main lesson from this course is that an LLM application becomes production-ready only when the surrounding system is controlled. The model is one component. The data pipeline, prompt format, evaluation split, deployment strategy, and response metadata are just as important. ...

Automated Evals for LLMOps: Testing LLM Apps in CI

Introduction Traditional software tests usually compare a known input with a predictable output. LLM applications are different because the output is generated, variable, and sometimes correct in more than one form. That does not mean LLM apps cannot be tested. It means the test suite needs several layers: deterministic checks where possible, model-graded evaluations where judgment is required, hallucination checks against known context, and CI automation so regressions are caught before release. ...

Measuring Quality and Safety in LLM Applications

Introduction Before adding controls to an LLM application, you need to know what is happening. Quality and safety measurement gives you that visibility. This course focuses on metrics and monitoring rather than runtime blocking. It uses chat datasets, WhyLogs, LangKit, custom UDFs, model-based scoring, and active monitoring patterns to inspect hallucinations, data leakage, toxicity, refusals, and prompt injection. The useful lesson is practical: do not treat “the model seems fine” as evidence. Log the interactions, compute signals, inspect critical examples, and evaluate filtered subsets. ...

Red Teaming LLM Applications: A Practical Assessment Workflow

Introduction Red teaming an LLM application is not the same thing as checking whether the base model passed a benchmark. The deployed application has prompts, retrieval, tools, business rules, memory, hidden context, and user workflows. Those layers create risks that do not exist in the foundation model alone. The course uses two demo applications: a banking assistant and an ebook store support bot. The useful pattern is not the specific brand names or prompts. The useful pattern is the assessment workflow: define scope, probe manually, automate repeatable checks, use scanners where they help, and connect successful attacks to real application impact. ...

Building Guardrails for RAG Applications

Introduction RAG applications are easy to prototype and hard to make reliable. A chatbot can retrieve documents, pass them to an LLM, and answer questions in a few lines of code. The hard part is making sure it does not invent unsupported details, drift off-topic, leak personal data, or violate business rules. This course frames guardrails as a secondary validation layer around LLM inputs and outputs. Prompting, fine-tuning, RLHF, and RAG help, but they do not remove the need for runtime checks. ...

Practical LLM Guardrails: Hallucination, Topic, PII, and Competitor Controls

Introduction The guardrails course becomes most useful when it moves from a simple keyword detector to specialized validators. The material covers four practical controls: hallucination detection with Natural Language Inference, topic restriction with zero-shot classification, PII detection with Microsoft Presidio, competitor mention detection with exact matching, NER, and vector similarity. Each control protects a different failure mode. Together they show the right engineering pattern: use small, task-specific models and validators around the LLM instead of expecting the LLM to police itself. ...

OWASP Top 10 for LLM Applications: A Practical Security Guide

Introduction The OWASP Top 10 for LLM Applications is useful because it moves the conversation beyond “the model said something wrong.” In real systems, an LLM is connected to prompts, RAG, vector databases, tools, APIs, logs, users, permissions, providers, and business workflows. That is where the risk lives. A bad response is a quality problem. A bad response that triggers a tool call, leaks internal context, writes to a ticketing system, executes generated SQL, or retrieves another user’s documents becomes a security problem. ...