The Enterprise RAG Architect

What Is This Program?

Students move beyond simple "chat with a PDF" scripts to building a production-ready Knowledge Engine. Focus on handling high-volume, unstructured document sets (10,000+ pages) while maintaining high accuracy.

Who Is It For?

Data engineers and AI practitioners aged 16+ building scalable enterprise knowledge solutions.

What Will I Learn?

Vector Databases (Pinecone, Milvus), Data Ingestion (Unstructured.io), Orchestration (LlamaIndex), and Evaluation (Ragas/TruLens).

Pre-Requisites

JSON & Dicts, API Fundamentals, Basic Database Logic.

Dates

Session 1: July 6 - July 17, 2026
Session 2: August 3 - August 14, 2026 (EN)

Language

This course is offered in both English or Chinese. Check the dates for your language preference.


Tuition & Fees

Tuition: NT$XX,000
Deposit: NT$2,000
Early Bird Deal: Save 15% (Book by March 1st)

Core Tech Stack

Vector Databases

Pinecone, Milvus, or Qdrant for scalable, high-speed similarity search.

Data Ingestion

Unstructured.io or Docling for parsing complex PDFs, tables, and nested headers.

Orchestration

LlamaIndex (specialized for data-heavy RAG) or LangChain.

Evaluation

Ragas or TruLens to mathematically score retrieval accuracy and faithfulness.

Curriculum

Week 1: Data Engineering & Indexing

Day Topic Hands-On Activity
01 Data Ingestion The Parser Lab: Using Unstructured.io to extract clean text.
02 Semantic Chunking Context-Aware Splitting: Recursive Character Splitting.
03 Metadata Enrichment The Tagging Engine: Tagging chunks with source data.
04 Table & Image Parsing Vision-RAG: Converting diagrams/tables to text summaries.
05 The Vector Infrastructure Scaling the Index: Bulk upserts of 10k+ documents.

Week 2: Advanced Retrieval & Production

Day Topic Hands-On Activity
06 Hybrid Search Keyword + Semantic: Combining Vector Search with BM25.
07 Hierarchical Indexing Parent-Child Retrieval: Searching small chunks, feeding larger context.
08 Query Transformation HyDE & Multi-Query: Expanding user questions.
09 Production Optimization Reranking & Caching: Cross-Encoders and Redis/GPTCache.
10 Evaluation & Grounding The Accuracy Audit: Using Ragas or TruLens.
Enroll in RAG Track