transformer

The architecture underlying modern LLMs. Uses self-attention to process sequences in parallel, capturing long-range dependencies without recurrence.

Syntax

ai-fundamentals
Attention(Q, K, V) = softmax(QK^T / sqrt(d_k)) * V

Example

ai-fundamentals
# Using a pretrained transformer (HuggingFace):
from transformers import pipeline

classifier = pipeline("sentiment-analysis")
result = classifier("I love learning about AI!")
# [{"label": "POSITIVE", "score": 0.9998}]

# Text generation:
generator = pipeline("text-generation", model="gpt2")
output = generator("Once upon a time", max_length=50)