# Advanced Docling Options Reference

## Pipeline Options

### Full Configuration Example

```python
from docling.document_converter import DocumentConverter, PdfFormatOption
from docling.datamodel.base_models import InputFormat
from docling.datamodel.pipeline_options import (
    PdfPipelineOptions,
    EasyOcrOptions,
    TesseractOcrOptions,
    TableFormerMode,
)
from docling.datamodel.accelerator_options import AcceleratorDevice, AcceleratorOptions

pipeline = PdfPipelineOptions()

# OCR Settings
pipeline.do_ocr = True
pipeline.ocr_options = EasyOcrOptions(
    lang=["en", "ja", "zh"],
    confidence_threshold=0.5,
    use_gpu=True
)

# Table Settings
pipeline.do_table_structure = True
pipeline.table_structure_options.mode = TableFormerMode.ACCURATE
pipeline.table_structure_options.do_cell_matching = True

# Feature Enrichment
pipeline.do_code_enrichment = True
pipeline.do_formula_enrichment = True

# Image Generation
pipeline.generate_page_images = True
pipeline.generate_picture_images = True

# Hardware Acceleration
pipeline.accelerator_options = AcceleratorOptions(
    num_threads=4,
    device=AcceleratorDevice.AUTO  # CPU, CUDA, MPS
)

converter = DocumentConverter(
    format_options={InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline)}
)
```

## OCR Language Codes

### EasyOCR Common Languages
- `en` - English
- `ja` - Japanese
- `zh` - Chinese (Simplified)
- `zh_tra` - Chinese (Traditional)
- `ko` - Korean
- `de` - German
- `fr` - French
- `es` - Spanish
- `it` - Italian
- `pt` - Portuguese
- `ru` - Russian
- `ar` - Arabic
- `th` - Thai
- `vi` - Vietnamese

## Multi-Format Converter

```python
from docling.document_converter import (
    DocumentConverter,
    PdfFormatOption,
    WordFormatOption,
    ExcelFormatOption,
    PowerpointFormatOption,
    MarkdownFormatOption,
    HTMLFormatOption,
)
from docling.datamodel.base_models import InputFormat

converter = DocumentConverter(
    format_options={
        InputFormat.PDF: PdfFormatOption(pipeline_options=pipeline),
        InputFormat.DOCX: WordFormatOption(),
        InputFormat.XLSX: ExcelFormatOption(),
        InputFormat.PPTX: PowerpointFormatOption(),
        InputFormat.MD: MarkdownFormatOption(),
        InputFormat.HTML: HTMLFormatOption(),
    }
)
```

## Image Export Modes

```python
from docling_core.types.doc import ImageRefMode

# Placeholder: [image] marker in output
doc.save_as_markdown("out.md", image_mode=ImageRefMode.PLACEHOLDER)

# Embedded: Base64 encoded in output
doc.save_as_html("out.html", image_mode=ImageRefMode.EMBEDDED)

# Referenced: Separate image files
doc.save_as_markdown("out.md", image_mode=ImageRefMode.REFERENCED)
```

## VLM Pipeline (Vision Language Model)

```bash
# Use GraniteDocling VLM for better understanding
docling --pipeline vlm --vlm-model granite_docling document.pdf
```

## CLI Full Options

```bash
docling document.pdf \
  --ocr \
  --ocr-engine easyocr \
  --table-structure-mode accurate \
  --device auto \
  --output ./results \
  --to markdown
```
