Second-order data models

Towards generalisation in domain modelling with Pydantic and Outlines

This series aims to incubate a new paradigm of flexible data models for data ingestion and content generation (for fields such as document processing), which I call "second order data models".

If a 'schema' representing an object in structured definitions (fields with associated data types) is known as a "data model", as it is by users of Pydantic, then what would a "data model for data models" look like? I envision it as a more flexible/general form of data model, and LLMs as one way to make practical use of them (beyond as a thought experiment).

This series covers the concept of second-order data models and then lays out examples of how they incoprorate both structure and flexibility, allowing for both automation and customisation in a wide variety of fields, from blogs to multimodal AI.

Series Overview

What is a second-order data model An introduction to the concept of second-order data models.
Second-order data models in theory Exploring some hypothetical examples as case studies.
A general data model for questionnaires Dynamic schema for flexible document ingestion.