Series-as-a-Service with httpolars

Seamless DataFrame web IO Polars plugin

  1. Putting web IO in the DataFrame
  2. Worked examples
  3. Development directions
  4. Benchmarking with rate limits

🐻‍❄️📡🧊🧊🧊🧊🧊🛰️

A common scenario in data science is that you want to ingest a dataset served by an API: this is effectively like having a local dataset but not having its ID field.

To pull the dataset, you therefore start from a set of values, most simply a list of IDs, and 'expand' them by individual API calls (or batch if you're lucky).

Polars is a library for DataFrame operations, both columnar and element-wise: why not just consider API calls just another type of operation?

Part 1 explains how I did exactly that and made a new Polars plugin httpolars! Part 2 shows usage examples and part 3 contemplates where it could develop next.

In part 4 I investigate what I was most curious about: how fast is httpolars?