Introduction to HerdingCATs
HerdingCATs is a Python library designed to speed up how data analysts explore and interact with open data sources.
Purpose​
The aim of this project is simple:
- Navigate the open data ecosystem
- Find the data that you need
- Load that data into a format and/or location for further analysis
Installation​
PyPi package coming soon.
Once available, you can install with:
pip install HerdCats
poetry add HerdCats
uv add HerdCats
Important Notes​
Herding-CATs is currently under active development.
Features will change as the project evolves.
Core Concepts​
HerdingCATs follows a Session → Explorer → Loader pattern:
- Create a session to connect to an open data source.
- Use an explorer to browse and find the data you need.
- Use a loader to load the data into your preferred format and/or location.
Supported Data Sources​
HerdingCATs supports multiple open data sources:
- CKAN - Widely used for open data catalogues
- OpenDataSoft - Popular in Europe, especially for energy related data catalogues
- Bespoke APIs - Including French Government open data and ONS Nomis
See the Supported Catalogues page for a complete list.
More sources are being added all the time.
If you need a data source that is not listed, please raise an issue.