ETL processes are essential for unifying disparate datasources. Basic ETL projects start with gathering user requirements. Some questions to ask are:
- What do you hope to learn from this project?
- How will you use the information?
- How often does the data need to be refreshed?
Next check out the datasource. The Observepoint API (https://docs.api.observepoint.com/) will be used for this ETL example.
The data we are interested in can be organized in to tables in a relational database such as Oracle or POSTGRE (https://www.postgresql.org/). Postgre was chosen for this project because it is free and appears to work well with python.
Here are all the posts in this ETL series:
Install PostgreSQL Database
Simple Data Model
Create Tables in PostgreSQL Database
Getting JSON API data with Python
Inserting JSON data into Postgres with Python