Now that the table has been created and we have inspected the JSON data returned by the Observepoint API we are finally ready to load the data. First up are a couple of functions used to open and close the database connection. The database connection object will be used in the function created to insert the data. Information used to put together these functions was gathered from observepoint API documentation, postgresqltutorial.com, and the standard psycopg2 documentation.
Opening and closing the connection:
#module name: queryFunc.py
import psycopg2
def openConn():
conn = None
try:
# connect to the PostgreSQL server
print('Connecting to the PostgreSQL database...')
conn = psycopg2.connect(host="localhost",database="YOURDATABASE", user="YOURUSERNAME", password="YOURPASSWORD")
cur = conn.cursor()
cur.execute('SELECT version()')
print('CONNECTION OPEN:PostgreSQL database version:')
# display the PostgreSQL database server version
db_version = cur.fetchone()
print(db_version)
except (Exception, psycopg2.DatabaseError) as error:
print(error)
return conn
def closeConn(conn):
try:
conn.close()
print('Closed connection to the PostgreSQL database')
except (Exception, psycopg2.DatabaseError) as error:
print(error)
finally:
if conn is not None:
conn.close()
print('Database connection closed.')
Loading the data, function used to insert each line:
"""insertJourney()START Connect to the PostgreSQL database server """
try:
sql = """
INSERT INTO journeys (jname, lastcheck, queued, folderid, journeyoptions,
webjourneyrunning, journeyid, status, screenshot, createdat,
domainId, actions, nextcheck, userid, emails)
VALUES(%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)
"""
# create a cursor
cur = c.cursor()
cur.execute(sql, j)
c.commit()
#print('inserted')
except (Exception, psycopg2.DatabaseError) as error:
print(error)
Loading the data, function used to call Observepoint and insert each line with the insertJourney function while parsing the returned JSON. The non string column values are formatted with Python so PostgreSQL understands the insert:
print(':loadJourneys() start')
conn = http.client.HTTPSConnection("api.observepoint.com")
payload = "{}"
headers = { 'authorization': "api_key YOURAPIKEY" }
conn.request("GET", "/v2/web-journeys", payload, headers)
res = conn.getresponse()
data = res.read()
de_data = data.decode("utf-8")
jdata = json.loads(de_data)
for rownum, drow in enumerate(jdata, 1):
insertvalues = []
#print('drow:' + str(rownum), end=' ')
#ilist is the list of items to include
ilist=['name','lastCheck','queued','folderId','options','webJourneyRunning','id','status','screenshot','createdAt','domainId','actions','nextCheck','userId','emails']
#dtlist is for dates
dtlist=['lastCheck','createdAt','nextCheck']
#dumplist is for dict
dumplist=['options']
for dfield in drow:
if dfield in ilist:
if dfield in dtlist:
insertvalues.append(drow.get(dfield)[0:-10])
elif dfield in dumplist:
json_string = json.dumps(drow.get(dfield))
insertvalues.append(json_string)
else:
insertvalues.append(drow.get(dfield))
#print('insertvalues:', end='')
#print(insertvalues)
queryFunc.insertJourney(c,insertvalues)
print(':loadJourneys() end')
This is a sample of what it looks like in pgadmin:
All posts in this ETL series:
Install PostgreSQL Database
Simple Data Model
Create Tables in PostgreSQL Database
Getting JSON API data with Python
Inserting JSON data into Postgres with Python