In this article, we’ll demonstrate how to write to a database using the polars write_database function. This function allows users to write data into a variety of databases. Let’s explore the various options available when working with this feature in Polars.
Writing data to a database is a common task in data engineering. Polars makes this task simple and fast!
First things first, we need to install the necessary packages:
pip install polars
pip install SQLAlchemy
pip install connectorx
Once you have these packages installed, you should be ready to go!
We’ll start by importing our call center data from a CSV file using pl.read csv
. This file contains 200 records that we’ll be transferring to our database. We’ll call the head method off of our polars DataFrame to get a quick glance at what this data looks like.
import polars as pl
data = pl.read_csv('call_center_data.csv')
data.head()
shape: (5, 9)
Call_ID | Call_DateTime | Caller_Name | Caller_Phone | Agent_ID | Agent_Name | Department | Call_Duration_Minutes | Call_Type |
---|---|---|---|---|---|---|---|---|
i64 | str | str | str | i64 | str | str | f64 | str |
4894 | “2021-12-11 10:… | “Mary Hogan” | “(257) 649-2898… | 639 | “Jordan Valdez” | “Sales” | 8.34 | “Outbound” |
3768 | “2021-02-20 04:… | “Tammy Moss” | “(670) 301-3280… | 559 | “Carol Elliott” | “Sales” | 15.58 | “Inbound” |
6802 | “2022-06-06 20:… | “Patrick Galvan… | “(614) 519-8534… | 816 | “Stephanie Chur… | “Billing” | 10.07 | “Callback” |
7308 | “2023-08-20 03:… | “Caitlyn Reid” | “(697) 595-9925… | 951 | “Joshua Marquez… | “Technical” | 3.35 | “Callback” |
4059 | “2022-08-03 19:… | “Kimberly Corte… | “(955) 647-8691… | 646 | “Joseph Alvarez… | “Support” | 29.14 | “Inbound” |
When we run the code, we have the first 5 rows of our data displayed. This is what we’ll be inserting into our database.
The next item we need to address is what database we’ll be writing to. Most databases require extensive set up before you can write to them. To make it easy, we’ll use SQLite and generate a new database in the process.
To do this, we need to specify a Uniform Resource Identifier or URI. In this case, we’ll use a SQL Lite database called “New Database”.
uri = "sqlite:///NewDatabase.db"
Next, we’ll create our first table by calling the write database function off of our data variable. Within the function, we’ll pass the table name argument and call our new table call center. Finally, we’ll connect to the database using the uri that we just defined.
data.write_database(table_name='call_center', connection=uri)
When we run this code, the data will be written into our newly specified database and table. To verify this you can install a SQLite viewer tool. I’ll include some links to a few here:
- https://sqliteviewer.app/
- https://marketplace.visualstudio.com/items?itemName=qwtel.sqlite-viewer
- https://sqlitebrowser.org/
So we’ve successfully created a new table! But what if you need to add data to this table or even replace it with new information? Polars provides us with this functionality as well. We can utilize the if_exists
argument to specify what we’d like polars to do if the table already exists in the database.
There are three behaviors:
- Append
- Replace
- Fail (default)
The default behavior is to fail. Let’s demonstrate what happens when we use if_exists
to append and replace data.
First, we can do a full replace of the table. We’ll reuse most of the same code as before, but this time we’ll add the if exists
argument and set to ‘replace’.
data.write_database(table_name='call_center', connection=uri, if_exists='replace')
When we run the code, polars will connect to the database, find the table, delete all previously committed rows, and replace them with the new data. In our case, the new data was exactly the same as the old data. This means that you won’t see any physical evidence that the database has been altered.
Let’s say that instead you want to append or add data to an existing table in the database. We’ll use the same code as before, but this time we’ll set the if exists
argument to ‘append’. This will add our new data as new rows to the existing table, preserving the data already present.
data.write_database(table_name='call_center', connection=uri,if_exists='append')
If you take a look at the database again, you should see that it now has 400 rows of data.
Conclusion
So now you’ve mastered the write database function in Polars. You have the ability to create new tables, replace existing ones, or append data as needed. You’re ready to tackle any data engineering task!