Using the Polars Cast Function

The Polars Cast Function

The Polars Cast Function lets you change column data types on a Polars DataFrame. There are several scenarios where you might need to change a column’s data type:

  • Ensuring Compatibility: Some operations require specific data types.
  • Optimizing Performance: Using efficient data types reduces memory usage.
  • Fixing Incorrect Types: Imported data sometimes has incorrect types (e.g., numeric values stored as strings).

In this article, we’re going to show you how to use this function.

The Original Schema

First, we’ll import the data and display the schema.

import polars as pl

data = pl.read_csv("employees.csv")
data.schema
Schema([('Employee ID', Int64),
        ('First Name', String),
        ('Last Name', String),
        ('Gender', String),
        ('Date of Birth', String),
        ('Department', String),
        ('Position', String),
        ('Salary ($)', Int64),
        ('Email', String),
        ('Phone', String)])

Now as you look through this schema you’ll notice that we have a few columns that could use some changes. The Date of Birth column is currently displaying as a String data type. This should be a Date data type. The Salary ($) column is also a problem. Most salaries are not whole numbers. Our example dataset doesn’t have any fractional numbers which is why the data type is currentlyInt64, but we want to change that to a Float64.

Changing Data Types

We’ll use the Polars Cast function to change our data types. We’ll call the cast function and then map our new types using a dictionary.

data.cast({"Date of Birth":pl.Date,"Salary ($)": pl.Float64}).schema
Schema([('Employee ID', Int64),
        ('First Name', String),
        ('Last Name', String),
        ('Gender', String),
        ('Date of Birth', Date),
        ('Department', String),
        ('Position', String),
        ('Salary ($)', Float64),
        ('Email', String),
        ('Phone', String)])

In the output, you’ll notice that our data types have changed.

Final Notes

The Polars Cast function is extremely useful for changing data types. Keep in mind that whenever you cast data types there needs to be a level of compatibility between the data types. For example, you can’t use a float data type unless the data you are converting is numeric. In any case, the cast function makes it easy to convert data types in your DataFrame.

Learn More

Polars Documentation – Cast

Sample Dataset

Review Your Cart
0
Add Coupon Code
Subtotal