For an introduction about Pandas, NumPy and Vectorized Operations go to: Pandas + NumPy Introduction.

Let’s create a DataFrame and see what we can do with it:

from datetime import date, datetime, timedelta

# Dataframe == Dictionary + Series
dataframe = pd.DataFrame(
  {
    'OrderID':      [1001, 1002, 1003, 1004, 1005],
    'CustomerID':   [5, 3, 2, 4, 6],
    'OrderDate':    [date(2023,3,2), date(2023,3,15),date(2023,3,30), date(2023,4,2), date(2023,4,11)],
    'ShipDate':     [date(2023,3,6), date(2023,3,16),date(2023,4,1), date(2023,4,6), date(2023,4,15)],
    'CustomerName': ['John', 'Mike','Jane', 'Tom', 'Angela'],
    'ProductID':    [101, 202, 203, 404, 606],
    'Qty':          [1, 2, 1, 4, 10],
    'Price':        [2.15, 2.59, 4.08, 1.55, 3.54]
  }
)
dataframe.info()

If you want to change the type of the columns (casting) you use the function astype().

# Cast the Date columns to datetime data type
dataframe['OrderDate'] = dataframe['OrderDate'].astype('datetime64[ns]')
dataframe['ShipDate']  = dataframe['ShipDate'].astype('datetime64[ns]')
dataframe.info()

Calculated columns can be created by using Vectorized Operations, this makes them very fast:

import math
import numpy as np


# Column == Series
seriesQty = dataframe['Qty'] 
seriesPrice = dataframe['Price'] 

# Use Vectorized Operations to create Calculated Columns
seriesLineTotal        = dataframe['Qty'] * dataframe['Price']
dataframe['LineTotal'] = dataframe['Qty'] * dataframe['Price'] 

The only downside to this is that all the normal non-vectorized functions that work on single values (scalar values) don’t work. You have to use the functions available in the NumPy library, like np.ceil(). For the same reason you can’t use the normal DateTime and String properties / methods, you need to use the dt and str accessor first.

# NumPy functions:
dataframe['LineTotal'] = math.ceil(dataframe['Qty'] * dataframe['Price'])  # Error: only works for scalar values
values!dataframe['LineTotal'] = np.ceil(dataframe['Qty'] * dataframe['Price'])

# DateTime accessor:
dataframe['Year'] = dataframe['OrderDate'].year  # Error, series has no year property!
dataframe['Year'] = dataframe['OrderDate'].dt.year
dataframe['LeadTime'] = ( dataframe['ShipDate'] - dataframe['OrderDate'] ).dt.days

# String accessor:
dataframe['Lower']      = dataframe['CustomerName'].str.lower()
dataframe['FirstUpper'] = dataframe['CustomerName'].str[0].str.upper() 

display(dataframe)