How to Read CSV, Analyze Returns, and Plot Stock Data in Python
What is Matplotlib?
Matplotlib is a powerful and widely used data visualization library in Python. It allows you to create a wide range of static, interactive, and animated plots, making it an essential tool for data analysis, data science, and machine learning.
What is Pandas in Python?
Pandas is a powerful and easy-to-use open-source Python library for data manipulation and analysis. It provides fast, flexible, and expressive data structures like Series and DataFrame, making it ideal for working with structured data.
What is NumPy in Python?
NumPy (short for Numerical Python) is a fundamental library for numerical computing in Python. It provides support for multi-dimensional arrays and high-performance mathematical operations on large datasets.

1. Importing Necessary Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
pandas as pd
: Used for working with structured data like tables. Think CSVs, Excel, and DataFrames.numpy as np
: Essential for numerical operations—like math on arrays.matplotlib.pyplot as plt
: A plotting library. It’s how you’ll visualize your data.
2. Reading Data from CSV
df = pd.read_csv("sample.csv", parse_dates=["Date"], index_col="Date")
- Reads data from
sample.csv
. parse_dates=["Date"]
: Automatically converts the “Date” column to datetime objects.index_col="Date"
: Makes the “Date” column the index—useful for time series plotting.
3. Displaying the DataFrame
df
- Typing just
df
(in an interactive environment like Jupyter) will display the first and last few rows of your DataFrame.
4. Checking Data Info
df.info()
- Gives a summary: number of entries, column names, data types, memory usage. Very handy for a quick snapshot!
5.Plotting the Data
df.plot(figsize=(12, 8), title="Sample Nifty Instrument", fontsize=12)
- Plots all numeric columns against the index (which is your “Date”).
figsize=(12, 8)
: Sets the width and height of the plot.title
: Adds a title.fontsize
: Controls the size of axis labels and ticks.
6. Display the Plot
plt.show()
- Actually renders the plot in your notebook or console. Without this, you might not see anything!
7. Calculating Log Returns
df["returns"] = np.log(df.div(df.shift(1)))
df.shift(1)
: Shifts all rows down by one—previous day’s data.df.div(...)
: Divides each row by the previous day’s row (element-wise).np.log(...)
: Applies natural log to get logarithmic returns—a common way to measure returns in finance.
8. Display Updated DataFrame
df
- Now displays the original data plus a new column
returns
, which contains the calculated log returns.
Example : Code for Sma strategies
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.read_csv("sample.csv", parse_dates=["Date"], index_col="Date")
df
df.info()
df.plot(figsize=(12, 8), title="Sample Nifty Instrument", fontsize=12)
plt.show()
df["returns"]=np.log(df.div(df.shift(1)))
df