Pandas Read CSV
A CSV (Comma-Separated Values) file is one of the most common formats for storing tabular data. Pandas provides an easy-to-use function, read_csv(), to read CSV files into a DataFrame. This allows for efficient data manipulation and analysis. Let’s explore how to read and handle CSV files using Pandas.
Reading a CSV File
To read a CSV file, you can use the pd.read_csv() function by providing the file path as an argument. The following example demonstrates reading a CSV file containing information about Indian rivers:
import pandas as pd
# Read a CSV file into a DataFrame
df = pd.read_csv("indian_rivers.csv")
# Display the first 5 rows
print(df.head())
Output
| River | Length (km) | Origin | States Covered |
|---|---|---|---|
| Ganga | 2525 | Gangotri Glacier | 11 |
| Godavari | 1465 | Trimbakeshwar | 6 |
| Krishna | 1400 | Mahabaleshwar | 5 |
| Kaveri | 805 | Talakaveri | 4 |
| Brahmaputra | 2900 | Angsi Glacier | 5 |
Explanation: The pd.read_csv() function reads the CSV file indian_rivers.csv into a DataFrame named df. The .head() method displays the first 5 rows of the DataFrame, making it easy to preview the dataset. The columns River, Length (km), Origin, and States Covered represent the data fields in the CSV file.
Specifying Parameters
The read_csv() function provides several parameters to customize the data import process. For example, you can specify a delimiter if the file uses a separator other than commas, skip rows, or select specific columns. Here’s an example:
# Read a CSV file with custom delimiter
df = pd.read_csv("indian_rivers.csv", delimiter=",", usecols=["River", "Length (km)"])
# Display the DataFrame
print(df)
Output
| River | Length (km) |
|---|---|
| Ganga | 2525 |
| Godavari | 1465 |
| Krishna | 1400 |
| Kaveri | 805 |
| Brahmaputra | 2900 |
Explanation: In this example, the usecols parameter selects only the River and Length (km) columns from the CSV file, and the delimiter parameter ensures that the file is correctly parsed using commas as separators.
Key Takeaways
- Simple Import: The
pd.read_csv()function is used to read CSV files into DataFrames. - Preview Data: Use
.head()to display the first few rows of the DataFrame. - Customizable Parameters: Parameters like
usecolsanddelimiterallow flexibility in reading specific parts of the data. - Common Format: CSV is a widely used format for tabular data, making it essential for real-world data analysis tasks.