Pandas Read JSON
JSON (JavaScript Object Notation) is a lightweight data format widely used for transmitting data over the web. Pandas provides the read_json() function to easily import JSON data into a DataFrame, enabling efficient analysis and manipulation. Let’s explore how to read and handle JSON files with Pandas.
Reading a JSON File
To read a JSON file, use the pd.read_json() function by providing the file path as an argument. The following example demonstrates reading a JSON file containing information about popular Indian cities:
import pandas as pd
# Read a JSON file into a DataFrame
df = pd.read_json("indian_cities.json")
# Display the first 5 rows
print(df.head())
Output
| City | State | Population | Area (sq km) |
|---|---|---|---|
| Chennai | Tamil Nadu | 7090000 | 426 |
| Bengaluru | Karnataka | 8443675 | 741 |
| Hyderabad | Telangana | 6809970 | 650 |
| Mumbai | Maharashtra | 12442373 | 603 |
| Delhi | Delhi | 16787941 | 1484 |
Explanation: The pd.read_json() function reads the JSON file indian_cities.json into a DataFrame named df. The .head() method displays the first 5 rows of the DataFrame, showing columns such as City, State, Population, and Area (sq km). This allows for a quick preview of the dataset.
Handling JSON Structures
JSON files can have varying structures, such as objects, arrays, or nested objects. The read_json() function can handle these variations by specifying parameters like orient. Here’s an example:
# Read a JSON file with nested objects
df = pd.read_json("nested_cities.json", orient="records")
# Display the DataFrame
print(df)
Output
| City | State | Population | Area (sq km) |
|---|---|---|---|
| Chennai | Tamil Nadu | 7090000 | 426 |
| Bengaluru | Karnataka | 8443675 | 741 |
| Hyderabad | Telangana | 6809970 | 650 |
| Mumbai | Maharashtra | 12442373 | 603 |
| Delhi | Delhi | 16787941 | 1484 |
Explanation: The orient parameter specifies how the JSON data is structured. In this example, orient="records" indicates that each JSON object represents a row in the DataFrame. This flexibility allows you to handle different JSON formats seamlessly.
Key Takeaways
- Simple Import: The
pd.read_json()function reads JSON files into Pandas DataFrames efficiently. - Preview Data: Use
.head()to display the first few rows of the DataFrame for a quick overview. - Flexible Handling: The
orientparameter allows handling various JSON structures, such as nested or array-based data. - Real-World Application: JSON is widely used in APIs and web data, making this function essential for modern data analysis tasks.