My Personal Journey with APIs: Building Bridges Between Systems¶

Author: Mohammad Sayem Chowdhury
Data Enthusiast & Python Developer

Welcome to my exploration of Application Programming Interfaces (APIs)! As someone who's passionate about data science and system integration, I've found APIs to be one of the most fascinating aspects of modern programming. They're like digital bridges that allow different software systems to communicate seamlessly.

In this notebook, I'll share my personal understanding, practical experiences, and real-world applications of APIs using Python.

Why APIs Matter to Me¶

APIs are the backbone of modern software development. I like to think of them as translators that help different applications speak the same language. Throughout my data science journey, I've discovered that understanding APIs is crucial for:

  • Data Collection: Gathering information from various sources
  • System Integration: Connecting different tools and platforms
  • Automation: Building workflows that work across multiple systems
  • Real-time Analysis: Accessing live data for dynamic insights

In this notebook, I'll demonstrate these concepts through practical examples that I use in my own projects.

What I'll Explore Today¶

  1. Understanding APIs Through Pandas - How I use DataFrames as APIs
  2. REST API Fundamentals - My approach to web-based data access
  3. Real-World NBA Data Analysis - A practical example using sports statistics
  4. Personal Insights & Best Practices - What I've learned from experience
  5. Next Steps in My API Journey - Where to go from here

Estimated time: 20-25 minutes


In [ ]:
# I use nba_api for some of my API experiments
!pip install nba_api
Collecting nba_api
  Downloading nba_api-1.1.11.tar.gz (125 kB)
Requirement already satisfied: requests in e:\anaconda\lib\site-packages (from nba_api) (2.24.0)
Requirement already satisfied: idna<3,>=2.5 in e:\anaconda\lib\site-packages (from requests->nba_api) (2.10)
Requirement already satisfied: certifi>=2017.4.17 in e:\anaconda\lib\site-packages (from requests->nba_api) (2020.6.20)
Requirement already satisfied: chardet<4,>=3.0.2 in e:\anaconda\lib\site-packages (from requests->nba_api) (3.0.4)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in e:\anaconda\lib\site-packages (from requests->nba_api) (1.25.10)
Building wheels for collected packages: nba-api
  Building wheel for nba-api (setup.py): started
  Building wheel for nba-api (setup.py): finished with status 'done'
  Created wheel for nba-api: filename=nba_api-1.1.11-py3-none-any.whl size=251492 sha256=cc6741eb9d02fb3cdc59b9640e69e1b3c93d89a4cfae9f4ea3405e61f5b72c33
  Stored in directory: c:\users\chysa\appdata\local\pip\cache\wheels\96\0a\d6\0e51f16e26a046ed08ce8266c86011c74bf57678cd62ad71b0
Successfully built nba-api
Installing collected packages: nba-api
Successfully installed nba-api-1.1.11

My Perspective: Pandas as a Powerful API¶

One of the first "APIs" I learned to use effectively was Pandas. While most people think of it as just a data analysis library, I've come to appreciate it as a sophisticated API that provides a consistent interface to complex data operations.

My Utility Function: Reshaping API Data¶

When working with API responses, I often receive lists of dictionaries that need to be restructured for analysis. Here's a utility function I've developed that I use frequently in my data projects:

def reshape_data(data):
    """Convert a list of dictionaries into a dictionary of lists."""
    if not data:
        return {}
    
    # Get the keys from the first dictionary in the list
    keys = data[0].keys()
    
    # Initialize a dictionary with empty lists for each key
    reshaped = {key: [] for key in keys}
    
    # Iterate over the list of dictionaries
    for d in data:
        # Append each value to the corresponding list in the reshaped dictionary
        for key in keys:
            reshaped[key].append(d[key])
    
    return reshaped

Here's how I use this function:

# Sample data: list of dictionaries
api_data = [
    {"id": 1, "name": "Alice", "age": 30},
    {"id": 2, "name": "Bob", "age": 25},
    {"id": 3, "name": "Charlie", "age": 35}
]

# Reshape the data
reshaped_data = reshape_data(api_data)

# Output the reshaped data
print(reshaped_data)

The output will be:

{
    'id': [1, 2, 3],
    'name': ['Alice', 'Bob', 'Charlie'],
    'age': [30, 25, 35]
}

This reshaped format is often easier to work with for analysis, allowing me to quickly access all values for a given key across the original list of dictionaries.

In [ ]:
def merge_dicts_to_columns(dict_list):
    """
    Convert a list of dictionaries into a single dictionary of lists.
    This is particularly useful when working with API responses.
    
    Args:
        dict_list: List of dictionaries with the same keys
    
    Returns:
        Dictionary where each key maps to a list of values
    
    Example:
        input: [{'a': 1, 'b': 2}, {'a': 3, 'b': 4}]
        output: {'a': [1, 3], 'b': [2, 4]}
    """
    if not dict_list:
        return {}
    
    # Get keys from the first dictionary
    keys = dict_list[0].keys()
    
    # Initialize result dictionary
    result = {key: [] for key in keys}
    
    # Populate lists for each key
    for d in dict_list:
        for key, value in d.items():
            result[key].append(value)
    
    return result

# Test the function with a simple example
test_data = [{'name': 'Mohammad', 'score': 95}, {'name': 'Alice', 'score': 87}]
print("Test result:", merge_dicts_to_columns(test_data))

Pandas is actually a collection of software components, many of which aren't even written in Python. I appreciate how it brings together different technologies to make data analysis easier.

Why I Appreciate Pandas' Design¶

Pandas is a perfect example of API design. It's built on top of NumPy (written in C), includes components from other libraries, and provides a unified Python interface. This modular approach teaches me important lessons about API design:

  • Abstraction: Complex operations are hidden behind simple method calls
  • Consistency: Similar operations work the same way across different data types
  • Extensibility: I can add my own methods and functionality

Let me demonstrate these principles with some examples I use regularly:

In [ ]:
import pandas as pd
import matplotlib.pyplot as plt

Working with My Sample Data¶

I'll start with a simple dataset that represents something I might encounter in my projects:

In [ ]:
# Sample data representing project scores over three quarters
my_project_data = {
    'quarter_1': [85, 92, 78], 
    'quarter_2': [88, 89, 82]
}
print("My project performance data:", my_project_data)

Creating My Data Interface¶

When I create a DataFrame from this dictionary, I'm essentially using Pandas' API to transform raw data into a structured, queryable format. The DataFrame becomes my primary interface for data exploration and analysis:

In [ ]:
# Create DataFrame using Pandas API
my_df = pd.DataFrame(my_project_data)
print("DataFrame type:", type(my_df))
print("\nDataFrame shape:", my_df.shape)
print("\nColumn names:", my_df.columns.tolist())
my_df

Exploring Data Through the API¶

Now I can use various DataFrame methods (API endpoints) to analyze my data. Each method provides a different way to interact with and understand the information:

  • head(): Displays the first few rows of the DataFrame, giving a glimpse of the data structure and content.
  • mean(): Calculates the average of numerical columns, helping to understand the central tendency of the data.
  • describe(): Generates descriptive statistics, such as count, mean, std deviation, min, and max, for numerical columns.
  • info(): Provides a summary of the DataFrame, including the index dtype and columns, non-null values, and memory usage.
  • value_counts(): Returns a Series containing counts of unique values in a column, useful for categorical data analysis.

By using these methods, I can efficiently explore and analyze my DataFrame, gaining valuable insights into my data.

In [ ]:
# Display the first few rows
print("First few rows:")
my_df.head()

For example, I can quickly get the mean of each column:

Statistical Analysis Through the API¶

I can quickly compute various statistics using the DataFrame's built-in methods:

In [ ]:
# Calculate various statistics
print("Mean scores per quarter:")
print(my_df.mean())

print("\nStandard deviation:")
print(my_df.std())

print("\nSummary statistics:")
print(my_df.describe())

My Adventure with REST APIs: Real-World Data Analysis¶

Now let's explore REST APIs - my gateway to accessing live data from the internet. REST (Representational State Transfer) APIs allow me to request specific data from servers using HTTP methods. This opens up a world of possibilities for real-time analysis and data collection.

My NBA Data Analysis Project¶

For this demonstration, I'll analyze NBA game data to answer a specific question: How did the Golden State Warriors perform against the Toronto Raptors, and did they play better at home or away?

This type of analysis showcases several important API concepts:

  • Making authenticated requests to external services
  • Processing JSON responses
  • Handling large datasets efficiently
  • Converting API data into actionable insights

Let's dive into the implementation:

REST APIs let you send requests (usually over HTTP) and get responses, often in JSON format. For this example, I'll use the NBA API to see how the Golden State Warriors performed against the Toronto Raptors. My goal is to find out how many points the Warriors won or lost by in each game.

In [ ]:
from nba_api.stats.static import teams
import matplotlib.pyplot as plt
In [ ]:
# Get all NBA teams as a list of dictionaries
nba_teams = teams.get_teams()

I like to look at the first few teams to get a sense of the data structure:

In [ ]:
nba_teams[0:3]

To make things easier, I use my earlier function to convert the list of team dictionaries into a DataFrame:

In [ ]:
dict_nba_team = merge_dicts_to_columns(nba_teams)
df_teams = pd.DataFrame(dict_nba_team)
df_teams.head()

I use the team's nickname to find the unique ID for the Warriors:

In [ ]:
df_warriors = df_teams[df_teams['nickname'] == 'Warriors']
df_warriors

Now I extract the team ID for the Warriors, which I'll use in the API call:

In [ ]:
id_warriors = df_warriors[['id']].values[0][0]
id_warriors
In [ ]:
from nba_api.stats.endpoints import leaguegamefinder
# Uncomment and run locally if you want to make the API call:
# gamefinder = leaguegamefinder.LeagueGameFinder(team_id_nullable=id_warriors)

The API returns a JSON response with all the games. I can convert this to a DataFrame for analysis. (If running in a cloud environment, you may need to download the data instead of making a live API call.)

In [ ]:
# Uncomment and run locally if you want to get the games DataFrame:
# games = gamefinder.get_data_frames()[0]
# games.head()
In [ ]:
# Download a pre-saved DataFrame if you can't access the API directly
!wget https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/PY0101EN/Chapter%205/Labs/Golden_State.pkl
In [ ]:
file_name = "Golden_State.pkl"
games = pd.read_pickle(file_name)
games.head()

I split the games into home and away games against the Raptors:

In [ ]:
games_home = games[games['MATCHUP'] == 'GSW vs. TOR']
games_away = games[games['MATCHUP'] == 'GSW @ TOR']

Now I can calculate the average point difference for home and away games:

In [ ]:
games_home.mean()['PLUS_MINUS']
In [ ]:
games_away.mean()['PLUS_MINUS']

Finally, I like to visualize the results to see if the Warriors played better at home or away:

In [ ]:
fig, ax = plt.subplots()
games_away.plot(x='GAME_DATE', y='PLUS_MINUS', ax=ax)
games_home.plot(x='GAME_DATE', y='PLUS_MINUS', ax=ax)
ax.legend(["away", "home"])
plt.show()

My Analysis Results¶

Let me interpret what this data tells us about the Warriors' performance:

In [ ]:
# Let me create a more comprehensive analysis
print("=== My Warriors vs Raptors Analysis ===")
print(f"Home games average point differential: {games_home.mean()['PLUS_MINUS']:.2f}")
print(f"Away games average point differential: {games_away.mean()['PLUS_MINUS']:.2f}")
print(f"\nTotal games analyzed:")
print(f"- Home games: {len(games_home)}")
print(f"- Away games: {len(games_away)}")

# Calculate win percentages
home_wins = len(games_home[games_home['PLUS_MINUS'] > 0])
away_wins = len(games_away[games_away['PLUS_MINUS'] > 0])

print(f"\nWin rates:")
print(f"- Home win rate: {home_wins/len(games_home)*100:.1f}%")
print(f"- Away win rate: {away_wins/len(games_away)*100:.1f}%")
In [ ]:
# Create a more detailed visualization
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))

# Plot 1: Point differential over time
games_away.plot(x='GAME_DATE', y='PLUS_MINUS', ax=ax1, label='Away Games', marker='o')
games_home.plot(x='GAME_DATE', y='PLUS_MINUS', ax=ax1, label='Home Games', marker='s')
ax1.axhline(y=0, color='red', linestyle='--', alpha=0.7, label='Break-even')
ax1.set_title('Warriors vs Raptors: Point Differential Over Time')
ax1.set_ylabel('Point Differential')
ax1.legend()
ax1.grid(True, alpha=0.3)

# Plot 2: Performance comparison
categories = ['Home Games', 'Away Games']
avg_scores = [games_home.mean()['PLUS_MINUS'], games_away.mean()['PLUS_MINUS']]
colors = ['#1f77b4', '#ff7f0e']

ax2.bar(categories, avg_scores, color=colors, alpha=0.7)
ax2.axhline(y=0, color='red', linestyle='--', alpha=0.7)
ax2.set_title('Average Point Differential Comparison')
ax2.set_ylabel('Average Point Differential')
ax2.grid(True, alpha=0.3)

# Add value labels on bars
for i, v in enumerate(avg_scores):
    ax2.text(i, v + 0.5, f'{v:.1f}', ha='center', va='bottom', fontweight='bold')

plt.tight_layout()
plt.show()

My Reflections¶

Exploring APIs in Python has helped me understand how data moves between systems and how I can use that data for my own analysis. Whether it's using Pandas as an API or working with a REST API, the process is both powerful and fun. If you have your own API stories or want to collaborate, let me know!

My Key Takeaways from API Development¶

Working with APIs has been transformative for my data science journey. Here are the most important lessons I've learned:

1. APIs are Everywhere¶

  • Pandas DataFrames provide an API for data manipulation
  • REST APIs connect us to vast amounts of real-world data
  • Library functions are essentially APIs for specific functionality

2. Practical Benefits I've Experienced¶

  • Data Access: APIs unlock datasets I could never collect manually
  • Real-time Analysis: Live data enables dynamic insights and decision-making
  • Automation: API integration allows me to build robust data pipelines
  • Scalability: Well-designed APIs handle growing data needs efficiently

3. Best Practices I Follow¶

  • Error Handling: Always prepare for API failures and rate limits
  • Data Validation: Verify the structure and quality of API responses
  • Documentation: Keep track of API endpoints, parameters, and expected responses
  • Caching: Store frequently-used data to reduce API calls and improve performance

4. My API Development Philosophy¶

  • Start simple and build complexity gradually
  • Focus on understanding the data structure before diving into analysis
  • Always respect API rate limits and terms of service
  • Document everything for future reference and collaboration

Next Steps in My API Journey¶

  1. Explore Authentication: Learn OAuth, API keys, and secure connection methods
  2. Build My Own APIs: Create REST APIs using Flask or FastAPI
  3. Advanced Data Processing: Implement streaming data analysis with real-time APIs
  4. Integration Projects: Connect multiple APIs to create comprehensive data solutions

Connect with Me¶

If you're interested in API development, data science, or want to collaborate on projects involving real-world data analysis, I'd love to connect! APIs have opened up incredible opportunities for innovation and insight.


About This Notebook

This exploration represents my personal journey with APIs and practical data analysis. The examples and insights shared here come from real projects and hands-on experience in the field.

© 2025 Mohammad Sayem Chowdhury. Shared for educational purposes and community learning.