Using SQL Magic in Jupyter: My Approach¶

Author: Mohammad Sayem Chowdhury


To interact with SQL databases in Jupyter, I use the ipython-sql extension. This lets me run SQL queries directly in notebook cells using special "magic" commands (like %sql).

In [ ]:
# If you're running this outside my environment, you may need to install these libraries:
# !pip install ibm_db ibm_db_sa ipython-sql
In [2]:
%load_ext sql

Once the extension is loaded, I connect to my database using a connection string. Make sure to use your own credentials.

In [ ]:
# Format: %sql ibm_db_sa://username:password@hostname:port/dbname?security=SSL
# Enter your connection string below
%sql ibm_db_sa://

You can use %%sql at the top of a cell to run multiple SQL statements at once. Here, I create a table and insert some test data for practice.

In [ ]:
%%sql

CREATE TABLE INTERNATIONAL_STUDENT_TEST_SCORES (
    country VARCHAR(50),
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    test_score INT
);
INSERT INTO INTERNATIONAL_STUDENT_TEST_SCORES (country, first_name, last_name, test_score)
VALUES
('United States', 'Marshall', 'Bernadot', 54),
('Ghana', 'Celinda', 'Malkin', 51),
('Ukraine', 'Guillermo', 'Furze', 53),
('Greece', 'Aharon', 'Tunnow', 48),
('Russia', 'Bail', 'Goodwin', 46),
('Poland', 'Cole', 'Winteringham', 49),
('Sweden', 'Emlyn', 'Erricker', 55),
('Russia', 'Cathee', 'Sivewright', 49),
('China', 'Barny', 'Ingerson', 57),
('Uganda', 'Sharla', 'Papaccio', 55),
('China', 'Stella', 'Youens', 51),
('Poland', 'Julio', 'Buesden', 48),
('United States', 'Tiffie', 'Cosely', 58),
('Poland', 'Auroora', 'Stiffell', 45),
('China', 'Clarita', 'Huet', 52),
('Poland', 'Shannon', 'Goulden', 45),
('Philippines', 'Emylee', 'Privost', 50),
('France', 'Madelina', 'Burk', 49),
('China', 'Saunderson', 'Root', 58),
('Indonesia', 'Bo', 'Waring', 55),
('China', 'Hollis', 'Domotor', 45),
('Russia', 'Robbie', 'Collip', 46),
('Philippines', 'Davon', 'Donisi', 46),
('China', 'Cristabel', 'Radeliffe', 48),
('China', 'Wallis', 'Bartleet', 58),
('Moldova', 'Arleen', 'Stailey', 38),
('Ireland', 'Mendel', 'Grumble', 58),
('China', 'Sallyann', 'Exley', 51),
('Mexico', 'Kain', 'Swaite', 46),
('Indonesia', 'Alonso', 'Bulteel', 45),
('Armenia', 'Anatol', 'Tankus', 51),
('Armenia', 'Anatol', 'Tankus', 51),
('Indonesia', 'Coralyn', 'Dawkins', 48),
('China', 'Deanne', 'Edwinson', 45),
('China', 'Georgiana', 'Epple', 51),
('Portugal', 'Bartlet', 'Breese', 56),
('Azerbaijan', 'Idalina', 'Lukash', 50),
('France', 'Livvie', 'Flory', 54),
('Malaysia', 'Nonie', 'Borit', 48),
('Indonesia', 'Clio', 'Mugg', 47),
('Brazil', 'Westley', 'Measor', 48),
('Philippines', 'Katrinka', 'Sibbert', 51),
('Poland', 'Valentia', 'Mounch', 50),
('Norway', 'Sheilah', 'Hedditch', 53),
('Papua New Guinea', 'Itch', 'Jubb', 50),
('Latvia', 'Stesha', 'Garnson', 53),
('Canada', 'Cristionna', 'Wadmore', 46),
('China', 'Lianna', 'Gatward', 43),
('Guatemala', 'Tanney', 'Vials', 48),
('France', 'Alma', 'Zavittieri', 44),
('China', 'Alvira', 'Tamas', 50),
('United States', 'Shanon', 'Peres', 45),
('Sweden', 'Maisey', 'Lynas', 53),
('Indonesia', 'Kip', 'Hothersall', 46),
('China', 'Cash', 'Landis', 48),
('Panama', 'Kennith', 'Digance', 45),
('China', 'Ulberto', 'Riggeard', 48),
('Switzerland', 'Judy', 'Gilligan', 49),
('Philippines', 'Tod', 'Trevaskus', 52),
('Brazil', 'Herold', 'Heggs', 44),
('Latvia', 'Verney', 'Note', 50),
('Poland', 'Temp', 'Ribey', 50),
('China', 'Conroy', 'Egdal', 48),
('Japan', 'Gabie', 'Alessandone', 47),
('Ukraine', 'Devlen', 'Chaperlin', 54),
('France', 'Babbette', 'Turner', 51),
('Czech Republic', 'Virgil', 'Scotney', 52),
('Tajikistan', 'Zorina', 'Bedow', 49),
('China', 'Aidan', 'Rudeyeard', 50),
('Ireland', 'Saunder', 'MacLice', 48),
('France', 'Waly', 'Brunstan', 53),
('China', 'Gisele', 'Enns', 52),
('Peru', 'Mina', 'Winchester', 48),
('Japan', 'Torie', 'MacShirrie', 50),
('Russia', 'Benjamen', 'Kenford', 51),
('China', 'Etan', 'Burn', 53),
('Russia', 'Merralee', 'Chaperlin', 38),
('Indonesia', 'Lanny', 'Malam', 49),
('Canada', 'Wilhelm', 'Deeprose', 54),
('Czech Republic', 'Lari', 'Hillhouse', 48),
('China', 'Ossie', 'Woodley', 52),
('Macedonia', 'April', 'Tyer', 50),
('Vietnam', 'Madelon', 'Dansey', 53),
('Ukraine', 'Korella', 'McNamee', 52),
('Jamaica', 'Linnea', 'Cannam', 43),
('China', 'Mart', 'Coling', 52),
('Indonesia', 'Marna', 'Causbey', 47),
('China', 'Berni', 'Daintier', 55),
('Poland', 'Cynthia', 'Hassell', 49),
('Canada', 'Carma', 'Schule', 49),
('Indonesia', 'Malia', 'Blight', 48),
('China', 'Paulo', 'Seivertsen', 47),
('Niger', 'Kaylee', 'Hearley', 54),
('Japan', 'Maure', 'Jandak', 46),
('Argentina', 'Foss', 'Feavers', 45),
('Venezuela', 'Ron', 'Leggitt', 60),
('Russia', 'Flint', 'Gokes', 40),
('China', 'Linet', 'Conelly', 52),
('Philippines', 'Nikolas', 'Birtwell', 57),
('Australia', 'Eduard', 'Leipelt', 53)
 * ibm_db_sa://pgs19748:***@6667d8e9-9d4d-4ccb-ba32-21da3bb5aafc.c1ogj3sd0tgtu0lqde00.databases.appdomain.cloud:30376/BLUDB
Done.
99 rows affected.
Out[ ]:
[]

Using Python Variables in SQL¶

You can use Python variables in your SQL statements by adding a : prefix to the variable name. For example, if I have a variable country = "Canada", I can use it in a query like this:

In [5]:
country = "Canada"
%sql select * from INTERNATIONAL_STUDENT_TEST_SCORES where country = :country
 * ibm_db_sa://pgs19748:***@6667d8e9-9d4d-4ccb-ba32-21da3bb5aafc.c1ogj3sd0tgtu0lqde00.databases.appdomain.cloud:30376/BLUDB
Done.
Out[5]:
country first_name last_name test_score
Canada Cristionna Wadmore 46
Canada Wilhelm Deeprose 54
Canada Carma Schule 49

Assigning Query Results to Python Variables¶

You can assign the results of a SQL query to a Python variable using the = operator. For example:

In [6]:
test_score_distribution = %sql SELECT test_score as "TestScore", count(*) as "Frequency" from INTERNATIONAL_STUDENT_TEST_SCORES GROUP BY test_score;
test_score_distribution
 * ibm_db_sa://pgs19748:***@6667d8e9-9d4d-4ccb-ba32-21da3bb5aafc.c1ogj3sd0tgtu0lqde00.databases.appdomain.cloud:30376/BLUDB
Done.
Out[6]:
testscore frequency
38 2
40 1
43 2
44 2
45 8
46 7
47 4
48 14
49 8
50 10
51 8
52 8
53 8
54 5
55 4
56 1
57 2
58 4
60 1

Converting Query Results to DataFrames¶

You can easily convert a SQL query result to a pandas DataFrame using the DataFrame() method. This makes it easy to visualize or further analyze your results.

In [ ]:
dataframe = test_score_distribution.DataFrame()

%matplotlib inline
# If you get a module error saying seaborn not found, uncomment the next line:
# !pip install seaborn==0.9.0
import seaborn

plot = seaborn.barplot(x='testscore',y='frequency', data=dataframe)
No description has been provided for this image

Now you know how to work with SQL databases from within Jupyter notebooks using SQL magic!

In [8]:
%%sql 

-- Feel free to experiment with the data set provided in this notebook for practice:
SELECT country, first_name, last_name, test_score FROM INTERNATIONAL_STUDENT_TEST_SCORES;    
 * ibm_db_sa://pgs19748:***@6667d8e9-9d4d-4ccb-ba32-21da3bb5aafc.c1ogj3sd0tgtu0lqde00.databases.appdomain.cloud:30376/BLUDB
Done.
Out[8]:
country first_name last_name test_score
United States Marshall Bernadot 54
Ghana Celinda Malkin 51
Ukraine Guillermo Furze 53
Greece Aharon Tunnow 48
Russia Bail Goodwin 46
Poland Cole Winteringham 49
Sweden Emlyn Erricker 55
Russia Cathee Sivewright 49
China Barny Ingerson 57
Uganda Sharla Papaccio 55
China Stella Youens 51
Poland Julio Buesden 48
United States Tiffie Cosely 58
Poland Auroora Stiffell 45
China Clarita Huet 52
Poland Shannon Goulden 45
Philippines Emylee Privost 50
France Madelina Burk 49
China Saunderson Root 58
Indonesia Bo Waring 55
China Hollis Domotor 45
Russia Robbie Collip 46
Philippines Davon Donisi 46
China Cristabel Radeliffe 48
China Wallis Bartleet 58
Moldova Arleen Stailey 38
Ireland Mendel Grumble 58
China Sallyann Exley 51
Mexico Kain Swaite 46
Indonesia Alonso Bulteel 45
Armenia Anatol Tankus 51
Indonesia Coralyn Dawkins 48
China Deanne Edwinson 45
China Georgiana Epple 51
Portugal Bartlet Breese 56
Azerbaijan Idalina Lukash 50
France Livvie Flory 54
Malaysia Nonie Borit 48
Indonesia Clio Mugg 47
Brazil Westley Measor 48
Philippines Katrinka Sibbert 51
Poland Valentia Mounch 50
Norway Sheilah Hedditch 53
Papua New Guinea Itch Jubb 50
Latvia Stesha Garnson 53
Canada Cristionna Wadmore 46
China Lianna Gatward 43
Guatemala Tanney Vials 48
France Alma Zavittieri 44
China Alvira Tamas 50
United States Shanon Peres 45
Sweden Maisey Lynas 53
Indonesia Kip Hothersall 46
China Cash Landis 48
Panama Kennith Digance 45
China Ulberto Riggeard 48
Switzerland Judy Gilligan 49
Philippines Tod Trevaskus 52
Brazil Herold Heggs 44
Latvia Verney Note 50
Poland Temp Ribey 50
China Conroy Egdal 48
Japan Gabie Alessandone 47
Ukraine Devlen Chaperlin 54
France Babbette Turner 51
Czech Republic Virgil Scotney 52
Tajikistan Zorina Bedow 49
China Aidan Rudeyeard 50
Ireland Saunder MacLice 48
France Waly Brunstan 53
China Gisele Enns 52
Peru Mina Winchester 48
Japan Torie MacShirrie 50
Russia Benjamen Kenford 51
China Etan Burn 53
Russia Merralee Chaperlin 38
Indonesia Lanny Malam 49
Canada Wilhelm Deeprose 54
Czech Republic Lari Hillhouse 48
China Ossie Woodley 52
Macedonia April Tyer 50
Vietnam Madelon Dansey 53
Ukraine Korella McNamee 52
Jamaica Linnea Cannam 43
China Mart Coling 52
Indonesia Marna Causbey 47
China Berni Daintier 55
Poland Cynthia Hassell 49
Canada Carma Schule 49
Indonesia Malia Blight 48
China Paulo Seivertsen 47
Niger Kaylee Hearley 54
Japan Maure Jandak 46
Argentina Foss Feavers 45
Venezuela Ron Leggitt 60
Russia Flint Gokes 40
China Linet Conelly 52
Philippines Nikolas Birtwell 57
Australia Eduard Leipelt 53

Notebook and analysis by Mohammad Sayem Chowdhury.

All code and explanations are my own. This notebook is shared under the MIT License.