To interact with SQL databases in Jupyter, I use the ipython-sql extension. This lets me run SQL queries directly in notebook cells using special "magic" commands (like %sql).
# If you're running this outside my environment, you may need to install these libraries:
# !pip install ibm_db ibm_db_sa ipython-sql
%load_ext sql
Once the extension is loaded, I connect to my database using a connection string. Make sure to use your own credentials.
# Format: %sql ibm_db_sa://username:password@hostname:port/dbname?security=SSL
# Enter your connection string below
%sql ibm_db_sa://
You can use %%sql at the top of a cell to run multiple SQL statements at once. Here, I create a table and insert some test data for practice.
%%sql
CREATE TABLE INTERNATIONAL_STUDENT_TEST_SCORES (
country VARCHAR(50),
first_name VARCHAR(50),
last_name VARCHAR(50),
test_score INT
);
INSERT INTO INTERNATIONAL_STUDENT_TEST_SCORES (country, first_name, last_name, test_score)
VALUES
('United States', 'Marshall', 'Bernadot', 54),
('Ghana', 'Celinda', 'Malkin', 51),
('Ukraine', 'Guillermo', 'Furze', 53),
('Greece', 'Aharon', 'Tunnow', 48),
('Russia', 'Bail', 'Goodwin', 46),
('Poland', 'Cole', 'Winteringham', 49),
('Sweden', 'Emlyn', 'Erricker', 55),
('Russia', 'Cathee', 'Sivewright', 49),
('China', 'Barny', 'Ingerson', 57),
('Uganda', 'Sharla', 'Papaccio', 55),
('China', 'Stella', 'Youens', 51),
('Poland', 'Julio', 'Buesden', 48),
('United States', 'Tiffie', 'Cosely', 58),
('Poland', 'Auroora', 'Stiffell', 45),
('China', 'Clarita', 'Huet', 52),
('Poland', 'Shannon', 'Goulden', 45),
('Philippines', 'Emylee', 'Privost', 50),
('France', 'Madelina', 'Burk', 49),
('China', 'Saunderson', 'Root', 58),
('Indonesia', 'Bo', 'Waring', 55),
('China', 'Hollis', 'Domotor', 45),
('Russia', 'Robbie', 'Collip', 46),
('Philippines', 'Davon', 'Donisi', 46),
('China', 'Cristabel', 'Radeliffe', 48),
('China', 'Wallis', 'Bartleet', 58),
('Moldova', 'Arleen', 'Stailey', 38),
('Ireland', 'Mendel', 'Grumble', 58),
('China', 'Sallyann', 'Exley', 51),
('Mexico', 'Kain', 'Swaite', 46),
('Indonesia', 'Alonso', 'Bulteel', 45),
('Armenia', 'Anatol', 'Tankus', 51),
('Armenia', 'Anatol', 'Tankus', 51),
('Indonesia', 'Coralyn', 'Dawkins', 48),
('China', 'Deanne', 'Edwinson', 45),
('China', 'Georgiana', 'Epple', 51),
('Portugal', 'Bartlet', 'Breese', 56),
('Azerbaijan', 'Idalina', 'Lukash', 50),
('France', 'Livvie', 'Flory', 54),
('Malaysia', 'Nonie', 'Borit', 48),
('Indonesia', 'Clio', 'Mugg', 47),
('Brazil', 'Westley', 'Measor', 48),
('Philippines', 'Katrinka', 'Sibbert', 51),
('Poland', 'Valentia', 'Mounch', 50),
('Norway', 'Sheilah', 'Hedditch', 53),
('Papua New Guinea', 'Itch', 'Jubb', 50),
('Latvia', 'Stesha', 'Garnson', 53),
('Canada', 'Cristionna', 'Wadmore', 46),
('China', 'Lianna', 'Gatward', 43),
('Guatemala', 'Tanney', 'Vials', 48),
('France', 'Alma', 'Zavittieri', 44),
('China', 'Alvira', 'Tamas', 50),
('United States', 'Shanon', 'Peres', 45),
('Sweden', 'Maisey', 'Lynas', 53),
('Indonesia', 'Kip', 'Hothersall', 46),
('China', 'Cash', 'Landis', 48),
('Panama', 'Kennith', 'Digance', 45),
('China', 'Ulberto', 'Riggeard', 48),
('Switzerland', 'Judy', 'Gilligan', 49),
('Philippines', 'Tod', 'Trevaskus', 52),
('Brazil', 'Herold', 'Heggs', 44),
('Latvia', 'Verney', 'Note', 50),
('Poland', 'Temp', 'Ribey', 50),
('China', 'Conroy', 'Egdal', 48),
('Japan', 'Gabie', 'Alessandone', 47),
('Ukraine', 'Devlen', 'Chaperlin', 54),
('France', 'Babbette', 'Turner', 51),
('Czech Republic', 'Virgil', 'Scotney', 52),
('Tajikistan', 'Zorina', 'Bedow', 49),
('China', 'Aidan', 'Rudeyeard', 50),
('Ireland', 'Saunder', 'MacLice', 48),
('France', 'Waly', 'Brunstan', 53),
('China', 'Gisele', 'Enns', 52),
('Peru', 'Mina', 'Winchester', 48),
('Japan', 'Torie', 'MacShirrie', 50),
('Russia', 'Benjamen', 'Kenford', 51),
('China', 'Etan', 'Burn', 53),
('Russia', 'Merralee', 'Chaperlin', 38),
('Indonesia', 'Lanny', 'Malam', 49),
('Canada', 'Wilhelm', 'Deeprose', 54),
('Czech Republic', 'Lari', 'Hillhouse', 48),
('China', 'Ossie', 'Woodley', 52),
('Macedonia', 'April', 'Tyer', 50),
('Vietnam', 'Madelon', 'Dansey', 53),
('Ukraine', 'Korella', 'McNamee', 52),
('Jamaica', 'Linnea', 'Cannam', 43),
('China', 'Mart', 'Coling', 52),
('Indonesia', 'Marna', 'Causbey', 47),
('China', 'Berni', 'Daintier', 55),
('Poland', 'Cynthia', 'Hassell', 49),
('Canada', 'Carma', 'Schule', 49),
('Indonesia', 'Malia', 'Blight', 48),
('China', 'Paulo', 'Seivertsen', 47),
('Niger', 'Kaylee', 'Hearley', 54),
('Japan', 'Maure', 'Jandak', 46),
('Argentina', 'Foss', 'Feavers', 45),
('Venezuela', 'Ron', 'Leggitt', 60),
('Russia', 'Flint', 'Gokes', 40),
('China', 'Linet', 'Conelly', 52),
('Philippines', 'Nikolas', 'Birtwell', 57),
('Australia', 'Eduard', 'Leipelt', 53)
* ibm_db_sa://pgs19748:***@6667d8e9-9d4d-4ccb-ba32-21da3bb5aafc.c1ogj3sd0tgtu0lqde00.databases.appdomain.cloud:30376/BLUDB Done. 99 rows affected.
[]
Using Python Variables in SQL¶
You can use Python variables in your SQL statements by adding a : prefix to the variable name. For example, if I have a variable country = "Canada", I can use it in a query like this:
country = "Canada"
%sql select * from INTERNATIONAL_STUDENT_TEST_SCORES where country = :country
* ibm_db_sa://pgs19748:***@6667d8e9-9d4d-4ccb-ba32-21da3bb5aafc.c1ogj3sd0tgtu0lqde00.databases.appdomain.cloud:30376/BLUDB Done.
| country | first_name | last_name | test_score |
|---|---|---|---|
| Canada | Cristionna | Wadmore | 46 |
| Canada | Wilhelm | Deeprose | 54 |
| Canada | Carma | Schule | 49 |
Assigning Query Results to Python Variables¶
You can assign the results of a SQL query to a Python variable using the = operator. For example:
test_score_distribution = %sql SELECT test_score as "TestScore", count(*) as "Frequency" from INTERNATIONAL_STUDENT_TEST_SCORES GROUP BY test_score;
test_score_distribution
* ibm_db_sa://pgs19748:***@6667d8e9-9d4d-4ccb-ba32-21da3bb5aafc.c1ogj3sd0tgtu0lqde00.databases.appdomain.cloud:30376/BLUDB Done.
| testscore | frequency |
|---|---|
| 38 | 2 |
| 40 | 1 |
| 43 | 2 |
| 44 | 2 |
| 45 | 8 |
| 46 | 7 |
| 47 | 4 |
| 48 | 14 |
| 49 | 8 |
| 50 | 10 |
| 51 | 8 |
| 52 | 8 |
| 53 | 8 |
| 54 | 5 |
| 55 | 4 |
| 56 | 1 |
| 57 | 2 |
| 58 | 4 |
| 60 | 1 |
Converting Query Results to DataFrames¶
You can easily convert a SQL query result to a pandas DataFrame using the DataFrame() method. This makes it easy to visualize or further analyze your results.
dataframe = test_score_distribution.DataFrame()
%matplotlib inline
# If you get a module error saying seaborn not found, uncomment the next line:
# !pip install seaborn==0.9.0
import seaborn
plot = seaborn.barplot(x='testscore',y='frequency', data=dataframe)
Now you know how to work with SQL databases from within Jupyter notebooks using SQL magic!
%%sql
-- Feel free to experiment with the data set provided in this notebook for practice:
SELECT country, first_name, last_name, test_score FROM INTERNATIONAL_STUDENT_TEST_SCORES;
* ibm_db_sa://pgs19748:***@6667d8e9-9d4d-4ccb-ba32-21da3bb5aafc.c1ogj3sd0tgtu0lqde00.databases.appdomain.cloud:30376/BLUDB Done.
| country | first_name | last_name | test_score |
|---|---|---|---|
| United States | Marshall | Bernadot | 54 |
| Ghana | Celinda | Malkin | 51 |
| Ukraine | Guillermo | Furze | 53 |
| Greece | Aharon | Tunnow | 48 |
| Russia | Bail | Goodwin | 46 |
| Poland | Cole | Winteringham | 49 |
| Sweden | Emlyn | Erricker | 55 |
| Russia | Cathee | Sivewright | 49 |
| China | Barny | Ingerson | 57 |
| Uganda | Sharla | Papaccio | 55 |
| China | Stella | Youens | 51 |
| Poland | Julio | Buesden | 48 |
| United States | Tiffie | Cosely | 58 |
| Poland | Auroora | Stiffell | 45 |
| China | Clarita | Huet | 52 |
| Poland | Shannon | Goulden | 45 |
| Philippines | Emylee | Privost | 50 |
| France | Madelina | Burk | 49 |
| China | Saunderson | Root | 58 |
| Indonesia | Bo | Waring | 55 |
| China | Hollis | Domotor | 45 |
| Russia | Robbie | Collip | 46 |
| Philippines | Davon | Donisi | 46 |
| China | Cristabel | Radeliffe | 48 |
| China | Wallis | Bartleet | 58 |
| Moldova | Arleen | Stailey | 38 |
| Ireland | Mendel | Grumble | 58 |
| China | Sallyann | Exley | 51 |
| Mexico | Kain | Swaite | 46 |
| Indonesia | Alonso | Bulteel | 45 |
| Armenia | Anatol | Tankus | 51 |
| Indonesia | Coralyn | Dawkins | 48 |
| China | Deanne | Edwinson | 45 |
| China | Georgiana | Epple | 51 |
| Portugal | Bartlet | Breese | 56 |
| Azerbaijan | Idalina | Lukash | 50 |
| France | Livvie | Flory | 54 |
| Malaysia | Nonie | Borit | 48 |
| Indonesia | Clio | Mugg | 47 |
| Brazil | Westley | Measor | 48 |
| Philippines | Katrinka | Sibbert | 51 |
| Poland | Valentia | Mounch | 50 |
| Norway | Sheilah | Hedditch | 53 |
| Papua New Guinea | Itch | Jubb | 50 |
| Latvia | Stesha | Garnson | 53 |
| Canada | Cristionna | Wadmore | 46 |
| China | Lianna | Gatward | 43 |
| Guatemala | Tanney | Vials | 48 |
| France | Alma | Zavittieri | 44 |
| China | Alvira | Tamas | 50 |
| United States | Shanon | Peres | 45 |
| Sweden | Maisey | Lynas | 53 |
| Indonesia | Kip | Hothersall | 46 |
| China | Cash | Landis | 48 |
| Panama | Kennith | Digance | 45 |
| China | Ulberto | Riggeard | 48 |
| Switzerland | Judy | Gilligan | 49 |
| Philippines | Tod | Trevaskus | 52 |
| Brazil | Herold | Heggs | 44 |
| Latvia | Verney | Note | 50 |
| Poland | Temp | Ribey | 50 |
| China | Conroy | Egdal | 48 |
| Japan | Gabie | Alessandone | 47 |
| Ukraine | Devlen | Chaperlin | 54 |
| France | Babbette | Turner | 51 |
| Czech Republic | Virgil | Scotney | 52 |
| Tajikistan | Zorina | Bedow | 49 |
| China | Aidan | Rudeyeard | 50 |
| Ireland | Saunder | MacLice | 48 |
| France | Waly | Brunstan | 53 |
| China | Gisele | Enns | 52 |
| Peru | Mina | Winchester | 48 |
| Japan | Torie | MacShirrie | 50 |
| Russia | Benjamen | Kenford | 51 |
| China | Etan | Burn | 53 |
| Russia | Merralee | Chaperlin | 38 |
| Indonesia | Lanny | Malam | 49 |
| Canada | Wilhelm | Deeprose | 54 |
| Czech Republic | Lari | Hillhouse | 48 |
| China | Ossie | Woodley | 52 |
| Macedonia | April | Tyer | 50 |
| Vietnam | Madelon | Dansey | 53 |
| Ukraine | Korella | McNamee | 52 |
| Jamaica | Linnea | Cannam | 43 |
| China | Mart | Coling | 52 |
| Indonesia | Marna | Causbey | 47 |
| China | Berni | Daintier | 55 |
| Poland | Cynthia | Hassell | 49 |
| Canada | Carma | Schule | 49 |
| Indonesia | Malia | Blight | 48 |
| China | Paulo | Seivertsen | 47 |
| Niger | Kaylee | Hearley | 54 |
| Japan | Maure | Jandak | 46 |
| Argentina | Foss | Feavers | 45 |
| Venezuela | Ron | Leggitt | 60 |
| Russia | Flint | Gokes | 40 |
| China | Linet | Conelly | 52 |
| Philippines | Nikolas | Birtwell | 57 |
| Australia | Eduard | Leipelt | 53 |
Notebook and analysis by Mohammad Sayem Chowdhury.
All code and explanations are my own. This notebook is shared under the MIT License.