Files
lab/5/data science/2/3-05_knn.ipynb
2026-02-17 23:13:20 +03:00

1153 lines
40 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src=\"./images/DLI_Header.png\" width=400/>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Fundamentals of Accelerated Data Science # "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 05 - KNN ##\n",
"\n",
"**Table of Contents**\n",
"<br>\n",
"This notebook uses GPU-accelerated k-nearest neighbors to identify the nearest road nodes to hospitals. This notebook covers the below sections: \n",
"1. [Environment](#Environment)\n",
"2. [Load Data](#Load-Data)\n",
" * [Road Nodes](#Road-Nodes)\n",
" * [Hospitals](#Hospitals)\n",
"3. [K-Nearest Neighbors](#K-Nearest-Neighbors)\n",
" * [Road Nodes Closest to Each Hospital](#Road-Nodes-Closest-to-Each-Hospital)\n",
" * [Viewing a Specific Hospital](#Viewing-a-Specific-Hospital)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Environment ##"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import cudf\n",
"import cuml"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load Data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Road Nodes ###\n",
"We begin by reading our road nodes data."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# road_nodes = cudf.read_csv('./data/road_nodes_2-06.csv', dtype=['str', 'float32', 'float32', 'str'])\n",
"road_nodes = cudf.read_csv('./data/road_nodes.csv', dtype=['str', 'float32', 'float32', 'str'])"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"node_id object\n",
"east float32\n",
"north float32\n",
"type object\n",
"dtype: object"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"road_nodes.dtypes"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(3121148, 4)"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"road_nodes.shape"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>node_id</th>\n",
" <th>east</th>\n",
" <th>north</th>\n",
" <th>type</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>id02FE73D4-E88D-4119-8DC2-6E80DE6F6594</td>\n",
" <td>320608.09375</td>\n",
" <td>870994.0000</td>\n",
" <td>junction</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>id634D65C1-C38B-4868-9080-2E1E47F0935C</td>\n",
" <td>320628.50000</td>\n",
" <td>871103.8125</td>\n",
" <td>road end</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>idDC14D4D1-774E-487D-8EDE-60B129E5482C</td>\n",
" <td>320635.46875</td>\n",
" <td>870983.8750</td>\n",
" <td>junction</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>id51555819-1A39-4B41-B0C9-C6D2086D9921</td>\n",
" <td>320648.68750</td>\n",
" <td>871083.5625</td>\n",
" <td>junction</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>id9E362428-79D7-4EE3-B015-0CE3F6A78A69</td>\n",
" <td>320658.18750</td>\n",
" <td>871162.3750</td>\n",
" <td>junction</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" node_id east north type\n",
"0 id02FE73D4-E88D-4119-8DC2-6E80DE6F6594 320608.09375 870994.0000 junction\n",
"1 id634D65C1-C38B-4868-9080-2E1E47F0935C 320628.50000 871103.8125 road end\n",
"2 idDC14D4D1-774E-487D-8EDE-60B129E5482C 320635.46875 870983.8750 junction\n",
"3 id51555819-1A39-4B41-B0C9-C6D2086D9921 320648.68750 871083.5625 junction\n",
"4 id9E362428-79D7-4EE3-B015-0CE3F6A78A69 320658.18750 871162.3750 junction"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"road_nodes.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Hospitals ###\n",
"Next we load the hospital data."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"hospitals = cudf.read_csv('./data/clean_hospitals_full.csv')"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"OrganisationID int64\n",
"OrganisationCode object\n",
"OrganisationType object\n",
"SubType object\n",
"Sector object\n",
"OrganisationStatus object\n",
"IsPimsManaged object\n",
"OrganisationName object\n",
"Address1 object\n",
"Address2 object\n",
"Address3 object\n",
"City object\n",
"County object\n",
"Postcode object\n",
"Latitude float64\n",
"Longitude float64\n",
"ParentODSCode object\n",
"ParentName object\n",
"Phone object\n",
"Email object\n",
"Website object\n",
"Fax object\n",
"northing float64\n",
"easting float64\n",
"dtype: object"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"hospitals.dtypes"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(1226, 24)"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"hospitals.shape"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>OrganisationID</th>\n",
" <th>OrganisationCode</th>\n",
" <th>OrganisationType</th>\n",
" <th>SubType</th>\n",
" <th>Sector</th>\n",
" <th>OrganisationStatus</th>\n",
" <th>IsPimsManaged</th>\n",
" <th>OrganisationName</th>\n",
" <th>Address1</th>\n",
" <th>Address2</th>\n",
" <th>...</th>\n",
" <th>Latitude</th>\n",
" <th>Longitude</th>\n",
" <th>ParentODSCode</th>\n",
" <th>ParentName</th>\n",
" <th>Phone</th>\n",
" <th>Email</th>\n",
" <th>Website</th>\n",
" <th>Fax</th>\n",
" <th>northing</th>\n",
" <th>easting</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>17970</td>\n",
" <td>NDA07</td>\n",
" <td>Hospital</td>\n",
" <td>Hospital</td>\n",
" <td>Independent Sector</td>\n",
" <td>Visible</td>\n",
" <td>TRUE</td>\n",
" <td>Walton Community Hospital - Virgin Care Servic...</td>\n",
" <td>&lt;NA&gt;</td>\n",
" <td>Rodney Road</td>\n",
" <td>...</td>\n",
" <td>51.379997</td>\n",
" <td>-0.406042</td>\n",
" <td>NDA</td>\n",
" <td>Virgin Care Services Ltd</td>\n",
" <td>01932 414205</td>\n",
" <td>&lt;NA&gt;</td>\n",
" <td>&lt;NA&gt;</td>\n",
" <td>01932 253674</td>\n",
" <td>165810.4688</td>\n",
" <td>510917.5313</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>17981</td>\n",
" <td>NDA18</td>\n",
" <td>Hospital</td>\n",
" <td>Hospital</td>\n",
" <td>Independent Sector</td>\n",
" <td>Visible</td>\n",
" <td>TRUE</td>\n",
" <td>Woking Community Hospital (Virgin Care)</td>\n",
" <td>&lt;NA&gt;</td>\n",
" <td>Heathside Road</td>\n",
" <td>...</td>\n",
" <td>51.315132</td>\n",
" <td>-0.556289</td>\n",
" <td>NDA</td>\n",
" <td>Virgin Care Services Ltd</td>\n",
" <td>01483 715911</td>\n",
" <td>&lt;NA&gt;</td>\n",
" <td>&lt;NA&gt;</td>\n",
" <td>&lt;NA&gt;</td>\n",
" <td>158381.3438</td>\n",
" <td>500604.8438</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>18102</td>\n",
" <td>NLT02</td>\n",
" <td>Hospital</td>\n",
" <td>Hospital</td>\n",
" <td>NHS Sector</td>\n",
" <td>Visible</td>\n",
" <td>TRUE</td>\n",
" <td>North Somerset Community Hospital</td>\n",
" <td>North Somerset Community Hospital</td>\n",
" <td>Old Street</td>\n",
" <td>...</td>\n",
" <td>51.437195</td>\n",
" <td>-2.847193</td>\n",
" <td>NLT</td>\n",
" <td>North Somerset Community Partnership Community...</td>\n",
" <td>01275 872212</td>\n",
" <td>&lt;NA&gt;</td>\n",
" <td>http://www.nscphealth.co.uk</td>\n",
" <td>&lt;NA&gt;</td>\n",
" <td>171305.7813</td>\n",
" <td>341119.3750</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>18138</td>\n",
" <td>NMP01</td>\n",
" <td>Hospital</td>\n",
" <td>Hospital</td>\n",
" <td>Independent Sector</td>\n",
" <td>Visible</td>\n",
" <td>FALSE</td>\n",
" <td>Bridgewater Hospital</td>\n",
" <td>120 Princess Road</td>\n",
" <td>&lt;NA&gt;</td>\n",
" <td>...</td>\n",
" <td>53.459743</td>\n",
" <td>-2.245469</td>\n",
" <td>NMP</td>\n",
" <td>Bridgewater Hospital (Manchester) Ltd</td>\n",
" <td>0161 2270000</td>\n",
" <td>&lt;NA&gt;</td>\n",
" <td>www.bridgewaterhospital.com</td>\n",
" <td>&lt;NA&gt;</td>\n",
" <td>395944.5625</td>\n",
" <td>383703.5938</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>18142</td>\n",
" <td>NMV01</td>\n",
" <td>Hospital</td>\n",
" <td>Hospital</td>\n",
" <td>Independent Sector</td>\n",
" <td>Visible</td>\n",
" <td>TRUE</td>\n",
" <td>Kneesworth House</td>\n",
" <td>Old North Road</td>\n",
" <td>Bassingbourn</td>\n",
" <td>...</td>\n",
" <td>52.078121</td>\n",
" <td>-0.030604</td>\n",
" <td>NMV</td>\n",
" <td>Partnerships In Care Ltd</td>\n",
" <td>01763 255 700</td>\n",
" <td>reception_kneesworthhouse@partnershipsincare.c...</td>\n",
" <td>www.partnershipsincare.co.uk</td>\n",
" <td>&lt;NA&gt;</td>\n",
" <td>244071.7031</td>\n",
" <td>534945.1875</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows × 24 columns</p>\n",
"</div>"
],
"text/plain": [
" OrganisationID OrganisationCode OrganisationType SubType \\\n",
"0 17970 NDA07 Hospital Hospital \n",
"1 17981 NDA18 Hospital Hospital \n",
"2 18102 NLT02 Hospital Hospital \n",
"3 18138 NMP01 Hospital Hospital \n",
"4 18142 NMV01 Hospital Hospital \n",
"\n",
" Sector OrganisationStatus IsPimsManaged \\\n",
"0 Independent Sector Visible TRUE \n",
"1 Independent Sector Visible TRUE \n",
"2 NHS Sector Visible TRUE \n",
"3 Independent Sector Visible FALSE \n",
"4 Independent Sector Visible TRUE \n",
"\n",
" OrganisationName \\\n",
"0 Walton Community Hospital - Virgin Care Servic... \n",
"1 Woking Community Hospital (Virgin Care) \n",
"2 North Somerset Community Hospital \n",
"3 Bridgewater Hospital \n",
"4 Kneesworth House \n",
"\n",
" Address1 Address2 ... Latitude \\\n",
"0 <NA> Rodney Road ... 51.379997 \n",
"1 <NA> Heathside Road ... 51.315132 \n",
"2 North Somerset Community Hospital Old Street ... 51.437195 \n",
"3 120 Princess Road <NA> ... 53.459743 \n",
"4 Old North Road Bassingbourn ... 52.078121 \n",
"\n",
" Longitude ParentODSCode \\\n",
"0 -0.406042 NDA \n",
"1 -0.556289 NDA \n",
"2 -2.847193 NLT \n",
"3 -2.245469 NMP \n",
"4 -0.030604 NMV \n",
"\n",
" ParentName Phone \\\n",
"0 Virgin Care Services Ltd 01932 414205 \n",
"1 Virgin Care Services Ltd 01483 715911 \n",
"2 North Somerset Community Partnership Community... 01275 872212 \n",
"3 Bridgewater Hospital (Manchester) Ltd 0161 2270000 \n",
"4 Partnerships In Care Ltd 01763 255 700 \n",
"\n",
" Email \\\n",
"0 <NA> \n",
"1 <NA> \n",
"2 <NA> \n",
"3 <NA> \n",
"4 reception_kneesworthhouse@partnershipsincare.c... \n",
"\n",
" Website Fax northing easting \n",
"0 <NA> 01932 253674 165810.4688 510917.5313 \n",
"1 <NA> <NA> 158381.3438 500604.8438 \n",
"2 http://www.nscphealth.co.uk <NA> 171305.7813 341119.3750 \n",
"3 www.bridgewaterhospital.com <NA> 395944.5625 383703.5938 \n",
"4 www.partnershipsincare.co.uk <NA> 244071.7031 534945.1875 \n",
"\n",
"[5 rows x 24 columns]"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"hospitals.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## K-Nearest Neighbors ##\n",
"We are going to use the [k-nearest neighbors](https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm) algorithm to find the nearest *k* road nodes for every hospital. We will need to fit a KNN model with road data, and then give our trained model hospital locations so that it can return the nearest roads.\n",
"\n",
"Create a k-nearest neighbors model `knn` by using the `cuml.NearestNeighbors` constructor, passing it the named argument `n_neighbors` set to 3.\n",
"\n",
"Create a new dataframe `road_locs` using the `road_nodes` columns `east` and `north`. The order of the columns doesn't matter, except that we will need them to remain consistent over multiple operations, so please use the ordering `['east', 'north']`.\n",
"\n",
"Fit the `knn` model with `road_locs` using the `knn.fit` method."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"\n",
"knn = cuml.NearestNeighbors(n_neighbors=3)\n"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<style>#sk-container-id-1 {\n",
" /* Definition of color scheme common for light and dark mode */\n",
" --sklearn-color-text: black;\n",
" --sklearn-color-line: gray;\n",
" /* Definition of color scheme for unfitted estimators */\n",
" --sklearn-color-unfitted-level-0: #fff5e6;\n",
" --sklearn-color-unfitted-level-1: #f6e4d2;\n",
" --sklearn-color-unfitted-level-2: #ffe0b3;\n",
" --sklearn-color-unfitted-level-3: chocolate;\n",
" /* Definition of color scheme for fitted estimators */\n",
" --sklearn-color-fitted-level-0: #f0f8ff;\n",
" --sklearn-color-fitted-level-1: #d4ebff;\n",
" --sklearn-color-fitted-level-2: #b3dbfd;\n",
" --sklearn-color-fitted-level-3: cornflowerblue;\n",
"\n",
" /* Specific color for light theme */\n",
" --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
" --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, white)));\n",
" --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, black)));\n",
" --sklearn-color-icon: #696969;\n",
"\n",
" @media (prefers-color-scheme: dark) {\n",
" /* Redefinition of color scheme for dark theme */\n",
" --sklearn-color-text-on-default-background: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
" --sklearn-color-background: var(--sg-background-color, var(--theme-background, var(--jp-layout-color0, #111)));\n",
" --sklearn-color-border-box: var(--sg-text-color, var(--theme-code-foreground, var(--jp-content-font-color1, white)));\n",
" --sklearn-color-icon: #878787;\n",
" }\n",
"}\n",
"\n",
"#sk-container-id-1 {\n",
" color: var(--sklearn-color-text);\n",
"}\n",
"\n",
"#sk-container-id-1 pre {\n",
" padding: 0;\n",
"}\n",
"\n",
"#sk-container-id-1 input.sk-hidden--visually {\n",
" border: 0;\n",
" clip: rect(1px 1px 1px 1px);\n",
" clip: rect(1px, 1px, 1px, 1px);\n",
" height: 1px;\n",
" margin: -1px;\n",
" overflow: hidden;\n",
" padding: 0;\n",
" position: absolute;\n",
" width: 1px;\n",
"}\n",
"\n",
"#sk-container-id-1 div.sk-dashed-wrapped {\n",
" border: 1px dashed var(--sklearn-color-line);\n",
" margin: 0 0.4em 0.5em 0.4em;\n",
" box-sizing: border-box;\n",
" padding-bottom: 0.4em;\n",
" background-color: var(--sklearn-color-background);\n",
"}\n",
"\n",
"#sk-container-id-1 div.sk-container {\n",
" /* jupyter's `normalize.less` sets `[hidden] { display: none; }`\n",
" but bootstrap.min.css set `[hidden] { display: none !important; }`\n",
" so we also need the `!important` here to be able to override the\n",
" default hidden behavior on the sphinx rendered scikit-learn.org.\n",
" See: https://github.com/scikit-learn/scikit-learn/issues/21755 */\n",
" display: inline-block !important;\n",
" position: relative;\n",
"}\n",
"\n",
"#sk-container-id-1 div.sk-text-repr-fallback {\n",
" display: none;\n",
"}\n",
"\n",
"div.sk-parallel-item,\n",
"div.sk-serial,\n",
"div.sk-item {\n",
" /* draw centered vertical line to link estimators */\n",
" background-image: linear-gradient(var(--sklearn-color-text-on-default-background), var(--sklearn-color-text-on-default-background));\n",
" background-size: 2px 100%;\n",
" background-repeat: no-repeat;\n",
" background-position: center center;\n",
"}\n",
"\n",
"/* Parallel-specific style estimator block */\n",
"\n",
"#sk-container-id-1 div.sk-parallel-item::after {\n",
" content: \"\";\n",
" width: 100%;\n",
" border-bottom: 2px solid var(--sklearn-color-text-on-default-background);\n",
" flex-grow: 1;\n",
"}\n",
"\n",
"#sk-container-id-1 div.sk-parallel {\n",
" display: flex;\n",
" align-items: stretch;\n",
" justify-content: center;\n",
" background-color: var(--sklearn-color-background);\n",
" position: relative;\n",
"}\n",
"\n",
"#sk-container-id-1 div.sk-parallel-item {\n",
" display: flex;\n",
" flex-direction: column;\n",
"}\n",
"\n",
"#sk-container-id-1 div.sk-parallel-item:first-child::after {\n",
" align-self: flex-end;\n",
" width: 50%;\n",
"}\n",
"\n",
"#sk-container-id-1 div.sk-parallel-item:last-child::after {\n",
" align-self: flex-start;\n",
" width: 50%;\n",
"}\n",
"\n",
"#sk-container-id-1 div.sk-parallel-item:only-child::after {\n",
" width: 0;\n",
"}\n",
"\n",
"/* Serial-specific style estimator block */\n",
"\n",
"#sk-container-id-1 div.sk-serial {\n",
" display: flex;\n",
" flex-direction: column;\n",
" align-items: center;\n",
" background-color: var(--sklearn-color-background);\n",
" padding-right: 1em;\n",
" padding-left: 1em;\n",
"}\n",
"\n",
"\n",
"/* Toggleable style: style used for estimator/Pipeline/ColumnTransformer box that is\n",
"clickable and can be expanded/collapsed.\n",
"- Pipeline and ColumnTransformer use this feature and define the default style\n",
"- Estimators will overwrite some part of the style using the `sk-estimator` class\n",
"*/\n",
"\n",
"/* Pipeline and ColumnTransformer style (default) */\n",
"\n",
"#sk-container-id-1 div.sk-toggleable {\n",
" /* Default theme specific background. It is overwritten whether we have a\n",
" specific estimator or a Pipeline/ColumnTransformer */\n",
" background-color: var(--sklearn-color-background);\n",
"}\n",
"\n",
"/* Toggleable label */\n",
"#sk-container-id-1 label.sk-toggleable__label {\n",
" cursor: pointer;\n",
" display: block;\n",
" width: 100%;\n",
" margin-bottom: 0;\n",
" padding: 0.5em;\n",
" box-sizing: border-box;\n",
" text-align: center;\n",
"}\n",
"\n",
"#sk-container-id-1 label.sk-toggleable__label-arrow:before {\n",
" /* Arrow on the left of the label */\n",
" content: \"▸\";\n",
" float: left;\n",
" margin-right: 0.25em;\n",
" color: var(--sklearn-color-icon);\n",
"}\n",
"\n",
"#sk-container-id-1 label.sk-toggleable__label-arrow:hover:before {\n",
" color: var(--sklearn-color-text);\n",
"}\n",
"\n",
"/* Toggleable content - dropdown */\n",
"\n",
"#sk-container-id-1 div.sk-toggleable__content {\n",
" max-height: 0;\n",
" max-width: 0;\n",
" overflow: hidden;\n",
" text-align: left;\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-1 div.sk-toggleable__content.fitted {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-1 div.sk-toggleable__content pre {\n",
" margin: 0.2em;\n",
" border-radius: 0.25em;\n",
" color: var(--sklearn-color-text);\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-1 div.sk-toggleable__content.fitted pre {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-1 input.sk-toggleable__control:checked~div.sk-toggleable__content {\n",
" /* Expand drop-down */\n",
" max-height: 200px;\n",
" max-width: 100%;\n",
" overflow: auto;\n",
"}\n",
"\n",
"#sk-container-id-1 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {\n",
" content: \"▾\";\n",
"}\n",
"\n",
"/* Pipeline/ColumnTransformer-specific style */\n",
"\n",
"#sk-container-id-1 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"#sk-container-id-1 div.sk-label.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Estimator-specific style */\n",
"\n",
"/* Colorize estimator box */\n",
"#sk-container-id-1 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"#sk-container-id-1 div.sk-estimator.fitted input.sk-toggleable__control:checked~label.sk-toggleable__label {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"#sk-container-id-1 div.sk-label label.sk-toggleable__label,\n",
"#sk-container-id-1 div.sk-label label {\n",
" /* The background is the default theme color */\n",
" color: var(--sklearn-color-text-on-default-background);\n",
"}\n",
"\n",
"/* On hover, darken the color of the background */\n",
"#sk-container-id-1 div.sk-label:hover label.sk-toggleable__label {\n",
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"/* Label box, darken color on hover, fitted */\n",
"#sk-container-id-1 div.sk-label.fitted:hover label.sk-toggleable__label.fitted {\n",
" color: var(--sklearn-color-text);\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Estimator label */\n",
"\n",
"#sk-container-id-1 div.sk-label label {\n",
" font-family: monospace;\n",
" font-weight: bold;\n",
" display: inline-block;\n",
" line-height: 1.2em;\n",
"}\n",
"\n",
"#sk-container-id-1 div.sk-label-container {\n",
" text-align: center;\n",
"}\n",
"\n",
"/* Estimator-specific */\n",
"#sk-container-id-1 div.sk-estimator {\n",
" font-family: monospace;\n",
" border: 1px dotted var(--sklearn-color-border-box);\n",
" border-radius: 0.25em;\n",
" box-sizing: border-box;\n",
" margin-bottom: 0.5em;\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-0);\n",
"}\n",
"\n",
"#sk-container-id-1 div.sk-estimator.fitted {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-0);\n",
"}\n",
"\n",
"/* on hover */\n",
"#sk-container-id-1 div.sk-estimator:hover {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-2);\n",
"}\n",
"\n",
"#sk-container-id-1 div.sk-estimator.fitted:hover {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-2);\n",
"}\n",
"\n",
"/* Specification for estimator info (e.g. \"i\" and \"?\") */\n",
"\n",
"/* Common style for \"i\" and \"?\" */\n",
"\n",
".sk-estimator-doc-link,\n",
"a:link.sk-estimator-doc-link,\n",
"a:visited.sk-estimator-doc-link {\n",
" float: right;\n",
" font-size: smaller;\n",
" line-height: 1em;\n",
" font-family: monospace;\n",
" background-color: var(--sklearn-color-background);\n",
" border-radius: 1em;\n",
" height: 1em;\n",
" width: 1em;\n",
" text-decoration: none !important;\n",
" margin-left: 1ex;\n",
" /* unfitted */\n",
" border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-unfitted-level-1);\n",
"}\n",
"\n",
".sk-estimator-doc-link.fitted,\n",
"a:link.sk-estimator-doc-link.fitted,\n",
"a:visited.sk-estimator-doc-link.fitted {\n",
" /* fitted */\n",
" border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-fitted-level-1);\n",
"}\n",
"\n",
"/* On hover */\n",
"div.sk-estimator:hover .sk-estimator-doc-link:hover,\n",
".sk-estimator-doc-link:hover,\n",
"div.sk-label-container:hover .sk-estimator-doc-link:hover,\n",
".sk-estimator-doc-link:hover {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
"div.sk-estimator.fitted:hover .sk-estimator-doc-link.fitted:hover,\n",
".sk-estimator-doc-link.fitted:hover,\n",
"div.sk-label-container:hover .sk-estimator-doc-link.fitted:hover,\n",
".sk-estimator-doc-link.fitted:hover {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
"/* Span, style for the box shown on hovering the info icon */\n",
".sk-estimator-doc-link span {\n",
" display: none;\n",
" z-index: 9999;\n",
" position: relative;\n",
" font-weight: normal;\n",
" right: .2ex;\n",
" padding: .5ex;\n",
" margin: .5ex;\n",
" width: min-content;\n",
" min-width: 20ex;\n",
" max-width: 50ex;\n",
" color: var(--sklearn-color-text);\n",
" box-shadow: 2pt 2pt 4pt #999;\n",
" /* unfitted */\n",
" background: var(--sklearn-color-unfitted-level-0);\n",
" border: .5pt solid var(--sklearn-color-unfitted-level-3);\n",
"}\n",
"\n",
".sk-estimator-doc-link.fitted span {\n",
" /* fitted */\n",
" background: var(--sklearn-color-fitted-level-0);\n",
" border: var(--sklearn-color-fitted-level-3);\n",
"}\n",
"\n",
".sk-estimator-doc-link:hover span {\n",
" display: block;\n",
"}\n",
"\n",
"/* \"?\"-specific style due to the `<a>` HTML tag */\n",
"\n",
"#sk-container-id-1 a.estimator_doc_link {\n",
" float: right;\n",
" font-size: 1rem;\n",
" line-height: 1em;\n",
" font-family: monospace;\n",
" background-color: var(--sklearn-color-background);\n",
" border-radius: 1rem;\n",
" height: 1rem;\n",
" width: 1rem;\n",
" text-decoration: none;\n",
" /* unfitted */\n",
" color: var(--sklearn-color-unfitted-level-1);\n",
" border: var(--sklearn-color-unfitted-level-1) 1pt solid;\n",
"}\n",
"\n",
"#sk-container-id-1 a.estimator_doc_link.fitted {\n",
" /* fitted */\n",
" border: var(--sklearn-color-fitted-level-1) 1pt solid;\n",
" color: var(--sklearn-color-fitted-level-1);\n",
"}\n",
"\n",
"/* On hover */\n",
"#sk-container-id-1 a.estimator_doc_link:hover {\n",
" /* unfitted */\n",
" background-color: var(--sklearn-color-unfitted-level-3);\n",
" color: var(--sklearn-color-background);\n",
" text-decoration: none;\n",
"}\n",
"\n",
"#sk-container-id-1 a.estimator_doc_link.fitted:hover {\n",
" /* fitted */\n",
" background-color: var(--sklearn-color-fitted-level-3);\n",
"}\n",
"</style><div id=\"sk-container-id-1\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>NearestNeighbors()</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator fitted sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-1\" type=\"checkbox\" checked><label for=\"sk-estimator-id-1\" class=\"sk-toggleable__label fitted sk-toggleable__label-arrow fitted\">&nbsp;NearestNeighbors<span class=\"sk-estimator-doc-link fitted\">i<span>Fitted</span></span></label><div class=\"sk-toggleable__content fitted\"><pre>NearestNeighbors()</pre></div> </div></div></div></div>"
],
"text/plain": [
"NearestNeighbors()"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"\n",
"road_locs = road_nodes[['east', 'north']]\n",
"knn.fit(road_locs)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Road Nodes Closest to Each Hospital ###\n",
"Use the `knn.kneighbors` method to find the 3 closest road nodes to each hospital. `knn.kneighbors` expects 2 arguments: `X`, for which you should use the `easting` and `northing` columns of `hospitals` (remember to retain the same column order as when you fit the `knn` model above), and `n_neighbors`, the number of neighbors to search for--in this case, 3. \n",
"\n",
"`knn.kneighbors` will return 2 cudf dataframes, which you should name `distances` and `indices` respectively."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"distances, indices = knn.kneighbors(hospitals[['easting', 'northing']], 3) # order has to match the knn fit order (east, north)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Viewing a Specific Hospital ###\n",
"We can now use `indices`, `hospitals`, and `road_nodes` to derive information specific to a given hospital. Here we will examine the hospital at index `10`. First we view the hospital's grid coordinates:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"hospital coordinates:\n",
"easting 260713.17190\n",
"northing 56303.21875\n",
"Name: 10, dtype: float64\n"
]
}
],
"source": [
"SELECTED_RESULT = 10\n",
"print('hospital coordinates:\\n', hospitals.loc[SELECTED_RESULT, ['easting', 'northing']], sep='')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we view the road node IDs for the 3 closest road nodes:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"node_id:\n",
"0 118559\n",
"1 118560\n",
"2 118678\n",
"Name: 10, dtype: int64\n"
]
}
],
"source": [
"nearest_road_nodes = indices.iloc[SELECTED_RESULT, 0:3]\n",
"print('node_id:\\n', nearest_road_nodes, sep='')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And finally the grid coordinates for the 3 nearest road nodes, which we can confirm are located in order of increasing distance from the hospital:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"road_node coordinates:\n",
" east north\n",
"118559 260697.859375 56322.710938\n",
"118560 260722.812500 56207.925781\n",
"118678 260540.000000 56105.000000\n"
]
}
],
"source": [
"print('road_node coordinates:\\n', road_nodes.loc[nearest_road_nodes, ['east', 'north']], sep='')"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'status': 'ok', 'restart': True}"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import IPython\n",
"app = IPython.Application.instance()\n",
"app.kernel.do_shutdown(True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Well Done!** Let's move to the [next notebook](3-06_xgboost.ipynb). "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<img src=\"./images/DLI_Header.png\" width=400/>"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.15"
}
},
"nbformat": 4,
"nbformat_minor": 4
}