mirror of
https://github.com/bvanroll/college-datascience.git
synced 2025-08-29 12:02:45 +00:00
eeyyyyyyyyyyyyyy
This commit is contained in:
101
5/.ipynb_checkpoints/Labo5-checkpoint.ipynb
Normal file
101
5/.ipynb_checkpoints/Labo5-checkpoint.ipynb
Normal file
@@ -0,0 +1,101 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Labo 5 Data Science : K-NN"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"__Oefening 1__ K-NN classificatie\n",
|
||||
"\n",
|
||||
"Gegeven volgende beperkte datset :\n",
|
||||
"\n",
|
||||
"| naam | zoetheid | krokantheid | soort |\n",
|
||||
"|-------------|:--------:|:-------------:|---------:|\n",
|
||||
"| pompelmoes | 8 | 5 | fruit |\n",
|
||||
"| groene boon | 3 | 7 | groente |\n",
|
||||
"| noot | 3 | 6 | proteïne |\n",
|
||||
"| appelsien | 7 | 3 | fruit |\n",
|
||||
"\n",
|
||||
"We wensen nu voor 2 onbekende ingredienten te beslissen tot welke categorie ze behoren : _fruit, groente of proteïne_ . Deze ingredienten zijn :\n",
|
||||
"\n",
|
||||
"| naam | zoetheid | krokantheid | soort |\n",
|
||||
"|-----------|:--------:|:-------------:|---------:|\n",
|
||||
"| tomaat | 6 | 4 | ? |\n",
|
||||
"| wortel | 4 | 9 | ? |\n",
|
||||
"\n",
|
||||
"Gebruik K-NN om deze classificatie te doen. \n",
|
||||
"\n",
|
||||
"* Doe dit eerst visueel : m.a.w. plot de trainings- en test data in het vlak en bepaal visueel de classificatie. Geef dezelfde kleur aan data uit dezelfde klasse.\n",
|
||||
"\n",
|
||||
"* Gebruik vervolgens de KNeighborsClassifier van module sklearn om de voorspellingen te doen. doe dit eerst voor $k=1$ daarna voor $k=4$. Kan je deze classificaties logisch verklaren?\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from sklearn.neighbors import KNeighborsClassifier\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"**Oefening 2** K-NN regression : De bedoeling is dat je een leermodel opstelt om te voorspellen hoeveel het verbuik is van een wagen (in miles /gallon) gegeven de afgelegde weg (in miles) en de pk-waarden van de wagen.\n",
|
||||
"\n",
|
||||
"1. Lees het bestanden _auto.csv_ in als een dataframe. Ga na wat deze data precies inhoudt en hoe omvangrijk ze is. \n",
|
||||
"\n",
|
||||
"2. Maak gebruik van de $train\\_test\\_split$ methode om je data op te splitsen in training versus test data. Neem 30% van de data als testdata, 70% van de data als trainingsdata.\n",
|
||||
"\n",
|
||||
"3. Ga eerst na wat in dit geval een goede waarde voor $k$ zou zijn. Gebruik \n",
|
||||
" hiervoor de Elbow-methode. Als error bereken je de $mean-squared_error \\; (mse)$ voor elke k-waarde die je test. Plot de _elbow_ uit in een grafiek (m.a.w. voor elke geteste k de bijbehorende $mse$). De $k-waarden$ neem je oneven als volgt : $ k\\_waarden = np.arrange(1,20,2)$\n",
|
||||
"De $mse$ is het gemiddelde van het verschil van de kwadraten tussen elke voorspelde waarde en zijn werkelijke waarde :\n",
|
||||
" \n",
|
||||
" \\begin{equation}\n",
|
||||
" mse = \\frac{1}{len(testset)} \\sum_i (y_{i\\;predicted} - y_{i\\;expected})^2\n",
|
||||
" \\end{equation}\n",
|
||||
" Gelukkig kan python die ook gewoon voor je berekenen : nl. via de $mean\\_squared\\_error$ methode uit : $sklearn.metrics$\n",
|
||||
"Test zeker ook uit wat het effect is van de parameter $random\\_state$ in je oproep van de $train\\_test\\_split$ methode die je hierboven gebruikte om je testset te genrereren.\n",
|
||||
"\n",
|
||||
"4. Werk nu verder met de $k$-waarde die een minimale error geeft in je grafiek. Train je model en bereken de accuracy en de mse op je test set. Maak een plot waarbij je voor de test data de voorspelde en werkelijke mpg waarde uitplot. Neem als waarde voor de X-as gewoon de range(1,119) een nummering over het aantal elementen uit je test set.\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": []
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.7.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
195
5/Labo5.ipynb
Normal file
195
5/Labo5.ipynb
Normal file
File diff suppressed because one or more lines are too long
393
5/auto.csv
Normal file
393
5/auto.csv
Normal file
@@ -0,0 +1,393 @@
|
||||
displacement,horsepower,mpg
|
||||
307,130,18
|
||||
350,165,15
|
||||
318,150,18
|
||||
304,150,16
|
||||
302,140,17
|
||||
429,198,15
|
||||
454,220,14
|
||||
440,215,14
|
||||
455,225,14
|
||||
390,190,15
|
||||
383,170,15
|
||||
340,160,14
|
||||
400,150,15
|
||||
455,225,14
|
||||
113,95,24
|
||||
198,95,22
|
||||
199,97,18
|
||||
200,85,21
|
||||
97,88,27
|
||||
97,46,26
|
||||
110,87,25
|
||||
107,90,24
|
||||
104,95,25
|
||||
121,113,26
|
||||
199,90,21
|
||||
360,215,10
|
||||
307,200,10
|
||||
318,210,11
|
||||
304,193,9
|
||||
97,88,27
|
||||
140,90,28
|
||||
113,95,25
|
||||
232,100,19
|
||||
225,105,16
|
||||
250,100,17
|
||||
250,88,19
|
||||
232,100,18
|
||||
350,165,14
|
||||
400,175,14
|
||||
351,153,14
|
||||
318,150,14
|
||||
383,180,12
|
||||
400,170,13
|
||||
400,175,13
|
||||
258,110,18
|
||||
140,72,22
|
||||
250,100,19
|
||||
250,88,18
|
||||
122,86,23
|
||||
116,90,28
|
||||
79,70,30
|
||||
88,76,30
|
||||
71,65,31
|
||||
72,69,35
|
||||
97,60,27
|
||||
91,70,26
|
||||
113,95,24
|
||||
97.5,80,25
|
||||
97,54,23
|
||||
140,90,20
|
||||
122,86,21
|
||||
350,165,13
|
||||
400,175,14
|
||||
318,150,15
|
||||
351,153,14
|
||||
304,150,17
|
||||
429,208,11
|
||||
350,155,13
|
||||
350,160,12
|
||||
400,190,13
|
||||
70,97,19
|
||||
304,150,15
|
||||
307,130,13
|
||||
302,140,13
|
||||
318,150,14
|
||||
121,112,18
|
||||
121,76,22
|
||||
120,87,21
|
||||
96,69,26
|
||||
122,86,22
|
||||
97,92,28
|
||||
120,97,23
|
||||
98,80,28
|
||||
97,88,27
|
||||
350,175,13
|
||||
304,150,14
|
||||
350,145,13
|
||||
302,137,14
|
||||
318,150,15
|
||||
429,198,12
|
||||
400,150,13
|
||||
351,158,13
|
||||
318,150,14
|
||||
440,215,13
|
||||
455,225,12
|
||||
360,175,13
|
||||
225,105,18
|
||||
250,100,16
|
||||
232,100,18
|
||||
250,88,18
|
||||
198,95,23
|
||||
97,46,26
|
||||
400,150,11
|
||||
400,167,12
|
||||
360,170,13
|
||||
350,180,12
|
||||
232,100,18
|
||||
97,88,20
|
||||
140,72,21
|
||||
108,94,22
|
||||
70,90,18
|
||||
122,85,19
|
||||
155,107,21
|
||||
98,90,26
|
||||
350,145,15
|
||||
400,230,16
|
||||
68,49,29
|
||||
116,75,24
|
||||
114,91,20
|
||||
121,112,19
|
||||
318,150,15
|
||||
121,110,24
|
||||
156,122,20
|
||||
350,180,11
|
||||
198,95,20
|
||||
232,100,19
|
||||
250,100,15
|
||||
79,67,31
|
||||
122,80,26
|
||||
71,65,32
|
||||
140,75,25
|
||||
250,100,16
|
||||
258,110,16
|
||||
225,105,18
|
||||
302,140,16
|
||||
350,150,13
|
||||
318,150,14
|
||||
302,140,14
|
||||
304,150,14
|
||||
98,83,29
|
||||
79,67,26
|
||||
97,78,26
|
||||
76,52,31
|
||||
83,61,32
|
||||
90,75,28
|
||||
90,75,24
|
||||
116,75,26
|
||||
120,97,24
|
||||
108,93,26
|
||||
79,67,31
|
||||
225,95,19
|
||||
250,105,18
|
||||
250,72,15
|
||||
250,72,15
|
||||
400,170,16
|
||||
350,145,15
|
||||
318,150,16
|
||||
351,148,14
|
||||
231,110,17
|
||||
250,105,16
|
||||
258,110,15
|
||||
225,95,18
|
||||
231,110,21
|
||||
262,110,20
|
||||
302,129,13
|
||||
97,75,29
|
||||
140,83,23
|
||||
232,100,20
|
||||
140,78,23
|
||||
134,96,24
|
||||
90,71,25
|
||||
119,97,24
|
||||
171,97,18
|
||||
90,70,29
|
||||
232,90,19
|
||||
115,95,23
|
||||
120,88,23
|
||||
121,98,22
|
||||
121,115,25
|
||||
91,53,33
|
||||
107,86,28
|
||||
116,81,25
|
||||
140,92,25
|
||||
98,79,26
|
||||
101,83,27
|
||||
305,140,17.5
|
||||
318,150,16
|
||||
304,120,15.5
|
||||
351,152,14.5
|
||||
225,100,22
|
||||
250,105,22
|
||||
200,81,24
|
||||
232,90,22.5
|
||||
85,52,29
|
||||
98,60,24.5
|
||||
90,70,29
|
||||
91,53,33
|
||||
225,100,20
|
||||
250,78,18
|
||||
250,110,18.5
|
||||
258,95,17.5
|
||||
97,71,29.5
|
||||
85,70,32
|
||||
97,75,28
|
||||
140,72,26.5
|
||||
130,102,20
|
||||
318,150,13
|
||||
120,88,19
|
||||
156,108,19
|
||||
168,120,16.5
|
||||
350,180,16.5
|
||||
350,145,13
|
||||
302,130,13
|
||||
318,150,13
|
||||
98,68,31.5
|
||||
111,80,30
|
||||
79,58,36
|
||||
122,96,25.5
|
||||
85,70,33.5
|
||||
305,145,17.5
|
||||
260,110,17
|
||||
318,145,15.5
|
||||
302,130,15
|
||||
250,110,17.5
|
||||
231,105,20.5
|
||||
225,100,19
|
||||
250,98,18.5
|
||||
400,180,16
|
||||
350,170,15.5
|
||||
400,190,15.5
|
||||
351,149,16
|
||||
97,78,29
|
||||
151,88,24.5
|
||||
97,75,26
|
||||
140,89,25.5
|
||||
98,63,30.5
|
||||
98,83,33.5
|
||||
97,67,30
|
||||
97,78,30.5
|
||||
146,97,22
|
||||
121,110,21.5
|
||||
80,110,21.5
|
||||
90,48,43.1
|
||||
98,66,36.1
|
||||
78,52,32.8
|
||||
85,70,39.4
|
||||
91,60,36.1
|
||||
260,110,19.9
|
||||
318,140,19.4
|
||||
302,139,20.2
|
||||
231,105,19.2
|
||||
200,95,20.5
|
||||
200,85,20.2
|
||||
140,88,25.1
|
||||
225,100,20.5
|
||||
232,90,19.4
|
||||
231,105,20.6
|
||||
200,85,20.8
|
||||
225,110,18.6
|
||||
258,120,18.1
|
||||
305,145,19.2
|
||||
231,165,17.7
|
||||
302,139,18.1
|
||||
318,140,17.5
|
||||
98,68,30
|
||||
134,95,27.5
|
||||
119,97,27.2
|
||||
105,75,30.9
|
||||
134,95,21.1
|
||||
156,105,23.2
|
||||
151,85,23.8
|
||||
119,97,23.9
|
||||
131,103,20.3
|
||||
163,125,17
|
||||
121,115,21.6
|
||||
163,133,16.2
|
||||
89,71,31.5
|
||||
98,68,29.5
|
||||
231,115,21.5
|
||||
200,85,19.8
|
||||
140,88,22.3
|
||||
232,90,20.2
|
||||
225,110,20.6
|
||||
305,130,17
|
||||
302,129,17.6
|
||||
351,138,16.5
|
||||
318,135,18.2
|
||||
350,155,16.9
|
||||
351,142,15.5
|
||||
267,125,19.2
|
||||
360,150,18.5
|
||||
89,71,31.9
|
||||
86,65,34.1
|
||||
98,80,35.7
|
||||
121,80,27.4
|
||||
183,77,25.4
|
||||
350,125,23
|
||||
141,71,27.2
|
||||
260,90,23.9
|
||||
105,70,34.2
|
||||
105,70,34.5
|
||||
85,65,31.8
|
||||
91,69,37.3
|
||||
151,90,28.4
|
||||
173,115,28.8
|
||||
173,115,26.8
|
||||
151,90,33.5
|
||||
98,76,41.5
|
||||
89,60,38.1
|
||||
98,70,32.1
|
||||
86,65,37.2
|
||||
151,90,28
|
||||
140,88,26.4
|
||||
151,90,24.3
|
||||
225,90,19.1
|
||||
97,78,34.3
|
||||
134,90,29.8
|
||||
120,75,31.3
|
||||
119,92,37
|
||||
108,75,32.2
|
||||
86,65,46.6
|
||||
156,105,27.9
|
||||
85,65,40.8
|
||||
90,48,44.3
|
||||
90,48,43.4
|
||||
121,67,36.4
|
||||
146,67,30
|
||||
91,67,44.6
|
||||
97,67,33.8
|
||||
89,62,29.8
|
||||
168,132,32.7
|
||||
70,100,23.7
|
||||
122,88,35
|
||||
107,72,32.4
|
||||
135,84,27.2
|
||||
151,84,26.6
|
||||
156,92,25.8
|
||||
173,110,23.5
|
||||
135,84,30
|
||||
79,58,39.1
|
||||
86,64,39
|
||||
81,60,35.1
|
||||
97,67,32.3
|
||||
85,65,37
|
||||
89,62,37.7
|
||||
91,68,34.1
|
||||
105,63,34.7
|
||||
98,65,34.4
|
||||
98,65,29.9
|
||||
105,74,33
|
||||
107,75,33.7
|
||||
108,75,32.4
|
||||
119,100,32.9
|
||||
120,74,31.6
|
||||
141,80,28.1
|
||||
145,76,30.7
|
||||
168,116,25.4
|
||||
146,120,24.2
|
||||
231,110,22.4
|
||||
350,105,26.6
|
||||
200,88,20.2
|
||||
225,85,17.6
|
||||
112,88,28
|
||||
112,88,27
|
||||
112,88,34
|
||||
112,85,31
|
||||
135,84,29
|
||||
151,90,27
|
||||
140,92,24
|
||||
105,74,36
|
||||
91,68,37
|
||||
91,68,31
|
||||
105,63,38
|
||||
98,70,36
|
||||
120,88,36
|
||||
107,75,36
|
||||
108,70,34
|
||||
91,67,38
|
||||
91,67,32
|
||||
91,67,38
|
||||
181,110,25
|
||||
262,85,38
|
||||
156,92,26
|
||||
232,112,22
|
||||
144,96,32
|
||||
135,84,36
|
||||
151,90,27
|
||||
140,86,27
|
||||
97,52,44
|
||||
135,84,32
|
||||
120,79,28
|
||||
119,82,31
|
|
Reference in New Issue
Block a user