Files
college-datascience/project/ProjectDataScienceBeppeVanrolleghem.ipynb
2019-05-29 11:46:47 +02:00

1127 lines
278 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Datascience project\n",
"\n",
"## Voorwoord\n",
"\n",
"Jammer genoeg heb ik niet zoveel tijd kunnen steken in deze opgave als ik wou. Dit komt namelijk omdat ik de opdracht niet goed gelezen had en de opgave verkeerd gemaakt heb voor meerendeels van de tijd die ik hierin gestoken heb. Dit andere project is meegegeven en kan gevonden worden in de notebook \"VoorspellenVanSignaalSterkteADVPositie\".\n",
"\n",
"## Inlezen van de data\n",
"\n",
"Er wordt begonnen met het inlezen van de data als een array van de lijnen. Door gebruik van enkele if functies kunnen we kiezen welke datasets we willen inlezen.\n"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Time=12/03 06:08:53& Sender=44:6E:E5:C5:8F:4F& Location=gang@0.61875;0.13758& WifiInfo=ODISEE@88-1d-fc-30-d4-40:-74,campusroam@88-1d-fc-30-d4-43:-74,ODISEE@88-1d-fc-30-d5-50:-72,eduroam@88-1d-fc-30-d4-42:-74,eduroam@88-1d-fc-30-d5-52:-72,campusroam@88-1d-fc-30-d5-53:-73,ODISEEGuest@88-1d-fc-30-d4-41:-75,ODISEEGuest@88-1d-fc-30-d5-51:-73,CiscoC5976@58-6d-8f-19-14-38:-82,rechts@58-6d-8f-19-10-fc:-59,ODISEE@88-1d-fc-41-dc-50:-81,eduroam@88-1d-fc-41-dc-52:-81,campusroam@88-1d-fc-41-dc-53:-67,eduroam@88-1d-fc-2c-c0-02:-78,campusroam@88-1d-fc-2c-c0-03:-71,ODISEE@88-1d-fc-2c-c0-00:-77,telenet-5467D@dc-53-7c-85-46-82:-87,ODISEEGuest@88-1d-fc-41-dc-51:-80,ODISEEGuest@88-1d-fc-2c-c0-01:-73,CiscoC5959@58-6d-8f-19-13-f4:-81,TELENETHOMESPOT@02-53-7c-85-46-83:-86\n",
"\n"
]
}
],
"source": [
"import matplotlib.pyplot as plt\n",
"import pandas as pd\n",
"import numpy as np\n",
"lines = []\n",
"\n",
"if True:\n",
" with open(\"DataScienceData01.txt\",\"r\") as infile:\n",
" lines = infile.readlines()\n",
"if True:\n",
" with open(\"DataScienceData02.txt\", \"r\") as infile:\n",
" lines.extend(infile.readlines())\n",
" \n",
"\n",
"if False:\n",
" with open(\"DataScienceData03.txt\", \"r\") as infile:\n",
" lines.extend(infile.readlines())\n",
"\n",
"print(lines[1])\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"De lijnen zullen meerdere keren gesplit moeten worden om zo een uiteindelijke dataset te krijgen.\n",
"Dit gebeurt door het gebruik van de dataParse functie:\n",
"\n",
"Deze zal de data splitten en parsen naar dictionary objecten. Vorm in json:\n",
"```json\n",
"[\n",
" {\n",
" sender = '',\n",
" location = '',\n",
" time = '',\n",
" x = '',\n",
" y = '',\n",
" px = '',\n",
" py = '',\n",
" xmax = '',\n",
" ymax = '',\n",
" WifiInfo= [\n",
" {\n",
" ssid = '',\n",
" mac = '',\n",
" routerid = '',\n",
" signal = ''\n",
" },\n",
" ...\n",
" ]\n",
" },\n",
" ...\n",
"]\n",
"```\n",
"Deze worden daarna in een dataframe gestoken."
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" Sender Time \\\n",
"0 44:6E:E5:C5:8F:4F 1900-03-12 06:08:41 \n",
"1 44:6E:E5:C5:8F:4F 1900-03-12 06:08:53 \n",
"2 44:6E:E5:C5:8F:4F 1900-03-12 06:09:03 \n",
"3 44:6E:E5:C5:8F:4F 1900-03-12 06:09:17 \n",
"4 44:6E:E5:C5:8F:4F 1900-03-12 06:09:41 \n",
"\n",
" WifiInfo location px \\\n",
"0 [{'ssid': 'rechts', 'mac': '58-6d-8f-19-10-fc'... gang 0.65625 \n",
"1 [{'ssid': 'rechts', 'mac': '58-6d-8f-19-10-fc'... gang 0.61875 \n",
"2 [{'ssid': 'rechts', 'mac': '58-6d-8f-19-10-fc'... gang 0.26250 \n",
"3 [{'ssid': 'rechts', 'mac': '58-6d-8f-19-10-fc'... gang 0.63333 \n",
"4 [{'ssid': 'rechts', 'mac': '58-6d-8f-19-10-fc'... gang 0.63958 \n",
"\n",
" py x xmax y ymax \n",
"0 0.04449 186.37500 284 49.51737 1113 \n",
"1 0.13758 175.72500 284 153.12654 1113 \n",
"2 0.13826 74.55000 284 153.88338 1113 \n",
"3 0.31006 179.86572 284 345.09678 1113 \n",
"4 0.49555 181.64072 284 551.54715 1113 \n"
]
}
],
"source": [
"from datetime import datetime\n",
"wifiSignals = []\n",
"\n",
"def dataParse2(l):\n",
" objs = l.split(\"& \")\n",
" dic = {}\n",
" for obj in objs:\n",
" items = obj.split(\"=\")\n",
" title = items[0]\n",
" data = items[1].split(\",\")\n",
" if len(data) == 1:\n",
" data = data[0]\n",
" if title == \"Time\":\n",
" dic[title] = datetime.strptime(data, \"%d/%m %H:%M:%S\")\n",
" continue\n",
" if title == \"Location\":\n",
" temp = data.split(\"@\")\n",
" naam = temp[0].lower()\n",
" x, y = temp[1].split(\";\")\n",
" dic[\"location\"] = naam\n",
" img = plt.imread(naam+'.png')\n",
" height, width, channels = img.shape\n",
" dic[\"x\"] = float(x) * width\n",
" dic[\"y\"] = float(y) * height\n",
" dic[\"px\"] = float(x)\n",
" dic[\"py\"] = float(y)\n",
" dic[\"xmax\"] = width\n",
" dic[\"ymax\"] = height\n",
" continue\n",
" if title == \"WifiInfo\":\n",
" appendable = []\n",
" for f in data:\n",
" append = {}\n",
" temp = f.replace(\"\\n\",'').split('@')\n",
" ti = temp[0]\n",
" append[\"ssid\"] = ti\n",
" temp = temp[1].split(\":\")\n",
" append[\"mac\"] = temp[0]\n",
" append[\"routerId\"] = \"\".join(temp[0].split('-'))\n",
" append[\"routerId\"] = append[\"routerId\"][:-4]\n",
" if append[\"routerId\"] not in wifiSignals:\n",
" wifiSignals.append(append[\"routerId\"])\n",
" append[\"signal\"] = float(temp[1])\n",
" appendable.append(append)\n",
" dic[title] = sorted(appendable, key=lambda k: k[\"signal\"], reverse=True)\n",
" continue\n",
" dic[title] = data\n",
" return dic\n",
"\n",
"\n",
"data = []\n",
"for l in lines:\n",
" data.append(dataParse2(l))\n",
"\n",
"\n",
"\n",
"\n",
"d = pd.DataFrame(data)\n",
"print(d.head())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Selectie van de data\n",
"\n",
"Nadat de data ingelezen wordt is het een goed idee om het in beeld te brengen zodat we een idee hebben van met wat we gaan werken. Dit wordt gedaan door de meetpunten te displayen in een scatterplot overheen de images. \n"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAALEAAAD8CAYAAAA470V3AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4zLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvnQurowAAD+pJREFUeJzt3X9s3PV9x/HneyYwDyIZSBaRACNlaSa6CAcsFo2qMqvWQIrla5Cq8MdIGFL2R5DqaAlK2kkzU7uyJqwRGqOjWnDSrSBEjElG2jRNOVVDpWA3VxKgDikxSpyQ0BWzZIkgJO/9cd9zL47PPt+v7/dz93pIpzt/73v3fcd5+Xuf+37vPm9zd0RC9ntxFyBSLoVYgqcQS/AUYgmeQizBU4gleFULsZndaWaDZnbQzNZVazsiVo3jxGbWBBwA/hI4ArwG3Ovub1Z8Y9LwqrUnvg046O7vuPvHwDNAZ5W2JQ3ukio97xzgcN7PR4A/K7TyjBkz/NSpU1x33XVFPfnhw4f56KOPyqtQQvAbd5852UrVCvGkzGwlsBLg+uuv5+qrr+bRRx8t6rFPPfUU999/fzXLkwTo6Oh4t5j1qjWcGAbyd6vXRstGufuT7t7m7m0zZ076xyZSULVC/Bowz8zmmtmlwDJge5W2JQ2uKsMJd//EzB4EdgFNwGZ3f6Ma2xKp2pjY3XcCO6v1/CI5OmMnwVOIJXgKsQRPIZbgKcQSPIVYghfbaeexFixYUPS6Zsbdd99dxWoag5kR17fdzYwdO3aMe19HR8eU6tKeWIKnEEvwFGIJnkIswVOIJXiJCbGZxV2CBCoxIRYplUIswVOIJXgKsQRPIZbgKcQSPIVYgqcQS/AUYgmeQizBU4gleAqxBE8hluAl5jt2Er6+vcNs2DXI0ZEzzG5pZu3i+aQWzqn6dhViqYi+vcOs793HmbPnABgeOcP63n0AVQ+yhhNSERt2DY4GOOfM2XNs2DVY9W0rxFIRR0fOTGl5JZUVYjMbMrN9ZpYxs/5o2VVmttvM3o6ur6xMqZJks1uap7S8kiqxJ77D3VvdvS36eR2wx93nAXuin6XOrV08n+ZpTRcsa57WxNrF86u+7WoMJzqBLdHtLUCqCtuQhEktnMM3ly5gTkszBsxpaeabSxcEcXTCgR+ZmQP/5u5PArPc/Vh0/3vArPEeOLZ70ty5c8ssJX6N/mXX1MI5NQntWOWG+LPuPmxmfwjsNrNf5d/p7h4F/CJR4J8EaGtri2dCsAqLa16zUtXLH11Zwwl3H46uTwDPk+0ketzMrgGIrk+UW6TIREoOsZldbmbTc7eBLwD7ybb6Wh6tthx4odwiRSZSznBiFvB89JJ0CfB9d/+hmb0GPGtmDwDvAl8uv0yRwkoOsbu/A9w8zvL/AT5fTlEiU6EzdhI8hViCpxBL8BRiCZ5CLMFTiCV4CrEETyGW4CnEEjyFWIKnEEvwggzxVPpAS/0LMsTTp0+PuwRJkCBDLJJPIZbgKcQSPIVYgqcQS/AUYgmeQizB0/zEFVQvk5GERiGuIM0AFA8NJyR4CrEETyGW4CnEEjyFWIKnEEvwFGIJnkIswZs0xGa22cxOmNn+vGXjtvmyrMfM7KCZvW5mt1SzeBEobk/cA9w5ZlmhNl93AfOiy0rgicqUKVLYpCF2958Cvx2zuFCbr05gq2e9ArTk+neIVEupY+JCbb7mAIfz1jsSLbuIma00s34z63///fdLLEOkAh8AmqjN1ySPq7sWYPXygZrQlBri42Z2jbsfG9Pmaxi4Lm+9a6NlDUGfYotHqcOJQm2+tgP3RUcpFgEf5g07RKpi0j2xmT0NtAMzzOwI8PfAI4zf5msnsAQ4CJwG7q9CzSIXmDTE7n5vgbsuavPl2dfTVeUWJTIVOmMnwVOIJXgKsQRPIZbgKcQSPIVYgqcQS/AUYgleYkL83HPPxV2CBCoxIRYplUIswVOIJXgKsQRPIZbgKcQSPIVYgqcQS/AUYgmeQizBU4gleAqxBE8hluApxBI8hViCpxBL8BRiCZ5CLMFTiGvsk4GNmN3BxoFTk6678Y8NM8OW9zLp1KLn36X3gQWY2eTr1pmyJ9mWJDjPHU03kI67jJiU2j2p28yGzSwTXZbk3bc+6p40aGaLq1W45DnV37ABhtK7JwF8291bo8tOADO7CVgGfCZ6zL+aWVOlipXxjDDwnX+AP2+nvUFb/JTaPamQTuAZd//I3Q+RnWz7tjLqkwl9zHDv39Gx9kW2ff/rfPEP4q4nHuW8sXswari4OdeMkSl0T5IyndrP5uW3cu09j8Nfb2PpnN+Pu6LYlBriJ4AbgVbgGPDoVJ9ALcDKM/CddTywdT+0d/MfX7877nJiVVKI3f24u59z9/PAd/ndkKHo7knu/qS7t7l728yZM0spI2Bp1rZNzx4+m+Cy9teFHn+eNWtfBL7Iho1f4S+uubSGtSdPSSEe0yX0S0DuyMV2YJmZXWZmc8m2x321vBLlAuffpfeBmxm8r4e3Tm5nza0tcVcUu2IOsT0N/AyYb2ZHoo5J3zKzfWb2OnAHsBrA3d8AngXeBH4IrHL3c1WrPljtbOg/ibtPeNlw49jHfcxw3wYe3Lyff/nHe/mTK3SuCkrvnvTvE6z/DeAb5RQl4zv1q6f56oOPc4w/JTWnsYcQ+XTGLiCD/9XD1mMA+2maoBvo7Nx9923j6Jal1PvhY70eSfAU4oDcuualwmPos/2jY+ijuWUNsBcGhVjqgEIswVOIJXhBHp3I9kEP0yW3rsF9TVHrrjnoFLcmcMmtU1u/jgS5Jz558mTcJUiCBBlikXwKsQRPIZbgKcQSPIVYgqcQS/AUYgmeQizBU4gleAqxBE8hluApxBI8hViCpxBL8BRiCZ5CLMFTiCV4CrEETyGW4CnEEjyFWIKnEEvwipmf+Doze8nM3jSzN8zsK9Hyq8xst5m9HV1fGS03M3ssagP2upndUu1/hDS2YvbEnwB/6+43AYuAVVGrr3XAHnefB+yJfga4i+wM8fOAlWT7e4hUTTEtwI65+y+i2yeBt8h2ROoEtkSrbQFS0e1OYKtnvQK0jGmPIFJRUxoTm9kNwELg58Asd8+1EX4PmBXdVhswqamiQ2xmVwDbgC53/9/8+zw7OdqUJkhTCzCplKJCbGbTyAb4P929N1p8PDdMiK5PRMuLagPW2C3ApJKKOTphZBvNvOXu/5x313ZgeXR7OfBC3vL7oqMUi4AP84YdIhVXzNSutwN/Bewzs0y07KvAI8CzUUuwd4EvR/ftBJaQ7et8Gri/ohUnmE3QDEYu1tHRUZHnKaYF2H8Dhf53Pj/O+g6sKrOuIIU2b3Lcf3SFfl9TrSvISbYlLOl0+oLroaEh4MKwunvJf1Q67SwVl06n6e7uJpVKYWasWLGC7u7u0ftbW1sZHBwc7fwEvwt0/rJiaU8sFTE0NERfXx+rV6/m5ptvprW1lVQqRV9f36SPLXcYpj2xlCSdTrNixQrMjJ6eHlpaWujq6sLdyWQy9PT0sGLFiprUoj2xTFl7ezuZTIauri4OHTrEDTfcEGs9CrEULX+s29fXR0tLS9wlARpOSJHa29tJpVK0t7fT1dWVmACD9sQygUwmMzquzWQyE68cI+2JZVybNm1i4cKFpFKpRAcYtCeWcYyMjNDd3c1LL71Ee3t73OVMSiGWUZlMZnTsOzIyEnc5RdNwQka1t7fT3d1NT09P3KVMifbEAmT3wps2barZCYpKUohldBgR0hAin4YTDa6lpYV0Oh1sgEEhbnipVIqurq64yyiLQtzgQnsTNx6FuAEdOXIk9m91VJJC3IAefvhhtm3bVvXt9O0d5vZHfsLcdS9y+yM/oW/vRV96rwgdnWgwvb3ZGReWLl1a1e307R1mfe8+zpw9B8DwyBnW9+4DILWwsnPpKMQN5p577uHw4cOTr1imDbsGRwOcc+bsOTbsGlSIpXQHDhzg9OnTNDc3V31bR0fOTGl5OTQmbiAPPfRQTQIMMLtl/O0UWl4OhbhBvPzyyzXd3trF82me1nTBsuZpTaxdPL/i29JwogEcOHCA1atX8+qrr9Zsm7lx74ZdgxwdOcPslmbWLp5f8fEwKMQVlfRjr7WuL7VwTlVCO5ZCXEFJncbKzPjggw8S9b24StKYuAF0dnbWbYBBIa57mUymqFl4QqYQ17l6DzCU1wKs28yGzSwTXZbkPWZ91AJs0MwWV/MfIBNrhBAX88Yu1wLsF2Y2HRgws93Rfd929435K0ftwZYBnwFmAz82s0+7+4XnIGusb+9wTQ73JEkjBBiKm2T7GHAsun3SzHItwArpBJ5x94+AQ2Z2ELgN+FkF6i1JLT+MkiS5rx3Vu3JagAE8GHUN3ZzrKEqRLcBq2T1pog+j1LN0Oq0Q5xunBdgTwI1AK9k99aNT2XAtuyfV8sMoSZLJZGhtbY27jKoruQWYux9393Pufh74LtkhAxTZAqyWavlhlKRIp9O0trbGPu1qLZTcAmxMq9svAfuj29uBZWZ2mZnNJdvjuXYn7cdRyw+jJMXQ0FBdn+DIV04LsHvNrJVsJ9Eh4G8A3P0NM3sWeJPskY1VcR+ZqOWHUZJiaGioIYYSAJaE8/1tbW0+MDDAjh07ilq/v7//gkYmcrGkfxgJJv+siZkNuHvbZM+jDwDVsSTsoAqp5B+ZQiyxG9vPbqqCDHEjHPss16FDh+IuoWhjg2tmUwqzPgBUpxrh0FpOkHtiqQ+VGhcrxBKbQkejOjo6pvQ8Gk5I8BRiCZ5CLMEL8oydNIaOjo6izthpTyzBU4gleAqxBE8hluApxBI8hViCF+Rp5yuuuEKfZJNR2hMHLJ1Ok06n4y4jdok42WFm7wP/B/wm7lrGMYNk1gX1X9sfufuk8zkkIsQAZtZfzNmZWktqXaDacjSckOApxBK8JIX4ybgLKCCpdYFqAxI0JhYpVZL2xCIliT3EZnZnNKP8QTNbl4B6hsxsXzT7fX+07Coz221mb0fXV072PBWqZbOZnTCz/XnLxq3Fsh6Lfo+vm9ktNa4rvs4B7h7bBWgCfg18CrgU+CVwU8w1DQEzxiz7FrAuur0O+Kca1fI54BZg/2S1AEuAHwAGLAJ+XuO6uoE146x7U/T/ehkwN/r/bqpkPXHviW8DDrr7O+7+MfAM2Znmk6YT2BLd3gKkarFRd/8p8Nsia+kEtnrWK0DLmJlLq11XIaOdA9z9EJDrHFAxcYe4qFnla8yBH5nZgJmtjJbN8mzbB4D3gFnxlDZhLUn4XZbcOaAccYc4iT7r7rcAdwGrzOxz+Xd69jUyEYd0klQLZXYOKEfcIU7crPLuPhxdnwCeJ/vSdzz30hxdn4ivwoK1xPq79Bg7B8Qd4teAeWY218wuJds6bHtcxZjZ5VGbM8zscuALZGfA3w4sj1ZbDrwQT4UwQS3bgfuioxSLgA/zhh1VF2vngFq8y57kne4S4ADZd61fi7mWT5F9J/1L4I1cPcDVwB7gbeDHwFU1qudpsi/NZ8mOJR8oVAvZoxKPR7/HfUBbjev6XrTd16PgXpO3/teiugaBuypdj87YSfDiHk6IlE0hluApxBI8hViCpxBL8BRiCZ5CLMFTiCV4/w9FqJppGGAAgwAAAABJRU5ErkJggg==\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"lokalen = d.location.unique().tolist()\n",
"\n",
"for i in lokalen:\n",
" temp = d.loc[d[\"location\"] == i]\n",
" plt.scatter(temp.x, temp.y)\n",
" #print(np.column_stack((temp.x, temp.y)))\n",
" #print(i)\n",
" img = plt.imread(i+'.png')\n",
" #print(img)\n",
" plt.imshow(img)\n",
" plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Zoals men kan zien zijn er sommige meetpunten die zeer dicht bij elkaar liggen. In dit geval worden deze gefilterd en moesten ze gelijkaardige wifi info hebben eruit gehaald."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"\n",
"def removeIrelevant(df, minSampleSize=50):\n",
" rdf = []\n",
" returnable = pd.DataFrame()\n",
" for i, v in df.iterrows():\n",
" rdf = CloseToOthers(v, rdf)\n",
" rdf = pd.DataFrame(rdf)\n",
" return rdf\n",
"\n",
"def CloseToOthers(i, df, SpacePerc = .1):\n",
" tdf = pd.DataFrame(df)\n",
" approved = []\n",
" if \"location\" in tdf:\n",
" l = tdf.loc[tdf[\"location\"] == i[\"location\"]]\n",
" for index, dataframe in l.iterrows():\n",
" temp = abs(dataframe.px - i.px) \n",
" temp2 = abs(dataframe.py - i.py)\n",
" if temp <= SpacePerc and temp2 <= SpacePerc and len(dataframe[\"WifiInfo\"]) > len(i[\"WifiInfo\"]):\n",
" return df\n",
" df.append(i)\n",
" return df\n",
" else:\n",
" df.append(i)\n",
" return df\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"oud: 217 \n",
"Nieuw: 203\n"
]
}
],
"source": [
"g = removeIrelevant(d)\n",
"print(\"oud: {} \\nNieuw: {}\".format(len(d), len(g)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Training data:\n",
"\n",
"Nadat de data gefilterd geweest is kan er begonnen worden aan de voorbereiding van de trainings data. Er wordt ook nog een functie aangemaakt voor de modellen te evalueren.\n",
"\n",
"Voor de x waarden gebruikt deze opgave een lijst van de top 2 bereikbare modems.De y waarden werden de coordinaten van het meetpunt + een nummer die afhankelijk is van het lokaal gegeven. Dit nummer werd vermenigvuldigd zodat het niet kan samenspelen met de percentages (float 0-1). Uit testen bleek dit beter te gaan dan 3 verschillende y waarden te proberen predicten. En om dit te parsen haalt men gewoon het tiental (het eerste/eerste twee getallen) van de return values en de overblijvende nummers zijn percentages (tussen 0 en 1 ideaal) die vermenigvuldigd moeten worden met de breedte / lengte van het lokaal.\n",
"\n",
"Daarna werd er geexperimenteerd met bepaalde scalers om het beste resultaat te halen waaruit bleek dat de normalizer de beste test results gaf."
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.model_selection import train_test_split\n",
"from sklearn.preprocessing import StandardScaler\n",
"from sklearn.preprocessing import MinMaxScaler\n",
"from sklearn.preprocessing import MaxAbsScaler\n",
"from sklearn.preprocessing import Normalizer\n",
"from sklearn.model_selection import cross_val_score\n",
"from sklearn.model_selection import KFold\n",
"\n",
"\n",
"def prepTrainingOLD(df, l=7):\n",
" x = []\n",
" y = []\n",
" #scaler = MinMaxScaler(feature_range=(0,1))\n",
" #scaler = StandardScaler()\n",
" #scaler = MaxAbsScaler()\n",
" scaler = Normalizer()\n",
" for i, dataframe in df.iterrows():\n",
" tx = []\n",
" for i in sorted(dataframe[\"WifiInfo\"], key=lambda x: x[\"signal\"], reverse=True):\n",
" if i[\"routerId\"] not in tx:\n",
" tx.append(wifiSignals.index(i[\"routerId\"]))\n",
" if len(tx) >= 2:\n",
" break\n",
" #for ij in dataframe[\"WifiInfo\"]:\n",
" # tx[ij[\"routerId\"]] = ij[\"signal\"]\n",
" #print(tx)\n",
" x.append(tx)\n",
" ty = (lokalen.index(dataframe[\"location\"])/len(lokalen),dataframe[\"px\"], dataframe[\"py\"])\n",
" #x.append(tx)\n",
" y.append(ty)\n",
" fx = pd.DataFrame(x).fillna(0)\n",
" fy = pd.DataFrame(y)\n",
" #print(fx)\n",
" #print(fy)\n",
" xtrain, xtest, ytrain, ytest = train_test_split(fx, fy)\n",
" scaler.fit(xtrain)\n",
" xtrain = scaler.transform(xtrain)\n",
" xtest = scaler.transform(xtest)\n",
" return xtrain, xtest, ytrain, ytest\n",
"\n",
"\n",
"def prepTraining(df, scaler=Normalizer(), l=2):\n",
" x = []\n",
" y = []\n",
" scaler = Normalizer()\n",
" for i, dataframe in df.iterrows():\n",
" tx = []\n",
" for i in sorted(dataframe[\"WifiInfo\"], key=lambda x: x[\"signal\"], reverse=True):\n",
" if i[\"routerId\"] not in tx:\n",
" tx.append(wifiSignals.index(i[\"routerId\"]))\n",
" if len(tx) >= l:\n",
" break\n",
" x.append(tx)\n",
" ty = (dataframe[\"px\"]+lokalen.index(dataframe[\"location\"])*10, dataframe[\"py\"]+lokalen.index(dataframe[\"location\"])*10)\n",
" y.append(ty)\n",
" fx = pd.DataFrame(x).fillna(0)\n",
" fy = pd.DataFrame(y)\n",
" xtrain, xtest, ytrain, ytest = train_test_split(fx, fy, random_state=3)\n",
" scaler.fit(xtrain)\n",
" xtrain = scaler.transform(xtrain)\n",
" xtest = scaler.transform(xtest)\n",
" return xtrain, xtest, ytrain, ytest\n",
"\n",
"\n",
"def score(mod, cv=3):\n",
" kfold = KFold(n_splits=3, shuffle=True, random_state=2)\n",
" print(\"Model score {}\\nCrosValScore {}\\nMean {}\\n\\n\".format(mod.score(xtest, ytest), cross_val_score(mod, xtest, ytest, cv = cv),cross_val_score(mod, xtest, ytest, cv = cv).mean()))\n",
" print(\"Kfold:\\nScore: {}\\nMean: {}\".format(cross_val_score(mod, xtest, ytest, cv=kfold),cross_val_score(mod, xtest, ytest, cv=kfold).mean()))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Modellen\n",
"\n",
"![alt text](https://scikit-learn.org/stable/_images/sphx_glr_plot_classifier_comparison_001.png \"Vormen van plotting\")\n",
"\n",
"\n",
"### Lineare regressie\n",
"\n",
"Het eenvoudigste model. Dit komt vooral omdat er niet veel parameters zijn die dit model aanpassen t.o.v. andere modellen die hier gebruikt worden. Het reflecteerd ook dus zeer goed de kwaliteit van de trainings set die gebruikt wordt. Dit is waarom het de baseline is van deze opgave."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model score 0.053549619325076785\n",
"CrosValScore [0.08107485 0.05643525 0.26221046]\n",
"Mean 0.1332401864126653\n",
"\n",
"\n",
"Kfold:\n",
"Score: [0.25199567 0.00721587 0.08458426]\n",
"Mean: 0.11459860042677367\n"
]
}
],
"source": [
"from sklearn.linear_model import LinearRegression\n",
"\n",
"\n",
"xtrain, xtest, ytrain, ytest = prepTraining(d)\n",
"\n",
"\n",
"\n",
"def LinReg():\n",
" xtrain, xtest, ytrain, ytest = prepTraining(d)\n",
" lr = LinearRegression().fit(xtrain, ytrain)\n",
" score(lr)\n",
" return lr\n",
"\n",
"model = LinReg()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Gaussian process\n",
"\n",
"Dit was een model waar er meer geexperimenteerd werd in de notebook (eerder vermeld). De mogelijkheid om kernels te kiezen die het model zou gebruiken leek mij zeer interessant om de resultaten hiervan te kunnen zien.\n",
"\n",
"\n",
"#### White kernel\n",
"\n",
"Op zich zelf is deze kernel redelijk onbruikbaar. Het is een white noise kernel, wat betekend dat het willekeurige en onverwachte resultaten zal geven, maar dit is handig als je ze in gebruik zet met andere kernels om een meer gevarieerd resultaat te geven.\n",
"\n",
"\n",
"#### DotProduct en RBF kernels\n",
"\n",
"De dotproduct kernel is een kernel die meer decision tree achtige resultaten geeft, terwijl de RBF kernel meer gevarieerd zal zijn. Dit is ook te zien in de notebook waar ze oorspronkelijk geimplementeerd werden, maar het leek interessant om ze in deze opgave ook te implementeren.\n",
"\n",
"Jammer genoeg is het niet gelukt om de White noise die deze kernel genereerd te onderdrukken. (((***Goede***))) resultaten zijn bereikbaar via de RBF en DotProduct kernel op zichzelf, maar van zodra dat de white noise erbij komt is dit teveel. Dit is zichtbaar door de testscores die exact dezelfde zijn als die van de whiteKernel op zichzelf.\n",
"\n",
"![alt text](https://scikit-learn.org/stable/_images/sphx_glr_plot_gpc_xor_001.png \"Kernels\")\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"---\n",
"GeneratingOptimalAlphaRBF\n",
"\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-2.33373797e-05]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 47, 'nit': 4, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-3.67252169e-05]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 47, 'nit': 4, 'warnflag': 2}\n",
" ConvergenceWarning)\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Optimal alpha for rbf kernel 1.275\n",
"\n",
"\n",
"\n",
"---\n",
"GeneratingOptimalAlphaDotProduct\n",
"\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([84.]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 78, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-10.]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 45, 'nit': 2, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([348.]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 48, 'nit': 2, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([20.5]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 63, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([11.125]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 47, 'nit': 2, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([15.]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 42, 'nit': 1, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-38.25]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 46, 'nit': 2, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.25]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 47, 'nit': 2, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([9.375]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 47, 'nit': 2, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([5.35546875]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 71, 'nit': 5, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([63.90625]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 48, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-8.34375]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 56, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([20.6875]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 68, 'nit': 4, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([65.1875]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 50, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-3.3125]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 47, 'nit': 2, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.75]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 42, 'nit': 1, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-24.]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 51, 'nit': 4, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([2.609375]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 75, 'nit': 4, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.453125]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 45, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.296875]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 87, 'nit': 4, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-0.8125]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 51, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([3.]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 59, 'nit': 6, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([2.609375]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 49, 'nit': 2, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-30.1640625]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 59, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([18.078125]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 46, 'nit': 2, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.71875]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 46, 'nit': 2, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-23.609375]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 88, 'nit': 4, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.1171875]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 45, 'nit': 2, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.671875]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 42, 'nit': 1, 'warnflag': 2}\n",
" ConvergenceWarning)\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([1.41796875]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 50, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-14.09765625]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 58, 'nit': 4, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([26.01953125]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 54, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.10546875]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 47, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-26.19140625]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 49, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-16.40234375]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 67, 'nit': 4, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-0.6796875]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 45, 'nit': 2, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-6.94140625]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 117, 'nit': 6, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-11.21679688]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 87, 'nit': 6, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-2.37890625]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 50, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-0.22265625]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 47, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([2.03515625]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 99, 'nit': 5, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-0.69921875]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 50, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.7578125]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 73, 'nit': 4, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-2.99804688]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 50, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-4.85742188]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 62, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.59667969]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 44, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.58984375]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 28, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-4.69726562]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 43, 'nit': 2, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-1.68945312]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 82, 'nit': 6, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.3359375]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 49, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-2.52978516]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 69, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.96875]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 48, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.04345703]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 43, 'nit': 2, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-0.72167969]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 74, 'nit': 4, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-1.03710938]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 62, 'nit': 4, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.6171875]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 47, 'nit': 2, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.61621094]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 79, 'nit': 4, 'warnflag': 2}\n",
" ConvergenceWarning)\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.97216797]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 49, 'nit': 4, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.68652344]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 67, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.66503906]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 64, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.76708984]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 62, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.73803711]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 45, 'nit': 2, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.48022461]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 54, 'nit': 3, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([0.17260742]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 43, 'nit': 2, 'warnflag': 2}\n",
" ConvergenceWarning)\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Optimal alpha for dotproduct kernel 0.04\n",
"\n",
"\n",
"\n",
"---\n",
"Generating white kernel noise level for rbf\n",
"\n",
"Optimal noise level for rbf 1\n",
"\n",
"\n",
"\n",
"---\n",
"Generating white kernel noise for DotProd\n",
"\n",
"Optimal noise level for dot 1\n",
"\n",
"\n",
"\n",
"---\n",
"White Kernel\n",
"Model score -1.3040243750461273\n",
"CrosValScore [-1.80544992 -1.09822479 -1.12201902]\n",
"Mean -1.3418979083128233\n",
"\n",
"\n",
"Kfold:\n",
"Score: [-1.59551739 -1.24587679 -1.11763551]\n",
"Mean: -1.3196765647323825\n",
"\n",
"\n",
"\n",
"\n",
"---\n",
"DotProduct\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-10.]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 45, 'nit': 2, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-10.]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 45, 'nit': 2, 'warnflag': 2}\n",
" ConvergenceWarning)\n",
"/home/beppe/.local/lib/python3.7/site-packages/sklearn/gaussian_process/gpr.py:480: ConvergenceWarning: fmin_l_bfgs_b terminated abnormally with the state: {'grad': array([-10.]), 'task': b'ABNORMAL_TERMINATION_IN_LNSRCH', 'funcalls': 45, 'nit': 2, 'warnflag': 2}\n",
" ConvergenceWarning)\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model score 0.05304424748634964\n",
"CrosValScore [0.08041413 0.05850993 0.26145889]\n",
"Mean 0.1334609854955265\n",
"\n",
"\n",
"Kfold:\n",
"Score: [0.25166842 0.00850081 0.08625134]\n",
"Mean: 0.11547352452721255\n",
"\n",
"\n",
"\n",
"\n",
"---\n",
"Rbf\n",
"Model score 0.0485491868165403\n",
"CrosValScore [0.06244162 0.10407524 0.32431861]\n",
"Mean 0.16361182171401337\n",
"\n",
"\n",
"Kfold:\n",
"Score: [ 0.26362672 -0.25184796 0.09040504]\n",
"Mean: 0.034061269824504775\n",
"\n",
"\n",
"\n",
"\n",
"---\n",
"DotWhite\n",
"Model score -0.013242764071176399\n",
"CrosValScore [-0.10300226 -0.03773465 -0.00324282]\n",
"Mean -0.04799324044247397\n",
"\n",
"\n",
"Kfold:\n",
"Score: [-0.04027678 -0.00891338 -0.00084781]\n",
"Mean: -0.01667932332465106\n",
"\n",
"\n",
"\n",
"\n",
"---\n",
"RbfWhite\n",
"Model score -1.2725604583700336\n",
"CrosValScore [-1.79311122 -1.08885394 -1.1123998 ]\n",
"Mean -1.3314549845918533\n",
"\n",
"\n",
"Kfold:\n",
"Score: [-1.58380539 -1.23608991 -1.10878677]\n",
"Mean: -1.3095606923592664\n"
]
}
],
"source": [
"from sklearn.gaussian_process import GaussianProcessRegressor\n",
"from sklearn.gaussian_process.kernels import RBF, DotProduct, WhiteKernel\n",
"\n",
"def GaussProc(alpha=.08, kernel=RBF()):\n",
" xtrain, xtest, ytrain, ytest = prepTraining(d)\n",
" gp = GaussianProcessRegressor(kernel=kernel,alpha=alpha).fit(xtrain, ytrain)\n",
" #print(\"Model score:{}\".format(gp.score(xtest, ytest)))\n",
" return gp, cross_val_score(gp, xtest, ytest, cv = 3).mean()\n",
" \n",
"\n",
" \n",
"lastScore = 0\n",
"optimal = 1\n",
"kern = RBF()\n",
"pr=False #set to true for every score it gets\n",
"print(\"---\\nGeneratingOptimalAlphaRBF\\n\")\n",
"\n",
"for i in np.arange(0.005,5, .005):\n",
" model, sc = GaussProc(alpha=i, kernel=kern)\n",
" if sc > lastScore:\n",
" lastScore = sc\n",
" optimal = i\n",
" if pr:\n",
" print(\"Last Score: {}\\nOptimal num: {}\".format(lastScore, optimal))\n",
"\n",
"rbfAlpha = optimal\n",
"print(\"Optimal alpha for rbf kernel {}\\n\\n\\n\".format(rbfAlpha))\n",
"\n",
"\n",
"\n",
"print(\"---\\nGeneratingOptimalAlphaDotProduct\\n\")\n",
"\n",
"lastScore = 0\n",
"optimal = 1\n",
"kern = DotProduct()\n",
" \n",
"for i in np.arange(0.005,5, .005):\n",
" model, sc = GaussProc(alpha=i, kernel=kern)\n",
" if sc > lastScore:\n",
" lastScore = sc\n",
" optimal = i\n",
" if pr:\n",
" print(\"Last Score: {}\\nOptimal num: {}\".format(lastScore, optimal))\n",
" \n",
"dotAlpha = optimal\n",
"print(\"Optimal alpha for dotproduct kernel {} with score {}\\n\\n\\n\".format(dotAlpha, lastScore))\n",
"\n",
"\n",
"\n",
"print(\"---\\nGenerating white kernel noise level for rbf\\n\")\n",
"kern = RBF()\n",
" \n",
"lastScore = 0\n",
"optimal = 1\n",
"\n",
"for i in np.arange(0.005,5, .005):\n",
" model, sc = GaussProc(alpha=rbfAlpha, kernel=kern+WhiteKernel(noise_level=i))\n",
" if sc > lastScore:\n",
" lastScore = sc\n",
" optimal = i\n",
" if pr:\n",
" print(\"Last Score: {}\\nOptimal num: {}\".format(lastScore, optimal))\n",
"\n",
"rbfWhite = optimal \n",
"\n",
"print(\"Optimal noise level for rbf {} with score {}\\n\\n\\n\".format(rbfWhite, lastScore))\n",
"\n",
"print(\"---\\nGenerating white kernel noise for DotProd\\n\")\n",
"\n",
"kern = DotProduct()\n",
" \n",
"lastScore = 0\n",
"optimal = 1\n",
" \n",
"for i in np.arange(0.005,5, .005):\n",
" model, sc = GaussProc(alpha=dotAlpha, kernel=kern+WhiteKernel(noise_level=i))\n",
" if sc > lastScore:\n",
" lastScore = sc\n",
" optimal = i\n",
" if pr:\n",
" print(\"Last Score: {}\\nOptimal num: {}\".format(lastScore, optimal))\n",
"\n",
"dotWhite = optimal \n",
"print(\"Optimal noise level for dot {} with score {}\\n\\n\\n\".format(dotWhite, lastScore))\n",
"\n",
"\n",
"\n",
"print(\"---\\nWhite Kernel\")\n",
"model, dump = GaussProc(alpha=1, kernel=WhiteKernel())\n",
"score(model)\n",
"print(\"\\n\\n\\n\")\n",
"print(\"---\\nDotProduct\")\n",
"model, dump = GaussProc(alpha=dotAlpha, kernel=DotProduct())\n",
"score(model)\n",
"print(\"\\n\\n\\n\")\n",
"print(\"---\\nRbf\")\n",
"model, dump = GaussProc(alpha=rbfAlpha, kernel=RBF())\n",
"score(model)\n",
"print(\"\\n\\n\\n\")\n",
"print(\"---\\nDotWhite\")\n",
"model, dump = GaussProc(alpha=dotAlpha, kernel=DotProduct() + WhiteKernel(noise_level = dotWhite))\n",
"score(model)\n",
"print(\"\\n\\n\\n\")\n",
"print(\"---\\nRbfWhite\")\n",
"model, dump = GaussProc(alpha=rbfAlpha, kernel=RBF() + WhiteKernel(noise_level = rbfWhite))\n",
"score(model)\n",
"#a, b = GaussProc(alpha=0.04908, kernel=DotProduct() * WhiteKernel())\n",
"#print(b)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Random Forest\n",
"\n",
"Dit model werd gekozen omdat een decision tree beter past bij het voorspellen van welk lokaal een bepaalde value in komt, moest dit een klassificatie probleem zijn (dit is het jammer genoeg niet). Door dus de lokalen op hogere waarden te steken (tientallen ipv values tussen 0 en 1) kunnen we het model nog simpele decisions geven (bv tussen 0 en 10 is 1 lokaal, tussen 10 en 20 is gang etc). Jammer genoeg wordt dit niet gereflecteerd in de resultaten.\n",
"\n",
"Er was nog geen vorm van decision tree aanwezig in het project, en Random Forest had de beste resultaten voor deze opgave. Door for loops te maken kan er gekeken worden wat de beste waarden zijn voor de n_estimators en max_depth. "
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"-0.03583447235599587 en 0.15000000000000002 voor dep\n",
"0.08730520188582598 en 1.0 voor dep\n",
"0.11835171292421753 en 1.05 voor dep\n",
"0.12800809141118277 en 1.1 voor dep\n",
"0.1309682618539385 en 1.2000000000000002 voor dep\n",
"0.1325868840100242 en 1.8 voor dep\n",
"0.18389298719551705 en 2.0500000000000003 voor dep\n",
"0.2033019128173553 en 2.6500000000000004 voor dep\n",
"0.024595002239385173 en 1 voor est\n",
"0.1414475584387812 en 2 voor est\n",
"0.19844434825486879 en 6 voor est\n",
"0.2045974213542903 en 23 voor est\n",
"\n",
"\n",
"\n",
"Optimal Estimations 23 en optimal depth 2.6500000000000004\n",
"\n",
"\n",
"\n",
"Model score 0.05451023322649307\n",
"CrosValScore [0.11198684 0.02559293 0.35556681]\n",
"Mean 0.16299764209737205\n",
"\n",
"\n",
"Kfold:\n",
"Score: [ 0.35889054 -0.01211664 0.08784591]\n",
"Mean: 0.11315903629642887\n"
]
}
],
"source": [
"from sklearn.ensemble import RandomForestRegressor\n",
"\n",
"\n",
"def rfor(est=5, dep=50):\n",
" xtrain, xtest, ytrain, ytest = prepTraining(d, scaler=MinMaxScaler())\n",
" lr = RandomForestRegressor(n_estimators=est, max_depth=dep)\n",
" lr.fit(xtrain, ytrain)\n",
" return lr, cross_val_score(lr, xtest, ytest, cv = 3).mean()\n",
"#Calculating optimal depth\n",
"lastScore = 0\n",
"optimal = 1\n",
"for i in np.arange(0,20,.05):\n",
" if i == 0:\n",
" model, lastScore = rfor(dep=0.05, est=25)\n",
" continue\n",
" model, sc = rfor(dep=i, est=25)\n",
" if sc > lastScore:\n",
" lastScore = sc\n",
" optimal = i\n",
" print(\"{} en {} voor dep\".format(lastScore, optimal))\n",
"\n",
"de = optimal\n",
"lastScore = 0\n",
"optimal = 1\n",
"\n",
"for i in range(1,140):\n",
" model, sc = rfor(est=i, dep=de)\n",
" if sc > lastScore:\n",
" lastScore = sc\n",
" optimal = i\n",
" print(\"{} en {} voor est\".format(lastScore, optimal))\n",
"\n",
"print(\"\\n\\n\\nOptimal Estimations {} en optimal depth {}\\n\\n\\n\".format(optimal, de))\n",
"optimal, sc = rfor(est=optimal, dep=de)\n",
"score(optimal)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Conclusie\n",
"\n",
"Uit deze vindingen kunnen we afleiden dat meeste modellen redelijk gelijkaardige scores halen voor deze trainingsdata. Hoewel theoretisch random forest het beste van de bovenstaande modellen zou zijn is de score maar een kleine verbetering op de lineare regressie die we als baseline gebruikten. \n",
"\n",
"## Post-mortem/wat kon beter\n",
"\n",
"Er moest zeker meer tijd gestoken worden in het selecteren van de trainings data en features. Deze hebben de rest van het project sterk beinvloed en gezorgd voor lage en gelijkaardige scores. Meer tijd om te experimenteren met de gaussian methode zou ook betere testscores kunnen genereren."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}