Deploy site
jump to
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> An Introduction to Asyncio | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>An Introduction to Asyncio</h1><div class=time><p>2018-06-13<p>last updated 2020-10-03</div><h2 id=index>Index</h2><ul><li><a href=https://lonami.dev/blog/asyncio/#background>Background</a><li><a href=https://lonami.dev/blog/asyncio/#input_output>Input / Output</a><li><a href=https://lonami.dev/blog/asyncio/#diving_in>Diving In</a><li><a href=https://lonami.dev/blog/asyncio/#a_toy_example>A Toy Example</a><li><a href=https://lonami.dev/blog/asyncio/#a_real_example>A Real Example</a><li><a href=https://lonami.dev/blog/asyncio/#extra_material>Extra Material</a></ul><h2 id=background>Background</h2><p>After seeing some friends struggle with <code>asyncio</code> I decided that it could be a good idea to write a blog post using my own words to explain how I understand the world of asynchronous IO. I will focus on Python's <code>asyncio</code> module but this post should apply to any other language easily.<p>So what is <code>asyncio</code> and what makes it good? Why don't we just use the old and known threads to run several parts of the code concurrently, at the same time?<p>The first reason is that <code>asyncio</code> makes your code easier to reason about, as opposed to using threads, because the amount of ways in which your code can run grows exponentially. Let's see that with an example. Imagine you have this code:<pre><code class=language-python data-lang=python>def method(): +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> An Introduction to Asyncio | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>An Introduction to Asyncio</h1><div class=time><p>2018-06-13<p>last updated 2020-10-03</div><h2 id=index>Index</h2><ul><li><a href=https://lonami.dev/blog/asyncio/#background>Background</a><li><a href=https://lonami.dev/blog/asyncio/#input_output>Input / Output</a><li><a href=https://lonami.dev/blog/asyncio/#diving_in>Diving In</a><li><a href=https://lonami.dev/blog/asyncio/#a_toy_example>A Toy Example</a><li><a href=https://lonami.dev/blog/asyncio/#a_real_example>A Real Example</a><li><a href=https://lonami.dev/blog/asyncio/#extra_material>Extra Material</a></ul><h2 id=background>Background</h2><p>After seeing some friends struggle with <code>asyncio</code> I decided that it could be a good idea to write a blog post using my own words to explain how I understand the world of asynchronous IO. I will focus on Python's <code>asyncio</code> module but this post should apply to any other language easily.<p>So what is <code>asyncio</code> and what makes it good? Why don't we just use the old and known threads to run several parts of the code concurrently, at the same time?<p>The first reason is that <code>asyncio</code> makes your code easier to reason about, as opposed to using threads, because the amount of ways in which your code can run grows exponentially. Let's see that with an example. Imagine you have this code:<pre><code class=language-python data-lang=python>def method(): line 1 line 2 line 3
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Breaking Risk of Rain | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Breaking Risk of Rain</h1><div class=time><p>2019-01-12</div><p><a href=https://riskofraingame.com/>Risk of Rain</a> is a fun little game you can spend a lot of hours on. It's incredibly challenging for new players, and fun once you have learnt the basics. This blog will go through what I've learnt and how to play the game correctly.<h2 id=getting-started>Getting Started</h2><p>If you're new to the game, you may find it frustrating. You must learn very well to dodge.<p>Your first <a href=http://riskofrain.wikia.com/wiki/Category:Characters>character</a> will be <a href=http://riskofrain.wikia.com/wiki/Commando>Commando</a>. He's actually a very nice character. Use your third skill (dodge) to move faster, pass through large groups of enemies, and negate fall damage.<p>If there are a lot of monsters, remember to <strong>leave</strong> from there! It's really important for survival. Most enemies <strong>don't do body damage</strong>. Not even the body of the <a href=http://riskofrain.wikia.com/wiki/Magma_Worm>Magma Worm</a> or the <a href=http://riskofrain.wikia.com/wiki/Wandering_Vagrant>Wandering Vagrant</a> (just dodge the head and projectiles respectively).<p>The first thing you must do is always <strong>rush for the teleporter</strong>. Completing the levels quick will make the game easier. But make sure to take note of <strong>where the chests are</strong>! When you have time (even when the countdown finishes), go back for them and buy as many as you can. Generally, prefer <a href=http://riskofrain.wikia.com/wiki/Chest>chests</a> over <a href=http://riskofrain.wikia.com/wiki/Shrine>shrines</a> since they may eat all your money.<p>Completing the game on <a href=http://riskofrain.wikia.com/wiki/Difficulty>Drizzle</a> is really easy if you follow these tips.<h2 id=requisites>Requisites</h2><p>Before breaking the game, you must obtain several <a href=http://riskofrain.wikia.com/wiki/Item#Artifacts>artifacts</a>. We are interested in particular in the following:<ul><li><a href=http://riskofrain.wikia.com/wiki/Sacrifice>Sacrifice</a>. You really need this one, and may be a bit hard to get. With it, you will be able to farm the first level for 30 minutes and kill the final boss in 30 seconds.<li><a href=http://riskofrain.wikia.com/wiki/Command>Command</a>. You need this unless you want to grind for hours to get enough of the items you really need for the rest of the game. Getting this one is easy.<li><a href=http://riskofrain.wikia.com/wiki/Glass>Glass</a>. Your life will be very small (at the beginning…), but you will be able to one-shot everything easily.<li><a href=http://riskofrain.wikia.com/wiki/Kin>Kin</a> (optional). It makes it easier to obtain a lot of boxes if you restart the first level until you get <a href=http://riskofrain.wikia.com/wiki/Lemurian>lemurians</a> or <a href=http://riskofrain.wikia.com/wiki/Jellyfish>jellyfish</a> as the monster, since they're cheap to spawn.</ul><p>With those, the game becomes trivial. Playing as <a href=http://riskofrain.wikia.com/wiki/Huntress>Huntress</a> is excellent since she can move at high speed while killing everything on screen.<h2 id=breaking-the-game>Breaking the Game</h2><p>The rest is easy! With the command artifact you want the following items.<h3 id=common-items><a href=http://riskofrain.wikia.com/wiki/Category:Common_Items>Common Items</a></h3><ul><li><a href=http://riskofrain.wikia.com/wiki/Soldier's_Syringe>Soldier's Syringe</a>. <strong>Stack 13</strong> of these and you will triple your attack speed. You can get started with 4 or so.<li><a href=http://riskofrain.wikia.com/wiki/Paul's_Goat_Hoof>Paul's Goat Hoof</a>. <strong>Stack +30</strong> of these and your movement speed will be insane. You can get a very good speed with 8 or so.<li><a href=http://riskofrain.wikia.com/wiki/Crowbar>Crowbar</a>. <strong>Stack +20</strong> to guarantee you can one-shot bosses.</ul><p>If you want to be safer:<ul><li><a href=http://riskofrain.wikia.com/wiki/Hermit's_Scarf>Hermit's Scarf</a>. <strong>Stack 6</strong> of these to dodge 1/3 of the attacks.<li><a href=http://riskofrain.wikia.com/wiki/Monster_Tooth>Monster Tooth</a>. <strong>Stack 9</strong> of these to recover 50 life on kill. This is plenty, since you will be killing <em>a lot</em>.</ul><p>If you don't have enough and want more fun, get one of these:<ul><li><a href=http://riskofrain.wikia.com/wiki/Gasoline>Gasoline</a>. Burn the ground on kill, and more will die!<li><a href=http://riskofrain.wikia.com/wiki/Headstompers>Headstompers</a>. They make a pleasing sound on fall, and hurt.<li><a href=http://riskofrain.wikia.com/wiki/Lens-Maker's_Glasses>Lens-Maker's Glasses</a>. <strong>Stack 14</strong> and you will always deal a critical strike for double the damage.</ul><h3 id=uncommon-items><a href=http://riskofrain.wikia.com/wiki/Category:Uncommon_Items>Uncommon Items</a></h3><ul><li><a href=http://riskofrain.wikia.com/wiki/Infusion>Infusion</a>. You only really need one of this. Your life will skyrocket after a while, since this gives you 1HP per kill.<li><a href=http://riskofrain.wikia.com/wiki/Hopoo_Feather>Hopoo Feather</a>. <strong>Stack +10</strong> of these. You will pretty much be able to fly with so many jumps.<li><a href=http://riskofrain.wikia.com/wiki/Guardian's_Heart>Guardian's Heart</a>. Not really necessary, but useful for early and late game, since it will absorb infinite damage the first hit.</ul><p>If, again, you want more fun, get one of these:<ul><li><a href=http://riskofrain.wikia.com/wiki/Ukulele>Ukelele</a>. Spazz your enemies!<li><a href=http://riskofrain.wikia.com/wiki/Will-o'-the-wisp>Will-o'-the-wisp</a>. Explode your enemies!<li><a href=http://riskofrain.wikia.com/wiki/Chargefield_Generator>Chargefield Generator</a>. It should cover your entire screen after a bit, hurting all enemies without moving a finger.<li><a href=http://riskofrain.wikia.com/wiki/Golden_Gun>Golden Gun</a>. You will be rich, so this gives you +40% damage.<li><a href=http://riskofrain.wikia.com/wiki/Predatory_Instincts>Predatory Instincts</a>. If you got 14 glasses, you will always be doing critical strikes, and this will give even more attack speed.<li><a href=http://riskofrain.wikia.com/wiki/56_Leaf_Clover>56 Leaf Clover</a>. More drops, in case you didn't have enough.</ul><h3 id=rare-items><a href=http://riskofrain.wikia.com/wiki/Category:Rare_Items>Rare Items</a></h3><ul><li><a href=http://riskofrain.wikia.com/wiki/Ceremonial_Dagger>Ceremonial Dagger</a>. <strong>Stack +3</strong>, then killing one thing kills another thing and makes a chain reaction.<li><a href=http://riskofrain.wikia.com/wiki/Alien_Head>Alien Head</a>. <strong>Stack 3</strong>, and you will be able to use your abilities more often.</ul><p>For more fun:<ul><li><a href=http://riskofrain.wikia.com/wiki/Brilliant_Behemoth>Brilliant Behemoth</a>. Boom boom.</ul><h2 id=closing-words>Closing Words</h2><p>You can now beat the game in Monsoon solo with any character. Have fun! And be careful with the sadly common crashes.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Breaking Risk of Rain | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Breaking Risk of Rain</h1><div class=time><p>2019-01-12</div><p><a href=https://riskofraingame.com/>Risk of Rain</a> is a fun little game you can spend a lot of hours on. It's incredibly challenging for new players, and fun once you have learnt the basics. This blog will go through what I've learnt and how to play the game correctly.<h2 id=getting-started>Getting Started</h2><p>If you're new to the game, you may find it frustrating. You must learn very well to dodge.<p>Your first <a href=http://riskofrain.wikia.com/wiki/Category:Characters>character</a> will be <a href=http://riskofrain.wikia.com/wiki/Commando>Commando</a>. He's actually a very nice character. Use your third skill (dodge) to move faster, pass through large groups of enemies, and negate fall damage.<p>If there are a lot of monsters, remember to <strong>leave</strong> from there! It's really important for survival. Most enemies <strong>don't do body damage</strong>. Not even the body of the <a href=http://riskofrain.wikia.com/wiki/Magma_Worm>Magma Worm</a> or the <a href=http://riskofrain.wikia.com/wiki/Wandering_Vagrant>Wandering Vagrant</a> (just dodge the head and projectiles respectively).<p>The first thing you must do is always <strong>rush for the teleporter</strong>. Completing the levels quick will make the game easier. But make sure to take note of <strong>where the chests are</strong>! When you have time (even when the countdown finishes), go back for them and buy as many as you can. Generally, prefer <a href=http://riskofrain.wikia.com/wiki/Chest>chests</a> over <a href=http://riskofrain.wikia.com/wiki/Shrine>shrines</a> since they may eat all your money.<p>Completing the game on <a href=http://riskofrain.wikia.com/wiki/Difficulty>Drizzle</a> is really easy if you follow these tips.<h2 id=requisites>Requisites</h2><p>Before breaking the game, you must obtain several <a href=http://riskofrain.wikia.com/wiki/Item#Artifacts>artifacts</a>. We are interested in particular in the following:<ul><li><a href=http://riskofrain.wikia.com/wiki/Sacrifice>Sacrifice</a>. You really need this one, and may be a bit hard to get. With it, you will be able to farm the first level for 30 minutes and kill the final boss in 30 seconds.<li><a href=http://riskofrain.wikia.com/wiki/Command>Command</a>. You need this unless you want to grind for hours to get enough of the items you really need for the rest of the game. Getting this one is easy.<li><a href=http://riskofrain.wikia.com/wiki/Glass>Glass</a>. Your life will be very small (at the beginning…), but you will be able to one-shot everything easily.<li><a href=http://riskofrain.wikia.com/wiki/Kin>Kin</a> (optional). It makes it easier to obtain a lot of boxes if you restart the first level until you get <a href=http://riskofrain.wikia.com/wiki/Lemurian>lemurians</a> or <a href=http://riskofrain.wikia.com/wiki/Jellyfish>jellyfish</a> as the monster, since they're cheap to spawn.</ul><p>With those, the game becomes trivial. Playing as <a href=http://riskofrain.wikia.com/wiki/Huntress>Huntress</a> is excellent since she can move at high speed while killing everything on screen.<h2 id=breaking-the-game>Breaking the Game</h2><p>The rest is easy! With the command artifact you want the following items.<h3 id=common-items><a href=http://riskofrain.wikia.com/wiki/Category:Common_Items>Common Items</a></h3><ul><li><a href=http://riskofrain.wikia.com/wiki/Soldier's_Syringe>Soldier's Syringe</a>. <strong>Stack 13</strong> of these and you will triple your attack speed. You can get started with 4 or so.<li><a href=http://riskofrain.wikia.com/wiki/Paul's_Goat_Hoof>Paul's Goat Hoof</a>. <strong>Stack +30</strong> of these and your movement speed will be insane. You can get a very good speed with 8 or so.<li><a href=http://riskofrain.wikia.com/wiki/Crowbar>Crowbar</a>. <strong>Stack +20</strong> to guarantee you can one-shot bosses.</ul><p>If you want to be safer:<ul><li><a href=http://riskofrain.wikia.com/wiki/Hermit's_Scarf>Hermit's Scarf</a>. <strong>Stack 6</strong> of these to dodge 1/3 of the attacks.<li><a href=http://riskofrain.wikia.com/wiki/Monster_Tooth>Monster Tooth</a>. <strong>Stack 9</strong> of these to recover 50 life on kill. This is plenty, since you will be killing <em>a lot</em>.</ul><p>If you don't have enough and want more fun, get one of these:<ul><li><a href=http://riskofrain.wikia.com/wiki/Gasoline>Gasoline</a>. Burn the ground on kill, and more will die!<li><a href=http://riskofrain.wikia.com/wiki/Headstompers>Headstompers</a>. They make a pleasing sound on fall, and hurt.<li><a href=http://riskofrain.wikia.com/wiki/Lens-Maker's_Glasses>Lens-Maker's Glasses</a>. <strong>Stack 14</strong> and you will always deal a critical strike for double the damage.</ul><h3 id=uncommon-items><a href=http://riskofrain.wikia.com/wiki/Category:Uncommon_Items>Uncommon Items</a></h3><ul><li><a href=http://riskofrain.wikia.com/wiki/Infusion>Infusion</a>. You only really need one of this. Your life will skyrocket after a while, since this gives you 1HP per kill.<li><a href=http://riskofrain.wikia.com/wiki/Hopoo_Feather>Hopoo Feather</a>. <strong>Stack +10</strong> of these. You will pretty much be able to fly with so many jumps.<li><a href=http://riskofrain.wikia.com/wiki/Guardian's_Heart>Guardian's Heart</a>. Not really necessary, but useful for early and late game, since it will absorb infinite damage the first hit.</ul><p>If, again, you want more fun, get one of these:<ul><li><a href=http://riskofrain.wikia.com/wiki/Ukulele>Ukelele</a>. Spazz your enemies!<li><a href=http://riskofrain.wikia.com/wiki/Will-o'-the-wisp>Will-o'-the-wisp</a>. Explode your enemies!<li><a href=http://riskofrain.wikia.com/wiki/Chargefield_Generator>Chargefield Generator</a>. It should cover your entire screen after a bit, hurting all enemies without moving a finger.<li><a href=http://riskofrain.wikia.com/wiki/Golden_Gun>Golden Gun</a>. You will be rich, so this gives you +40% damage.<li><a href=http://riskofrain.wikia.com/wiki/Predatory_Instincts>Predatory Instincts</a>. If you got 14 glasses, you will always be doing critical strikes, and this will give even more attack speed.<li><a href=http://riskofrain.wikia.com/wiki/56_Leaf_Clover>56 Leaf Clover</a>. More drops, in case you didn't have enough.</ul><h3 id=rare-items><a href=http://riskofrain.wikia.com/wiki/Category:Rare_Items>Rare Items</a></h3><ul><li><a href=http://riskofrain.wikia.com/wiki/Ceremonial_Dagger>Ceremonial Dagger</a>. <strong>Stack +3</strong>, then killing one thing kills another thing and makes a chain reaction.<li><a href=http://riskofrain.wikia.com/wiki/Alien_Head>Alien Head</a>. <strong>Stack 3</strong>, and you will be able to use your abilities more often.</ul><p>For more fun:<ul><li><a href=http://riskofrain.wikia.com/wiki/Brilliant_Behemoth>Brilliant Behemoth</a>. Boom boom.</ul><h2 id=closing-words>Closing Words</h2><p>You can now beat the game in Monsoon solo with any character. Have fun! And be careful with the sadly common crashes.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Python ctypes and Windows | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Python ctypes and Windows</h1><div class=time><p>2019-06-19</div><p><a href=https://www.python.org/>Python</a>'s <a href=https://docs.python.org/3/library/ctypes.html><code>ctypes</code></a> is quite a nice library to easily load and invoke C methods available in already-compiled <a href=https://en.wikipedia.org/wiki/Dynamic-link_library><code>.dll</code> files</a> without any additional dependencies. And I <em>love</em> depending on as little as possible.<p>In this blog post, we will walk through my endeavors to use <code>ctypes</code> with the <a href=https://docs.microsoft.com/en-us/windows/desktop/api/>Windows API</a>, and do some cool stuff with it.<p>We will assume some knowledge of C/++ and Python, since we will need to read and write a bit of both. Please note that this post is only an introduction to <code>ctypes</code>, and if you need more information you should consult the <a href=https://docs.python.org/3/library/ctypes.html>Python's documentation for <code>ctypes</code></a>.<p>While the post focuses on Windows' API, the code here probably applies to unix-based systems with little modifications.<h2 id=basics>Basics</h2><p>First of all, let's learn how to load a library. Let's say we want to load <code>User32.dll</code>:<pre><code class=language-python data-lang=python>import ctypes +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Python ctypes and Windows | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Python ctypes and Windows</h1><div class=time><p>2019-06-19</div><p><a href=https://www.python.org/>Python</a>'s <a href=https://docs.python.org/3/library/ctypes.html><code>ctypes</code></a> is quite a nice library to easily load and invoke C methods available in already-compiled <a href=https://en.wikipedia.org/wiki/Dynamic-link_library><code>.dll</code> files</a> without any additional dependencies. And I <em>love</em> depending on as little as possible.<p>In this blog post, we will walk through my endeavors to use <code>ctypes</code> with the <a href=https://docs.microsoft.com/en-us/windows/desktop/api/>Windows API</a>, and do some cool stuff with it.<p>We will assume some knowledge of C/++ and Python, since we will need to read and write a bit of both. Please note that this post is only an introduction to <code>ctypes</code>, and if you need more information you should consult the <a href=https://docs.python.org/3/library/ctypes.html>Python's documentation for <code>ctypes</code></a>.<p>While the post focuses on Windows' API, the code here probably applies to unix-based systems with little modifications.<h2 id=basics>Basics</h2><p>First of all, let's learn how to load a library. Let's say we want to load <code>User32.dll</code>:<pre><code class=language-python data-lang=python>import ctypes ctypes.windll.user32 </code></pre><p>Yes, it's that simple. When you access an attribute of <code>windll</code>, said library will load. Since Windows is case-insensitive, we will use lowercase consistently.<p>Calling a function is just as simple. Let's say you want to call <a href=https://docs.microsoft.com/en-us/windows/desktop/api/winuser/nf-winuser-setcursorpos><code>SetCursorPos</code></a>, which is defined as follows:<pre><code class=language-c data-lang=c>BOOL SetCursorPos(
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Graphs | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Graphs</h1><div class=time><p>2017-06-02</div><p><noscript>There are a few things which won't render unless you enable JavaScript. No tracking, I promise!</noscript><blockquote><p>Don't know English? <a href=https://lonami.dev/blog/graphs/spanish.html>Read the Spanish version instead</a>.</blockquote><p>Let's imagine we have 5 bus stations, which we'll denote by ((s_i)):<div class=matrix>' s_1 ' s_2 ' s_3 ' s_4 ' s_5 \\ s_1 ' ' V ' ' ' \\ s_2 ' V ' ' ' ' V \\ s_3 ' ' ' ' V ' \\ s_4 ' ' V ' V ' ' \\ s_5 ' V ' ' ' V '</div><p>This is known as a "table of direct interconnections". The ((V)) represent connected paths. For instance, on the first row starting at ((s_1)), reaching the ((V)), allows us to turn up to get to ((s_2)).<p>We can see the above table represented in a more graphical way:<p><img src=https://lonami.dev/blog/graphs/example1.svg alt="Table 1 as a Graph"><p>This type of graph is called, well, a graph, and it's a directed graph (or digraph), since the direction on which the arrows go does matter. It's made up of vertices, joined together by edges (also known as lines or directed arcs).<p>One can walk from a node to another through different paths. For example, ((s_4 $rightarrow s_2 $rightarrow s_5)) is an indirect path of order two, because we must use two edges to go from ((s_4)) to ((s_5)).<p>Let's now represent its adjacency matrix called A which represents the same table, but uses 1 instead V to represent a connection:<div class=matrix>0 ' 1 ' 0 ' 0 ' 0 \\ 1 ' 0 ' 0 ' 0 ' 1 \\ 0 ' 0 ' 0 ' 1 ' 0 \\ 0 ' 1 ' 1 ' 0 ' 0 \\ 1 ' 0 ' 0 ' 1 ' 0</div><p>This way we can see how the ((a_{2,1})) element represents the connection ((s_2 $rightarrow s_1)), and the ((a_{5,1})) element the ((s_5 $rightarrow s_1)) connection, etc.<p>In general, ((a_{i,j})) represents a connection from ((s_i $rightarrow s_j))as long as ((a_{i,j}$geq 1)).<p>Working with matrices allows us to have a computable representation of any graph, which is very useful.<hr><p>Graphs have a lot of interesting properties besides being representable by a computer. What would happen if, for instance, we calculated ((A^2))? We obtain the following matrix:<div class=matrix>1 ' 0 ' 0 ' 0 ' 1 \\ 1 ' 1 ' 0 ' 1 ' 0 \\ 0 ' 1 ' 1 ' 0 ' 0 \\ 1 ' 0 ' 0 ' 1 ' 1 \\ 0 ' 2 ' 1 ' 0 ' 0</div><p>We can interpret this as the paths of order two. But what does the element ((a_{5,2}=2)) represent? It indicates the amount of possible ways to go from ((s_5 $rightarrow s_i $rightarrow s_2)).<p>One can manually multiply the involved row and column to determine which element is the one we need to pass through, this way we have the row (([1 0 0 1 0])) and the column (([1 0 0 1 0])) (on vertical). The elements ((s_i$geq 1)) are ((s_1)) and ((s_4)). This is, we can go from ((s_5)) to ((s_2)) via ((s_5 $rightarrow s_1 $rightarrow s_2)) or via ((s_5 $rightarrow s_4 $rightarrow s_2)): <img src=example2.svg><p>It's important to note that graphs to not consider self-connections, this is, ((s_i $rightarrow s_i)) is not allowed; neither we work with multigraphs here (those which allow multiple connections, for instance, an arbitrary number ((n)) of times).<div class=matrix>1 ' 1 ' 0 ' 1 ' 0 \\ 1 ' 2 ' \textbf{1} ' 0 ' 1 \\ 1 ' 0 ' 0 ' 1 ' 1 \\ 1 ' 2 ' 1 ' 1 ' 0 \\ 2 ' 0 ' 0 ' 1 ' 2</div><p>We can see how the first ((1)) just appeared on the element ((a_{2,3})), which means that the shortest path to it is at least of order three.<hr><p>A graph is said to be strongly connected as long as there is a way to reach all its elements.<p>We can see all the available paths until now by simply adding up all the direct and indirect ways to reach a node, so for now, we can add ((A+A^2+A^3)) in such a way that:<div class=matrix>2 ' 2 ' 0 ' 1 ' 1 \\ 3 ' 3 ' 1 ' 1 ' 3 \\ 1 ' 1 ' 1 ' 2 ' 1 \\ 2 ' 3 ' 2 ' 2 ' 1 \\ 3 ' 2 ' 1 ' 2 ' 2</div><p>There isn't a connection between ((s_1)) and ((s_3)) yet. If we were to calculate ((A^4)):<div class=matrix>1 ' 2 ' 1 ' ' \\ ' ' ' ' \\ ' ' ' ' \\ ' ' ' ' \\ ' ' ' '</div><p>We don't need to calculate anymore. We now know that the graph is strongly connected!<hr><p>Congratulations! You've completed this tiny introduction to graphs. Now you can play around with them and design your own connections.<p>Hold the left mouse button on the above area and drag it down to create a new node, or drag a node to this area to delete it.<p>To create new connections, hold the right mouse button on the node you want to start with, and drag it to the node you want it to be connected to.<p>To delete the connections coming from a specific node, middle click it.<table><tr><td style=width:100%;><button onclick=resetConnections()>Reset connections</button> <button onclick=clearNodes()>Clear all the nodes</button> <br> <br> <label for=matrixOrder>Show matrix of order:</label> <input id=matrixOrder type=number min=1 max=5 value=1 oninput=updateOrder()> <br> <label for=matrixAccum>Show accumulated matrix</label> <input id=matrixAccum type=checkbox onchange=updateOrder()> <br> <br> <div><table id=matrixTable></table></div><td><canvas id=canvas width=400 height=400 oncontextmenu="return false;">Looks like your browser won't let you see this fancy example :(</canvas> <br></table><script src=tinyparser.js></script><script src=enhancements.js></script><script src=graphs.js></script></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Graphs | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Graphs</h1><div class=time><p>2017-06-02</div><p><noscript>There are a few things which won't render unless you enable JavaScript. No tracking, I promise!</noscript><blockquote><p>Don't know English? <a href=https://lonami.dev/blog/graphs/spanish.html>Read the Spanish version instead</a>.</blockquote><p>Let's imagine we have 5 bus stations, which we'll denote by ((s_i)):<div class=matrix>' s_1 ' s_2 ' s_3 ' s_4 ' s_5 \\ s_1 ' ' V ' ' ' \\ s_2 ' V ' ' ' ' V \\ s_3 ' ' ' ' V ' \\ s_4 ' ' V ' V ' ' \\ s_5 ' V ' ' ' V '</div><p>This is known as a "table of direct interconnections". The ((V)) represent connected paths. For instance, on the first row starting at ((s_1)), reaching the ((V)), allows us to turn up to get to ((s_2)).<p>We can see the above table represented in a more graphical way:<p><img src=https://lonami.dev/blog/graphs/example1.svg alt="Table 1 as a Graph"><p>This type of graph is called, well, a graph, and it's a directed graph (or digraph), since the direction on which the arrows go does matter. It's made up of vertices, joined together by edges (also known as lines or directed arcs).<p>One can walk from a node to another through different paths. For example, ((s_4 $rightarrow s_2 $rightarrow s_5)) is an indirect path of order two, because we must use two edges to go from ((s_4)) to ((s_5)).<p>Let's now represent its adjacency matrix called A which represents the same table, but uses 1 instead V to represent a connection:<div class=matrix>0 ' 1 ' 0 ' 0 ' 0 \\ 1 ' 0 ' 0 ' 0 ' 1 \\ 0 ' 0 ' 0 ' 1 ' 0 \\ 0 ' 1 ' 1 ' 0 ' 0 \\ 1 ' 0 ' 0 ' 1 ' 0</div><p>This way we can see how the ((a_{2,1})) element represents the connection ((s_2 $rightarrow s_1)), and the ((a_{5,1})) element the ((s_5 $rightarrow s_1)) connection, etc.<p>In general, ((a_{i,j})) represents a connection from ((s_i $rightarrow s_j))as long as ((a_{i,j}$geq 1)).<p>Working with matrices allows us to have a computable representation of any graph, which is very useful.<hr><p>Graphs have a lot of interesting properties besides being representable by a computer. What would happen if, for instance, we calculated ((A^2))? We obtain the following matrix:<div class=matrix>1 ' 0 ' 0 ' 0 ' 1 \\ 1 ' 1 ' 0 ' 1 ' 0 \\ 0 ' 1 ' 1 ' 0 ' 0 \\ 1 ' 0 ' 0 ' 1 ' 1 \\ 0 ' 2 ' 1 ' 0 ' 0</div><p>We can interpret this as the paths of order two. But what does the element ((a_{5,2}=2)) represent? It indicates the amount of possible ways to go from ((s_5 $rightarrow s_i $rightarrow s_2)).<p>One can manually multiply the involved row and column to determine which element is the one we need to pass through, this way we have the row (([1 0 0 1 0])) and the column (([1 0 0 1 0])) (on vertical). The elements ((s_i$geq 1)) are ((s_1)) and ((s_4)). This is, we can go from ((s_5)) to ((s_2)) via ((s_5 $rightarrow s_1 $rightarrow s_2)) or via ((s_5 $rightarrow s_4 $rightarrow s_2)): <img src=example2.svg><p>It's important to note that graphs to not consider self-connections, this is, ((s_i $rightarrow s_i)) is not allowed; neither we work with multigraphs here (those which allow multiple connections, for instance, an arbitrary number ((n)) of times).<div class=matrix>1 ' 1 ' 0 ' 1 ' 0 \\ 1 ' 2 ' \textbf{1} ' 0 ' 1 \\ 1 ' 0 ' 0 ' 1 ' 1 \\ 1 ' 2 ' 1 ' 1 ' 0 \\ 2 ' 0 ' 0 ' 1 ' 2</div><p>We can see how the first ((1)) just appeared on the element ((a_{2,3})), which means that the shortest path to it is at least of order three.<hr><p>A graph is said to be strongly connected as long as there is a way to reach all its elements.<p>We can see all the available paths until now by simply adding up all the direct and indirect ways to reach a node, so for now, we can add ((A+A^2+A^3)) in such a way that:<div class=matrix>2 ' 2 ' 0 ' 1 ' 1 \\ 3 ' 3 ' 1 ' 1 ' 3 \\ 1 ' 1 ' 1 ' 2 ' 1 \\ 2 ' 3 ' 2 ' 2 ' 1 \\ 3 ' 2 ' 1 ' 2 ' 2</div><p>There isn't a connection between ((s_1)) and ((s_3)) yet. If we were to calculate ((A^4)):<div class=matrix>1 ' 2 ' 1 ' ' \\ ' ' ' ' \\ ' ' ' ' \\ ' ' ' ' \\ ' ' ' '</div><p>We don't need to calculate anymore. We now know that the graph is strongly connected!<hr><p>Congratulations! You've completed this tiny introduction to graphs. Now you can play around with them and design your own connections.<p>Hold the left mouse button on the above area and drag it down to create a new node, or drag a node to this area to delete it.<p>To create new connections, hold the right mouse button on the node you want to start with, and drag it to the node you want it to be connected to.<p>To delete the connections coming from a specific node, middle click it.<table><tr><td style=width:100%;><button onclick=resetConnections()>Reset connections</button> <button onclick=clearNodes()>Clear all the nodes</button> <br> <br> <label for=matrixOrder>Show matrix of order:</label> <input id=matrixOrder type=number min=1 max=5 value=1 oninput=updateOrder()> <br> <label for=matrixAccum>Show accumulated matrix</label> <input id=matrixAccum type=checkbox onchange=updateOrder()> <br> <br> <div><table id=matrixTable></table></div><td><canvas id=canvas width=400 height=400 oncontextmenu="return false;">Looks like your browser won't let you see this fancy example :(</canvas> <br></table><script src=tinyparser.js></script><script src=enhancements.js></script><script src=graphs.js></script></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>My Blog</h1><p id=welcome onclick=pls_stop()>Welcome to my blog!<p>Here I occasionally post new entries, mostly tech related. Perhaps it's tips for a new game I'm playing, perhaps it has something to do with FFI, or perhaps I'm fighting the borrow checker (just kidding, I'm over that. Mostly).<hr><ul><li><a href=https://lonami.dev/blog/woce-1/>Writing our own Cheat Engine: Introduction</a><span class=dim> [mod sw; 'windows, 'rust, 'hacking] </span><li><a href=https://lonami.dev/blog/university/>Data Mining, Warehousing and Information Retrieval</a><span class=dim> [mod algos; 'series, 'bigdata, 'databases] </span><li><a href=https://lonami.dev/blog/new-computer/>My new computer</a><span class=dim> [mod hw; 'showoff] </span><li><a href=https://lonami.dev/blog/tips-outpost/>Tips for Outpost</a><span class=dim> [mod games; 'tips] </span><li><a href=https://lonami.dev/blog/ctypes-and-windows/>Python ctypes and Windows</a><span class=dim> [mod sw; 'python, 'ffi, 'windows] </span><li><a href=https://lonami.dev/blog/pixel-dungeon/>Shattered Pixel Dungeon</a><span class=dim> [mod games; 'tips] </span><li><a href=https://lonami.dev/blog/installing-nixos-2/>Installing NixOS, Take 2</a><span class=dim> [mod sw; 'os, 'nixos] </span><li><a href=https://lonami.dev/blog/breaking-ror/>Breaking Risk of Rain</a><span class=dim> [mod games; 'tips] </span><li><a href=https://lonami.dev/blog/world-edit/>WorldEdit Commands</a><span class=dim> [mod games; 'minecraft, 'worldedit, 'tips] </span><li><a href=https://lonami.dev/blog/asyncio/>An Introduction to Asyncio</a><span class=dim> [mod sw; 'python, 'asyncio] </span><li><a href=https://lonami.dev/blog/posts/>Atemporal Blog Posts</a><span class=dim> [mod algos; 'algorithms, 'culture, 'debate, 'foodforthought, 'graphics, 'optimization] </span><li><a href=https://lonami.dev/blog/graphs/>Graphs</a><span class=dim> [mod algos; 'graphs] </span><li><a href=https://lonami.dev/blog/installing-nixos/>Installing NixOS</a><span class=dim> [mod sw; 'os, 'nixos] </span></ul><script> +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>My Blog</h1><p id=welcome onclick=pls_stop()>Welcome to my blog!<p>Here I occasionally post new entries, mostly tech related. Perhaps it's tips for a new game I'm playing, perhaps it has something to do with FFI, or perhaps I'm fighting the borrow checker (just kidding, I'm over that. Mostly).<hr><ul><li><a href=https://lonami.dev/blog/woce-1/>Writing our own Cheat Engine: Introduction</a><span class=dim> [mod sw; 'windows, 'rust, 'hacking] </span><li><a href=https://lonami.dev/blog/university/>Data Mining, Warehousing and Information Retrieval</a><span class=dim> [mod algos; 'series, 'bigdata, 'databases] </span><li><a href=https://lonami.dev/blog/new-computer/>My new computer</a><span class=dim> [mod hw; 'showoff] </span><li><a href=https://lonami.dev/blog/tips-outpost/>Tips for Outpost</a><span class=dim> [mod games; 'tips] </span><li><a href=https://lonami.dev/blog/ctypes-and-windows/>Python ctypes and Windows</a><span class=dim> [mod sw; 'python, 'ffi, 'windows] </span><li><a href=https://lonami.dev/blog/pixel-dungeon/>Shattered Pixel Dungeon</a><span class=dim> [mod games; 'tips] </span><li><a href=https://lonami.dev/blog/installing-nixos-2/>Installing NixOS, Take 2</a><span class=dim> [mod sw; 'os, 'nixos] </span><li><a href=https://lonami.dev/blog/breaking-ror/>Breaking Risk of Rain</a><span class=dim> [mod games; 'tips] </span><li><a href=https://lonami.dev/blog/world-edit/>WorldEdit Commands</a><span class=dim> [mod games; 'minecraft, 'worldedit, 'tips] </span><li><a href=https://lonami.dev/blog/asyncio/>An Introduction to Asyncio</a><span class=dim> [mod sw; 'python, 'asyncio] </span><li><a href=https://lonami.dev/blog/posts/>Atemporal Blog Posts</a><span class=dim> [mod algos; 'algorithms, 'culture, 'debate, 'foodforthought, 'graphics, 'optimization] </span><li><a href=https://lonami.dev/blog/graphs/>Graphs</a><span class=dim> [mod algos; 'graphs] </span><li><a href=https://lonami.dev/blog/installing-nixos/>Installing NixOS</a><span class=dim> [mod sw; 'os, 'nixos] </span></ul><script> const WELCOME_EN = 'Welcome to my blog!' const WELCOME_ES = '¡Bienvenido a mi blog!' const APOLOGIES = "ok sorry i'll stop"
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Installing NixOS, Take 2 | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Installing NixOS, Take 2</h1><div class=time><p>2019-02-15<p>last updated 2019-02-16</div><p>This is my second take at installing NixOS, after a while being frustrated with Arch Linux and the fact that a few kernel upgrades ago, the system crashed randomly from time to time. <code>journalctl</code> did not have any helpful hints and I thought reinstalling could be worthwhile anyway.<p>This time, I started with more knowledge! The first step is heading to the <a href=https://nixos.org>NixOS website</a> and downloading their minimal installation CD for 64 bits. I didn't go with their graphical live CD, because their <a href=https://nixos.org/nixos/manual>installation manual</a> is a wonderful resource that guides you nicely.<p>Once you have downloaded their <code>.iso</code>, you should probably verify it's <code>sha256sum</code> and make sure that it matches. The easiest thing to do in my opinion is using an USB to burn the image in it. Plug it in and check its device name with <code>fdisk -l</code>. In my case, it was <code>/dev/sdb</code>, so I went ahead with it and ran <code>dd if=nixos.iso of=/dev/sdb status=progress</code>. Make sure to run <code>sync</code> once that's done.<p>If either <code>dd</code> or <code>sync</code> seem "stuck" in the end, they are just flushing the changes to disk to make sure all is good. This is normal, and depends on your drives.<p>Now, reboot your computer with the USB plugged in and make sure to boot into it. You should be welcome with a pretty screen. Just select the first option and wait until it logs you in as root. Once you're there you probably want to <code>loadkeys es</code> or whatever your keyboard layout is, or you will have a hard time with passwords, since the characters are all over the place.<p>In a clean disk, you would normally create the partitions now. In my case, I already had the partitions made (100MB for the EFI system, where <code>/boot</code> lives, 40GB for the root <code>/</code> partition with my old Linux installation, and 700G for <code>/home</code>), so I didn't need to do anything here. The manual showcases <code>parted</code>, but I personally use <code>fdisk</code>, which has very helpful help I check every time I use it.<p><strong>Important</strong>: The <code>XY</code> in <code>/dev/sdXY</code> is probably different in your system! Make sure you use <code>fdisk -l</code> to see the correct letters and numbers!<p>With the partitions ready in my UEFI system, I formatted both <code>/</code> and <code>/boot</code> just to be safe with <code>mkfs.ext4 -L nixos /dev/sda2</code> and <code>mkfs.fat -F 32 -n boot /dev/sda1</code> (remember that these are the letters and numbers used in my partition scheme). Don't worry about the warning in the second command regarding lowercase letters and Windows. It's not really an issue.<p>Now, since we gave each partition a label, we can easily mount them through <code>mount /dev/disk/by-label/nixos /mnt</code> and, in UEFI systems, be sure to <code>mkdir -p /mnt/boot</code> and <code>mount /dev/disk/by-label/boot /mnt/boot</code>. I didn't bother setting up swap, since I have 8GB of RAM in my laptop and that's really enough for my use case.<p>With that done, we will now ask the configuration wizard to do some work for us (in particular, generate a template) with <code>nixos-generate-config --root /mnt</code>. This generates a very well documented file that we should edit right now (and this is important!) with whatever editor you prefer. I used <code>vim</code>, but you can change it for <code>nano</code> if you prefer.<p>On to the configuration file, we need to enable a few things, so <code>vim /mnt/etc/nixos/configuration.nix</code> and start scrolling down. We want to make sure to uncomment:<pre><code># We really want network! +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Installing NixOS, Take 2 | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Installing NixOS, Take 2</h1><div class=time><p>2019-02-15<p>last updated 2019-02-16</div><p>This is my second take at installing NixOS, after a while being frustrated with Arch Linux and the fact that a few kernel upgrades ago, the system crashed randomly from time to time. <code>journalctl</code> did not have any helpful hints and I thought reinstalling could be worthwhile anyway.<p>This time, I started with more knowledge! The first step is heading to the <a href=https://nixos.org>NixOS website</a> and downloading their minimal installation CD for 64 bits. I didn't go with their graphical live CD, because their <a href=https://nixos.org/nixos/manual>installation manual</a> is a wonderful resource that guides you nicely.<p>Once you have downloaded their <code>.iso</code>, you should probably verify it's <code>sha256sum</code> and make sure that it matches. The easiest thing to do in my opinion is using an USB to burn the image in it. Plug it in and check its device name with <code>fdisk -l</code>. In my case, it was <code>/dev/sdb</code>, so I went ahead with it and ran <code>dd if=nixos.iso of=/dev/sdb status=progress</code>. Make sure to run <code>sync</code> once that's done.<p>If either <code>dd</code> or <code>sync</code> seem "stuck" in the end, they are just flushing the changes to disk to make sure all is good. This is normal, and depends on your drives.<p>Now, reboot your computer with the USB plugged in and make sure to boot into it. You should be welcome with a pretty screen. Just select the first option and wait until it logs you in as root. Once you're there you probably want to <code>loadkeys es</code> or whatever your keyboard layout is, or you will have a hard time with passwords, since the characters are all over the place.<p>In a clean disk, you would normally create the partitions now. In my case, I already had the partitions made (100MB for the EFI system, where <code>/boot</code> lives, 40GB for the root <code>/</code> partition with my old Linux installation, and 700G for <code>/home</code>), so I didn't need to do anything here. The manual showcases <code>parted</code>, but I personally use <code>fdisk</code>, which has very helpful help I check every time I use it.<p><strong>Important</strong>: The <code>XY</code> in <code>/dev/sdXY</code> is probably different in your system! Make sure you use <code>fdisk -l</code> to see the correct letters and numbers!<p>With the partitions ready in my UEFI system, I formatted both <code>/</code> and <code>/boot</code> just to be safe with <code>mkfs.ext4 -L nixos /dev/sda2</code> and <code>mkfs.fat -F 32 -n boot /dev/sda1</code> (remember that these are the letters and numbers used in my partition scheme). Don't worry about the warning in the second command regarding lowercase letters and Windows. It's not really an issue.<p>Now, since we gave each partition a label, we can easily mount them through <code>mount /dev/disk/by-label/nixos /mnt</code> and, in UEFI systems, be sure to <code>mkdir -p /mnt/boot</code> and <code>mount /dev/disk/by-label/boot /mnt/boot</code>. I didn't bother setting up swap, since I have 8GB of RAM in my laptop and that's really enough for my use case.<p>With that done, we will now ask the configuration wizard to do some work for us (in particular, generate a template) with <code>nixos-generate-config --root /mnt</code>. This generates a very well documented file that we should edit right now (and this is important!) with whatever editor you prefer. I used <code>vim</code>, but you can change it for <code>nano</code> if you prefer.<p>On to the configuration file, we need to enable a few things, so <code>vim /mnt/etc/nixos/configuration.nix</code> and start scrolling down. We want to make sure to uncomment:<pre><code># We really want network! networking.wireless.enable = true; # This "fixes" the keyboard layout. Put the one you use.
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Installing NixOS | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Installing NixOS</h1><div class=time><p>2017-05-13<p>last updated 2019-02-16</div><h2 id=update>Update</h2><p><em>Please see <a href=../installing_nixos_2/index.html>my followup post with NixOS</a> for a far better experience with it</em><hr><p>Today I decided to install <a href=http://nixos.org/>NixOS</a> as a recommendation, a purely functional Linux distribution, since <a href=https://xubuntu.org/>Xubuntu</a> kept crashing. Here's my journey, and how I managed to install it from a terminal for the first time in my life. Steps aren't hard, but they may not seem obvious at first.<ul><li><p>Grab the Live CD, burn it on a USB stick and boot. I recommend using <a href=https://etcher.io/>Etcher</a>.<li><p>Type <code>systemctl start display-manager</code> and wait.<sup class=footnote-reference><a href=#1>1</a></sup><li><p>Open both the manual and the <code>konsole</code>.<li><p>Connect to the network using the GUI.<li><p>Create the disk partitions by using <code>fdisk</code>.</p> <p>You can list them with <code>fdisk -l</code>, modify a certain drive with <code>fdisk /dev/sdX</code> (for instance, <code>/dev/sda</code>) and follow the instructions.</p> <p>To create the file system, use <code>mkfs.ext4 -L <label> /dev/sdXY</code> and swap with <code>mkswap -L <label> /dev/sdXY</code>.</p> <p>The EFI partition should be done with <code>mkfs.vfat</code>.<li><p>Mount the target to <code>/mnt</code> e.g. if the label was <code>nixos</code>, <code>mount /dev/disk/by-label/nixos /mnt</code><li><p><code>mkdir /mnt/boot</code> and then mount your EFI partition to it.<li><p>Generate a configuration template with <code>nixos-generate-config --root /mnt</code>, and modify it with <code>nano /etc/nixos/configuration.nix</code>.<li><p>While modifying the configuration, make sure to add <code>boot.loader.grub.device = "/dev/sda"</code><li><p>More useful configuration things are:</p> <ul><li>Uncomment the whole <code>i18n</code> block.<li>Add some essential packages like <code>environment.systemPackages = with pkgs; [wget git firefox pulseaudio networkmanagerapplet];</code>.<li>If you want to use XFCE, add <code>services.xserver.desktopManager.xfce.enable = true;</code>, otherwise, you don't need <code>networkmanagerapplet</code> either. Make sure to add <code>networking.networkmanager.enable = true;</code> too.<li>Define some user for yourself (modify <code>guest</code> name) and use a UID greater than 1000. Also, add yourself to <code>extraGroups = ["wheel" "networkmanager"];</code> (the first to be able to <code>sudo</code>, the second to use network related things).</ul><li><p>Run <code>nixos-install</code>. If you ever modify that file again, to add more packages for instance (this is how they're installed), run <code>nixos-rebuild switch</code> (or use <code>test</code> to test but don't boot to it, or <code>boot</code> not to switch but to use on next boot.<li><p><code>reboot</code>.<li><p>Login as <code>root</code>, and set a password for your user with <code>passwd <user></code>. Done!</ul><p>I enjoyed the process of installing it, and it's really cool that it has versioning and is so clean to keep track of which packages you install. But not being able to run arbitrary binaries by default is something very limitting in my opinion, though they've done a good job.<p>I'm now back to Xubuntu, with a fresh install.<h2 id=update-1>Update</h2><p>It is not true that "they don't allow running arbitrary binaries by default", as pointed out in their <a href=https://nixos.org/nixpkgs/manual/#sec-fhs-environments>manual, buildFHSUserEnv</a>:<blockquote><p><code>buildFHSUserEnv</code> provides a way to build and run FHS-compatible lightweight sandboxes. It creates an isolated root with bound <code>/nix/store</code>, so its footprint in terms of disk space needed is quite small. This allows one to run software which is hard or unfeasible to patch for NixOS -- 3rd-party source trees with FHS assumptions, games distributed as tarballs, software with integrity checking and/or external self-updated binaries. It uses Linux namespaces feature to create temporary lightweight environments which are destroyed after all child processes exit, without root user rights requirement.</blockquote><p>Thanks to <a href=https://github.com/bb010g>@bb010g</a> for pointing this out.<h2 id=notes>Notes</h2><div class=footnote-definition id=1><sup class=footnote-definition-label>1</sup><p>The keyboard mapping is a bit strange. On my Spanish keyboard, the keys were as follows:</div><table><thead><tr><th>Keyboard<th>Maps to<th>Shift<tbody><tr><td>'<td>-<td>_<tr><td>´<td>'<td>"<tr><td>`<td>[<td><tr><td>+<td>]<td><tr><td>¡<td>=<td><tr><td>-<td>/<td><tr><td>ñ<td>;<td></table></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Installing NixOS | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Installing NixOS</h1><div class=time><p>2017-05-13<p>last updated 2019-02-16</div><h2 id=update>Update</h2><p><em>Please see <a href=../installing_nixos_2/index.html>my followup post with NixOS</a> for a far better experience with it</em><hr><p>Today I decided to install <a href=http://nixos.org/>NixOS</a> as a recommendation, a purely functional Linux distribution, since <a href=https://xubuntu.org/>Xubuntu</a> kept crashing. Here's my journey, and how I managed to install it from a terminal for the first time in my life. Steps aren't hard, but they may not seem obvious at first.<ul><li><p>Grab the Live CD, burn it on a USB stick and boot. I recommend using <a href=https://etcher.io/>Etcher</a>.<li><p>Type <code>systemctl start display-manager</code> and wait.<sup class=footnote-reference><a href=#1>1</a></sup><li><p>Open both the manual and the <code>konsole</code>.<li><p>Connect to the network using the GUI.<li><p>Create the disk partitions by using <code>fdisk</code>.</p> <p>You can list them with <code>fdisk -l</code>, modify a certain drive with <code>fdisk /dev/sdX</code> (for instance, <code>/dev/sda</code>) and follow the instructions.</p> <p>To create the file system, use <code>mkfs.ext4 -L <label> /dev/sdXY</code> and swap with <code>mkswap -L <label> /dev/sdXY</code>.</p> <p>The EFI partition should be done with <code>mkfs.vfat</code>.<li><p>Mount the target to <code>/mnt</code> e.g. if the label was <code>nixos</code>, <code>mount /dev/disk/by-label/nixos /mnt</code><li><p><code>mkdir /mnt/boot</code> and then mount your EFI partition to it.<li><p>Generate a configuration template with <code>nixos-generate-config --root /mnt</code>, and modify it with <code>nano /etc/nixos/configuration.nix</code>.<li><p>While modifying the configuration, make sure to add <code>boot.loader.grub.device = "/dev/sda"</code><li><p>More useful configuration things are:</p> <ul><li>Uncomment the whole <code>i18n</code> block.<li>Add some essential packages like <code>environment.systemPackages = with pkgs; [wget git firefox pulseaudio networkmanagerapplet];</code>.<li>If you want to use XFCE, add <code>services.xserver.desktopManager.xfce.enable = true;</code>, otherwise, you don't need <code>networkmanagerapplet</code> either. Make sure to add <code>networking.networkmanager.enable = true;</code> too.<li>Define some user for yourself (modify <code>guest</code> name) and use a UID greater than 1000. Also, add yourself to <code>extraGroups = ["wheel" "networkmanager"];</code> (the first to be able to <code>sudo</code>, the second to use network related things).</ul><li><p>Run <code>nixos-install</code>. If you ever modify that file again, to add more packages for instance (this is how they're installed), run <code>nixos-rebuild switch</code> (or use <code>test</code> to test but don't boot to it, or <code>boot</code> not to switch but to use on next boot.<li><p><code>reboot</code>.<li><p>Login as <code>root</code>, and set a password for your user with <code>passwd <user></code>. Done!</ul><p>I enjoyed the process of installing it, and it's really cool that it has versioning and is so clean to keep track of which packages you install. But not being able to run arbitrary binaries by default is something very limitting in my opinion, though they've done a good job.<p>I'm now back to Xubuntu, with a fresh install.<h2 id=update-1>Update</h2><p>It is not true that "they don't allow running arbitrary binaries by default", as pointed out in their <a href=https://nixos.org/nixpkgs/manual/#sec-fhs-environments>manual, buildFHSUserEnv</a>:<blockquote><p><code>buildFHSUserEnv</code> provides a way to build and run FHS-compatible lightweight sandboxes. It creates an isolated root with bound <code>/nix/store</code>, so its footprint in terms of disk space needed is quite small. This allows one to run software which is hard or unfeasible to patch for NixOS -- 3rd-party source trees with FHS assumptions, games distributed as tarballs, software with integrity checking and/or external self-updated binaries. It uses Linux namespaces feature to create temporary lightweight environments which are destroyed after all child processes exit, without root user rights requirement.</blockquote><p>Thanks to <a href=https://github.com/bb010g>@bb010g</a> for pointing this out.<h2 id=notes>Notes</h2><div class=footnote-definition id=1><sup class=footnote-definition-label>1</sup><p>The keyboard mapping is a bit strange. On my Spanish keyboard, the keys were as follows:</div><table><thead><tr><th>Keyboard<th>Maps to<th>Shift<tbody><tr><td>'<td>-<td>_<tr><td>´<td>'<td>"<tr><td>`<td>[<td><tr><td>+<td>]<td><tr><td>¡<td>=<td><tr><td>-<td>/<td><tr><td>ñ<td>;<td></table></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> A practical example with Hadoop | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>A practical example with Hadoop</h1><div class=time><p>2020-03-30T01:00:00+00:00<p>last updated 2020-04-18T13:25:43+00:00</div><p>In our <a href=/blog/mdad/introduction-to-hadoop-and-its-mapreduce/>previous Hadoop post</a>, we learnt what it is, how it originated, and how it works, from a theoretical standpoint. Here we will instead focus on a more practical example with Hadoop.<p>This post will reproduce the example on Chapter 2 of the book <a href=http://www.hadoopbook.com/>Hadoop: The Definitive Guide, Fourth Edition</a> (<a href=http://grut-computing.com/HadoopBook.pdf>pdf,</a><a href=http://www.hadoopbook.com/code.html>code</a>), that is, finding the maximum global-wide temperature for a given year.<h2 id=installation>Installation</h2><p>Before running any piece of software, its executable code must first be downloaded into our computers so that we can run it. Head over to <a href=http://hadoop.apache.org/releases.html>Apache Hadoop’s releases</a> and download the <a href=https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz>latest binary version</a> at the time of writing (3.2.1).<p>We will be using the <a href=https://linuxmint.com/>Linux Mint</a> distribution because I love its simplicity, although the process shown here should work just fine on any similar Linux distribution such as <a href=https://ubuntu.com/>Ubuntu</a>.<p>Once the archive download is complete, extract it with any tool of your choice (graphical or using the terminal) and execute it. Make sure you have a version of Java installed, such as <a href=https://openjdk.java.net/>OpenJDK</a>.<p>Here are all the three steps in the command line:<pre><code>wget https://apache.brunneis.com/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> A practical example with Hadoop | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>A practical example with Hadoop</h1><div class=time><p>2020-03-30T01:00:00+00:00<p>last updated 2020-04-18T13:25:43+00:00</div><p>In our <a href=/blog/mdad/introduction-to-hadoop-and-its-mapreduce/>previous Hadoop post</a>, we learnt what it is, how it originated, and how it works, from a theoretical standpoint. Here we will instead focus on a more practical example with Hadoop.<p>This post will reproduce the example on Chapter 2 of the book <a href=http://www.hadoopbook.com/>Hadoop: The Definitive Guide, Fourth Edition</a> (<a href=http://grut-computing.com/HadoopBook.pdf>pdf,</a><a href=http://www.hadoopbook.com/code.html>code</a>), that is, finding the maximum global-wide temperature for a given year.<h2 id=installation>Installation</h2><p>Before running any piece of software, its executable code must first be downloaded into our computers so that we can run it. Head over to <a href=http://hadoop.apache.org/releases.html>Apache Hadoop’s releases</a> and download the <a href=https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz>latest binary version</a> at the time of writing (3.2.1).<p>We will be using the <a href=https://linuxmint.com/>Linux Mint</a> distribution because I love its simplicity, although the process shown here should work just fine on any similar Linux distribution such as <a href=https://ubuntu.com/>Ubuntu</a>.<p>Once the archive download is complete, extract it with any tool of your choice (graphical or using the terminal) and execute it. Make sure you have a version of Java installed, such as <a href=https://openjdk.java.net/>OpenJDK</a>.<p>Here are all the three steps in the command line:<pre><code>wget https://apache.brunneis.com/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz tar xf hadoop-3.2.1.tar.gz hadoop-3.2.1/bin/hadoop version </code></pre><p>We will be using the two example data files that they provide in <a href=https://github.com/tomwhite/hadoop-book/tree/master/input/ncdc/all>their GitHub repository</a>, although the full dataset is offered by the <a href=https://www.ncdc.noaa.gov/>National Climatic Data Center</a> (NCDC).<p>We will also unzip and concatenate both files into a single text file, to make it easier to work with. As a single command pipeline:<pre><code>curl https://raw.githubusercontent.com/tomwhite/hadoop-book/master/input/ncdc/all/190{1,2}.gz | gunzip > 190x
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Big Data | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Big Data</h1><div class=time><p>2020-02-25T01:00:30+00:00<p>last updated 2020-03-18T09:51:17+00:00</div><p>Big Data sounds like a buzzword you may be hearing everywhere, but it’s actually here to stay!<h2 id=what-is-big-data>What is Big Data?</h2><p>And why is it so important? We use this term to refer to the large amount of data available, rapidly growing every day, that cannot be processed in conventional ways. It’s not only about the amount, it’s also about the variety and rate of growth.<p>Thanks to technological advancements, there are new ways to process this insane amount of data, which would otherwise be too costly for processing in traditional database systems.<h2 id=where-does-data-come-from>Where does data come from?</h2><p>It can be pictures in your phone, industry transactions, messages in social networks, a sensor in the mountains. It can come from anywhere, which makes the data very varied.<p>Just to give some numbers, over 12TB of data is generated on Twitter <em>daily</em>. If you purchase a laptop today (as of March 2020), the disk will be roughly 1TB, maybe 2TB. Twitter would fill 6 of those drives every day!<p>What about Facebook? It is estimated they store around 100PB of photos and videos. That would be 50000 laptop disks. Not a small number. And let’s not talk about worldwide network traffic…<h2 id=what-data-can-be-exploited>What data can be exploited?</h2><p>So, we have a lot of data. Should we attempt and process everything? We can distinguish several categories.<ul><li><strong>Web and Social Media</strong>: Clickstream Data, Twitter Feeds, Facebook Postings, Web content… Stuff coming from social networks.<li><strong>Biometrics</strong>: Facial Recognion, Genetics… Any kind of personal recognition.<li><strong>Machine-to-Machine</strong>: Utility Smart Meter Readings, RFID Readings, Oil Rig Sensor Readings, GPS Signals… Any sensor shared with other machines.<li><strong>Human Generated</strong>: Call Center Voice Recordings, Email, Electronic Medical Records… Even the voice notes one sends over WhatsApp count.<li><strong>Big Transaction Data</strong>: Healthcare Claims, Telecommunications Call Detail Records, Utility Billing Records… Financial transactions.</ul><p>But asking what to process is asking the wrong question. Instead, one should think about «What problem am I trying to solve?».<h2 id=how-to-exploit-this-data>How to exploit this data?</h2><p>What are some of the ways to deal with this data? If the problem fits the Map-Reduce paradigm then Hadoop is a great option! Hadoop is inspired by Google File System (GFS), and achieves great parallelism across the nodes of a cluster, and has the following components:<ul><li><strong>Hadoop Distributed File System</strong>. Data is divided into smaller «blocks» and distributed across the cluster, which makes it possible to execute the mapping and reduction in smaller subsets, and makes it possible to scale horizontally.<li><strong>Hadoop MapReduce</strong>. First, a data set is «mapped» into a different set, and data becomes a list of tuples (key, value). The «reduce» step works on these tuples and combines them into a smaller subset.<li><strong>Hadoop Common</strong>. These are a set of libraries that ease working with Hadoop.</ul><h2 id=key-insights>Key insights</h2><p>Big Data is a field whose goal is to extract information from very large sets of data, and find ways to do so. To summarize its different dimensions, we can refer to what’s known as «the Four V’s of Big Data»:<ul><li><strong>Volume</strong>. Really large quantities.<li><strong>Velocity</strong>. Processing response time matters!<li><strong>Variety</strong>. Data comes from plenty of sources.<li><strong>Veracity.</strong> Can we trust all sources, though?</ul><p>Some sources talk about a fifth V for <strong>Value</strong>; because processing this data is costly, it is important we can get value out of it.<p>…And some other sources go as high as seven V’s, including <strong>Viability</strong> and <strong>Visualization</strong>. Computers can’t take decissions on their own (yet), a human has to. And they can only do so if they’re presented the data (and visualize it) in a meaningful way.<h2 id=infographics>Infographics</h2><p>Let’s see some pictures, we all love pictures:<p><img src=https://lonami.dev/blog/mdad/big-data/4-Vs-of-big-data.jpg><h2 id=common-patterns>Common patterns</h2><h2 id=references>References</h2><ul><li>¿Qué es Big Data? – <a href=https://www.ibm.com/developerworks/ssa/local/im/que-es-big-data/>https://www.ibm.com/developerworks/ssa/local/im/que-es-big-data/</a><li>The Four V’s of Big Data – <a href=https://www.ibmbigdatahub.com/infographic/four-vs-big-data>https://www.ibmbigdatahub.com/infographic/four-vs-big-data</a><li>Big data – <a href=https://en.wikipedia.org/wiki/Big_data>https://en.wikipedia.org/wiki/Big_data</a><li>Las 5 V’s del Big Data – <a href=https://www.quanticsolutions.es/big-data/las-5-vs-del-big-data>https://www.quanticsolutions.es/big-data/las-5-vs-del-big-data</a><li>Las 7 V del Big data: Características más importantes – <a href=https://www.iic.uam.es/innovacion/big-data-caracteristicas-mas-importantes-7-v/#viabilidad>https://www.iic.uam.es/innovacion/big-data-caracteristicas-mas-importantes-7-v/</a></ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Big Data | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Big Data</h1><div class=time><p>2020-02-25T01:00:30+00:00<p>last updated 2020-03-18T09:51:17+00:00</div><p>Big Data sounds like a buzzword you may be hearing everywhere, but it’s actually here to stay!<h2 id=what-is-big-data>What is Big Data?</h2><p>And why is it so important? We use this term to refer to the large amount of data available, rapidly growing every day, that cannot be processed in conventional ways. It’s not only about the amount, it’s also about the variety and rate of growth.<p>Thanks to technological advancements, there are new ways to process this insane amount of data, which would otherwise be too costly for processing in traditional database systems.<h2 id=where-does-data-come-from>Where does data come from?</h2><p>It can be pictures in your phone, industry transactions, messages in social networks, a sensor in the mountains. It can come from anywhere, which makes the data very varied.<p>Just to give some numbers, over 12TB of data is generated on Twitter <em>daily</em>. If you purchase a laptop today (as of March 2020), the disk will be roughly 1TB, maybe 2TB. Twitter would fill 6 of those drives every day!<p>What about Facebook? It is estimated they store around 100PB of photos and videos. That would be 50000 laptop disks. Not a small number. And let’s not talk about worldwide network traffic…<h2 id=what-data-can-be-exploited>What data can be exploited?</h2><p>So, we have a lot of data. Should we attempt and process everything? We can distinguish several categories.<ul><li><strong>Web and Social Media</strong>: Clickstream Data, Twitter Feeds, Facebook Postings, Web content… Stuff coming from social networks.<li><strong>Biometrics</strong>: Facial Recognion, Genetics… Any kind of personal recognition.<li><strong>Machine-to-Machine</strong>: Utility Smart Meter Readings, RFID Readings, Oil Rig Sensor Readings, GPS Signals… Any sensor shared with other machines.<li><strong>Human Generated</strong>: Call Center Voice Recordings, Email, Electronic Medical Records… Even the voice notes one sends over WhatsApp count.<li><strong>Big Transaction Data</strong>: Healthcare Claims, Telecommunications Call Detail Records, Utility Billing Records… Financial transactions.</ul><p>But asking what to process is asking the wrong question. Instead, one should think about «What problem am I trying to solve?».<h2 id=how-to-exploit-this-data>How to exploit this data?</h2><p>What are some of the ways to deal with this data? If the problem fits the Map-Reduce paradigm then Hadoop is a great option! Hadoop is inspired by Google File System (GFS), and achieves great parallelism across the nodes of a cluster, and has the following components:<ul><li><strong>Hadoop Distributed File System</strong>. Data is divided into smaller «blocks» and distributed across the cluster, which makes it possible to execute the mapping and reduction in smaller subsets, and makes it possible to scale horizontally.<li><strong>Hadoop MapReduce</strong>. First, a data set is «mapped» into a different set, and data becomes a list of tuples (key, value). The «reduce» step works on these tuples and combines them into a smaller subset.<li><strong>Hadoop Common</strong>. These are a set of libraries that ease working with Hadoop.</ul><h2 id=key-insights>Key insights</h2><p>Big Data is a field whose goal is to extract information from very large sets of data, and find ways to do so. To summarize its different dimensions, we can refer to what’s known as «the Four V’s of Big Data»:<ul><li><strong>Volume</strong>. Really large quantities.<li><strong>Velocity</strong>. Processing response time matters!<li><strong>Variety</strong>. Data comes from plenty of sources.<li><strong>Veracity.</strong> Can we trust all sources, though?</ul><p>Some sources talk about a fifth V for <strong>Value</strong>; because processing this data is costly, it is important we can get value out of it.<p>…And some other sources go as high as seven V’s, including <strong>Viability</strong> and <strong>Visualization</strong>. Computers can’t take decissions on their own (yet), a human has to. And they can only do so if they’re presented the data (and visualize it) in a meaningful way.<h2 id=infographics>Infographics</h2><p>Let’s see some pictures, we all love pictures:<p><img src=https://lonami.dev/blog/mdad/big-data/4-Vs-of-big-data.jpg><h2 id=common-patterns>Common patterns</h2><h2 id=references>References</h2><ul><li>¿Qué es Big Data? – <a href=https://www.ibm.com/developerworks/ssa/local/im/que-es-big-data/>https://www.ibm.com/developerworks/ssa/local/im/que-es-big-data/</a><li>The Four V’s of Big Data – <a href=https://www.ibmbigdatahub.com/infographic/four-vs-big-data>https://www.ibmbigdatahub.com/infographic/four-vs-big-data</a><li>Big data – <a href=https://en.wikipedia.org/wiki/Big_data>https://en.wikipedia.org/wiki/Big_data</a><li>Las 5 V’s del Big Data – <a href=https://www.quanticsolutions.es/big-data/las-5-vs-del-big-data>https://www.quanticsolutions.es/big-data/las-5-vs-del-big-data</a><li>Las 7 V del Big data: Características más importantes – <a href=https://www.iic.uam.es/innovacion/big-data-caracteristicas-mas-importantes-7-v/#viabilidad>https://www.iic.uam.es/innovacion/big-data-caracteristicas-mas-importantes-7-v/</a></ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Cassandra: Introducción | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Cassandra: Introducción</h1><div class=time><p>2020-03-05T00:00:33+00:00<p>last updated 2020-03-30T09:28:07+00:00</div><p><img src=https://lonami.dev/blog/mdad/cassandra-introduccion/1200px-Cassandra_logo.png><p>Este es el primer post en la serie sobre Cassandra, en el cuál introduciremos dicha bases de datos NoSQL y veremos sus características e instalación.<p>Otros posts en esta serie:<ul><li><a href=/blog/mdad/cassandra-introduccion/>Cassandra: Introducción</a> (este post)<li><a href=/blog/mdad/cassandra-operaciones-basicas-y-arquitectura/>Cassandra: Operaciones Básicas y Arquitectura</a></ul><p>Este post está hecho en colaboración con un compañero.<hr><h2 id=finalidad-de-la-tecnologia>Finalidad de la tecnología</h2><p>Apache Cassandra es una base de datos NoSQL distribuida y de código abierto (<a href=https://github.com/apache/cassandra>con un espejo en GitHub</a>). Su filosofía es de tipo «clave-valor», y puede manejar grandes volúmenes de datos<p>Entre sus objetivos, busca ser escalable horizontalmente (puede replicarse en varios centros manteniendo la latencia baja) y alta disponibilidad sin ceder en rendimiento.<h2 id=como-funciona>Cómo funciona</h2><p>Instancias de Cassandra se distribuyen en nodos iguales (es decir, no hay maestro-esclavo) que se comunican entre sí (P2P). De este modo, da buen soporte entre varios centros de datos, con redundancia y réplicas síncronas.<p><img src=https://lonami.dev/blog/mdad/cassandra-introduccion/multiple-data-centers-and-data-replication-in-cassandra.jpg><p>Con respecto al modelo de datos, Cassandra particiona las filas con el objetivo de re-organizarla a lo largo distintas tablas. Como clave primaria, se usa un primer componente conocido como «clave de la partición». Dentro de cada partición, las filas se agrupan según el resto de columnas de la clave. Cualquier otra columna se puede indexar independientemente de la clave primaria.<p>Las tablas se pueden crear, borrar, actualizar y consultar sin bloqueos. No hay soporte para JOIN o subconsultas, pero Cassandra prefiere de-normalizar los datos haciendo uso de características como coleciones.<p>Para realizar las operaciones sobre cassandra se usa CQL (Cassandra Query Language), que tiene una sintaxis muy similar a SQL.<h2 id=caracteristicas>Características</h2><p>Como ya hemos mencionado antes, la arquitectura de Cassandra es <strong>decentralizada</strong>. No tiene un único punto que pudiera fallar porque todos los nodos son iguales (sin maestros), y por lo tanto, cualquiera puede dar servicio a la petición.<p>Los datos se encuentran <strong>replicados</strong> entre los distintos nodos del clúster (lo que ofrece gran <strong>tolerancia a fallos</strong> sin necesidad de interrumpir la aplicación), y es trivial <strong>escalar</strong> añadiendo más nodos al sistema.<p>El nivel de <strong>consistencia</strong> para lecturas y escrituras es configurable.<p>Siendo de la familia Apache, Cassandra ofrece integración con Apache Hadoop para tener soporte MapReduce.<h2 id=arista-dentro-del-teorema-cap>Arista dentro del Teorema CAP</h2><p>Cassandra se encuentra dentro de la esquina «AP» junto con CouchDB y otros, porque garantiza tanto la disponibilidad como la tolerancia a fallos.<p>Sin embargo, puede configurarse como un sistema «CP» si se prefiere respetar la consistencia en todo momento.<p><img src=https://lonami.dev/blog/mdad/cassandra-introduccion/0.jpeg><h2 id=descarga>Descarga</h2><p>Se pueden seguir las instrucciones de la página oficial para <a href=https://cassandra.apache.org/download/>descargar Cassandra</a>. Para ello, se debe clicar en la <a href=https://www.apache.org/dyn/closer.lua/cassandra/3.11.6/apache-cassandra-3.11.6-bin.tar.gz>última versión para descargar el archivo</a>. En nuestro caso, esto es el enlace nombrado «3.11.6», versión que utilizamos.<h2 id=instalacion>Instalación</h2><p>Cassandra no ofrece binarios para Windows, por lo que usaremos Linux para instalarlo. En nuestro caso, tenemos un sistema Linux Mint (derivado de Ubuntu), pero una máquina virtual con cualquier Linux debería funcionar.<p>Debemos asegurarnos de tener Java y Python 2 instalado mediante el siguiente comando:<pre><code>apt install openjdk-8-jdk openjdk-8-jre python2.7 +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Cassandra: Introducción | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Cassandra: Introducción</h1><div class=time><p>2020-03-05T00:00:33+00:00<p>last updated 2020-03-30T09:28:07+00:00</div><p><img src=https://lonami.dev/blog/mdad/cassandra-introduccion/1200px-Cassandra_logo.png><p>Este es el primer post en la serie sobre Cassandra, en el cuál introduciremos dicha bases de datos NoSQL y veremos sus características e instalación.<p>Otros posts en esta serie:<ul><li><a href=/blog/mdad/cassandra-introduccion/>Cassandra: Introducción</a> (este post)<li><a href=/blog/mdad/cassandra-operaciones-basicas-y-arquitectura/>Cassandra: Operaciones Básicas y Arquitectura</a></ul><p>Este post está hecho en colaboración con un compañero.<hr><h2 id=finalidad-de-la-tecnologia>Finalidad de la tecnología</h2><p>Apache Cassandra es una base de datos NoSQL distribuida y de código abierto (<a href=https://github.com/apache/cassandra>con un espejo en GitHub</a>). Su filosofía es de tipo «clave-valor», y puede manejar grandes volúmenes de datos<p>Entre sus objetivos, busca ser escalable horizontalmente (puede replicarse en varios centros manteniendo la latencia baja) y alta disponibilidad sin ceder en rendimiento.<h2 id=como-funciona>Cómo funciona</h2><p>Instancias de Cassandra se distribuyen en nodos iguales (es decir, no hay maestro-esclavo) que se comunican entre sí (P2P). De este modo, da buen soporte entre varios centros de datos, con redundancia y réplicas síncronas.<p><img src=https://lonami.dev/blog/mdad/cassandra-introduccion/multiple-data-centers-and-data-replication-in-cassandra.jpg><p>Con respecto al modelo de datos, Cassandra particiona las filas con el objetivo de re-organizarla a lo largo distintas tablas. Como clave primaria, se usa un primer componente conocido como «clave de la partición». Dentro de cada partición, las filas se agrupan según el resto de columnas de la clave. Cualquier otra columna se puede indexar independientemente de la clave primaria.<p>Las tablas se pueden crear, borrar, actualizar y consultar sin bloqueos. No hay soporte para JOIN o subconsultas, pero Cassandra prefiere de-normalizar los datos haciendo uso de características como coleciones.<p>Para realizar las operaciones sobre cassandra se usa CQL (Cassandra Query Language), que tiene una sintaxis muy similar a SQL.<h2 id=caracteristicas>Características</h2><p>Como ya hemos mencionado antes, la arquitectura de Cassandra es <strong>decentralizada</strong>. No tiene un único punto que pudiera fallar porque todos los nodos son iguales (sin maestros), y por lo tanto, cualquiera puede dar servicio a la petición.<p>Los datos se encuentran <strong>replicados</strong> entre los distintos nodos del clúster (lo que ofrece gran <strong>tolerancia a fallos</strong> sin necesidad de interrumpir la aplicación), y es trivial <strong>escalar</strong> añadiendo más nodos al sistema.<p>El nivel de <strong>consistencia</strong> para lecturas y escrituras es configurable.<p>Siendo de la familia Apache, Cassandra ofrece integración con Apache Hadoop para tener soporte MapReduce.<h2 id=arista-dentro-del-teorema-cap>Arista dentro del Teorema CAP</h2><p>Cassandra se encuentra dentro de la esquina «AP» junto con CouchDB y otros, porque garantiza tanto la disponibilidad como la tolerancia a fallos.<p>Sin embargo, puede configurarse como un sistema «CP» si se prefiere respetar la consistencia en todo momento.<p><img src=https://lonami.dev/blog/mdad/cassandra-introduccion/0.jpeg><h2 id=descarga>Descarga</h2><p>Se pueden seguir las instrucciones de la página oficial para <a href=https://cassandra.apache.org/download/>descargar Cassandra</a>. Para ello, se debe clicar en la <a href=https://www.apache.org/dyn/closer.lua/cassandra/3.11.6/apache-cassandra-3.11.6-bin.tar.gz>última versión para descargar el archivo</a>. En nuestro caso, esto es el enlace nombrado «3.11.6», versión que utilizamos.<h2 id=instalacion>Instalación</h2><p>Cassandra no ofrece binarios para Windows, por lo que usaremos Linux para instalarlo. En nuestro caso, tenemos un sistema Linux Mint (derivado de Ubuntu), pero una máquina virtual con cualquier Linux debería funcionar.<p>Debemos asegurarnos de tener Java y Python 2 instalado mediante el siguiente comando:<pre><code>apt install openjdk-8-jdk openjdk-8-jre python2.7 </code></pre><p>Para verificar que la instalación ha sido correcta, podemos mostrar las versiones de los programas:<pre><code>$ java -version openjdk version "1.8.0_242" OpenJDK Runtime Environment (build 1.8.0_242-8u242-b08-0ubuntu3~18.04-b08)
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Cassandra: Operaciones Básicas y Arquitectura | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Cassandra: Operaciones Básicas y Arquitectura</h1><div class=time><p>2020-03-05T02:00:41+00:00<p>last updated 2020-03-20T11:36:18+00:00</div><p>Este es el segundo post en la serie sobre Cassandra, con una breve descripción de las operaciones básicas (tales como inserción, recuperación e indexado), y ejecución por completo junto con el modelo de datos y arquitectura.<p>Otros posts en esta serie:<ul><li><a href=/blog/mdad/cassandra-introduccion/>Cassandra: Introducción</a><li><a href=/blog/mdad/cassandra-operaciones-basicas-y-arquitectura/>Cassandra: Operaciones Básicas y Arquitectura</a> (este post)</ul><p>Este post está hecho en colaboración con un compañero.<hr><p>Antes de poder ejecutar ninguna consulta, debemos lanzar la base de datos en caso de que no se encuentre en ejecución aún. Para ello, en una terminal, lanzamos el binario de <code>cassandra</code>:<pre><code>$ cassandra-3.11.6/bin/cassandra +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Cassandra: Operaciones Básicas y Arquitectura | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Cassandra: Operaciones Básicas y Arquitectura</h1><div class=time><p>2020-03-05T02:00:41+00:00<p>last updated 2020-03-20T11:36:18+00:00</div><p>Este es el segundo post en la serie sobre Cassandra, con una breve descripción de las operaciones básicas (tales como inserción, recuperación e indexado), y ejecución por completo junto con el modelo de datos y arquitectura.<p>Otros posts en esta serie:<ul><li><a href=/blog/mdad/cassandra-introduccion/>Cassandra: Introducción</a><li><a href=/blog/mdad/cassandra-operaciones-basicas-y-arquitectura/>Cassandra: Operaciones Básicas y Arquitectura</a> (este post)</ul><p>Este post está hecho en colaboración con un compañero.<hr><p>Antes de poder ejecutar ninguna consulta, debemos lanzar la base de datos en caso de que no se encuentre en ejecución aún. Para ello, en una terminal, lanzamos el binario de <code>cassandra</code>:<pre><code>$ cassandra-3.11.6/bin/cassandra </code></pre><p>Sin cerrar esta consola, abrimos otra en la que podamos usar la <a href=https://cassandra.apache.org/doc/latest/tools/cqlsh.html>CQL shell</a>:<pre><code>$ cassandra-3.11.6/bin/cqlsh Connected to Test Cluster at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 3.11.6 | CQL spec 3.4.4 | Native protocol v4]
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Data Warehousing and OLAP | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Data Warehousing and OLAP</h1><div class=time><p>2020-03-23T01:00:00+00:00<p>last updated 2020-04-01T09:45:41+00:00</div><p>Business intelligence (BI) refers to systems used to gain insights from data, traditionally taken from relational databases and being used to build a data warehouse. Performance and scalability are key aspects of BI systems.<p>Commonly, the data in the warehouse is a transformation of the original, operational data into a form better suited for reporting and analysis.<p>This whole process is known as Online Analytical Processing (OLAP), and is different to the approach taken by relational databases, which is known as Online Transaction Processing (OLTP) and is optimized for individual transactions. OLAP is based on multidimensional databases simply by the way it works.<p>The Business Intelligence Semantic Model (BISM) refers to the different semantics in which data can be accessed and queried.<p>On the one hand, MDX is the language used for Microsoft’s BISM of multidimensional mode, and on the other, DAX is the language of tabular mode, based on Excel’s formula language and designed to be easy to use by those familiar with Excel.<h2 id=types-of-data>Types of data</h2><p>The business data is often called detail data or <em>fact</em> data, goes in a de-normalized table called the fact table. The term «facts» literally refers to the facts, such as number of products sold and amount received for products sold. Different tables will often represent different dimensions of the data, where «dimensions» simply means different ways to look at the data.<p>Data can also be referred to as measures, because most of it is numbers and subject to aggregations. By measures, we refer to these values and numbers.<p>Multidimensional databases are formed with separate fact and dimension tables, grouped to create a «cube» with both facts and dimensions.<h2 id=places-to-store-data>Places to store data</h2><p>Three different terms are often heard when talking about the places where data is stored: data lakes, data warehouses, and data marts. All of these have different target users, cost, size and growth.<p>The data lake contains <strong>all</strong> the data generated by your business. Nothing is filtered out, not even cancelled or invalid transactions. If there are future plans to use the data, or a need to analyze it in various ways, a data lake is often necessary.<p>The data warehouse contains <strong>structured</strong> data, or has already been modelled. It’s also multi-purpose, but often of a lot smaller scale. Operational users are able to easily evaluate reports or analyze performance here, since it is built for their needs.<p>The data mart contains a <strong>small portion</strong> of the data, and is often part of data warehouses themselves. It can be seen as a subsection built for specific departments, and as a benefit, users get isolated security and performance. The data here is clean, and subject-oriented.<h2 id=ways-to-store-data>Ways to store data</h2><p>Data is often stored de-normalized, because it would not be feasible to store otherwise.<p>There are two main techniques to implement data warehouses, known as Inmon approach and Kimball approach. They are named after Ralph Kimball <em>et al.</em> for their work on «The Data Warehouse Lifecycle Toolkit», and Bill Inmon <em>et al.</em> for their work on «Corporate Information Factory» respectively.<p>When several independent systems identify and store data in different ways, we face what’s known as the problem of the stovepipe. Something as simple as trying to connect these systems or use their data in a warehouse results in an overly complicated system.<p>To tackle this issue, Kimball advocates the use of «conformed dimensions», that is, some dimensions will be «of interest», and have the same attributes and rollups (or at least a subset) in different data marts. This way, warehouses contain dimensional databases to ease analysis in the data marts it is composed of, and users query the warehouse.<p>The Inmon approach on the other hand has the warehouse laid out in third normal form, and users query the data marts, not the warehouse (so the data marts are dimensional in nature).<h2 id=key-takeaways>Key takeaways</h2><ul><li>«BI» stands for «Business Intelligence» and refers to the system that <em>perform</em> data analysis.<li>«BISM» stands for «Business Intelligence Semantic Model», and Microsoft has two languages to query data: MDX and DAX.<li>«OLAP» stands for «Online Analytical Processing», and «OLTP» for «Online Transaction Processing».<li>Data mart, warehouse and lake refer to places at different scales and with different needs to store data.<li>Inmon and Kimbal are different ways to implement data warehouses.<li>Data facts contains various measures arranged into different dimensions, which together form a data cube.</ul><h2 id=references>References</h2><ul><li><a href=https://media.wiley.com/product_data/excerpt/03/11181011/1118101103-157.pdf>Chapter 1 – Professional Microsoft SQL Server 2012 Analysis Services with MDX and DAX (Harinath et al., 2012)</a><li><a href=https://youtu.be/m_DzhW-2pWI>YouTube – Data Mining in SQL Server Analysis Services</a><li>Almacenes de Datos y Procesamiento Analítico On-Line (Félix R.)<li><a href=https://youtu.be/qkJOace9FZg>YouTube – What are Dimensions and Measures?</a><li><a href=https://www.holistics.io/blog/data-lake-vs-data-warehouse-vs-data-mart/>Data Lake vs Data Warehouse vs Data Mart</a></ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Data Warehousing and OLAP | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Data Warehousing and OLAP</h1><div class=time><p>2020-03-23T01:00:00+00:00<p>last updated 2020-04-01T09:45:41+00:00</div><p>Business intelligence (BI) refers to systems used to gain insights from data, traditionally taken from relational databases and being used to build a data warehouse. Performance and scalability are key aspects of BI systems.<p>Commonly, the data in the warehouse is a transformation of the original, operational data into a form better suited for reporting and analysis.<p>This whole process is known as Online Analytical Processing (OLAP), and is different to the approach taken by relational databases, which is known as Online Transaction Processing (OLTP) and is optimized for individual transactions. OLAP is based on multidimensional databases simply by the way it works.<p>The Business Intelligence Semantic Model (BISM) refers to the different semantics in which data can be accessed and queried.<p>On the one hand, MDX is the language used for Microsoft’s BISM of multidimensional mode, and on the other, DAX is the language of tabular mode, based on Excel’s formula language and designed to be easy to use by those familiar with Excel.<h2 id=types-of-data>Types of data</h2><p>The business data is often called detail data or <em>fact</em> data, goes in a de-normalized table called the fact table. The term «facts» literally refers to the facts, such as number of products sold and amount received for products sold. Different tables will often represent different dimensions of the data, where «dimensions» simply means different ways to look at the data.<p>Data can also be referred to as measures, because most of it is numbers and subject to aggregations. By measures, we refer to these values and numbers.<p>Multidimensional databases are formed with separate fact and dimension tables, grouped to create a «cube» with both facts and dimensions.<h2 id=places-to-store-data>Places to store data</h2><p>Three different terms are often heard when talking about the places where data is stored: data lakes, data warehouses, and data marts. All of these have different target users, cost, size and growth.<p>The data lake contains <strong>all</strong> the data generated by your business. Nothing is filtered out, not even cancelled or invalid transactions. If there are future plans to use the data, or a need to analyze it in various ways, a data lake is often necessary.<p>The data warehouse contains <strong>structured</strong> data, or has already been modelled. It’s also multi-purpose, but often of a lot smaller scale. Operational users are able to easily evaluate reports or analyze performance here, since it is built for their needs.<p>The data mart contains a <strong>small portion</strong> of the data, and is often part of data warehouses themselves. It can be seen as a subsection built for specific departments, and as a benefit, users get isolated security and performance. The data here is clean, and subject-oriented.<h2 id=ways-to-store-data>Ways to store data</h2><p>Data is often stored de-normalized, because it would not be feasible to store otherwise.<p>There are two main techniques to implement data warehouses, known as Inmon approach and Kimball approach. They are named after Ralph Kimball <em>et al.</em> for their work on «The Data Warehouse Lifecycle Toolkit», and Bill Inmon <em>et al.</em> for their work on «Corporate Information Factory» respectively.<p>When several independent systems identify and store data in different ways, we face what’s known as the problem of the stovepipe. Something as simple as trying to connect these systems or use their data in a warehouse results in an overly complicated system.<p>To tackle this issue, Kimball advocates the use of «conformed dimensions», that is, some dimensions will be «of interest», and have the same attributes and rollups (or at least a subset) in different data marts. This way, warehouses contain dimensional databases to ease analysis in the data marts it is composed of, and users query the warehouse.<p>The Inmon approach on the other hand has the warehouse laid out in third normal form, and users query the data marts, not the warehouse (so the data marts are dimensional in nature).<h2 id=key-takeaways>Key takeaways</h2><ul><li>«BI» stands for «Business Intelligence» and refers to the system that <em>perform</em> data analysis.<li>«BISM» stands for «Business Intelligence Semantic Model», and Microsoft has two languages to query data: MDX and DAX.<li>«OLAP» stands for «Online Analytical Processing», and «OLTP» for «Online Transaction Processing».<li>Data mart, warehouse and lake refer to places at different scales and with different needs to store data.<li>Inmon and Kimbal are different ways to implement data warehouses.<li>Data facts contains various measures arranged into different dimensions, which together form a data cube.</ul><h2 id=references>References</h2><ul><li><a href=https://media.wiley.com/product_data/excerpt/03/11181011/1118101103-157.pdf>Chapter 1 – Professional Microsoft SQL Server 2012 Analysis Services with MDX and DAX (Harinath et al., 2012)</a><li><a href=https://youtu.be/m_DzhW-2pWI>YouTube – Data Mining in SQL Server Analysis Services</a><li>Almacenes de Datos y Procesamiento Analítico On-Line (Félix R.)<li><a href=https://youtu.be/qkJOace9FZg>YouTube – What are Dimensions and Measures?</a><li><a href=https://www.holistics.io/blog/data-lake-vs-data-warehouse-vs-data-mart/>Data Lake vs Data Warehouse vs Data Mart</a></ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Developing a Python application for Cassandra | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Developing a Python application for Cassandra</h1><div class=time><p>2020-03-23T00:00:00+00:00<p>last updated 2020-04-16T07:52:26+00:00</div><p><em><strong>Warning</strong>: this post is, in fact, a shameless self-plug to my own library. If you continue reading, you accept that you are okay with this. Otherwise, please close the tab, shut down your computer, and set it on fire.__(Also, that was a joke. Please don’t do that.)</em><p>Let’s do some programming! Today we will be making a tiny CLI application in <a href=http://python.org/>Python</a> that queries <a href=https://core.telegram.org/api>Telegram’s API</a> and stores the data in <a href=http://cassandra.apache.org/>Cassandra</a>.<h2 id=our-goal>Our goal</h2><p>Our goal is to make a Python console application. This application will connect to <a href=https://telegram.org/>Telegram</a>, and ask for your account credentials. Once you have logged in, the application will fetch all of your open conversations and we will store these in Cassandra.<p>With the data saved in Cassandra, we can now very efficiently query information about your conversations given their identifier offline (no need to query Telegram anymore).<p><strong>In short</strong>, we are making an application that performs efficient offline queries to Cassandra to print out information about your Telegram conversations given the ID you want to query.<h2 id=data-model>Data model</h2><p>The application itself is really simple, and we only need one table to store all the relevant information we will be needing. This table called <code>**users**</code> will contain the following columns:<ul><li><code>**id**</code>, of type <code>int</code>. This will also be the <code>primary key</code> and we’ll use it to query the database later on.<li><code>**first_name**</code>, of type <code>varchar</code>. This field contains the first name of the stored user.<li><code>**last_name**</code>, of type <code>varchar</code>. This field contains the last name of the stored user.<li><code>**username**</code>, of type <code>varchar</code>. This field contains the username of the stored user. Because Cassandra uses a <a href=https://cassandra.apache.org/doc/latest/architecture/overview.html>wide column storage model</a>, direct access through a key is the most efficient way to query the database. In our case, the key is the primary key of the <code>users</code> table, using the <code>id</code> column. The index for the primary key is ready to be used as soon as we create the table, so we don’t need to create it on our own.</ul><h2 id=dependencies>Dependencies</h2><p>Because we will program it in Python, you need Python installed. You can install it using a package manager of your choice or heading over to the <a href=https://www.python.org/downloads/>Python downloads section</a>, but if you’re on Linux, chances are you have it installed already.<p>Once Python 3.5 or above is installed, get a copy of the Cassandra driver for Python and Telethon through <code>pip</code>:<pre><code>pip install cassandra-driver telethon +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Developing a Python application for Cassandra | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Developing a Python application for Cassandra</h1><div class=time><p>2020-03-23T00:00:00+00:00<p>last updated 2020-04-16T07:52:26+00:00</div><p><em><strong>Warning</strong>: this post is, in fact, a shameless self-plug to my own library. If you continue reading, you accept that you are okay with this. Otherwise, please close the tab, shut down your computer, and set it on fire.__(Also, that was a joke. Please don’t do that.)</em><p>Let’s do some programming! Today we will be making a tiny CLI application in <a href=http://python.org/>Python</a> that queries <a href=https://core.telegram.org/api>Telegram’s API</a> and stores the data in <a href=http://cassandra.apache.org/>Cassandra</a>.<h2 id=our-goal>Our goal</h2><p>Our goal is to make a Python console application. This application will connect to <a href=https://telegram.org/>Telegram</a>, and ask for your account credentials. Once you have logged in, the application will fetch all of your open conversations and we will store these in Cassandra.<p>With the data saved in Cassandra, we can now very efficiently query information about your conversations given their identifier offline (no need to query Telegram anymore).<p><strong>In short</strong>, we are making an application that performs efficient offline queries to Cassandra to print out information about your Telegram conversations given the ID you want to query.<h2 id=data-model>Data model</h2><p>The application itself is really simple, and we only need one table to store all the relevant information we will be needing. This table called <code>**users**</code> will contain the following columns:<ul><li><code>**id**</code>, of type <code>int</code>. This will also be the <code>primary key</code> and we’ll use it to query the database later on.<li><code>**first_name**</code>, of type <code>varchar</code>. This field contains the first name of the stored user.<li><code>**last_name**</code>, of type <code>varchar</code>. This field contains the last name of the stored user.<li><code>**username**</code>, of type <code>varchar</code>. This field contains the username of the stored user. Because Cassandra uses a <a href=https://cassandra.apache.org/doc/latest/architecture/overview.html>wide column storage model</a>, direct access through a key is the most efficient way to query the database. In our case, the key is the primary key of the <code>users</code> table, using the <code>id</code> column. The index for the primary key is ready to be used as soon as we create the table, so we don’t need to create it on our own.</ul><h2 id=dependencies>Dependencies</h2><p>Because we will program it in Python, you need Python installed. You can install it using a package manager of your choice or heading over to the <a href=https://www.python.org/downloads/>Python downloads section</a>, but if you’re on Linux, chances are you have it installed already.<p>Once Python 3.5 or above is installed, get a copy of the Cassandra driver for Python and Telethon through <code>pip</code>:<pre><code>pip install cassandra-driver telethon </code></pre><p>For more details on that, see the <a href=https://docs.datastax.com/en/developer/python-driver/3.22/installation/>installation guide for <code>cassandra-driver</code></a>, or the <a href=https://docs.telethon.dev/en/latest/basic/installation.html>installation guide for <code>telethon</code></a>.<p>As we did in our <a href=/blog/mdad/cassandra-operaciones-basicas-y-arquitectura/>previous post</a>, we will setup a new keyspace for this application with <code>cqlsh</code>. We will also create a table to store the users into. This could all be automated in the Python code, but because it’s a one-time thing, we prefer to use <code>cqlsh</code>.<p>Make sure that Cassandra is running in the background. We can’t make queries to it if it’s not running.<pre><code>$ bin/cqlsh Connected to Test Cluster at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 3.11.6 | CQL spec 3.4.4 | Native protocol v4]
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Privado: Final NoSQL evaluation | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Privado: Final NoSQL evaluation</h1><div class=time><p>2020-05-13T00:00:00+00:00<p>last updated 2020-05-14T08:31:06+00:00</div><p>This evaluation is a bit different to my <a href=/blog/mdad/nosql-evaluation/>previous one</a> because this time I have been tasked to evaluate student <code>a(i - 2)</code>, and because I am <code>i = 11</code> that happens to be <code>a(9) =</code> a classmate.<h2 id=classmate-s-evaluation>Classmate’s Evaluation</h2><p><strong>Grading: A.</strong><p>The post I have evaluated is Trabajo en grupo – Bases de datos NoSQL, 3ª entrada: Aplicación con una Base de datos NoSQL seleccionada.<p>It starts with a very brief introduction with who has written the post, what data they will be using, and what database they have chosen.<p>They properly describe their objective, how they will do it and what library will be used.<p>They also explain where they obtain the data from, and what other things the site can do, which is a nice bonus.<p>The post continues listing and briefly explaining all the tools used and what they are for, including commands to execute.<p>At last, they list what files their project uses, what they do, and contains a showcase of images which lets the reader know what the application does.<p>All in all, in my opinion, it’s clear they have put work into this entry and I have not noticed any major flaws, so they deserve the highest grade.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Privado: Final NoSQL evaluation | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Privado: Final NoSQL evaluation</h1><div class=time><p>2020-05-13T00:00:00+00:00<p>last updated 2020-05-14T08:31:06+00:00</div><p>This evaluation is a bit different to my <a href=/blog/mdad/nosql-evaluation/>previous one</a> because this time I have been tasked to evaluate student <code>a(i - 2)</code>, and because I am <code>i = 11</code> that happens to be <code>a(9) =</code> a classmate.<h2 id=classmate-s-evaluation>Classmate’s Evaluation</h2><p><strong>Grading: A.</strong><p>The post I have evaluated is Trabajo en grupo – Bases de datos NoSQL, 3ª entrada: Aplicación con una Base de datos NoSQL seleccionada.<p>It starts with a very brief introduction with who has written the post, what data they will be using, and what database they have chosen.<p>They properly describe their objective, how they will do it and what library will be used.<p>They also explain where they obtain the data from, and what other things the site can do, which is a nice bonus.<p>The post continues listing and briefly explaining all the tools used and what they are for, including commands to execute.<p>At last, they list what files their project uses, what they do, and contains a showcase of images which lets the reader know what the application does.<p>All in all, in my opinion, it’s clear they have put work into this entry and I have not noticed any major flaws, so they deserve the highest grade.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Data Mining and Data Warehousing</h1><p id=welcome onclick=pls_stop()>Welcome to my blog!<p>Here I occasionally post new entries, mostly tech related. Perhaps it's tips for a new game I'm playing, perhaps it has something to do with FFI, or perhaps I'm fighting the borrow checker (just kidding, I'm over that. Mostly).<hr><ul><li><a href=https://lonami.dev/blog/mdad/final-nosql-evaluation/>Privado: Final NoSQL evaluation</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/a-practical-example-with-hadoop/>A practical example with Hadoop</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/introduction-to-hadoop-and-its-mapreduce/>Introduction to Hadoop and its MapReduce</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/data-warehousing-and-olap/>Data Warehousing and OLAP</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/developing-a-python-application-for-cassandra/>Developing a Python application for Cassandra</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/mining-of-massive-datasets/>Mining of Massive Datasets</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/nosql-evaluation/>Privado: NoSQL evaluation</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/visualizing-caceres-opendata/>Visualizing Cáceres’ OpenData</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/mongodb-operaciones-basicas-y-arquitectura/>MongoDB: Operaciones Básicas y Arquitectura</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/cassandra-operaciones-basicas-y-arquitectura/>Cassandra: Operaciones Básicas y Arquitectura</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/mongodb-introduction/>MongoDB: Introducción</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/cassandra-introduccion/>Cassandra: Introducción</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/introduction-to-nosql/>Introduction to NoSQL</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/big-data/>Big Data</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/what-is-an-algorithm/>What is an algorithm?</a><span class=dim> </span></ul><script> +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Data Mining and Data Warehousing</h1><p id=welcome onclick=pls_stop()>Welcome to my blog!<p>Here I occasionally post new entries, mostly tech related. Perhaps it's tips for a new game I'm playing, perhaps it has something to do with FFI, or perhaps I'm fighting the borrow checker (just kidding, I'm over that. Mostly).<hr><ul><li><a href=https://lonami.dev/blog/mdad/final-nosql-evaluation/>Privado: Final NoSQL evaluation</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/a-practical-example-with-hadoop/>A practical example with Hadoop</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/introduction-to-hadoop-and-its-mapreduce/>Introduction to Hadoop and its MapReduce</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/data-warehousing-and-olap/>Data Warehousing and OLAP</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/developing-a-python-application-for-cassandra/>Developing a Python application for Cassandra</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/mining-of-massive-datasets/>Mining of Massive Datasets</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/nosql-evaluation/>Privado: NoSQL evaluation</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/visualizing-caceres-opendata/>Visualizing Cáceres’ OpenData</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/mongodb-operaciones-basicas-y-arquitectura/>MongoDB: Operaciones Básicas y Arquitectura</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/cassandra-operaciones-basicas-y-arquitectura/>Cassandra: Operaciones Básicas y Arquitectura</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/mongodb-introduction/>MongoDB: Introducción</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/cassandra-introduccion/>Cassandra: Introducción</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/introduction-to-nosql/>Introduction to NoSQL</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/big-data/>Big Data</a><span class=dim> </span><li><a href=https://lonami.dev/blog/mdad/what-is-an-algorithm/>What is an algorithm?</a><span class=dim> </span></ul><script> const WELCOME_EN = 'Welcome to my blog!' const WELCOME_ES = '¡Bienvenido a mi blog!' const APOLOGIES = "ok sorry i'll stop"
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Introduction to Hadoop and its MapReduce | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Introduction to Hadoop and its MapReduce</h1><div class=time><p>2020-03-30T00:00:00+00:00<p>last updated 2020-04-01T11:01:46+00:00</div><p>Hadoop is an open-source, free, Java-based programming framework that helps processing large datasets in a distributed environment and the problems that arise when trying to harness the knowledge from BigData, capable of running on thousands of nodes and dealing with petabytes of data. It is based on Google File System (GFS) and originated from the work on the Nutch open-source project on search engines.<p>Hadoop also offers a distributed filesystem (HDFS) enabling for fast transfer among nodes, and a way to program with MapReduce.<p>It aims to strive for the 4 V’s: Volume, Variety, Veracity and Velocity. For veracity, it is a secure environment that can be trusted.<h2 id=milestones>Milestones</h2><p>The creators of Hadoop are Doug Cutting and Mike Cafarella, who just wanted to design a search engine, Nutch, and quickly found the problems of dealing with large amounts of data. They found their solution with the papers Google published.<p>The name comes from the plush of Cutting’s child, a yellow elephant.<ul><li>In July 2005, Nutch used GFS to perform MapReduce operations.<li>In February 2006, Nutch started a Lucene subproject which led to Hadoop.<li>In April 2007, Yahoo used Hadoop in a 1 000-node cluster.<li>In January 2008, Apache took over and made Hadoop a top-level project.<li>In July 2008, Apache tested a 4000-node cluster. The performance was the fastest compared to other technologies that year.<li>In May 2009, Hadoop sorted a petabyte of data in 17 hours.<li>In December 2011, Hadoop reached 1.0.<li>In May 2012, Hadoop 2.0 was released with the addition of YARN (Yet Another Resource Navigator) on top of HDFS, splitting MapReduce and other processes into separate components, greatly improving the fault tolerance.</ul><p>From here onwards, many other alternatives have born, like Spark, Hive & Drill, Kafka, HBase, built around the Hadoop ecosystem.<p>As of 2017, Amazon has clusters between 1 and 100 nodes, Yahoo has over 100 000 CPUs running Hadoop, AOL has clusters with 50 machines, and Facebook has a 320-machine (2 560 cores) and 1.3PB of raw storage.<h2 id=why-not-use-rdbms>Why not use RDBMS?</h2><p>Relational database management systems simply cannot scale horizontally, and vertical scaling will require very expensive servers. Similar to RDBMS, Hadoop has a notion of jobs (analogous to transactions), but without ACID or concurrency control. Hadoop supports any form of data (unstructured or semi-structured) in read-only mode, and failures are common but there’s a simple yet efficient fault tolerance.<p>So what problems does Hadoop solve? It solves the way we should think about problems, and distributing them, which is key to do anything related with BigData nowadays. We start working with clusters of nodes, and coordinating the jobs between them. Hadoop’s API makes this really easy.<p>Hadoop also takes very seriously the loss of data with replication, and if a node falls, they are moved to a different node.<h2 id=major-components>Major components</h2><p>The previously-mentioned HDFS runs on commodity machine, which are cost-friendly. It is very fault-tolerant and efficient enough to process huge amounts of data, because it splits large files into smaller chunks (or blocks) that can be more easily handled. Multiple nodes can work on multiple chunks at the same time.<p>NameNode stores the metadata of the various datablocks (map of blocks) along with their location. It is the brain and the master in Hadoop’s master-slave architecture, also known as the namespace, and makes use of the DataNode.<p>A secondary NameNode is a replica that can be used if the first NameNode dies, so that Hadoop doesn’t shutdown and can restart.<p>DataNode stores the blocks of data, and are the slaves in the architecture. This data is split into one or more files. Their only job is to manage this access to the data. They are often distributed among racks to avoid data lose.<p>JobTracker creates and schedules jobs from the clients for either map or reduce operations.<p>TaskTracker runs MapReduce tasks assigned to the current data node.<p>When clients need data, they first interact with the NameNode and replies with the location of the data in the correct DataNode. Client proceeds with interaction with the DataNode.<h2 id=mapreduce>MapReduce</h2><p>MapReduce, as the name implies, is split into two steps: the map and the reduce. The map stage is the «divide and conquer» strategy, while the reduce part is about combining and reducing the results.<p>The mapper has to process the input data (normally a file or directory), commonly line-by-line, and produce one or more outputs. The reducer uses all the results from the mapper as its input to produce a new output file itself.<p><img src=https://lonami.dev/blog/mdad/introduction-to-hadoop-and-its-mapreduce/bitmap.png><p>When reading the data, some may be junk that we can choose to ignore. If it is valid data, however, we label it with a particular type that can be useful for the upcoming process. Hadoop is responsible for splitting the data accross the many nodes available to execute this process in parallel.<p>There is another part to MapReduce, known as the Shuffle-and-Sort. In this part, types or categories from one node get moved to a different node. This happens with all nodes, so that every node can work on a complete category. These categories are known as «keys», and allows Hadoop to scale linearly.<h2 id=references>References</h2><ul><li><a href=https://youtu.be/oT7kczq5A-0>YouTube – Hadoop Tutorial For Beginners | What Is Hadoop? | Hadoop Tutorial | Hadoop Training | Simplilearn</a><li><a href=https://youtu.be/bcjSe0xCHbE>YouTube – Learn MapReduce with Playing Cards</a><li><a href=https://youtu.be/j8ehT1_G5AY?list=PLi4tp-TF_qjM_ed4lIzn03w7OnEh0D8Xi>YouTube – Video Post #2: Hadoop para torpes (I)-¿Qué es y para qué sirve?</a><li><a href=https://youtu.be/NQ8mjVPCDvk?list=PLi4tp-TF_qjM_ed4lIzn03w7OnEh0D8Xi>Video Post #3: Hadoop para torpes (II)-¿Cómo funciona? HDFS y MapReduce</a><li><a href=https://hadoop.apache.org/old/releases.html>Apache Hadoop Releases</a><li><a href=https://youtu.be/20qWx2KYqYg?list=PLi4tp-TF_qjM_ed4lIzn03w7OnEh0D8Xi>Video Post #4: Hadoop para torpes (III y fin)- Ecosistema y distribuciones</a><li><a href=http://www.hadoopbook.com/>Chapter 2 – Hadoop: The Definitive Guide, Fourth Edition</a> (<a href=http://grut-computing.com/HadoopBook.pdf>pdf,</a><a href=http://www.hadoopbook.com/code.html>code</a>)</ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Introduction to Hadoop and its MapReduce | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Introduction to Hadoop and its MapReduce</h1><div class=time><p>2020-03-30T00:00:00+00:00<p>last updated 2020-04-01T11:01:46+00:00</div><p>Hadoop is an open-source, free, Java-based programming framework that helps processing large datasets in a distributed environment and the problems that arise when trying to harness the knowledge from BigData, capable of running on thousands of nodes and dealing with petabytes of data. It is based on Google File System (GFS) and originated from the work on the Nutch open-source project on search engines.<p>Hadoop also offers a distributed filesystem (HDFS) enabling for fast transfer among nodes, and a way to program with MapReduce.<p>It aims to strive for the 4 V’s: Volume, Variety, Veracity and Velocity. For veracity, it is a secure environment that can be trusted.<h2 id=milestones>Milestones</h2><p>The creators of Hadoop are Doug Cutting and Mike Cafarella, who just wanted to design a search engine, Nutch, and quickly found the problems of dealing with large amounts of data. They found their solution with the papers Google published.<p>The name comes from the plush of Cutting’s child, a yellow elephant.<ul><li>In July 2005, Nutch used GFS to perform MapReduce operations.<li>In February 2006, Nutch started a Lucene subproject which led to Hadoop.<li>In April 2007, Yahoo used Hadoop in a 1 000-node cluster.<li>In January 2008, Apache took over and made Hadoop a top-level project.<li>In July 2008, Apache tested a 4000-node cluster. The performance was the fastest compared to other technologies that year.<li>In May 2009, Hadoop sorted a petabyte of data in 17 hours.<li>In December 2011, Hadoop reached 1.0.<li>In May 2012, Hadoop 2.0 was released with the addition of YARN (Yet Another Resource Navigator) on top of HDFS, splitting MapReduce and other processes into separate components, greatly improving the fault tolerance.</ul><p>From here onwards, many other alternatives have born, like Spark, Hive & Drill, Kafka, HBase, built around the Hadoop ecosystem.<p>As of 2017, Amazon has clusters between 1 and 100 nodes, Yahoo has over 100 000 CPUs running Hadoop, AOL has clusters with 50 machines, and Facebook has a 320-machine (2 560 cores) and 1.3PB of raw storage.<h2 id=why-not-use-rdbms>Why not use RDBMS?</h2><p>Relational database management systems simply cannot scale horizontally, and vertical scaling will require very expensive servers. Similar to RDBMS, Hadoop has a notion of jobs (analogous to transactions), but without ACID or concurrency control. Hadoop supports any form of data (unstructured or semi-structured) in read-only mode, and failures are common but there’s a simple yet efficient fault tolerance.<p>So what problems does Hadoop solve? It solves the way we should think about problems, and distributing them, which is key to do anything related with BigData nowadays. We start working with clusters of nodes, and coordinating the jobs between them. Hadoop’s API makes this really easy.<p>Hadoop also takes very seriously the loss of data with replication, and if a node falls, they are moved to a different node.<h2 id=major-components>Major components</h2><p>The previously-mentioned HDFS runs on commodity machine, which are cost-friendly. It is very fault-tolerant and efficient enough to process huge amounts of data, because it splits large files into smaller chunks (or blocks) that can be more easily handled. Multiple nodes can work on multiple chunks at the same time.<p>NameNode stores the metadata of the various datablocks (map of blocks) along with their location. It is the brain and the master in Hadoop’s master-slave architecture, also known as the namespace, and makes use of the DataNode.<p>A secondary NameNode is a replica that can be used if the first NameNode dies, so that Hadoop doesn’t shutdown and can restart.<p>DataNode stores the blocks of data, and are the slaves in the architecture. This data is split into one or more files. Their only job is to manage this access to the data. They are often distributed among racks to avoid data lose.<p>JobTracker creates and schedules jobs from the clients for either map or reduce operations.<p>TaskTracker runs MapReduce tasks assigned to the current data node.<p>When clients need data, they first interact with the NameNode and replies with the location of the data in the correct DataNode. Client proceeds with interaction with the DataNode.<h2 id=mapreduce>MapReduce</h2><p>MapReduce, as the name implies, is split into two steps: the map and the reduce. The map stage is the «divide and conquer» strategy, while the reduce part is about combining and reducing the results.<p>The mapper has to process the input data (normally a file or directory), commonly line-by-line, and produce one or more outputs. The reducer uses all the results from the mapper as its input to produce a new output file itself.<p><img src=https://lonami.dev/blog/mdad/introduction-to-hadoop-and-its-mapreduce/bitmap.png><p>When reading the data, some may be junk that we can choose to ignore. If it is valid data, however, we label it with a particular type that can be useful for the upcoming process. Hadoop is responsible for splitting the data accross the many nodes available to execute this process in parallel.<p>There is another part to MapReduce, known as the Shuffle-and-Sort. In this part, types or categories from one node get moved to a different node. This happens with all nodes, so that every node can work on a complete category. These categories are known as «keys», and allows Hadoop to scale linearly.<h2 id=references>References</h2><ul><li><a href=https://youtu.be/oT7kczq5A-0>YouTube – Hadoop Tutorial For Beginners | What Is Hadoop? | Hadoop Tutorial | Hadoop Training | Simplilearn</a><li><a href=https://youtu.be/bcjSe0xCHbE>YouTube – Learn MapReduce with Playing Cards</a><li><a href=https://youtu.be/j8ehT1_G5AY?list=PLi4tp-TF_qjM_ed4lIzn03w7OnEh0D8Xi>YouTube – Video Post #2: Hadoop para torpes (I)-¿Qué es y para qué sirve?</a><li><a href=https://youtu.be/NQ8mjVPCDvk?list=PLi4tp-TF_qjM_ed4lIzn03w7OnEh0D8Xi>Video Post #3: Hadoop para torpes (II)-¿Cómo funciona? HDFS y MapReduce</a><li><a href=https://hadoop.apache.org/old/releases.html>Apache Hadoop Releases</a><li><a href=https://youtu.be/20qWx2KYqYg?list=PLi4tp-TF_qjM_ed4lIzn03w7OnEh0D8Xi>Video Post #4: Hadoop para torpes (III y fin)- Ecosistema y distribuciones</a><li><a href=http://www.hadoopbook.com/>Chapter 2 – Hadoop: The Definitive Guide, Fourth Edition</a> (<a href=http://grut-computing.com/HadoopBook.pdf>pdf,</a><a href=http://www.hadoopbook.com/code.html>code</a>)</ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Introduction to NoSQL | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Introduction to NoSQL</h1><div class=time><p>2020-02-25T02:00:30+00:00<p>last updated 2020-03-18T09:51:33+00:00</div><p>This post will primarly focus on the talk held in the <a href=https://youtu.be/qI_g07C_Q5I>GOTO 2012 conference: Introduction to NoSQL by Martin Fowler</a>. It can be seen as an informal, summarized transcript of the talk<hr><p>The relational database model is affected by the <em><a href=https://en.wikipedia.org/wiki/Object-relational_impedance_mismatch>impedance mismatch problem</a></em>. This occurs because we have to match our high-level design with the separate columns and rows used by relational databases.<p>Taking the in-memory objects and putting them into a relational database (which were dominant at the time) simply didn’t work out. Why? Relational databases were more than just databases, they served as a an integration mechanism across applications, up to the 2000s. For 20 years!<p>With the rise of the Internet and the sheer amount of traffic, databases needed to scale. Unfortunately, relational databases only scale well vertically (by upgrading a <em>single</em> node). This is <em>very</em> expensive, and not something many could afford.<p>The problem are those pesky <code>JOIN</code>‘s, and its friends <code>GROUP BY</code>. Because our program and reality model don’t match the tables used by SQL, we have to rely on them to query the data. It is because the model doesn’t map directly.<p>Furthermore, graphs don’t map very well at all to relational models.<p>We needed a way to scale horizontally (by increasing the <em>amount</em> of nodes), something relational databases were not designed to do.<blockquote><p><em>We need to do something different, relational across nodes is an unnatural act</em></blockquote><p>This inspired the NoSQL movement.<blockquote><p><em>#nosql was only meant to be a hashtag to advertise it, but unfortunately it’s how it is called now</em></blockquote><p>It is not possible to define NoSQL, but we can identify some of its characteristics:<ul><li>Non-relational<li><strong>Cluster-friendly</strong> (this was the original spark)<li>Open-source (until now, generally)<li>21st century web culture<li>Schema-less (easier integration or conjugation of several models, structure aggregation)</ul><p>These databases use different data models to those used by the relational model. However, it is possible to identify 4 broad chunks (some may say 3, or even 2!):<ul><li><strong>Key-value store</strong>. With a certain key, you obtain the value corresponding to it. It knows nothing else, nor does it care. We say the data is opaque.<li><strong>Document-based</strong>. It stores an entire mass of documents with complex structure, normally through the use of JSON (XML has been left behind). Then, you can ask for certain fields, structures, or portions. We say the data is transparent.<li><strong>Column-family</strong>. There is a «row key», and within it we store multiple «column families» (columns that fit together, our aggregate). We access by row-key and column-family name.</ul><p>All of these kind of serve to store documents without any <em>explicit</em> schema. Just shove in anything! This gives a lot of flexibility and ease of migration, except… that’s not really true. There’s an <em>implicit</em> schema when querying.<p>For example, a query where we may do <code>anOrder['price'] * anOrder['quantity']</code> is assuming that <code>anOrder</code> has both a <code>price</code> and a <code>quantity</code>, and that both of these can be multiplied together. «Schema-less» is a fuzzy term.<p>However, it is the lack of a <em>fixed</em> schema that gives flexibility.<p>One could argue that the line between key-value and document-based is very fuzzy, and they would be right! Key-value databases often let you include additional metadata that behaves like an index, and in document-based, documents often have an identifier anyway.<p>The common notion between these three types is what matters. They save an entire structure as an <em>unit</em>. We can refer to these as «Aggregate Oriented Databases». Aggregate, because we group things when designing or modeling our systems, as opposed to relational databases that scatter the information across many tables.<p>There exists a notable outlier, though, and that’s:<ul><li><strong>Graph</strong> databases. They use a node-and-arc graph structure. They are great for moving on relationships across things. Ironically, relational databases are not very good at jumping across relationships! It is possibly to perform very interesting queries in graph databases which would be really hard and costly on relational models. Unlike the aggregated databases, graphs break things into even smaller units. NoSQL is not <em>the</em> solution. It depends on how you’ll work with your data. Do you need an aggregate database? Will you have a lot of relationships? Or would the relational model be good fit for you?</ul><p>NoSQL, however, is a good fit for large-scale projects (data will <em>always</em> grow) and faster development (the impedance mismatch is drastically reduced).<p>Regardless of our choice, it is important to remember that NoSQL is a young technology, which is still evolving really fast (SQL has been stable for <em>decades</em>). But the <em>polyglot persistence</em> is what matters. One must know the alternatives, and be able to choose.<hr><p>Relational databases have the well-known ACID properties: Atomicity, Consistency, Isolation and Durability.<p>NoSQL (except graph-based!) are about being BASE instead: Basically Available, Soft state, Eventual consistency.<p>SQL needs transactions because we don’t want to perform a read while we’re only half-way done with a write! The readers and writers are the problem, and ensuring consistency results in a performance hit, even if the risk is low (two writers are extremely rare but it still must be handled).<p>NoSQL on the other hand doesn’t need ACID because the aggregate <em>is</em> the transaction boundary. Even before NoSQL itself existed! Any update is atomic by nature. When updating many documents it <em>is</em> a problem, but this is very rare.<p>We have to distinguish between logical and replication consistency. During an update and if a conflict occurs, it must be resolved to preserve the logical consistency. Replication consistency on the other hand is preserveed when distributing the data across many machines, for example during sharding or copies.<p>Replication buys us more processing power and resillence (at the cost of more storage) in case some of the nodes die. But what happens if what dies is the communication across the nodes? We could drop the requests and preserve the consistency, or accept the risk to continue and instead preserve the availability.<p>The choice on whether trading consistency for availability is acceptable or not depends on the domain rules. It is the domain’s choice, the business people will choose. If you’re Amazon, you always want to be able to sell, but if you’re a bank, you probably don’t want your clients to have negative numbers in their account!<p>Regardless of what we do, in a distributed system, the CAP theorem always applies: Consistecy, Availability, Partitioning-tolerancy (error tolerancy). It is <strong>impossible</strong> to guarantee all 3 at 100%. Most of the times, it does work, but it is mathematically impossible to guarantee at 100%.<p>A database has to choose what to give up at some point. When designing a distributed system, this must be considered. Normally, the choice is made between consistency or response time.<h2 id=further-reading>Further reading</h2><ul><li><a href=https://www.martinfowler.com/articles/nosql-intro-original.pdf>The future is: <del>NoSQL Databases</del> Polyglot Persistence</a><li><a href=https://www.thoughtworks.com/insights/blog/nosql-databases-overview>NoSQL Databases: An Overview</a></ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Introduction to NoSQL | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Introduction to NoSQL</h1><div class=time><p>2020-02-25T02:00:30+00:00<p>last updated 2020-03-18T09:51:33+00:00</div><p>This post will primarly focus on the talk held in the <a href=https://youtu.be/qI_g07C_Q5I>GOTO 2012 conference: Introduction to NoSQL by Martin Fowler</a>. It can be seen as an informal, summarized transcript of the talk<hr><p>The relational database model is affected by the <em><a href=https://en.wikipedia.org/wiki/Object-relational_impedance_mismatch>impedance mismatch problem</a></em>. This occurs because we have to match our high-level design with the separate columns and rows used by relational databases.<p>Taking the in-memory objects and putting them into a relational database (which were dominant at the time) simply didn’t work out. Why? Relational databases were more than just databases, they served as a an integration mechanism across applications, up to the 2000s. For 20 years!<p>With the rise of the Internet and the sheer amount of traffic, databases needed to scale. Unfortunately, relational databases only scale well vertically (by upgrading a <em>single</em> node). This is <em>very</em> expensive, and not something many could afford.<p>The problem are those pesky <code>JOIN</code>‘s, and its friends <code>GROUP BY</code>. Because our program and reality model don’t match the tables used by SQL, we have to rely on them to query the data. It is because the model doesn’t map directly.<p>Furthermore, graphs don’t map very well at all to relational models.<p>We needed a way to scale horizontally (by increasing the <em>amount</em> of nodes), something relational databases were not designed to do.<blockquote><p><em>We need to do something different, relational across nodes is an unnatural act</em></blockquote><p>This inspired the NoSQL movement.<blockquote><p><em>#nosql was only meant to be a hashtag to advertise it, but unfortunately it’s how it is called now</em></blockquote><p>It is not possible to define NoSQL, but we can identify some of its characteristics:<ul><li>Non-relational<li><strong>Cluster-friendly</strong> (this was the original spark)<li>Open-source (until now, generally)<li>21st century web culture<li>Schema-less (easier integration or conjugation of several models, structure aggregation)</ul><p>These databases use different data models to those used by the relational model. However, it is possible to identify 4 broad chunks (some may say 3, or even 2!):<ul><li><strong>Key-value store</strong>. With a certain key, you obtain the value corresponding to it. It knows nothing else, nor does it care. We say the data is opaque.<li><strong>Document-based</strong>. It stores an entire mass of documents with complex structure, normally through the use of JSON (XML has been left behind). Then, you can ask for certain fields, structures, or portions. We say the data is transparent.<li><strong>Column-family</strong>. There is a «row key», and within it we store multiple «column families» (columns that fit together, our aggregate). We access by row-key and column-family name.</ul><p>All of these kind of serve to store documents without any <em>explicit</em> schema. Just shove in anything! This gives a lot of flexibility and ease of migration, except… that’s not really true. There’s an <em>implicit</em> schema when querying.<p>For example, a query where we may do <code>anOrder['price'] * anOrder['quantity']</code> is assuming that <code>anOrder</code> has both a <code>price</code> and a <code>quantity</code>, and that both of these can be multiplied together. «Schema-less» is a fuzzy term.<p>However, it is the lack of a <em>fixed</em> schema that gives flexibility.<p>One could argue that the line between key-value and document-based is very fuzzy, and they would be right! Key-value databases often let you include additional metadata that behaves like an index, and in document-based, documents often have an identifier anyway.<p>The common notion between these three types is what matters. They save an entire structure as an <em>unit</em>. We can refer to these as «Aggregate Oriented Databases». Aggregate, because we group things when designing or modeling our systems, as opposed to relational databases that scatter the information across many tables.<p>There exists a notable outlier, though, and that’s:<ul><li><strong>Graph</strong> databases. They use a node-and-arc graph structure. They are great for moving on relationships across things. Ironically, relational databases are not very good at jumping across relationships! It is possibly to perform very interesting queries in graph databases which would be really hard and costly on relational models. Unlike the aggregated databases, graphs break things into even smaller units. NoSQL is not <em>the</em> solution. It depends on how you’ll work with your data. Do you need an aggregate database? Will you have a lot of relationships? Or would the relational model be good fit for you?</ul><p>NoSQL, however, is a good fit for large-scale projects (data will <em>always</em> grow) and faster development (the impedance mismatch is drastically reduced).<p>Regardless of our choice, it is important to remember that NoSQL is a young technology, which is still evolving really fast (SQL has been stable for <em>decades</em>). But the <em>polyglot persistence</em> is what matters. One must know the alternatives, and be able to choose.<hr><p>Relational databases have the well-known ACID properties: Atomicity, Consistency, Isolation and Durability.<p>NoSQL (except graph-based!) are about being BASE instead: Basically Available, Soft state, Eventual consistency.<p>SQL needs transactions because we don’t want to perform a read while we’re only half-way done with a write! The readers and writers are the problem, and ensuring consistency results in a performance hit, even if the risk is low (two writers are extremely rare but it still must be handled).<p>NoSQL on the other hand doesn’t need ACID because the aggregate <em>is</em> the transaction boundary. Even before NoSQL itself existed! Any update is atomic by nature. When updating many documents it <em>is</em> a problem, but this is very rare.<p>We have to distinguish between logical and replication consistency. During an update and if a conflict occurs, it must be resolved to preserve the logical consistency. Replication consistency on the other hand is preserveed when distributing the data across many machines, for example during sharding or copies.<p>Replication buys us more processing power and resillence (at the cost of more storage) in case some of the nodes die. But what happens if what dies is the communication across the nodes? We could drop the requests and preserve the consistency, or accept the risk to continue and instead preserve the availability.<p>The choice on whether trading consistency for availability is acceptable or not depends on the domain rules. It is the domain’s choice, the business people will choose. If you’re Amazon, you always want to be able to sell, but if you’re a bank, you probably don’t want your clients to have negative numbers in their account!<p>Regardless of what we do, in a distributed system, the CAP theorem always applies: Consistecy, Availability, Partitioning-tolerancy (error tolerancy). It is <strong>impossible</strong> to guarantee all 3 at 100%. Most of the times, it does work, but it is mathematically impossible to guarantee at 100%.<p>A database has to choose what to give up at some point. When designing a distributed system, this must be considered. Normally, the choice is made between consistency or response time.<h2 id=further-reading>Further reading</h2><ul><li><a href=https://www.martinfowler.com/articles/nosql-intro-original.pdf>The future is: <del>NoSQL Databases</del> Polyglot Persistence</a><li><a href=https://www.thoughtworks.com/insights/blog/nosql-databases-overview>NoSQL Databases: An Overview</a></ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Mining of Massive Datasets | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Mining of Massive Datasets</h1><div class=time><p>2020-03-16T01:00:00+00:00<p>last updated 2020-03-28T19:09:44+00:00</div><p>In this post we will talk about the Chapter 1 of the book Mining of Massive Datasets Leskovec, J. et al., available online, and I will summarize and share my thoughts on it.<p>Data mining often refers to the discovery of models for data, where the model can be for statistics, machine learning, summarizing, extracting features, or other computational approaches to perform complex queries on the data.<p>Commonly, problems related to data mining involve discovering unusual events hidden in massive data sets. There is another problem when trying to achieve Total Information Awareness (TIA), though, a project that was proposed by the Bush administration but shut down. The problem is, if you look at so much data, and try to find activities that look like (for example) terrorist behavior, inevitably one will find other illicit activities that are not terrorism with bad consequences. So it is important to narrow the activities we are looking for, in this case.<p>When looking at data, even completely random data, for a certain event type, the event will likely occur. With more data, it will occur more times. However, these are bogus results. The Bonferroni correction gives a statistically sound way to avoid most of these bogus results, however, the Bonferroni’s Principle can be used as an informal version to achieve the same thing.<p>For that, we calculate the expected number of occurrences of the events you are looking for on the assumption that data is random. If this number is way larger than the number of real instances one hoped to find, then nearly everything will be Bogus.<hr><p>When analysing documents, some words will be more important than others, and can help determine the topic of the document. One could think the most repeated words are the most important, but that’s far from the truth. The most common words are the stop-words, which carry no meaning, reason why we should remove them prior to processing. We are mostly looking for rare nouns.<p>There are of course formal measures for how concentrated into relatively few documents are the occurrences of a given word, known as TF.IDF (Term Frequency times In-verse Document Frequency). We won’t go into details on how to compute it, because there are multiple ways.<p>Hash functions are also frequently used, because they can turn hash keys into a bucket number (the index of the bucket where this hash key belongs). They «randomize» and spread the universe of keys into a smaller number of buckets, useful for storage and access.<p>An index is an efficient structure to query for values given a key, and can be built with hash functions and buckets.<p>Having all of these is important when analysing documents when doing data mining, because otherwise it would take far too long.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Mining of Massive Datasets | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Mining of Massive Datasets</h1><div class=time><p>2020-03-16T01:00:00+00:00<p>last updated 2020-03-28T19:09:44+00:00</div><p>In this post we will talk about the Chapter 1 of the book Mining of Massive Datasets Leskovec, J. et al., available online, and I will summarize and share my thoughts on it.<p>Data mining often refers to the discovery of models for data, where the model can be for statistics, machine learning, summarizing, extracting features, or other computational approaches to perform complex queries on the data.<p>Commonly, problems related to data mining involve discovering unusual events hidden in massive data sets. There is another problem when trying to achieve Total Information Awareness (TIA), though, a project that was proposed by the Bush administration but shut down. The problem is, if you look at so much data, and try to find activities that look like (for example) terrorist behavior, inevitably one will find other illicit activities that are not terrorism with bad consequences. So it is important to narrow the activities we are looking for, in this case.<p>When looking at data, even completely random data, for a certain event type, the event will likely occur. With more data, it will occur more times. However, these are bogus results. The Bonferroni correction gives a statistically sound way to avoid most of these bogus results, however, the Bonferroni’s Principle can be used as an informal version to achieve the same thing.<p>For that, we calculate the expected number of occurrences of the events you are looking for on the assumption that data is random. If this number is way larger than the number of real instances one hoped to find, then nearly everything will be Bogus.<hr><p>When analysing documents, some words will be more important than others, and can help determine the topic of the document. One could think the most repeated words are the most important, but that’s far from the truth. The most common words are the stop-words, which carry no meaning, reason why we should remove them prior to processing. We are mostly looking for rare nouns.<p>There are of course formal measures for how concentrated into relatively few documents are the occurrences of a given word, known as TF.IDF (Term Frequency times In-verse Document Frequency). We won’t go into details on how to compute it, because there are multiple ways.<p>Hash functions are also frequently used, because they can turn hash keys into a bucket number (the index of the bucket where this hash key belongs). They «randomize» and spread the universe of keys into a smaller number of buckets, useful for storage and access.<p>An index is an efficient structure to query for values given a key, and can be built with hash functions and buckets.<p>Having all of these is important when analysing documents when doing data mining, because otherwise it would take far too long.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> MongoDB: Introducción | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>MongoDB: Introducción</h1><div class=time><p>2020-03-05T01:00:18+00:00<p>last updated 2020-03-20T10:31:10+00:00</div><p>Este es el primer post en la serie sobre Mongo, en el cuál introduciremos dicha bases de datos NoSQL y veremos sus características e instalación.<p>Otros posts en esta serie:<ul><li><a href=/blog/mdad/mongodb-introduction/>MongoDB: Introducción</a> (este post)<li><a href=/blog/mdad/mongodb-operaciones-basicas-y-arquitectura/>MongoDB: Operaciones Básicas y Arquitectura</a></ul><p>Este post está hecho en colaboración con un compañero.<hr><p><img src=https://lonami.dev/blog/mdad/mongodb-introduction/0LRP4__jIIkJ-0gl8j2RDzWscL1Rto-NwvdqzmYk0jmYBIVbJ78n1ZLByPgV.png><h2 id=definicion>Definición</h2><p>MongoDB es una base de datos orientada a documentos. Esto quiere decir que en lugar de guardar los datos en registros, guarda los datos en documentos. Estos documentos son almacenados en BSON, que es una representación binaria de JSON. Una de las principales diferencias respecto a las bases de datos relacionales es que no necesita seguir ningún esquema, los documentos de una misma colección pueden tener esquemas diferentes.<p>MongoDB está escrito en C++, aunque las consultas se hacen pasando objetos JSON como parámetro.<pre><code>{ +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> MongoDB: Introducción | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>MongoDB: Introducción</h1><div class=time><p>2020-03-05T01:00:18+00:00<p>last updated 2020-03-20T10:31:10+00:00</div><p>Este es el primer post en la serie sobre Mongo, en el cuál introduciremos dicha bases de datos NoSQL y veremos sus características e instalación.<p>Otros posts en esta serie:<ul><li><a href=/blog/mdad/mongodb-introduction/>MongoDB: Introducción</a> (este post)<li><a href=/blog/mdad/mongodb-operaciones-basicas-y-arquitectura/>MongoDB: Operaciones Básicas y Arquitectura</a></ul><p>Este post está hecho en colaboración con un compañero.<hr><p><img src=https://lonami.dev/blog/mdad/mongodb-introduction/0LRP4__jIIkJ-0gl8j2RDzWscL1Rto-NwvdqzmYk0jmYBIVbJ78n1ZLByPgV.png><h2 id=definicion>Definición</h2><p>MongoDB es una base de datos orientada a documentos. Esto quiere decir que en lugar de guardar los datos en registros, guarda los datos en documentos. Estos documentos son almacenados en BSON, que es una representación binaria de JSON. Una de las principales diferencias respecto a las bases de datos relacionales es que no necesita seguir ningún esquema, los documentos de una misma colección pueden tener esquemas diferentes.<p>MongoDB está escrito en C++, aunque las consultas se hacen pasando objetos JSON como parámetro.<pre><code>{ "_id" : ObjectId("52f602d787945c344bb4bda5"), "name" : "Tyrion", "hobbies" : [
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> MongoDB: Operaciones Básicas y Arquitectura | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>MongoDB: Operaciones Básicas y Arquitectura</h1><div class=time><p>2020-03-05T03:00:53+00:00<p>last updated 2020-03-20T11:42:15+00:00</div><p>Este es el segundo post en la serie sobre MongoDB, con una breve descripción de las operaciones básicas (tales como inserción, recuperación e indexado), y ejecución por completo junto con el modelo de datos y arquitectura.<p>Otros posts en esta serie:<ul><li><a href=/blog/mdad/mongodb-introduction/>MongoDB: Introducción</a><li><a href=/blog/mdad/mongodb-operaciones-basicas-y-arquitectura/>MongoDB: Operaciones Básicas y Arquitectura</a> (este post)</ul><p>Este post está hecho en colaboración con un compañero, y en él veremos algunos ejemplos de las operaciones básicas (<a href=https://stackify.com/what-are-crud-operations/>CRUD</a>) sobre MongoDB.<hr><p>Empezaremos viendo cómo creamos una nueva base de datos dentro de MongoDB y una nueva colección donde poder insertar nuestros documentos.<h2 id=creacion-de-una-base-de-datos-e-insercion-de-un-primer-documento>Creación de una base de datos e inserción de un primer documento</h2><p>Podemos ver las bases de datos que tenemos disponibles ejecutando el comando:<pre><code>> show databases +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> MongoDB: Operaciones Básicas y Arquitectura | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>MongoDB: Operaciones Básicas y Arquitectura</h1><div class=time><p>2020-03-05T03:00:53+00:00<p>last updated 2020-03-20T11:42:15+00:00</div><p>Este es el segundo post en la serie sobre MongoDB, con una breve descripción de las operaciones básicas (tales como inserción, recuperación e indexado), y ejecución por completo junto con el modelo de datos y arquitectura.<p>Otros posts en esta serie:<ul><li><a href=/blog/mdad/mongodb-introduction/>MongoDB: Introducción</a><li><a href=/blog/mdad/mongodb-operaciones-basicas-y-arquitectura/>MongoDB: Operaciones Básicas y Arquitectura</a> (este post)</ul><p>Este post está hecho en colaboración con un compañero, y en él veremos algunos ejemplos de las operaciones básicas (<a href=https://stackify.com/what-are-crud-operations/>CRUD</a>) sobre MongoDB.<hr><p>Empezaremos viendo cómo creamos una nueva base de datos dentro de MongoDB y una nueva colección donde poder insertar nuestros documentos.<h2 id=creacion-de-una-base-de-datos-e-insercion-de-un-primer-documento>Creación de una base de datos e inserción de un primer documento</h2><p>Podemos ver las bases de datos que tenemos disponibles ejecutando el comando:<pre><code>> show databases admin 0.000GB config 0.000GB local 0.000GB
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Privado: NoSQL evaluation | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Privado: NoSQL evaluation</h1><div class=time><p>2020-03-16T00:00:35+00:00<p>last updated 2020-03-28T19:22:31+00:00</div><p>This evaluation is based on the criteria for the first delivery described by Trabajos en grupo sobre Bases de Datos NoSQL.<p>I have chosen to evaluate the following people and works:<ul><li>a12: Classmate (username) with Druid.<li>a21: Classmate (username) with Neo4J.</ul><h2 id=classmate-s-evaluation>Classmate’s Evaluation</h2><p><strong>Grading: A.</strong><p>The post evaluated is Bases de datos NoSQL – Apache Druid – Primera entrega.<p>It is a very well-written, complete post, with each section meeting one of the points in the required criteria. The only thing that bothered me a little is the abuse of strong emphasis in the text, which I found quite distracting. However, the content deserves the highest grading.<h2 id=classmate-s-evaluation-1>Classmate’s Evaluation</h2><p><strong>Grading: A.</strong><p>The post evaluated is Bases de datos NoSQL – Neo4j – Primera entrega.<p>Well-written post, although a bit smaller than Classmate’s, but that’s not really an issue. It still talks about everything it should talk and includes photos to go along the text which help. There is no noticeable wrong things in it, so it gets the highest grading as well.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Privado: NoSQL evaluation | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Privado: NoSQL evaluation</h1><div class=time><p>2020-03-16T00:00:35+00:00<p>last updated 2020-03-28T19:22:31+00:00</div><p>This evaluation is based on the criteria for the first delivery described by Trabajos en grupo sobre Bases de Datos NoSQL.<p>I have chosen to evaluate the following people and works:<ul><li>a12: Classmate (username) with Druid.<li>a21: Classmate (username) with Neo4J.</ul><h2 id=classmate-s-evaluation>Classmate’s Evaluation</h2><p><strong>Grading: A.</strong><p>The post evaluated is Bases de datos NoSQL – Apache Druid – Primera entrega.<p>It is a very well-written, complete post, with each section meeting one of the points in the required criteria. The only thing that bothered me a little is the abuse of strong emphasis in the text, which I found quite distracting. However, the content deserves the highest grading.<h2 id=classmate-s-evaluation-1>Classmate’s Evaluation</h2><p><strong>Grading: A.</strong><p>The post evaluated is Bases de datos NoSQL – Neo4j – Primera entrega.<p>Well-written post, although a bit smaller than Classmate’s, but that’s not really an issue. It still talks about everything it should talk and includes photos to go along the text which help. There is no noticeable wrong things in it, so it gets the highest grading as well.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Visualizing Cáceres’ OpenData | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Visualizing Cáceres’ OpenData</h1><div class=time><p>2020-03-09T00:00:08+00:00<p>last updated 2020-03-19T14:38:41+00:00</div><p>The city of Cáceres has online services to provide <a href=http://opendata.caceres.es/>Open Data</a> over a wide range of <a href=http://opendata.caceres.es/dataset>categories</a>, all of which are very interesting to explore!<p>We have chosen two different datasets, and will explore four different ways to visualize the data.<p>This post is co-authored with Classmate.<h2 id=obtain-the-data>Obtain the data</h2><p>We are interested in the JSON format for the <a href=http://opendata.caceres.es/dataset/informacion-del-padron-de-caceres-2017>census in 2017</a> and those for the <a href=http://opendata.caceres.es/dataset/vias-urbanas-caceres>vias of the city</a>. This way, we can explore the population and their location in interesting ways! You may follow those two links and select the JSON format under Resources to download it.<p>Why JSON? We will be using <a href=https://python.org/>Python</a> (3.7 or above) and <a href=https://matplotlib.org/>matplotlib</a> for quick iteration, and loading the data with <a href=https://docs.python.org/3/library/json.html>Python’s <code>json</code> module</a> will be trivial.<h2 id=implementation>Implementation</h2><h3 id=imports-and-constants>Imports and constants</h3><p>We are going to need a lot of things in this code, such as <code>json</code> to load the data, <code>matplotlib</code> to visualize it, and other data types and type hinting for use in the code.<p>We also want automatic download of the JSON files if they’re missing, so we add their URLs and download paths as constants.<pre><code>import json +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Visualizing Cáceres’ OpenData | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Visualizing Cáceres’ OpenData</h1><div class=time><p>2020-03-09T00:00:08+00:00<p>last updated 2020-03-19T14:38:41+00:00</div><p>The city of Cáceres has online services to provide <a href=http://opendata.caceres.es/>Open Data</a> over a wide range of <a href=http://opendata.caceres.es/dataset>categories</a>, all of which are very interesting to explore!<p>We have chosen two different datasets, and will explore four different ways to visualize the data.<p>This post is co-authored with Classmate.<h2 id=obtain-the-data>Obtain the data</h2><p>We are interested in the JSON format for the <a href=http://opendata.caceres.es/dataset/informacion-del-padron-de-caceres-2017>census in 2017</a> and those for the <a href=http://opendata.caceres.es/dataset/vias-urbanas-caceres>vias of the city</a>. This way, we can explore the population and their location in interesting ways! You may follow those two links and select the JSON format under Resources to download it.<p>Why JSON? We will be using <a href=https://python.org/>Python</a> (3.7 or above) and <a href=https://matplotlib.org/>matplotlib</a> for quick iteration, and loading the data with <a href=https://docs.python.org/3/library/json.html>Python’s <code>json</code> module</a> will be trivial.<h2 id=implementation>Implementation</h2><h3 id=imports-and-constants>Imports and constants</h3><p>We are going to need a lot of things in this code, such as <code>json</code> to load the data, <code>matplotlib</code> to visualize it, and other data types and type hinting for use in the code.<p>We also want automatic download of the JSON files if they’re missing, so we add their URLs and download paths as constants.<pre><code>import json import re import os import sys
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> What is an algorithm? | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>What is an algorithm?</h1><div class=time><p>2020-02-25T00:00:16+00:00<p>last updated 2020-03-18T09:51:02+00:00</div><p>Algorithms are a sequence of instructions that can be followed to achieve <em>something</em>. That something can be anything, and depends entirely on your problem!<p>For example, a recipe to cook some really nice food is an algorithm: it guides you, step by step, to cook something nice. People dealing with mathemathics also apply algorithms to transform their data. And computers <em>love</em> algorithms, too!<p>In reality, any computer program can basically be thought as an algorithm. It contains a series of instructions for the computer to execute. Running them is a process that takes time, consumes input and produces output. This is also why terms like «procedure» come up when talking about them.<p>Computer programs (their algorithms) are normally written in some more specific language, like Java or Python. The instructions are very clear here, which is what we need! A natural language like English is a lot harder to process, and ambiguous. I’m sure you’ve been in arguments because the other person didn’t understand you!<h2 id=references>References</h2><ul><li>algorithm – definition and meaning: <a href=https://www.wordnik.com/words/algorithm>https://www.wordnik.com/words/algorithm</a><li>Algorithm: <a href=https://en.wikipedia.org/wiki/Algorithm>https://en.wikipedia.org/wiki/Algorithm</a><li>What is a «computer algorithm»?: <a href=https://computer.howstuffworks.com/what-is-a-computer-algorithm.htm>https://computer.howstuffworks.com/what-is-a-computer-algorithm.htm</a></ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> What is an algorithm? | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>What is an algorithm?</h1><div class=time><p>2020-02-25T00:00:16+00:00<p>last updated 2020-03-18T09:51:02+00:00</div><p>Algorithms are a sequence of instructions that can be followed to achieve <em>something</em>. That something can be anything, and depends entirely on your problem!<p>For example, a recipe to cook some really nice food is an algorithm: it guides you, step by step, to cook something nice. People dealing with mathemathics also apply algorithms to transform their data. And computers <em>love</em> algorithms, too!<p>In reality, any computer program can basically be thought as an algorithm. It contains a series of instructions for the computer to execute. Running them is a process that takes time, consumes input and produces output. This is also why terms like «procedure» come up when talking about them.<p>Computer programs (their algorithms) are normally written in some more specific language, like Java or Python. The instructions are very clear here, which is what we need! A natural language like English is a lot harder to process, and ambiguous. I’m sure you’ve been in arguments because the other person didn’t understand you!<h2 id=references>References</h2><ul><li>algorithm – definition and meaning: <a href=https://www.wordnik.com/words/algorithm>https://www.wordnik.com/words/algorithm</a><li>Algorithm: <a href=https://en.wikipedia.org/wiki/Algorithm>https://en.wikipedia.org/wiki/Algorithm</a><li>What is a «computer algorithm»?: <a href=https://computer.howstuffworks.com/what-is-a-computer-algorithm.htm>https://computer.howstuffworks.com/what-is-a-computer-algorithm.htm</a></ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> My new computer | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>My new computer</h1><div class=time><p>2020-06-19<p>last updated 2020-07-03</div><p>This post will be mostly me ranting about setting up a new laptop, but I also just want to share my upgrade. If you're considering installing Arch Linux with dual-boot for Windows, maybe this post will help. Or perhaps you will learn something new to troubleshoot systems in the future. Let's begin!<p>Last Sunday, I ordered a Asus Rog Strix G531GT-BQ165 for 900€ (on a 20% discount) with the following specifications:<ul><li>Intel® Core i7-9750H (6 cores, 12MB cache, 2.6GHz up to 4.5GHz, 64-bit)<li>16GB RAM (8GB*2) DDR4 2666MHz<li>512GB SSD M.2 PCIe® NVMe<li>Display 15.6" (1920x1080/16:9) 60Hz<li>Graphics NVIDIA® GeForce® GTX1650 4GB GDDR5 VRAM<li>LAN 10/100/1000<li>Wi-Fi 5 (802.11ac) 2x2 RangeBoost<li>Bluetooth 5.0<li>48Wh battery with 3 cells<li>3 x USB 3.1 (GEN1)</ul><p>I was mostly interested in a general upgrade (better processor, disk, more RAM), although the graphics card is a really nice addition which will allow me to take some time off on more games. After using it for a bit, I really love the feel of the keyboard, and I love the lack of numpad! (No sarcasm, I really don't like numpads.)<p>This is an upgrade from my previous laptop (Asus X554LA-XX822T), which I won in a competition before entering university in a programming challenge. It has served me really well for the past five years, and had the following specifications:<ul><li>Intel® Core™ i5-5200U<li>4GB RAM DDR3L 1600MHz (which I upgraded to have 8GB)<li>1TB HDD<li>Display 15.6" (1366x768/16:9)<li>Intel® HD Graphics 4400<li>LAN 10/100/1000<li>Wifi 802.11 bgn<li>Bluetooth 4.0<li>Battery 2 cells<li>1 x USB 2.0<li>2 x USB 3.0</ul><p>Prior to this one, I had a Lenovo (also won in the same competition of the previous year), and prior to that (just for the sake of history), it was HP Pavilion, AMD A4-3300M processor, which unfortunately ended with heating problems. But that's very old now.<h2 id=laptop-arrival>Laptop arrival</h2><p>The laptop arrived 2 days ago at roughly 19:00, which I put charged for 3 hours as the book said. The day after, nightmares began!<p>Trying to boot it the first two times was fun, as it comes with a somewhat loud sound on boot. I don't know why they would do this, and I immediately turned it off in the BIOS.<h2 id=installation-journey>Installation journey</h2><p>I spent all of yesterday trying to setup Windows and Arch Linux (and didn't even finish, it took me this morning too and even now it's only half functional). I absolutely <em>hate</em> the amount of partitions the Windows installer creates on a clean disk. So instead, I first went with Arch Linux, and followed the <a href=https://wiki.archlinux.org/index.php/Installation_guide>installation guide on the Arch wiki</a>. Pre-installation, setting up the wireless network, creating the partitions and formatting them went all good. I decided to avoid GRUB at first and go with rEFInd, but alas I missed a big warning on the wiki and after reboot (I would later find out) it was not mounting root properly, so all I had was whatever was in the Initramfs. Reboot didn't work, so I had to hold the power button.<p>Anyway, once the partitions were created, I went to install Windows (there was a lot of back and forth burning different <code>.iso</code> images on the USB, which was a bit annoying because it wasn't the fastest thing in the world). This was pretty painless, and the process was standard: select advanced to let me choose the right partition, pick the one, say "no" to everything in the services setup, and done. But this was the first Windows <code>.iso</code> I tried. It was an old revision, and the drivers were causing issues when running (something weird about their <code>.dll</code>, manually installing the <code>.ini</code> driver files seemed to work?). The Nvidia drivers didn't want to be installed on such an old revision, after updating everything I could via Windows updates. So back I went to burning a newer Windows <code>.iso</code> and going through the same process again…<p>Once Windows was ready and I verified that I could boot to it correctly, it was time to have a second go at Arch Linux. And I went through the setup at least three times, getting it wrong every single time, formatting root every single time, redownloading the packages every single pain. If only had I known earlier what the issue was!<p>Why bother with Arch? I was pretty happy with Linux Mint, and I lowkey wanted to try NixOS, but I had used Arch before and it's a really nice distro overall (up-to-date, has AUR, quite minimal, imperative), except for trying to install rEFInd while chrooted…<p>In the end I managed to get something half-working, I still need to properly configure WiFi and pulseaudio in my system but hey it works.<p>I like to be able to dual-boot Windows and Linux because Linux is amazing for productivity, but unfortunately, some games only work fine on Windows. Might as well have both systems and use one for gaming, while the other is my daily driver.<h2 id=setting-up-arch-linux>Setting up Arch Linux</h2><p>This is the process I followed to install Arch Linux in the end, along with a brief explanation on what I think the things are doing and why we are doing them. I think the wiki could do a better job at this, but I also know it's hard to get it right for everyone. Something I do dislike is the link colour, after opening a link it becomes gray and it's a lot easier to miss the fact that it is a link in the first place, which was tough when re-reading it because some links actually matter a lot. Furthermore, important information may just be a single line, also easy to skim over. Anyway, on to the installation process…<p>The first thing we want to do is configure our keyboard layout or else the keys won't correspond to what we expect:<pre><code class=language-sh data-lang=sh>loadkeys es +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> My new computer | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>My new computer</h1><div class=time><p>2020-06-19<p>last updated 2020-07-03</div><p>This post will be mostly me ranting about setting up a new laptop, but I also just want to share my upgrade. If you're considering installing Arch Linux with dual-boot for Windows, maybe this post will help. Or perhaps you will learn something new to troubleshoot systems in the future. Let's begin!<p>Last Sunday, I ordered a Asus Rog Strix G531GT-BQ165 for 900€ (on a 20% discount) with the following specifications:<ul><li>Intel® Core i7-9750H (6 cores, 12MB cache, 2.6GHz up to 4.5GHz, 64-bit)<li>16GB RAM (8GB*2) DDR4 2666MHz<li>512GB SSD M.2 PCIe® NVMe<li>Display 15.6" (1920x1080/16:9) 60Hz<li>Graphics NVIDIA® GeForce® GTX1650 4GB GDDR5 VRAM<li>LAN 10/100/1000<li>Wi-Fi 5 (802.11ac) 2x2 RangeBoost<li>Bluetooth 5.0<li>48Wh battery with 3 cells<li>3 x USB 3.1 (GEN1)</ul><p>I was mostly interested in a general upgrade (better processor, disk, more RAM), although the graphics card is a really nice addition which will allow me to take some time off on more games. After using it for a bit, I really love the feel of the keyboard, and I love the lack of numpad! (No sarcasm, I really don't like numpads.)<p>This is an upgrade from my previous laptop (Asus X554LA-XX822T), which I won in a competition before entering university in a programming challenge. It has served me really well for the past five years, and had the following specifications:<ul><li>Intel® Core™ i5-5200U<li>4GB RAM DDR3L 1600MHz (which I upgraded to have 8GB)<li>1TB HDD<li>Display 15.6" (1366x768/16:9)<li>Intel® HD Graphics 4400<li>LAN 10/100/1000<li>Wifi 802.11 bgn<li>Bluetooth 4.0<li>Battery 2 cells<li>1 x USB 2.0<li>2 x USB 3.0</ul><p>Prior to this one, I had a Lenovo (also won in the same competition of the previous year), and prior to that (just for the sake of history), it was HP Pavilion, AMD A4-3300M processor, which unfortunately ended with heating problems. But that's very old now.<h2 id=laptop-arrival>Laptop arrival</h2><p>The laptop arrived 2 days ago at roughly 19:00, which I put charged for 3 hours as the book said. The day after, nightmares began!<p>Trying to boot it the first two times was fun, as it comes with a somewhat loud sound on boot. I don't know why they would do this, and I immediately turned it off in the BIOS.<h2 id=installation-journey>Installation journey</h2><p>I spent all of yesterday trying to setup Windows and Arch Linux (and didn't even finish, it took me this morning too and even now it's only half functional). I absolutely <em>hate</em> the amount of partitions the Windows installer creates on a clean disk. So instead, I first went with Arch Linux, and followed the <a href=https://wiki.archlinux.org/index.php/Installation_guide>installation guide on the Arch wiki</a>. Pre-installation, setting up the wireless network, creating the partitions and formatting them went all good. I decided to avoid GRUB at first and go with rEFInd, but alas I missed a big warning on the wiki and after reboot (I would later find out) it was not mounting root properly, so all I had was whatever was in the Initramfs. Reboot didn't work, so I had to hold the power button.<p>Anyway, once the partitions were created, I went to install Windows (there was a lot of back and forth burning different <code>.iso</code> images on the USB, which was a bit annoying because it wasn't the fastest thing in the world). This was pretty painless, and the process was standard: select advanced to let me choose the right partition, pick the one, say "no" to everything in the services setup, and done. But this was the first Windows <code>.iso</code> I tried. It was an old revision, and the drivers were causing issues when running (something weird about their <code>.dll</code>, manually installing the <code>.ini</code> driver files seemed to work?). The Nvidia drivers didn't want to be installed on such an old revision, after updating everything I could via Windows updates. So back I went to burning a newer Windows <code>.iso</code> and going through the same process again…<p>Once Windows was ready and I verified that I could boot to it correctly, it was time to have a second go at Arch Linux. And I went through the setup at least three times, getting it wrong every single time, formatting root every single time, redownloading the packages every single pain. If only had I known earlier what the issue was!<p>Why bother with Arch? I was pretty happy with Linux Mint, and I lowkey wanted to try NixOS, but I had used Arch before and it's a really nice distro overall (up-to-date, has AUR, quite minimal, imperative), except for trying to install rEFInd while chrooted…<p>In the end I managed to get something half-working, I still need to properly configure WiFi and pulseaudio in my system but hey it works.<p>I like to be able to dual-boot Windows and Linux because Linux is amazing for productivity, but unfortunately, some games only work fine on Windows. Might as well have both systems and use one for gaming, while the other is my daily driver.<h2 id=setting-up-arch-linux>Setting up Arch Linux</h2><p>This is the process I followed to install Arch Linux in the end, along with a brief explanation on what I think the things are doing and why we are doing them. I think the wiki could do a better job at this, but I also know it's hard to get it right for everyone. Something I do dislike is the link colour, after opening a link it becomes gray and it's a lot easier to miss the fact that it is a link in the first place, which was tough when re-reading it because some links actually matter a lot. Furthermore, important information may just be a single line, also easy to skim over. Anyway, on to the installation process…<p>The first thing we want to do is configure our keyboard layout or else the keys won't correspond to what we expect:<pre><code class=language-sh data-lang=sh>loadkeys es </code></pre><p>Because we're on a recent system, we want to verify that UEFI works correctly. If we see files listed, then it works fine:<pre><code class=language-sh data-lang=sh>ls /sys/firmware/efi/efivars </code></pre><p>The next thing we want to do is configure the WiFi, because I don't have any ethernet cable nearby. To do this, we check what network interfaces our laptop has (we're looking for the one prefixed with "w", presumably for wireless, such as "wlan0" or "wlo1"), we set it up, scan for available wireless network, and finally connect. In my case, the network has WPA security so we rely on <code>wpa_supplicant</code> to connect, passing the SSID (network name) and password:<pre><code class=language-sh data-lang=sh>ip link ip link set <IFACE> up
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Shattered Pixel Dungeon | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Shattered Pixel Dungeon</h1><div class=time><p>2019-06-03</div><p><a href=https://shatteredpixel.com/shatteredpd/>Shattered Pixel Dungeon</a> is the classic roguelike RPG game with randomly-generated dungeons. As a new player, it was a bit frustrating to be constantly killed on the first levels of the dungeon, but with some practice it's easy to reach high levels if you can kill the first boss.<h2 id=basic-tips>Basic Tips</h2><p>The game comes with its own tips, but here's a short and straight-forward summary:<ul><li><strong>Don't rush into enemies</strong>. Abuse doors and small corridors to kill them one by one. You can use the clock on the bottom left to wait a turn without moving.<li><strong>Explore each level at full</strong>. You will find goodies and gain XP while doing so.<li><strong>Upon finding a special room</strong> (e.g. has a chest but is protected by piranhas), drink all potions that you found in that level until there's one that helps you (e.g. be invisible so piranhas leave you alone). There is guaranteed to be a helpful one per level with special rooms.<li><strong>Drink potions as early as possible</strong>. Harmful potions do less damage on early levels (and if you die, you lose less). This will keep them identified early for the rest of the game.<li><strong>Read scrolls as early as possible</strong> as well. This will keep them identified. It may be worth to wait until you have an item which may be cursed and until the level is clear, because some scrolls clean curses and others alert enemies.<li><strong>Food and health are resources</strong> that you have to <em>manage</em>, not keep them always at full. Even if you are starving and taking damage, you may not need to eat <em>just yet</em>, since food is scarce. Eat when you are low on health or in possible danger.<li><strong>Piranhas</strong>. Seriously, just leave them alone if you are melee. They're free food if you're playing ranged, though.<li><strong>Prefer armor over weapons</strong>. And make sure to identify or clean it from curses before wearing anything!<li><strong>Find a dew vial early</strong>. It's often a better idea to store dew (health) for later than to use it as soon as possible.</ul><h2 id=bosses>Bosses</h2><p>There is a boss every 5 levels.<ul><li><strong>Level 5 boss</strong>. Try to stay on water, but don't let <em>it</em> stay on water since it will heal. Be careful when he starts enraging.<li><strong>Level 10 boss</strong>. Ranged weapons are good against it.<li><strong>Level 15 boss</strong>. I somehow managed to tank it with a health potion.<li><strong>Level 20 boss</strong>. I didn't get this far just yet. You are advised to use scrolls of magic mapping in the last levels to skip straight to the boss, since there's nothing else of value.<li><strong>Level 25 boss</strong>. The final boss. Good job if you made it this far!</ul><h2 id=mage>Mage</h2><p>If you followed the basic tips, you will sooner or later make use of two scrolls of upgrade in a single run. This will unlock the mage class, which is ridiculously powerful. He starts with a ranged-weapon, a magic missile wand, which is really helpful to keep enemies at a distance. Normally, you want to use this at first to surprise attack them soon, and if you are low on charges, you may go melee on normal enemies if you are confident.<h2 id=luck>Luck</h2><p>This game is all about luck and patience! Some runs will be better than others, and you should thank and pray the RNG gods for them. If you don't, they will only give you cursed items and not a single scroll to clean them. So, good luck and enjoy playing!</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Shattered Pixel Dungeon | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Shattered Pixel Dungeon</h1><div class=time><p>2019-06-03</div><p><a href=https://shatteredpixel.com/shatteredpd/>Shattered Pixel Dungeon</a> is the classic roguelike RPG game with randomly-generated dungeons. As a new player, it was a bit frustrating to be constantly killed on the first levels of the dungeon, but with some practice it's easy to reach high levels if you can kill the first boss.<h2 id=basic-tips>Basic Tips</h2><p>The game comes with its own tips, but here's a short and straight-forward summary:<ul><li><strong>Don't rush into enemies</strong>. Abuse doors and small corridors to kill them one by one. You can use the clock on the bottom left to wait a turn without moving.<li><strong>Explore each level at full</strong>. You will find goodies and gain XP while doing so.<li><strong>Upon finding a special room</strong> (e.g. has a chest but is protected by piranhas), drink all potions that you found in that level until there's one that helps you (e.g. be invisible so piranhas leave you alone). There is guaranteed to be a helpful one per level with special rooms.<li><strong>Drink potions as early as possible</strong>. Harmful potions do less damage on early levels (and if you die, you lose less). This will keep them identified early for the rest of the game.<li><strong>Read scrolls as early as possible</strong> as well. This will keep them identified. It may be worth to wait until you have an item which may be cursed and until the level is clear, because some scrolls clean curses and others alert enemies.<li><strong>Food and health are resources</strong> that you have to <em>manage</em>, not keep them always at full. Even if you are starving and taking damage, you may not need to eat <em>just yet</em>, since food is scarce. Eat when you are low on health or in possible danger.<li><strong>Piranhas</strong>. Seriously, just leave them alone if you are melee. They're free food if you're playing ranged, though.<li><strong>Prefer armor over weapons</strong>. And make sure to identify or clean it from curses before wearing anything!<li><strong>Find a dew vial early</strong>. It's often a better idea to store dew (health) for later than to use it as soon as possible.</ul><h2 id=bosses>Bosses</h2><p>There is a boss every 5 levels.<ul><li><strong>Level 5 boss</strong>. Try to stay on water, but don't let <em>it</em> stay on water since it will heal. Be careful when he starts enraging.<li><strong>Level 10 boss</strong>. Ranged weapons are good against it.<li><strong>Level 15 boss</strong>. I somehow managed to tank it with a health potion.<li><strong>Level 20 boss</strong>. I didn't get this far just yet. You are advised to use scrolls of magic mapping in the last levels to skip straight to the boss, since there's nothing else of value.<li><strong>Level 25 boss</strong>. The final boss. Good job if you made it this far!</ul><h2 id=mage>Mage</h2><p>If you followed the basic tips, you will sooner or later make use of two scrolls of upgrade in a single run. This will unlock the mage class, which is ridiculously powerful. He starts with a ranged-weapon, a magic missile wand, which is really helpful to keep enemies at a distance. Normally, you want to use this at first to surprise attack them soon, and if you are low on charges, you may go melee on normal enemies if you are confident.<h2 id=luck>Luck</h2><p>This game is all about luck and patience! Some runs will be better than others, and you should thank and pray the RNG gods for them. If you don't, they will only give you cursed items and not a single scroll to clean them. So, good luck and enjoy playing!</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Atemporal Blog Posts | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Atemporal Blog Posts</h1><div class=time><p>2018-02-03</div><p>These are some interesting posts and links I've found around the web. I believe they are quite interesting and nice reads, so if you have the time, I encourage you to check some out.<h2 id=algorithms>Algorithms</h2><ul><li>http://www.tannerhelland.com/4660/dithering-eleven-algorithms-source-code/. Image Dithering: Eleven Algorithms and Source Code. What does it mean and how to achieve it?<li>https://cristian.io/post/bloom-filters/. Idempotence layer on bloom filters. What are they and how can they help?<li>https://en.wikipedia.org/wiki/Huffman_coding. Huffman coding. This encoding is a simple yet interesting way of compressing information.<li>https://github.com/mxgmn/WaveFunctionCollapse. Wave Function Collapse. Bitmap & tilemap generation from a single example with the help of ideas from quantum mechanics.<li>https://blog.nelhage.com/2015/02/regular-expression-search-with-suffix-arrays/. Regular Expression Search with Suffix Arrays. A way to efficiently search large amounts of text.</ul><h2 id=culture>Culture</h2><ul><li>https://www.wired.com/story/ideas-joi-ito-robot-overlords/. Why Westerners Fear Robots and the Japanese Do Not. Explains some possible reasons for this case.<li>http://catb.org/~esr/faqs/smart-questions.html. How To Ask Questions The Smart Way. Some bits of hacker culture and amazing tips on how to ask a question.<li>http://apenwarr.ca/log/?m=201809#14. XML, blockchains, and the strange shapes of progress. Some of history about XML and blockchain.<li>https://czep.net/17/legion-of-lobotomized-unices.html. Legion of lobotomized unices. A time where computers are treated a lot more nicely.<li>https://eli.thegreenplace.net/2016/the-expression-problem-and-its-solutions/. The Expression Problem and its solutions. What is it and what can we do to solve it?<li>http://allendowney.blogspot.com/2015/08/the-inspection-paradox-is-everywhere.html. The Inspection Paradox is Everywhere. Interesting and very common phenomena.<li>https://github.com/ChrisKnott/Algojammer. An experimental code editor for writing algorithms. Contains several links to different tools for reverse debugging.<li>http://habitatchronicles.com/2017/05/what-are-capabilities/. What Are Capabilities? Good ideas with great security implications.<li>https://blog.aurynn.com/2015/12/16-contempt-culture. Contempt Culture. Or why you should not speak crap about your non-favourite programming languages.<li>https://www.lesswrong.com/posts/tscc3e5eujrsEeFN4/well-kept-gardens-die-by-pacifism. Well-Kept Gardens Die By Pacifism. Risks any online community can run into.<li>https://ncase.me/. It's Nicky Case! They make some cool things worth checking out, I really like "we become what we behold".</ul><h2 id=debate>Debate</h2><ul><li>https://steemit.com/opensource/@crell/open-source-is-awful. Open Source is awful. Has some points about why is it bad and how it could improve.<li>http://www.mondo2000.com/2018/01/17/pink-lexical-goop-dark-side-autocorrect/. Pink Lexical Goop: The Dark Side of Autocorrect. It can shape how you think.<li>http://blog.ploeh.dk/2015/08/03/idiomatic-or-idiosyncratic/. Idiomatic or idiosyncratic? Can porting code constructs from other languages have a positive effect?<li>https://gamasutra.com/view/news/169296/Indepth_Functional_programming_in_C.php. In-depth: Functional programming in C++. Is it useful to bother with functional concepts in a language like C++?<li>https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/. Notes on structured concurrency, or: Go statement considered harmful.<li>https://queue.acm.org/detail.cfm?id=3212479. C Is Not a Low-level Language. Could there be alternative programming models designed for more specialized CPUs?</ul><h2 id=food-for-thought>Food for Thought</h2><ul><li>https://www.hillelwayne.com/post/divide-by-zero/. 1/0 = 0. Explores why it makes sense to redefine mathemathics under some circumstances, and why it is possible to do so.<li>https://jeremykun.com/2018/04/13/for-mathematicians-does-not-mean-equality/. For mathematicians, = does not mean equality. What other definitions does the equal sign have?<li>https://www.lesswrong.com/posts/2MD3NMLBPCqPfnfre/cached-thoughts. Cached Thoughts. How is it possible that our brains work at all?<li>http://tonsky.me/blog/disenchantment/. Software disenchantment. Faster hardware and slower software is a trend. <ul><li>https://blackhole12.com/blog/software-engineering-is-bad-but-it-s-not-that-bad/. Software Engineering Is Bad, But That's Not Why. This post has some good counterpoints to Software disenchantment.</ul><li>http://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/. What Color is Your Function? Spoiler: can we approach asynchronous IO better?<li>https://hackernoon.com/im-harvesting-credit-card-numbers-and-passwords-from-your-site-here-s-how-9a8cb347c5b5. I'm harvesting credit card numbers and passwords from your site. A word of warning when mindlessly adding dependencies.<li>https://medium.com/message/everything-is-broken-81e5f33a24e1. Everything Is Broken. Some of the (probable) truths about our world.</ul><h2 id=funny>Funny</h2><ul><li>http://thedailywtf.com/articles/We-Use-BobX. We Use BobX. BobX.<li>http://thedailywtf.com/articles/the-inner-json-effect. The Inner JSON Effect. For some reason, custom languages are in.<li>https://thedailywtf.com/articles/exponential-backup. Exponential Backup. Far better than git.<li>https://thedailywtf.com/articles/ITAPPMONROBOT. ITAPPMONROBOT. Solving software problems with hardware.<li>https://thedailywtf.com/articles/a-tapestry-of-threads. A Tapestry of Threads.More threads must mean faster code, right?<li>https://medium.com/commitlog/a-brief-totally-accurate-history-of-programming-languages-cd93ec806124. A Brief Totally Accurate History Of Programming Languages. Don't take offense for it!</ul><h2 id=graphics>Graphics</h2><ul><li>http://shaunlebron.github.io/visualizing-projections/. Visualizing Projections. Small post about different projection methods.<li>http://www.iquilezles.org/www/index.htm. A <em>lot</em> of useful and quality articles regarding computer graphics.</ul><h2 id=history>History</h2><ul><li>https://twobithistory.org/2018/08/18/ada-lovelace-note-g.html. What Did Ada Lovelace's Program Actually Do?. And other characters that took part in the beginning's of programming.<li>https://chrisdown.name/2018/01/02/in-defence-of-swap.html. In defence of swap: common misconceptions. Swap is still an useful concept.<li>https://www.pacifict.com/Story/. The Graphing Calculator Story. A great classic Apple tale.<li>https://twobithistory.org/2018/10/14/lisp.html. How Lisp Became God's Own Programming Language. Lisp as a foundational programming language.</ul><h2 id=motivational>Motivational</h2><ul><li>https://www.joelonsoftware.com/2002/01/06/fire-and-motion/. Fire And Motion. What does actually take to get things done?<li>https://realmensch.org/2017/08/25/the-parable-of-the-two-programmers/. The Parable of the Two Programmers. This tale is about two different types of programmer and their respective endings in a company, illustrating how the one you wouldn't expect to actually ends in a better situation.<li>https://byorgey.wordpress.com/2018/05/06/conversations-with-a-six-year-old-on-functional-programming/. Conversations with a six-year-old on functional programming. Little kids today can be really interested in technological topics.<li>https://bulletproofmusician.com/how-many-hours-a-day-should-you-practice/. How Many Hours a Day Should You Practice?. While the article is about music, it applies to any other areas.<li>http://nathanmarz.com/blog/suffering-oriented-programming.html. Suffering-oriented programming. A possibly new approach on how you could tackle your new projects.<li>https://www.joelonsoftware.com/2000/04/06/things-you-should-never-do-part-i/. Things You Should Never Do, Part I. There is no need to rewrite your code.</ul><h2 id=optimization>Optimization</h2><ul><li>http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html. What Every C Programmer Should Know About Undefined Behavior #1/3. Explains what undefined behaviour is and why it makes sense.<li>http://ridiculousfish.com/blog/posts/labor-of-division-episode-i.html. Labor of Division (Episode I). Some tricks to divide without division.<li>http://blog.moertel.com/posts/2013-12-14-great-old-timey-game-programming-hack.html. A Great Old-Timey Game-Programming Hack. Abusing instructions to make games playable even on the slowest hardware.<li>https://web.archive.org/web/20191213224640/https://people.eecs.berkeley.edu/~sangjin/2012/12/21/epoll-vs-kqueue.html. Scalable Event Multiplexing: epoll vs kqueue. How good OS primitives can really help performance and scability.<li>https://adamdrake.com/command-line-tools-can-be-235x-faster-than-your-hadoop-cluster.html. Command-line Tools can be 235x Faster than your Hadoop Cluster. Or how to use the right tool for the right job.<li>https://nullprogram.com/blog/2018/05/27/. When FFI Function Calls Beat Native C. How lua beat C at it and the explanation behind it.<li>http://igoro.com/archive/gallery-of-processor-cache-effects/. Gallery of Processor Cache Effects. Knowing a few things about the cache can make a big difference.</ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Atemporal Blog Posts | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Atemporal Blog Posts</h1><div class=time><p>2018-02-03</div><p>These are some interesting posts and links I've found around the web. I believe they are quite interesting and nice reads, so if you have the time, I encourage you to check some out.<h2 id=algorithms>Algorithms</h2><ul><li>http://www.tannerhelland.com/4660/dithering-eleven-algorithms-source-code/. Image Dithering: Eleven Algorithms and Source Code. What does it mean and how to achieve it?<li>https://cristian.io/post/bloom-filters/. Idempotence layer on bloom filters. What are they and how can they help?<li>https://en.wikipedia.org/wiki/Huffman_coding. Huffman coding. This encoding is a simple yet interesting way of compressing information.<li>https://github.com/mxgmn/WaveFunctionCollapse. Wave Function Collapse. Bitmap & tilemap generation from a single example with the help of ideas from quantum mechanics.<li>https://blog.nelhage.com/2015/02/regular-expression-search-with-suffix-arrays/. Regular Expression Search with Suffix Arrays. A way to efficiently search large amounts of text.</ul><h2 id=culture>Culture</h2><ul><li>https://www.wired.com/story/ideas-joi-ito-robot-overlords/. Why Westerners Fear Robots and the Japanese Do Not. Explains some possible reasons for this case.<li>http://catb.org/~esr/faqs/smart-questions.html. How To Ask Questions The Smart Way. Some bits of hacker culture and amazing tips on how to ask a question.<li>http://apenwarr.ca/log/?m=201809#14. XML, blockchains, and the strange shapes of progress. Some of history about XML and blockchain.<li>https://czep.net/17/legion-of-lobotomized-unices.html. Legion of lobotomized unices. A time where computers are treated a lot more nicely.<li>https://eli.thegreenplace.net/2016/the-expression-problem-and-its-solutions/. The Expression Problem and its solutions. What is it and what can we do to solve it?<li>http://allendowney.blogspot.com/2015/08/the-inspection-paradox-is-everywhere.html. The Inspection Paradox is Everywhere. Interesting and very common phenomena.<li>https://github.com/ChrisKnott/Algojammer. An experimental code editor for writing algorithms. Contains several links to different tools for reverse debugging.<li>http://habitatchronicles.com/2017/05/what-are-capabilities/. What Are Capabilities? Good ideas with great security implications.<li>https://blog.aurynn.com/2015/12/16-contempt-culture. Contempt Culture. Or why you should not speak crap about your non-favourite programming languages.<li>https://www.lesswrong.com/posts/tscc3e5eujrsEeFN4/well-kept-gardens-die-by-pacifism. Well-Kept Gardens Die By Pacifism. Risks any online community can run into.<li>https://ncase.me/. It's Nicky Case! They make some cool things worth checking out, I really like "we become what we behold".</ul><h2 id=debate>Debate</h2><ul><li>https://steemit.com/opensource/@crell/open-source-is-awful. Open Source is awful. Has some points about why is it bad and how it could improve.<li>http://www.mondo2000.com/2018/01/17/pink-lexical-goop-dark-side-autocorrect/. Pink Lexical Goop: The Dark Side of Autocorrect. It can shape how you think.<li>http://blog.ploeh.dk/2015/08/03/idiomatic-or-idiosyncratic/. Idiomatic or idiosyncratic? Can porting code constructs from other languages have a positive effect?<li>https://gamasutra.com/view/news/169296/Indepth_Functional_programming_in_C.php. In-depth: Functional programming in C++. Is it useful to bother with functional concepts in a language like C++?<li>https://vorpus.org/blog/notes-on-structured-concurrency-or-go-statement-considered-harmful/. Notes on structured concurrency, or: Go statement considered harmful.<li>https://queue.acm.org/detail.cfm?id=3212479. C Is Not a Low-level Language. Could there be alternative programming models designed for more specialized CPUs?</ul><h2 id=food-for-thought>Food for Thought</h2><ul><li>https://www.hillelwayne.com/post/divide-by-zero/. 1/0 = 0. Explores why it makes sense to redefine mathemathics under some circumstances, and why it is possible to do so.<li>https://jeremykun.com/2018/04/13/for-mathematicians-does-not-mean-equality/. For mathematicians, = does not mean equality. What other definitions does the equal sign have?<li>https://www.lesswrong.com/posts/2MD3NMLBPCqPfnfre/cached-thoughts. Cached Thoughts. How is it possible that our brains work at all?<li>http://tonsky.me/blog/disenchantment/. Software disenchantment. Faster hardware and slower software is a trend. <ul><li>https://blackhole12.com/blog/software-engineering-is-bad-but-it-s-not-that-bad/. Software Engineering Is Bad, But That's Not Why. This post has some good counterpoints to Software disenchantment.</ul><li>http://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/. What Color is Your Function? Spoiler: can we approach asynchronous IO better?<li>https://hackernoon.com/im-harvesting-credit-card-numbers-and-passwords-from-your-site-here-s-how-9a8cb347c5b5. I'm harvesting credit card numbers and passwords from your site. A word of warning when mindlessly adding dependencies.<li>https://medium.com/message/everything-is-broken-81e5f33a24e1. Everything Is Broken. Some of the (probable) truths about our world.</ul><h2 id=funny>Funny</h2><ul><li>http://thedailywtf.com/articles/We-Use-BobX. We Use BobX. BobX.<li>http://thedailywtf.com/articles/the-inner-json-effect. The Inner JSON Effect. For some reason, custom languages are in.<li>https://thedailywtf.com/articles/exponential-backup. Exponential Backup. Far better than git.<li>https://thedailywtf.com/articles/ITAPPMONROBOT. ITAPPMONROBOT. Solving software problems with hardware.<li>https://thedailywtf.com/articles/a-tapestry-of-threads. A Tapestry of Threads.More threads must mean faster code, right?<li>https://medium.com/commitlog/a-brief-totally-accurate-history-of-programming-languages-cd93ec806124. A Brief Totally Accurate History Of Programming Languages. Don't take offense for it!</ul><h2 id=graphics>Graphics</h2><ul><li>http://shaunlebron.github.io/visualizing-projections/. Visualizing Projections. Small post about different projection methods.<li>http://www.iquilezles.org/www/index.htm. A <em>lot</em> of useful and quality articles regarding computer graphics.</ul><h2 id=history>History</h2><ul><li>https://twobithistory.org/2018/08/18/ada-lovelace-note-g.html. What Did Ada Lovelace's Program Actually Do?. And other characters that took part in the beginning's of programming.<li>https://chrisdown.name/2018/01/02/in-defence-of-swap.html. In defence of swap: common misconceptions. Swap is still an useful concept.<li>https://www.pacifict.com/Story/. The Graphing Calculator Story. A great classic Apple tale.<li>https://twobithistory.org/2018/10/14/lisp.html. How Lisp Became God's Own Programming Language. Lisp as a foundational programming language.</ul><h2 id=motivational>Motivational</h2><ul><li>https://www.joelonsoftware.com/2002/01/06/fire-and-motion/. Fire And Motion. What does actually take to get things done?<li>https://realmensch.org/2017/08/25/the-parable-of-the-two-programmers/. The Parable of the Two Programmers. This tale is about two different types of programmer and their respective endings in a company, illustrating how the one you wouldn't expect to actually ends in a better situation.<li>https://byorgey.wordpress.com/2018/05/06/conversations-with-a-six-year-old-on-functional-programming/. Conversations with a six-year-old on functional programming. Little kids today can be really interested in technological topics.<li>https://bulletproofmusician.com/how-many-hours-a-day-should-you-practice/. How Many Hours a Day Should You Practice?. While the article is about music, it applies to any other areas.<li>http://nathanmarz.com/blog/suffering-oriented-programming.html. Suffering-oriented programming. A possibly new approach on how you could tackle your new projects.<li>https://www.joelonsoftware.com/2000/04/06/things-you-should-never-do-part-i/. Things You Should Never Do, Part I. There is no need to rewrite your code.</ul><h2 id=optimization>Optimization</h2><ul><li>http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html. What Every C Programmer Should Know About Undefined Behavior #1/3. Explains what undefined behaviour is and why it makes sense.<li>http://ridiculousfish.com/blog/posts/labor-of-division-episode-i.html. Labor of Division (Episode I). Some tricks to divide without division.<li>http://blog.moertel.com/posts/2013-12-14-great-old-timey-game-programming-hack.html. A Great Old-Timey Game-Programming Hack. Abusing instructions to make games playable even on the slowest hardware.<li>https://web.archive.org/web/20191213224640/https://people.eecs.berkeley.edu/~sangjin/2012/12/21/epoll-vs-kqueue.html. Scalable Event Multiplexing: epoll vs kqueue. How good OS primitives can really help performance and scability.<li>https://adamdrake.com/command-line-tools-can-be-235x-faster-than-your-hadoop-cluster.html. Command-line Tools can be 235x Faster than your Hadoop Cluster. Or how to use the right tool for the right job.<li>https://nullprogram.com/blog/2018/05/27/. When FFI Function Calls Beat Native C. How lua beat C at it and the explanation behind it.<li>http://igoro.com/archive/gallery-of-processor-cache-effects/. Gallery of Processor Cache Effects. Knowing a few things about the cache can make a big difference.</ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> A practical example with Hadoop | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>A practical example with Hadoop</h1><div class=time><p>2020-04-01T02:00:00+00:00<p>last updated 2020-04-03T08:43:41+00:00</div><p>In our <a href=/blog/ribw/introduction-to-hadoop-and-its-mapreduce/>previous Hadoop post</a>, we learnt what it is, how it originated, and how it works, from a theoretical standpoint. Here we will instead focus on a more practical example with Hadoop.<p>This post will showcase my own implementation to implement a word counter for any plain text document that you want to analyze.<h2 id=installation>Installation</h2><p>Before running any piece of software, its executable code must first be downloaded into our computers so that we can run it. Head over to <a href=http://hadoop.apache.org/releases.html>Apache Hadoop’s releases</a> and download the <a href=https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz>latest binary version</a> at the time of writing (3.2.1).<p>We will be using the <a href=https://linuxmint.com/>Linux Mint</a> distribution because I love its simplicity, although the process shown here should work just fine on any similar Linux distribution such as <a href=https://ubuntu.com/>Ubuntu</a>.<p>Once the archive download is complete, extract it with any tool of your choice (graphical or using the terminal) and execute it. Make sure you have a version of Java installed, such as <a href=https://openjdk.java.net/>OpenJDK</a>.<p>Here are all the three steps in the command line:<pre><code>wget https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> A practical example with Hadoop | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>A practical example with Hadoop</h1><div class=time><p>2020-04-01T02:00:00+00:00<p>last updated 2020-04-03T08:43:41+00:00</div><p>In our <a href=/blog/ribw/introduction-to-hadoop-and-its-mapreduce/>previous Hadoop post</a>, we learnt what it is, how it originated, and how it works, from a theoretical standpoint. Here we will instead focus on a more practical example with Hadoop.<p>This post will showcase my own implementation to implement a word counter for any plain text document that you want to analyze.<h2 id=installation>Installation</h2><p>Before running any piece of software, its executable code must first be downloaded into our computers so that we can run it. Head over to <a href=http://hadoop.apache.org/releases.html>Apache Hadoop’s releases</a> and download the <a href=https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz>latest binary version</a> at the time of writing (3.2.1).<p>We will be using the <a href=https://linuxmint.com/>Linux Mint</a> distribution because I love its simplicity, although the process shown here should work just fine on any similar Linux distribution such as <a href=https://ubuntu.com/>Ubuntu</a>.<p>Once the archive download is complete, extract it with any tool of your choice (graphical or using the terminal) and execute it. Make sure you have a version of Java installed, such as <a href=https://openjdk.java.net/>OpenJDK</a>.<p>Here are all the three steps in the command line:<pre><code>wget https://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz tar xf hadoop-3.2.1.tar.gz hadoop-3.2.1/bin/hadoop version </code></pre><h2 id=processing-data>Processing data</h2><p>To take advantage of Hadoop, we have to design our code to work in the MapReduce model. Both the map and reduce phase work on key-value pairs as input and output, and both have a programmer-defined function.<p>We will use Java, because it’s a dependency that we already have anyway, so might as well.<p>Our map function needs to split each of the lines we receive as input into words, and we will also convert them to lowercase, thus preparing the data for later use (counting words). There won’t be bad records, so we don’t have to worry about that.<p>Copy or reproduce the following code in a file called <code>WordCountMapper.java</code>, using any text editor of your choice:<pre><code>import java.io.IOException;
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> About Boolean Retrieval | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>About Boolean Retrieval</h1><div class=time><p>2020-02-25T00:00:29+00:00<p>last updated 2020-03-18T09:38:02+00:00</div><p>This entry will discuss the section on the <em><a href=https://nlp.stanford.edu/IR-book/pdf/01bool.pdf>Boolean retrieval</a></em> section of the book <em><a href=https://nlp.stanford.edu/IR-book/pdf/irbookprint.pdf>An Introduction to Information Retrieval</a></em>.<h2 id=summary-on-the-topic>Summary on the topic</h2><p>Boolean retrieval is one of the many ways information retrieval (finding materials that satisfy an information need), often simply called <em>search</em>.<p>A simple way to retrieve information is to <em>grep</em> through the text (term named after the Unix tool <code>grep</code>), scanning text linearly and excluding it on certain criteria. However, this falls short when the volume of the data grows, more complex queries are desired, or one seeks some sort of ranking.<p>To avoid linear scanning, we build an <em>index</em> and record for each document whether it contains each term out of our full dictionary of terms (which may be words in a chapter and words in the book). This results in a binary term-document <em>incidence matrix</em>. Such a possible matrix is:<table><tbody><tr><td><em> word/play </em><td><strong> Antony and Cleopatra </strong><td><strong> Julius Caesar </strong><td><strong> The Tempest </strong><td><strong> … </strong><tr><td><strong> Antony </strong><td>1<td>1<td>0<td><tr><td><strong> Brutus </strong><td>1<td>1<td>0<td><tr><td><strong> Caesar </strong><td>1<td>1<td>0<td><tr><td><strong> Calpurnia </strong><td>0<td>1<td>0<td><tr><td><strong> Cleopatra </strong><td>1<td>0<td>0<td><tr><td><strong> mercy </strong><td>1<td>0<td>1<td><tr><td><strong> worser </strong><td>1<td>0<td>1<td><tr><td><strong> … </strong><td><td><td><td></table><p>We can look at this matrix’s rows or columns to obtain a vector for each term indicating where it appears, or a vector for each document indicating the terms it contains.<p>Now, answering a query such as <code>Brutus AND Caesar AND NOT Calpurnia</code> becomes trivial:<pre><code>VECTOR(Brutus) AND VECTOR(Caesar) AND COMPLEMENT(VECTOR(Calpurnia)) +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> About Boolean Retrieval | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>About Boolean Retrieval</h1><div class=time><p>2020-02-25T00:00:29+00:00<p>last updated 2020-03-18T09:38:02+00:00</div><p>This entry will discuss the section on the <em><a href=https://nlp.stanford.edu/IR-book/pdf/01bool.pdf>Boolean retrieval</a></em> section of the book <em><a href=https://nlp.stanford.edu/IR-book/pdf/irbookprint.pdf>An Introduction to Information Retrieval</a></em>.<h2 id=summary-on-the-topic>Summary on the topic</h2><p>Boolean retrieval is one of the many ways information retrieval (finding materials that satisfy an information need), often simply called <em>search</em>.<p>A simple way to retrieve information is to <em>grep</em> through the text (term named after the Unix tool <code>grep</code>), scanning text linearly and excluding it on certain criteria. However, this falls short when the volume of the data grows, more complex queries are desired, or one seeks some sort of ranking.<p>To avoid linear scanning, we build an <em>index</em> and record for each document whether it contains each term out of our full dictionary of terms (which may be words in a chapter and words in the book). This results in a binary term-document <em>incidence matrix</em>. Such a possible matrix is:<table><tbody><tr><td><em> word/play </em><td><strong> Antony and Cleopatra </strong><td><strong> Julius Caesar </strong><td><strong> The Tempest </strong><td><strong> … </strong><tr><td><strong> Antony </strong><td>1<td>1<td>0<td><tr><td><strong> Brutus </strong><td>1<td>1<td>0<td><tr><td><strong> Caesar </strong><td>1<td>1<td>0<td><tr><td><strong> Calpurnia </strong><td>0<td>1<td>0<td><tr><td><strong> Cleopatra </strong><td>1<td>0<td>0<td><tr><td><strong> mercy </strong><td>1<td>0<td>1<td><tr><td><strong> worser </strong><td>1<td>0<td>1<td><tr><td><strong> … </strong><td><td><td><td></table><p>We can look at this matrix’s rows or columns to obtain a vector for each term indicating where it appears, or a vector for each document indicating the terms it contains.<p>Now, answering a query such as <code>Brutus AND Caesar AND NOT Calpurnia</code> becomes trivial:<pre><code>VECTOR(Brutus) AND VECTOR(Caesar) AND COMPLEMENT(VECTOR(Calpurnia)) = 110 AND 110 AND COMPLEMENT(010) = 110 AND 110 AND 101 = 100
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Build your own PC | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Build your own PC</h1><div class=time><p>2020-02-25T02:00:12+00:00<p>last updated 2020-03-18T09:38:46+00:00</div><p><em>…where PC obviously stands for Personal Crawler</em>.<hr><p>This post contains the source code for a very simple crawler written in Java. You can compile and run it on any file or directory, and it will calculate the frequency of all the words it finds.<h2 id=source-code>Source code</h2><p>Paste the following code in a new file called <code>Crawl.java</code>:<pre><code>import java.io.*; +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Build your own PC | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Build your own PC</h1><div class=time><p>2020-02-25T02:00:12+00:00<p>last updated 2020-03-18T09:38:46+00:00</div><p><em>…where PC obviously stands for Personal Crawler</em>.<hr><p>This post contains the source code for a very simple crawler written in Java. You can compile and run it on any file or directory, and it will calculate the frequency of all the words it finds.<h2 id=source-code>Source code</h2><p>Paste the following code in a new file called <code>Crawl.java</code>:<pre><code>import java.io.*; import java.util.*; import java.util.regex.Matcher; import java.util.regex.Pattern;
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Cassandra: an Introduction | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Cassandra: an Introduction</h1><div class=time><p>2020-03-05T00:00:45+00:00<p>last updated 2020-03-18T09:47:05+00:00</div><p>This is the first post in the Cassandra series, where we will introduce the Cassandra database system and take a look at its features and installation methods.<p>Other posts in this series:<ul><li><a href=/blog/ribw/cassandra-an-introduction/>Cassandra: an Introduction</a> (this post)</ul><p>This post is co-authored wih Classmate.<hr><p><img src=https://lonami.dev/blog/ribw/cassandra-an-introduction/cassandra-database-e1584191543401.jpg alt="NoSQL database – Apache Cassandra – First delivery"><h2 id=purpose-of-technology>Purpose of technology</h2><p>Apache Cassandra is a <strong>NoSQL</strong>, <strong>open-source</strong>, <strong>distributed “key-value” database</strong>. It allows <strong>large volumes of distributed data</strong>. The main **goal **is provide <strong>linear scalability and availabilitywithout compromising performance</strong>. Besides, Cassandra <strong>supports replication</strong> across multiple datacenters, providing low latency.<h2 id=how-it-works>How it works</h2><p>Cassandra’s distributed **architecture **is based on a series of <strong>equal nodes</strong> that communicate with a <strong>P2P protocol</strong> so that <strong>redundancy is maximum</strong>. It offers robust support for multiple datacenters, with <strong>asynchronous replication</strong> without the need for a master server.<p>Besides, Cassandra’s <strong>data model consists of partitioning the rows</strong>, which are rearranged into <strong>different tables</strong>. The primary keys of each table have a first component that is the <strong>partition key</strong>. Within a partition, the rows are grouped by the remaining columns of the key. The other columns can be indexed separately from the primary key.<p>These tables can be <strong>created, deleted, updated and queried****at runtime without blocking</strong> each other. However it does <strong>not support joins or subqueries</strong>, but instead <strong>emphasizes denormalization</strong> through features like collections.<p>Nowadays, Cassandra uses its own query language called <strong>CQL</strong> (<strong>Cassandra Query Language</strong>), with a <strong>similar syntax to SQL</strong>. It also allows access from <strong>JDBC</strong>.<p><img src=https://lonami.dev/blog/ribw/cassandra-an-introduction/s0GHpggGZXOFcdhypRWV4trU-PkSI6lukEv54pLZnoirh0GlDVAc4LamB1Dy.png> _ Cassandra architecture _<h2 id=features>Features</h2><ul><li><strong>Decentralized</strong>: there are <strong>no single points of failure</strong>, every **node **in the cluster has the <strong>same role</strong> and there is <strong>no master node</strong>, so each node <strong>can service any request</strong>, besides the data is distributed across the cluster.<li>Supports **replication **and multiple replication of <strong>data center</strong>: the replication strategies are <strong>configurable</strong>.<li>**Scalability: **reading and writing performance increases linearly as new nodes are added, also <strong>new nodes</strong> can be <strong>added without interrupting</strong> application <strong>execution</strong>.<li><strong>Fault tolerance: data replication</strong> is done **automatically **in several nodes in order to recover from failures. It is possible to <strong>replace failure nodes****without <strong>making</strong> inactivity time or interruptions</strong> to the application.<li>**Consistency: **a choice of consistency level is provided for <strong>reading and writing</strong>.<li><strong>MapReduce support</strong>: it is **integrated **with <strong>Apache Hadoop</strong> to support MapReduce.<li><strong>Query language</strong>: it has its own query language called **CQL (Cassandra Query Language) **</ul><h2 id=corner-in-cap-theorem>Corner in CAP theorem</h2><p><strong>Apache Cassandra</strong> is usually described as an “<strong>AP</strong>” system because it guarantees <strong>availability</strong> and <strong>partition/fault tolerance</strong>. So it errs on the side of ensuring data availability even if this means <strong>sacrificing consistency</strong>. But, despite this fact, Apache Cassandra <strong>seeks to satisfy all three requirements</strong> (Consistency, Availability and Fault tolerance) simultaneously and can be <strong>configured to behave</strong> like a “<strong>CP</strong>” database, guaranteeing <strong>consistency and partition/fault tolerance</strong>.<p><img src=https://lonami.dev/blog/ribw/cassandra-an-introduction/rf3n9LTOKCQVbx4qrn7NPSVcRcwE1LxR_khi-9Qc51Hcbg6BHHPu-0GZjUwD.png> <em>Cassandra in CAP Theorem</em><h2 id=download>Download</h2><p>In order to download the file, with extension .tar.gz. you must visit the <a href=https://cassandra.apache.org/download/>download site</a> and click on the file “<a href=https://ftp.cixug.es/apache/cassandra/3.11.6/apache-cassandra-3.11.6-bin.tar.gz>https://ftp.cixug.es/apache/cassandra/3.11.6/apache-cassandra-3.11.6-bin.tar.gz</a>”. It is important to mention that the previous link is related to the 3.11.6 version.<h2 id=installation>Installation</h2><p>This database can only be installed on Linux distributions and Mac OS X systems, so, it is not possible to install it on Microsoft Windows.<p>The first main requirement is having installed Java 8 in <strong>Ubuntu</strong>, the OS that we will use. Therefore, the Java 8 installation is explained below. First open a terminal and execute the next command:<pre><code>sudo apt update +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Cassandra: an Introduction | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Cassandra: an Introduction</h1><div class=time><p>2020-03-05T00:00:45+00:00<p>last updated 2020-03-18T09:47:05+00:00</div><p>This is the first post in the Cassandra series, where we will introduce the Cassandra database system and take a look at its features and installation methods.<p>Other posts in this series:<ul><li><a href=/blog/ribw/cassandra-an-introduction/>Cassandra: an Introduction</a> (this post)</ul><p>This post is co-authored wih Classmate.<hr><p><img src=https://lonami.dev/blog/ribw/cassandra-an-introduction/cassandra-database-e1584191543401.jpg alt="NoSQL database – Apache Cassandra – First delivery"><h2 id=purpose-of-technology>Purpose of technology</h2><p>Apache Cassandra is a <strong>NoSQL</strong>, <strong>open-source</strong>, <strong>distributed “key-value” database</strong>. It allows <strong>large volumes of distributed data</strong>. The main **goal **is provide <strong>linear scalability and availabilitywithout compromising performance</strong>. Besides, Cassandra <strong>supports replication</strong> across multiple datacenters, providing low latency.<h2 id=how-it-works>How it works</h2><p>Cassandra’s distributed **architecture **is based on a series of <strong>equal nodes</strong> that communicate with a <strong>P2P protocol</strong> so that <strong>redundancy is maximum</strong>. It offers robust support for multiple datacenters, with <strong>asynchronous replication</strong> without the need for a master server.<p>Besides, Cassandra’s <strong>data model consists of partitioning the rows</strong>, which are rearranged into <strong>different tables</strong>. The primary keys of each table have a first component that is the <strong>partition key</strong>. Within a partition, the rows are grouped by the remaining columns of the key. The other columns can be indexed separately from the primary key.<p>These tables can be <strong>created, deleted, updated and queried****at runtime without blocking</strong> each other. However it does <strong>not support joins or subqueries</strong>, but instead <strong>emphasizes denormalization</strong> through features like collections.<p>Nowadays, Cassandra uses its own query language called <strong>CQL</strong> (<strong>Cassandra Query Language</strong>), with a <strong>similar syntax to SQL</strong>. It also allows access from <strong>JDBC</strong>.<p><img src=https://lonami.dev/blog/ribw/cassandra-an-introduction/s0GHpggGZXOFcdhypRWV4trU-PkSI6lukEv54pLZnoirh0GlDVAc4LamB1Dy.png> _ Cassandra architecture _<h2 id=features>Features</h2><ul><li><strong>Decentralized</strong>: there are <strong>no single points of failure</strong>, every **node **in the cluster has the <strong>same role</strong> and there is <strong>no master node</strong>, so each node <strong>can service any request</strong>, besides the data is distributed across the cluster.<li>Supports **replication **and multiple replication of <strong>data center</strong>: the replication strategies are <strong>configurable</strong>.<li>**Scalability: **reading and writing performance increases linearly as new nodes are added, also <strong>new nodes</strong> can be <strong>added without interrupting</strong> application <strong>execution</strong>.<li><strong>Fault tolerance: data replication</strong> is done **automatically **in several nodes in order to recover from failures. It is possible to <strong>replace failure nodes****without <strong>making</strong> inactivity time or interruptions</strong> to the application.<li>**Consistency: **a choice of consistency level is provided for <strong>reading and writing</strong>.<li><strong>MapReduce support</strong>: it is **integrated **with <strong>Apache Hadoop</strong> to support MapReduce.<li><strong>Query language</strong>: it has its own query language called **CQL (Cassandra Query Language) **</ul><h2 id=corner-in-cap-theorem>Corner in CAP theorem</h2><p><strong>Apache Cassandra</strong> is usually described as an “<strong>AP</strong>” system because it guarantees <strong>availability</strong> and <strong>partition/fault tolerance</strong>. So it errs on the side of ensuring data availability even if this means <strong>sacrificing consistency</strong>. But, despite this fact, Apache Cassandra <strong>seeks to satisfy all three requirements</strong> (Consistency, Availability and Fault tolerance) simultaneously and can be <strong>configured to behave</strong> like a “<strong>CP</strong>” database, guaranteeing <strong>consistency and partition/fault tolerance</strong>.<p><img src=https://lonami.dev/blog/ribw/cassandra-an-introduction/rf3n9LTOKCQVbx4qrn7NPSVcRcwE1LxR_khi-9Qc51Hcbg6BHHPu-0GZjUwD.png> <em>Cassandra in CAP Theorem</em><h2 id=download>Download</h2><p>In order to download the file, with extension .tar.gz. you must visit the <a href=https://cassandra.apache.org/download/>download site</a> and click on the file “<a href=https://ftp.cixug.es/apache/cassandra/3.11.6/apache-cassandra-3.11.6-bin.tar.gz>https://ftp.cixug.es/apache/cassandra/3.11.6/apache-cassandra-3.11.6-bin.tar.gz</a>”. It is important to mention that the previous link is related to the 3.11.6 version.<h2 id=installation>Installation</h2><p>This database can only be installed on Linux distributions and Mac OS X systems, so, it is not possible to install it on Microsoft Windows.<p>The first main requirement is having installed Java 8 in <strong>Ubuntu</strong>, the OS that we will use. Therefore, the Java 8 installation is explained below. First open a terminal and execute the next command:<pre><code>sudo apt update sudo apt install openjdk-8-jdk openjdk-8-jre </code></pre><p>In order to establish Java as a environment variable it is needed to open the file “/.bashrc”:<pre><code>nano ~/.bashrc </code></pre><p>And add at the end of it the path where Java is installed, as follows:<pre><code>export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Developing a Python application for MongoDB | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Developing a Python application for MongoDB</h1><div class=time><p>2020-03-25T00:00:04+00:00<p>last updated 2020-04-16T08:01:23+00:00</div><p>This is the third and last post in the MongoDB series, where we will develop a Python application to process and store OpenData inside Mongo.<p>Other posts in this series:<ul><li><a href=/blog/ribw/mongodb-an-introduction/>MongoDB: an Introduction</a><li><a href=/blog/ribw/mongodb-basic-operations-and-architecture/>MongoDB: Basic Operations and Architecture</a><li><a href=/blog/ribw/developing-a-python-application-for-mongodb/>Developing a Python application for MongoDB</a> (this post)</ul><p>This post is co-authored wih a Classmate.<hr><h2 id=what-are-we-making>What are we making?</h2><p>We are going to develop a web application that renders a map, in this case, the town of Cáceres, with which users can interact. When the user clicks somewhere on the map, the selected location will be sent to the server to process. This server will perform geospatial queries to Mongo and once the results are ready, the information is presented back at the webpage.<p>The data used for the application comes from <a href=https://opendata.caceres.es/>Cáceres’ OpenData</a>, and our goal is that users will be able to find information about certain areas in a quick and intuitive way, such as precise coordinates, noise level, and such.<h2 id=what-are-we-using>What are we using?</h2><p>The web application will be using <a href=https://python.org/>Python</a> for the backend, <a href=https://svelte.dev/>Svelte</a> for the frontend, and <a href=https://www.mongodb.com/>Mongo</a> as our storage database and processing center.<ul><li><strong>Why Python?</strong> It’s a comfortable language to write and to read, and has a great ecosystem with <a href=https://pypi.org/>plenty of libraries</a>.<li><strong>Why Svelte?</strong> Svelte is the New Thing<strong>™</strong> in the world of component frameworks for JavaScript. It is similar to React or Vue, but compiled and with a lot less boilerplate. Check out their <a href=https://svelte.dev/blog/svelte-3-rethinking-reactivity>Svelte post</a> to learn more.<li><strong>Why Mongo?</strong> We believe NoSQL is the right approach for doing the kind of processing and storage that we expect, and it’s <a href=https://docs.mongodb.com/>very easy to use</a>. In addition, we will be making Geospatial Queries which <a href=https://docs.mongodb.com/manual/geospatial-queries/>Mongo supports</a>.</ul><p>Why didn’t we choose to make a smaller project, you may ask? You will be shocked to hear that we do not have an answer for that!<p>Note that we will not be embedding <strong>all</strong> the code of the project in this post, or it would be too long! We will include only the relevant snippets needed to understand the core ideas of the project, and not the unnecessary parts of it (for example, parsing configuration files to easily change the port where the server runs is not included).<h2 id=python-dependencies>Python dependencies</h2><p>Because we will program it in Python, you need Python installed. You can install it using a package manager of your choice or heading over to the <a href=https://www.python.org/downloads/>Python downloads section</a>, but if you’re on Linux, chances are you have it installed already.<p>Once Python 3.7 or above is installed, install <a href=https://motor.readthedocs.io/en/stable/><code>motor</code> (Asynchronous Python driver for MongoDB)</a> and the <a href=https://docs.aiohttp.org/en/stable/web.html><code>aiohttp</code> server</a> through <code>pip</code>:<pre><code>pip install aiohttp motor +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Developing a Python application for MongoDB | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Developing a Python application for MongoDB</h1><div class=time><p>2020-03-25T00:00:04+00:00<p>last updated 2020-04-16T08:01:23+00:00</div><p>This is the third and last post in the MongoDB series, where we will develop a Python application to process and store OpenData inside Mongo.<p>Other posts in this series:<ul><li><a href=/blog/ribw/mongodb-an-introduction/>MongoDB: an Introduction</a><li><a href=/blog/ribw/mongodb-basic-operations-and-architecture/>MongoDB: Basic Operations and Architecture</a><li><a href=/blog/ribw/developing-a-python-application-for-mongodb/>Developing a Python application for MongoDB</a> (this post)</ul><p>This post is co-authored wih a Classmate.<hr><h2 id=what-are-we-making>What are we making?</h2><p>We are going to develop a web application that renders a map, in this case, the town of Cáceres, with which users can interact. When the user clicks somewhere on the map, the selected location will be sent to the server to process. This server will perform geospatial queries to Mongo and once the results are ready, the information is presented back at the webpage.<p>The data used for the application comes from <a href=https://opendata.caceres.es/>Cáceres’ OpenData</a>, and our goal is that users will be able to find information about certain areas in a quick and intuitive way, such as precise coordinates, noise level, and such.<h2 id=what-are-we-using>What are we using?</h2><p>The web application will be using <a href=https://python.org/>Python</a> for the backend, <a href=https://svelte.dev/>Svelte</a> for the frontend, and <a href=https://www.mongodb.com/>Mongo</a> as our storage database and processing center.<ul><li><strong>Why Python?</strong> It’s a comfortable language to write and to read, and has a great ecosystem with <a href=https://pypi.org/>plenty of libraries</a>.<li><strong>Why Svelte?</strong> Svelte is the New Thing<strong>™</strong> in the world of component frameworks for JavaScript. It is similar to React or Vue, but compiled and with a lot less boilerplate. Check out their <a href=https://svelte.dev/blog/svelte-3-rethinking-reactivity>Svelte post</a> to learn more.<li><strong>Why Mongo?</strong> We believe NoSQL is the right approach for doing the kind of processing and storage that we expect, and it’s <a href=https://docs.mongodb.com/>very easy to use</a>. In addition, we will be making Geospatial Queries which <a href=https://docs.mongodb.com/manual/geospatial-queries/>Mongo supports</a>.</ul><p>Why didn’t we choose to make a smaller project, you may ask? You will be shocked to hear that we do not have an answer for that!<p>Note that we will not be embedding <strong>all</strong> the code of the project in this post, or it would be too long! We will include only the relevant snippets needed to understand the core ideas of the project, and not the unnecessary parts of it (for example, parsing configuration files to easily change the port where the server runs is not included).<h2 id=python-dependencies>Python dependencies</h2><p>Because we will program it in Python, you need Python installed. You can install it using a package manager of your choice or heading over to the <a href=https://www.python.org/downloads/>Python downloads section</a>, but if you’re on Linux, chances are you have it installed already.<p>Once Python 3.7 or above is installed, install <a href=https://motor.readthedocs.io/en/stable/><code>motor</code> (Asynchronous Python driver for MongoDB)</a> and the <a href=https://docs.aiohttp.org/en/stable/web.html><code>aiohttp</code> server</a> through <code>pip</code>:<pre><code>pip install aiohttp motor </code></pre><p>Make sure that Mongo is running in the background (this has been described in previous posts), and we should be able to get to work.<h2 id=web-dependencies>Web dependencies</h2><p>To work with Svelte and its dependencies, we will need <code>[npm](https://www.npmjs.com/)</code> which comes with <a href=https://nodejs.org/en/>NodeJS</a>, so go and <a href=https://nodejs.org/en/download/>install Node from their site</a>. The download will be different depending on your operating system.<p>Following <a href=https://svelte.dev/blog/the-easiest-way-to-get-started>the easiest way to get started with Svelte</a>, we will put our project in a <code>client/</code> folder (because this is what the clients see, the frontend). Feel free to tinker a bit with the configuration files to change the name and such, although this isn’t relevant for the rest of the post.<h2 id=finding-the-data>Finding the data</h2><p>We are going to work with the JSON files provided by <a href=http://opendata.caceres.es/>OpenData Cáceres</a>. In particular, we want information about the noise, census, vias and trees. To save you the time from <a href=http://opendata.caceres.es/dataset>searching each of these</a>, we will automate the download with code.<p>If you want to save the data offline or just know what data we’ll be using for other purposes though, you can right click on the following links and select «Save Link As…» with the name of the link:<ul><li><code>[noise.json](http://opendata.caceres.es/GetData/GetData?dataset=om:MedicionRuido&format=json)</code><li><code>[census.json](http://opendata.caceres.es/GetData/GetData?dataset=om:InformacionPadron&year=2017&format=json)</code><li><code>[vias.json](http://opendata.caceres.es/GetData/GetData?dataset=om:InformacionPadron&year=2017&format=json)</code><li><code>[trees.json](http://opendata.caceres.es/GetData/GetData?dataset=om:Arbol&format=json)</code></ul><h2 id=backend>Backend</h2><p>It’s time to get started with some code! We will put it in a <code>server/</code> folder because it will contain the Python server, that is, the backend of our application.<p>We are using <code>aiohttp</code> because we would like our server to be <code>async</code>. We don’t expect a lot of users at the same time, but it’s good to know our server would be well-designed for that use-case. As a bonus, it makes IO points clear in the code, which can help reason about it. The implicit synchronization between <code>await</code> is also a nice bonus.<h3 id=saving-the-data-in-mongo>Saving the data in Mongo</h3><p>Before running the server, we must ensure that the data we need is already stored and indexed in Mongo. Our <code>server/data.py</code> will take care of downloading the files, cleaning them up a little (Cáceres’ OpenData can be a bit awkward sometimes), inserting them into Mongo and indexing them.<p>Downloading the JSON data can be done with <code>[ClientSession.get](https://aiohttp.readthedocs.io/en/stable/client_reference.html#aiohttp.ClientSession.get)</code>. We also take this opportunity to clean up the messy encoding from the JSON, which does not seem to be UTF-8 in some cases.<pre><code>async def load_json(session, url): fixes = [(old, new.encode('utf-8')) for old, new in [ (b'\xc3\x83\\u2018', 'Ñ'),
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Privado: Final NoSQL evaluation | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Privado: Final NoSQL evaluation</h1><div class=time><p>2020-05-13T00:00:00+00:00<p>last updated 2020-05-14T07:30:08+00:00</div><p>This evaluation is a bit different to my <a href=/blog/ribw/16/nosql-evaluation/>previous one</a> because this time I have been tasked to evaluate the student <code>a(i - 2)</code>, and because I am <code>a = 9</code> that happens to be <code>a(7) =</code> Classmate.<p>Unfortunately for Classmate, the only entry related to NoSQL I have found in their blog is Prima y segunda Actividad: Base de datos NoSQL which does not develop an application as requested for the third entry (as of 14th of May).<p>This means that, instead, I will evaluate <code>a(i - 3)</code> which happens to be <code>a(6) =</code> Classmate and they do have an entry.<h2 id=classmate-s-evaluation>Classmate’s Evaluation</h2><p><strong>Grading: B.</strong><p>The post I have evaluated is BB.DD. NoSQL RethinkDB 3ª Fase. Aplicación.<p>It starts with an introduction, properly explaining what database they have chosen and why, but not what application they will be making.<p>This is detailed just below in the next section, although it’s a bit vague.<p>The next section talks about the Python dependencies that are required, but they never said they would be making a Python application or that we need to install Python!<p>The next section talks about the file structure of the project, and they detail what everything part does, although I have missed some code snippets.<p>The final result is pretty cool and contains many interesting graphs, they provide a download to the source code and list all the relevant references used.<p>Except for a weird «necesario falta» in the text, it’s otherwise well-written, although given the issues above I cannot grade it with the highest score.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Privado: Final NoSQL evaluation | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Privado: Final NoSQL evaluation</h1><div class=time><p>2020-05-13T00:00:00+00:00<p>last updated 2020-05-14T07:30:08+00:00</div><p>This evaluation is a bit different to my <a href=/blog/ribw/16/nosql-evaluation/>previous one</a> because this time I have been tasked to evaluate the student <code>a(i - 2)</code>, and because I am <code>a = 9</code> that happens to be <code>a(7) =</code> Classmate.<p>Unfortunately for Classmate, the only entry related to NoSQL I have found in their blog is Prima y segunda Actividad: Base de datos NoSQL which does not develop an application as requested for the third entry (as of 14th of May).<p>This means that, instead, I will evaluate <code>a(i - 3)</code> which happens to be <code>a(6) =</code> Classmate and they do have an entry.<h2 id=classmate-s-evaluation>Classmate’s Evaluation</h2><p><strong>Grading: B.</strong><p>The post I have evaluated is BB.DD. NoSQL RethinkDB 3ª Fase. Aplicación.<p>It starts with an introduction, properly explaining what database they have chosen and why, but not what application they will be making.<p>This is detailed just below in the next section, although it’s a bit vague.<p>The next section talks about the Python dependencies that are required, but they never said they would be making a Python application or that we need to install Python!<p>The next section talks about the file structure of the project, and they detail what everything part does, although I have missed some code snippets.<p>The final result is pretty cool and contains many interesting graphs, they provide a download to the source code and list all the relevant references used.<p>Except for a weird «necesario falta» in the text, it’s otherwise well-written, although given the issues above I cannot grade it with the highest score.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,2 +1,2 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Google’s BigTable | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Google’s BigTable</h1><div class=time><p>2020-04-01T00:00:00+00:00<p>last updated 2020-04-03T09:30:05+00:00</div><p>Let’s talk about BigTable, and why it is what it is. But before we get into that, let’s see some important aspects anybody should consider when dealing with a lot of data (something BigTable does!).<h2 id=the-basics>The basics</h2><p>Converting a text document into a different format is often a great way to greatly speed up scanning of it in the future. It allows for efficient searches.<p>In addition, you generally want to store everything in a single, giant file. This will save a lot of time opening and closing files, because everything is in the same file! One proposal to make this happen is <a href=https://trec.nist.gov/file_help.html>Web TREC</a> (see also the <a href=https://en.wikipedia.org/wiki/Text_Retrieval_Conference>Wikipedia page on TREC</a>), which is basically HTML but every document is properly delimited from one another.<p>Because we will have a lot of data, it’s often a good idea to compress it. Most text consists of the same words, over and over again. Classic compression techniques such as <code>DEFLATE</code> or <code>LZW</code> do an excellent job here.<h2 id=so-what-s-bigtable>So what’s BigTable?</h2><p>Okay, enough of an introduction to the basics on storing data. BigTable is what Google uses to store documents, and it’s a customized approach to save, search and update web pages.<p>BigTable is is a distributed storage system for managing structured data, able to scale to petabytes of data across thousands of commodity servers, with wide applicability, scalability, high performance, and high availability.<p>In a way, it’s kind of like databases and shares many implementation strategies with them, like parallel databases, or main-memory databases, but of course, with a different schema.<p>It consists of a big table known as the «Root tablet», with pointers to many other «tablets» (or metadata in between). These are stored in a replicated filesystem accessible by all BigTable servers. Any change to a tablet gets logged (said log also gets stored in a replicated filesystem).<p>If any of the tablets servers gets locked, a different one can take its place, read the log and deal with the problem.<p>There’s no query language, transactions occur at row-level only. Every read or write in a row is atomic. Each row stores a single web page, and by combining the row and column keys along with a timestamp, it is possible to retrieve a single cell in the row. More formally, it’s a map that looks like this:<pre><code>fetch(row: string, column: string, time: int64) -> string +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Google’s BigTable | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Google’s BigTable</h1><div class=time><p>2020-04-01T00:00:00+00:00<p>last updated 2020-04-03T09:30:05+00:00</div><p>Let’s talk about BigTable, and why it is what it is. But before we get into that, let’s see some important aspects anybody should consider when dealing with a lot of data (something BigTable does!).<h2 id=the-basics>The basics</h2><p>Converting a text document into a different format is often a great way to greatly speed up scanning of it in the future. It allows for efficient searches.<p>In addition, you generally want to store everything in a single, giant file. This will save a lot of time opening and closing files, because everything is in the same file! One proposal to make this happen is <a href=https://trec.nist.gov/file_help.html>Web TREC</a> (see also the <a href=https://en.wikipedia.org/wiki/Text_Retrieval_Conference>Wikipedia page on TREC</a>), which is basically HTML but every document is properly delimited from one another.<p>Because we will have a lot of data, it’s often a good idea to compress it. Most text consists of the same words, over and over again. Classic compression techniques such as <code>DEFLATE</code> or <code>LZW</code> do an excellent job here.<h2 id=so-what-s-bigtable>So what’s BigTable?</h2><p>Okay, enough of an introduction to the basics on storing data. BigTable is what Google uses to store documents, and it’s a customized approach to save, search and update web pages.<p>BigTable is is a distributed storage system for managing structured data, able to scale to petabytes of data across thousands of commodity servers, with wide applicability, scalability, high performance, and high availability.<p>In a way, it’s kind of like databases and shares many implementation strategies with them, like parallel databases, or main-memory databases, but of course, with a different schema.<p>It consists of a big table known as the «Root tablet», with pointers to many other «tablets» (or metadata in between). These are stored in a replicated filesystem accessible by all BigTable servers. Any change to a tablet gets logged (said log also gets stored in a replicated filesystem).<p>If any of the tablets servers gets locked, a different one can take its place, read the log and deal with the problem.<p>There’s no query language, transactions occur at row-level only. Every read or write in a row is atomic. Each row stores a single web page, and by combining the row and column keys along with a timestamp, it is possible to retrieve a single cell in the row. More formally, it’s a map that looks like this:<pre><code>fetch(row: string, column: string, time: int64) -> string </code></pre><p>A row may have as many columns as it needs, and these column groups are the same for everyone (but the columns themselves may vary), which is importan to reduce disk read time.<p>Rows are split in different tablets based on the row keys, which simplifies determining an appropriated server for them. The keys can be up to 64KB big, although most commonly they range 10-100 bytes.<h2 id=conclusions>Conclusions</h2><p>BigTable is Google’s way to deal with large amounts of data on many of their services, and the ideas behind it are not too complex to understand.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> How does Google’s Search Engine work? | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>How does Google’s Search Engine work?</h1><div class=time><p>2020-03-18T01:00:00+00:00<p>last updated 2020-03-28T10:17:09+00:00</div><p>The original implementation was written in C/++ for Linux/Solaris.<p>There are three major components in the system’s anatomy, which can be thought as steps to be performed for Google to be what it is today.<p><img src=https://lonami.dev/blog/ribw/how-does-googles-search-engine-work/image-1024x649.png><p>But before we talk about the different components, let’s take a look at how they store all of this information.<h2 id=data-structures>Data structures</h2><p>A «BigFile» is a virtual file addressable by 64 bits.<p>There exists a repository with the full HTML of every page compressed, along with a document identifier, length and URL.<table><tbody><tr><td>sync<td>length<td>compressed packet</table><p>The Document Index has the document identifier, a pointer into the repository, a checksum and various other statistics.<table><tbody><tr><td>doc id<td>ecode<td>url len<td>page len<td>url<td>page</table><p>A Lexicon stores the repository of words, implemented with a hashtable over pointers linking to the barrels (sorted linked lists) of the Inverted Index.<table><tbody><tr><td>word id<td>n docs<tr><td>word id<td>n docs</table><p>The Hit Lists store occurences of a word in a document.<table><tbody><tr><td><strong> plain </strong><td>cap: 1<td>imp: 3<td>pos: 12<tr><td><strong> fancy </strong><td>cap: 1<td>imp: 7<td>type: 4<td>pos: 8<tr><td><strong> anchor </strong><td>cap: 1<td>imp: 7<td>type: 4<td>hash: 4<td>pos: 8</table><p>The Forward Index is a barrel with a range of word identifiers (document identifier and list of word identifiers).<table><tbody><tr><td rowspan=3>doc id<td>word id: 24<td>n hits: 8<td>hit hit hit hit hit hit hit hit<tr><td>word id: 24<td>n hits: 8<td>hit hit hit hit hit hit hit hit<tr><td>null word id</table><p>The Inverted Index can be sorted by either document identifier or by ranking of word occurence.<table><tbody><tr><td>doc id: 23<td>n hits: 5<td>hit hit hit hit hit<tr><td>doc id: 23<td>n hits: 3<td>hit hit hit<tr><td>doc id: 23<td>n hits: 4<td>hit hit hit hit<tr><td>doc id: 23<td>n hits: 2<td>hit hit</table><p>Back in 1998, Google compressed its repository to 53GB and had 24 million pages. The indices, lexicon, and other temporary storage required about 55GB.<h2 id=crawling>Crawling</h2><p>The crawling must be reliable, fast and robust, and also respect the decision of some authors not wanting their pages crawled. Originally, it took a week or more, so simultaneous execution became a must.<p>Back in 1998, Google had between 3 and 4 crawlers running at 100 web pages per second maximum. These were implemented in Python.<p>The crawled pages need parsing to deal with typos or formatting issues.<h2 id=indexing>Indexing</h2><p>Indexing is about putting the pages into barrels, converting words into word identifiers, and occurences into hit lists.<p>Once indexing is done, sorting of the barrels happens to have them ordered by word identifier, producing the inverted index. This process also had to be done in parallel over many machines, or would otherwise have been too slow.<h2 id=searching>Searching</h2><p>We need to find quality results efficiently. Plenty of weights are considered nowadays, but at its heart, PageRank is used. It is the algorithm they use to map the web, which is formally defined as follows:<p><img src=https://lonami.dev/blog/ribw/how-does-googles-search-engine-work/8e1e61b119e107fcb4bdd7e78f649985.png> <em>PR(A) = (1-d) + d(PR(T1)/C(T1) + … + PR(Tn)/C(Tn))</em><p>Where:<ul><li><code>A</code> is a given page<li><code>T<sub>n</sub></code> are pages that point to A<li><code>d</code> is the damping factor in the range <code>[0, 1]</code> (often 0.85)<li><code>C(A)</code> is the number of links going out of page <code>A</code><li><code>PR(A)</code> is the page rank of page <code>A</code> This formula indicates the probability that a random surfer visits a certain page, and <code>1 - d</code> is used to indicate when it will «get bored» and stop surfing. More intuitively, the page rank of a page will grow as more pages link to it, or the few that link to it have high page rank.</ul><p>The anchor text in the links also help provide a better description and helps indexing for even better results.<p>While searching, the concern is disk I/O which takes up most of the time. Caching is very important to improve performance up to 30 times.<p>Now, in order to turn user queries into something we can search, we must parse the query and convert the words into word identifiers.<h2 id=conclusion>Conclusion</h2><p>Google is designed to be a efficient, scalable, high-quality search engine. There are still bottlenecks in CPU, memory, disk speed and network I/O, but major data structures are used to make efficient use of the resources.<h2 id=references>References</h2><ul><li><a href=https://snap.stanford.edu/class/cs224w-readings/Brin98Anatomy.pdf>The anatomy of a large-scale hypertextual Web search engine</a><li><a href=https://www.site.uottawa.ca/%7Ediana/csi4107/Google_SearchEngine.pdf>The Anatomy of a Large-Scale Hypertextual Web Search Engine (slides)</a></ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> How does Google’s Search Engine work? | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>How does Google’s Search Engine work?</h1><div class=time><p>2020-03-18T01:00:00+00:00<p>last updated 2020-03-28T10:17:09+00:00</div><p>The original implementation was written in C/++ for Linux/Solaris.<p>There are three major components in the system’s anatomy, which can be thought as steps to be performed for Google to be what it is today.<p><img src=https://lonami.dev/blog/ribw/how-does-googles-search-engine-work/image-1024x649.png><p>But before we talk about the different components, let’s take a look at how they store all of this information.<h2 id=data-structures>Data structures</h2><p>A «BigFile» is a virtual file addressable by 64 bits.<p>There exists a repository with the full HTML of every page compressed, along with a document identifier, length and URL.<table><tbody><tr><td>sync<td>length<td>compressed packet</table><p>The Document Index has the document identifier, a pointer into the repository, a checksum and various other statistics.<table><tbody><tr><td>doc id<td>ecode<td>url len<td>page len<td>url<td>page</table><p>A Lexicon stores the repository of words, implemented with a hashtable over pointers linking to the barrels (sorted linked lists) of the Inverted Index.<table><tbody><tr><td>word id<td>n docs<tr><td>word id<td>n docs</table><p>The Hit Lists store occurences of a word in a document.<table><tbody><tr><td><strong> plain </strong><td>cap: 1<td>imp: 3<td>pos: 12<tr><td><strong> fancy </strong><td>cap: 1<td>imp: 7<td>type: 4<td>pos: 8<tr><td><strong> anchor </strong><td>cap: 1<td>imp: 7<td>type: 4<td>hash: 4<td>pos: 8</table><p>The Forward Index is a barrel with a range of word identifiers (document identifier and list of word identifiers).<table><tbody><tr><td rowspan=3>doc id<td>word id: 24<td>n hits: 8<td>hit hit hit hit hit hit hit hit<tr><td>word id: 24<td>n hits: 8<td>hit hit hit hit hit hit hit hit<tr><td>null word id</table><p>The Inverted Index can be sorted by either document identifier or by ranking of word occurence.<table><tbody><tr><td>doc id: 23<td>n hits: 5<td>hit hit hit hit hit<tr><td>doc id: 23<td>n hits: 3<td>hit hit hit<tr><td>doc id: 23<td>n hits: 4<td>hit hit hit hit<tr><td>doc id: 23<td>n hits: 2<td>hit hit</table><p>Back in 1998, Google compressed its repository to 53GB and had 24 million pages. The indices, lexicon, and other temporary storage required about 55GB.<h2 id=crawling>Crawling</h2><p>The crawling must be reliable, fast and robust, and also respect the decision of some authors not wanting their pages crawled. Originally, it took a week or more, so simultaneous execution became a must.<p>Back in 1998, Google had between 3 and 4 crawlers running at 100 web pages per second maximum. These were implemented in Python.<p>The crawled pages need parsing to deal with typos or formatting issues.<h2 id=indexing>Indexing</h2><p>Indexing is about putting the pages into barrels, converting words into word identifiers, and occurences into hit lists.<p>Once indexing is done, sorting of the barrels happens to have them ordered by word identifier, producing the inverted index. This process also had to be done in parallel over many machines, or would otherwise have been too slow.<h2 id=searching>Searching</h2><p>We need to find quality results efficiently. Plenty of weights are considered nowadays, but at its heart, PageRank is used. It is the algorithm they use to map the web, which is formally defined as follows:<p><img src=https://lonami.dev/blog/ribw/how-does-googles-search-engine-work/8e1e61b119e107fcb4bdd7e78f649985.png> <em>PR(A) = (1-d) + d(PR(T1)/C(T1) + … + PR(Tn)/C(Tn))</em><p>Where:<ul><li><code>A</code> is a given page<li><code>T<sub>n</sub></code> are pages that point to A<li><code>d</code> is the damping factor in the range <code>[0, 1]</code> (often 0.85)<li><code>C(A)</code> is the number of links going out of page <code>A</code><li><code>PR(A)</code> is the page rank of page <code>A</code> This formula indicates the probability that a random surfer visits a certain page, and <code>1 - d</code> is used to indicate when it will «get bored» and stop surfing. More intuitively, the page rank of a page will grow as more pages link to it, or the few that link to it have high page rank.</ul><p>The anchor text in the links also help provide a better description and helps indexing for even better results.<p>While searching, the concern is disk I/O which takes up most of the time. Caching is very important to improve performance up to 30 times.<p>Now, in order to turn user queries into something we can search, we must parse the query and convert the words into word identifiers.<h2 id=conclusion>Conclusion</h2><p>Google is designed to be a efficient, scalable, high-quality search engine. There are still bottlenecks in CPU, memory, disk speed and network I/O, but major data structures are used to make efficient use of the resources.<h2 id=references>References</h2><ul><li><a href=https://snap.stanford.edu/class/cs224w-readings/Brin98Anatomy.pdf>The anatomy of a large-scale hypertextual Web search engine</a><li><a href=https://www.site.uottawa.ca/%7Ediana/csi4107/Google_SearchEngine.pdf>The Anatomy of a Large-Scale Hypertextual Web Search Engine (slides)</a></ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Information Retrieval and Web Search</h1><p id=welcome onclick=pls_stop()>Welcome to my blog!<p>Here I occasionally post new entries, mostly tech related. Perhaps it's tips for a new game I'm playing, perhaps it has something to do with FFI, or perhaps I'm fighting the borrow checker (just kidding, I'm over that. Mostly).<hr><ul><li><a href=https://lonami.dev/blog/ribw/final-nosql-evaluation/>Privado: Final NoSQL evaluation</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/a-practical-example-with-hadoop/>A practical example with Hadoop</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/introduction-to-hadoop-and-its-mapreduce/>Introduction to Hadoop and its MapReduce</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/googles-bigtable/>Google’s BigTable</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/developing-a-python-application-for-mongodb/>Developing a Python application for MongoDB</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/what-is-elasticsearch-and-why-should-you-care/>What is ElasticSearch and why should you care?</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/how-does-googles-search-engine-work/>How does Google’s Search Engine work?</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/integrating-apache-tika-into-our-crawler/>Integrating Apache Tika into our Crawler</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/pc-crawler-evaluation-2/>Privado: PC-Crawler evaluation 2</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/nosql-evaluation/>Privado: NoSQL evaluation</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/upgrading-our-baby-crawler/>Upgrading our Baby Crawler</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/mongodb-basic-operations-and-architecture/>MongoDB: Basic Operations and Architecture</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/nosql-databases-basic-operations-and-architecture/>Cassandra: Basic Operations and Architecture</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/mongodb-an-introduction/>MongoDB: an Introduction</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/cassandra-an-introduction/>Cassandra: an Introduction</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/pc-crawler-evaluation/>Privado: PC-Crawler evaluation</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/build-your-own-pc/>Build your own PC</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/introduction-to-nosql/>Introduction to NoSQL</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/about-boolean-retrieval/>About Boolean Retrieval</a><span class=dim> </span></ul><script> +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Information Retrieval and Web Search</h1><p id=welcome onclick=pls_stop()>Welcome to my blog!<p>Here I occasionally post new entries, mostly tech related. Perhaps it's tips for a new game I'm playing, perhaps it has something to do with FFI, or perhaps I'm fighting the borrow checker (just kidding, I'm over that. Mostly).<hr><ul><li><a href=https://lonami.dev/blog/ribw/final-nosql-evaluation/>Privado: Final NoSQL evaluation</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/a-practical-example-with-hadoop/>A practical example with Hadoop</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/introduction-to-hadoop-and-its-mapreduce/>Introduction to Hadoop and its MapReduce</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/googles-bigtable/>Google’s BigTable</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/developing-a-python-application-for-mongodb/>Developing a Python application for MongoDB</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/what-is-elasticsearch-and-why-should-you-care/>What is ElasticSearch and why should you care?</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/how-does-googles-search-engine-work/>How does Google’s Search Engine work?</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/integrating-apache-tika-into-our-crawler/>Integrating Apache Tika into our Crawler</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/pc-crawler-evaluation-2/>Privado: PC-Crawler evaluation 2</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/nosql-evaluation/>Privado: NoSQL evaluation</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/upgrading-our-baby-crawler/>Upgrading our Baby Crawler</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/mongodb-basic-operations-and-architecture/>MongoDB: Basic Operations and Architecture</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/nosql-databases-basic-operations-and-architecture/>Cassandra: Basic Operations and Architecture</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/mongodb-an-introduction/>MongoDB: an Introduction</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/cassandra-an-introduction/>Cassandra: an Introduction</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/pc-crawler-evaluation/>Privado: PC-Crawler evaluation</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/build-your-own-pc/>Build your own PC</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/introduction-to-nosql/>Introduction to NoSQL</a><span class=dim> </span><li><a href=https://lonami.dev/blog/ribw/about-boolean-retrieval/>About Boolean Retrieval</a><span class=dim> </span></ul><script> const WELCOME_EN = 'Welcome to my blog!' const WELCOME_ES = '¡Bienvenido a mi blog!' const APOLOGIES = "ok sorry i'll stop"
@@ -1,2 +1,2 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Integrating Apache Tika into our Crawler | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Integrating Apache Tika into our Crawler</h1><div class=time><p>2020-03-18T00:00:00+00:00<p>last updated 2020-03-25T17:38:07+00:00</div><p><a href=/blog/ribw/upgrading-our-baby-crawler/>In our last crawler post</a>, we detailed how our crawler worked, and although it did a fine job, it’s time for some extra upgrading.<h2 id=what-kind-of-upgrades>What kind of upgrades?</h2><p>A small but useful one. We are adding support for file types that contain text but cannot be processed by normal text editors because they are structured and not just plain text (such as PDF files, Excel, Word documents…).<p>And for this task, we will make use of the help offered by <a href=https://tika.apache.org/>Tika</a>, our friendly Apache tool.<h2 id=what-is-tika>What is Tika?</h2><p><a href=https://tika.apache.org/>Tika</a> is a set of libraries offered by <a href=https://en.wikipedia.org/wiki/The_Apache_Software_Foundation>The Apache Software Foundation</a> that we can include in our project in order to extract the text and metadata of files from a <a href=https://tika.apache.org/1.24/formats.html>long list of supported formats</a>.<h2 id=changes-in-the-code>Changes in the code</h2><p>Not much has changed in the structure of the crawler, we simply have added a new method in <code>Utils</code> that uses the class <code>Tika</code> from the previously mentioned library so as to process and extract the text of more filetypes.<p>Then, we use this text just like we would for our standard text file (checking the thesaurus and adding it to the word map) and voilà! We have just added support for a big range of file types.<h2 id=incorporating-gradle>Incorporating Gradle</h2><p>In order for the previous code to work, we need to make use of external libraries. To make this process easier and because the project is growing, we decided to use <a href=https://gradle.org/>Gradle</a>, a build system that can be used for projects in various programming languages, such as Java.<p>We followed their <a href=https://guides.gradle.org/building-java-applications/>guide to Building Java Applications</a>, and in a few steps added the required <code>.gradle</code> files. Now we can compile and run the code without having to worry about juggling with Java and external dependencies in a single command:<pre><code>./gradlew run +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Integrating Apache Tika into our Crawler | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Integrating Apache Tika into our Crawler</h1><div class=time><p>2020-03-18T00:00:00+00:00<p>last updated 2020-03-25T17:38:07+00:00</div><p><a href=/blog/ribw/upgrading-our-baby-crawler/>In our last crawler post</a>, we detailed how our crawler worked, and although it did a fine job, it’s time for some extra upgrading.<h2 id=what-kind-of-upgrades>What kind of upgrades?</h2><p>A small but useful one. We are adding support for file types that contain text but cannot be processed by normal text editors because they are structured and not just plain text (such as PDF files, Excel, Word documents…).<p>And for this task, we will make use of the help offered by <a href=https://tika.apache.org/>Tika</a>, our friendly Apache tool.<h2 id=what-is-tika>What is Tika?</h2><p><a href=https://tika.apache.org/>Tika</a> is a set of libraries offered by <a href=https://en.wikipedia.org/wiki/The_Apache_Software_Foundation>The Apache Software Foundation</a> that we can include in our project in order to extract the text and metadata of files from a <a href=https://tika.apache.org/1.24/formats.html>long list of supported formats</a>.<h2 id=changes-in-the-code>Changes in the code</h2><p>Not much has changed in the structure of the crawler, we simply have added a new method in <code>Utils</code> that uses the class <code>Tika</code> from the previously mentioned library so as to process and extract the text of more filetypes.<p>Then, we use this text just like we would for our standard text file (checking the thesaurus and adding it to the word map) and voilà! We have just added support for a big range of file types.<h2 id=incorporating-gradle>Incorporating Gradle</h2><p>In order for the previous code to work, we need to make use of external libraries. To make this process easier and because the project is growing, we decided to use <a href=https://gradle.org/>Gradle</a>, a build system that can be used for projects in various programming languages, such as Java.<p>We followed their <a href=https://guides.gradle.org/building-java-applications/>guide to Building Java Applications</a>, and in a few steps added the required <code>.gradle</code> files. Now we can compile and run the code without having to worry about juggling with Java and external dependencies in a single command:<pre><code>./gradlew run </code></pre><h2 id=download>Download</h2><p>And here you can download the final result:<p><em>download removed</em></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Introduction to Hadoop and its MapReduce | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Introduction to Hadoop and its MapReduce</h1><div class=time><p>2020-04-01T01:00:00+00:00<p>last updated 2020-04-03T08:43:44+00:00</div><p>Hadoop is an open-source, free, Java-based programming framework that helps processing large datasets in a distributed environment and the problems that arise when trying to harness the knowledge from BigData, capable of running on thousands of nodes and dealing with petabytes of data. It is based on Google File System (GFS) and originated from the work on the Nutch open-source project on search engines.<p>Hadoop also offers a distributed filesystem (HDFS) enabling for fast transfer among nodes, and a way to program with MapReduce.<p>It aims to strive for the 4 V’s: Volume, Variety, Veracity and Velocity. For veracity, it is a secure environment that can be trusted.<h2 id=milestones>Milestones</h2><p>The creators of Hadoop are Doug Cutting and Mike Cafarella, who just wanted to design a search engine, Nutch, and quickly found the problems of dealing with large amounts of data. They found their solution with the papers Google published.<p>The name comes from the plush of Cutting’s child, a yellow elephant.<ul><li>In July 2005, Nutch used GFS to perform MapReduce operations.<li>In February 2006, Nutch started a Lucene subproject which led to Hadoop.<li>In April 2007, Yahoo used Hadoop in a 1 000-node cluster.<li>In January 2008, Apache took over and made Hadoop a top-level project.<li>In July 2008, Apache tested a 4000-node cluster. The performance was the fastest compared to other technologies that year.<li>In May 2009, Hadoop sorted a petabyte of data in 17 hours.<li>In December 2011, Hadoop reached 1.0.<li>In May 2012, Hadoop 2.0 was released with the addition of YARN (Yet Another Resource Navigator) on top of HDFS, splitting MapReduce and other processes into separate components, greatly improving the fault tolerance.</ul><p>From here onwards, many other alternatives have born, like Spark, Hive & Drill, Kafka, HBase, built around the Hadoop ecosystem.<p>As of 2017, Amazon has clusters between 1 and 100 nodes, Yahoo has over 100 000 CPUs running Hadoop, AOL has clusters with 50 machines, and Facebook has a 320-machine (2 560 cores) and 1.3PB of raw storage.<h2 id=why-not-use-rdbms>Why not use RDBMS?</h2><p>Relational database management systems simply cannot scale horizontally, and vertical scaling will require very expensive servers. Similar to RDBMS, Hadoop has a notion of jobs (analogous to transactions), but without ACID or concurrency control. Hadoop supports any form of data (unstructured or semi-structured) in read-only mode, and failures are common but there’s a simple yet efficient fault tolerance.<p>So what problems does Hadoop solve? It solves the way we should think about problems, and distributing them, which is key to do anything related with BigData nowadays. We start working with clusters of nodes, and coordinating the jobs between them. Hadoop’s API makes this really easy.<p>Hadoop also takes very seriously the loss of data with replication, and if a node falls, they are moved to a different node.<h2 id=major-components>Major components</h2><p>The previously-mentioned HDFS runs on commodity machine, which are cost-friendly. It is very fault-tolerant and efficient enough to process huge amounts of data, because it splits large files into smaller chunks (or blocks) that can be more easily handled. Multiple nodes can work on multiple chunks at the same time.<p>NameNode stores the metadata of the various datablocks (map of blocks) along with their location. It is the brain and the master in Hadoop’s master-slave architecture, also known as the namespace, and makes use of the DataNode.<p>A secondary NameNode is a replica that can be used if the first NameNode dies, so that Hadoop doesn’t shutdown and can restart.<p>DataNode stores the blocks of data, and are the slaves in the architecture. This data is split into one or more files. Their only job is to manage this access to the data. They are often distributed among racks to avoid data lose.<p>JobTracker creates and schedules jobs from the clients for either map or reduce operations.<p>TaskTracker runs MapReduce tasks assigned to the current data node.<p>When clients need data, they first interact with the NameNode and replies with the location of the data in the correct DataNode. Client proceeds with interaction with the DataNode.<h2 id=mapreduce>MapReduce</h2><p>MapReduce, as the name implies, is split into two steps: the map and the reduce. The map stage is the «divide and conquer» strategy, while the reduce part is about combining and reducing the results.<p>The mapper has to process the input data (normally a file or directory), commonly line-by-line, and produce one or more outputs. The reducer uses all the results from the mapper as its input to produce a new output file itself.<p><img src=https://lonami.dev/blog/ribw/introduction-to-hadoop-and-its-mapreduce/bitmap.png><p>When reading the data, some may be junk that we can choose to ignore. If it is valid data, however, we label it with a particular type that can be useful for the upcoming process. Hadoop is responsible for splitting the data accross the many nodes available to execute this process in parallel.<p>There is another part to MapReduce, known as the Shuffle-and-Sort. In this part, types or categories from one node get moved to a different node. This happens with all nodes, so that every node can work on a complete category. These categories are known as «keys», and allows Hadoop to scale linearly.<h2 id=references>References</h2><ul><li><a href=https://youtu.be/oT7kczq5A-0>YouTube – Hadoop Tutorial For Beginners | What Is Hadoop? | Hadoop Tutorial | Hadoop Training | Simplilearn</a><li><a href=https://youtu.be/bcjSe0xCHbE>YouTube – Learn MapReduce with Playing Cards</a><li><a href=https://youtu.be/j8ehT1_G5AY?list=PLi4tp-TF_qjM_ed4lIzn03w7OnEh0D8Xi>YouTube – Video Post #2: Hadoop para torpes (I)-¿Qué es y para qué sirve?</a><li><a href=https://youtu.be/NQ8mjVPCDvk?list=PLi4tp-TF_qjM_ed4lIzn03w7OnEh0D8Xi>Video Post #3: Hadoop para torpes (II)-¿Cómo funciona? HDFS y MapReduce</a><li><a href=https://hadoop.apache.org/old/releases.html>Apache Hadoop Releases</a><li><a href=https://youtu.be/20qWx2KYqYg?list=PLi4tp-TF_qjM_ed4lIzn03w7OnEh0D8Xi>Video Post #4: Hadoop para torpes (III y fin)- Ecosistema y distribuciones</a><li><a href=http://www.hadoopbook.com/>Chapter 2 – Hadoop: The Definitive Guide, Fourth Edition</a> (<a href=http://grut-computing.com/HadoopBook.pdf>pdf,</a><a href=http://www.hadoopbook.com/code.html>code</a>)</ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Introduction to Hadoop and its MapReduce | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Introduction to Hadoop and its MapReduce</h1><div class=time><p>2020-04-01T01:00:00+00:00<p>last updated 2020-04-03T08:43:44+00:00</div><p>Hadoop is an open-source, free, Java-based programming framework that helps processing large datasets in a distributed environment and the problems that arise when trying to harness the knowledge from BigData, capable of running on thousands of nodes and dealing with petabytes of data. It is based on Google File System (GFS) and originated from the work on the Nutch open-source project on search engines.<p>Hadoop also offers a distributed filesystem (HDFS) enabling for fast transfer among nodes, and a way to program with MapReduce.<p>It aims to strive for the 4 V’s: Volume, Variety, Veracity and Velocity. For veracity, it is a secure environment that can be trusted.<h2 id=milestones>Milestones</h2><p>The creators of Hadoop are Doug Cutting and Mike Cafarella, who just wanted to design a search engine, Nutch, and quickly found the problems of dealing with large amounts of data. They found their solution with the papers Google published.<p>The name comes from the plush of Cutting’s child, a yellow elephant.<ul><li>In July 2005, Nutch used GFS to perform MapReduce operations.<li>In February 2006, Nutch started a Lucene subproject which led to Hadoop.<li>In April 2007, Yahoo used Hadoop in a 1 000-node cluster.<li>In January 2008, Apache took over and made Hadoop a top-level project.<li>In July 2008, Apache tested a 4000-node cluster. The performance was the fastest compared to other technologies that year.<li>In May 2009, Hadoop sorted a petabyte of data in 17 hours.<li>In December 2011, Hadoop reached 1.0.<li>In May 2012, Hadoop 2.0 was released with the addition of YARN (Yet Another Resource Navigator) on top of HDFS, splitting MapReduce and other processes into separate components, greatly improving the fault tolerance.</ul><p>From here onwards, many other alternatives have born, like Spark, Hive & Drill, Kafka, HBase, built around the Hadoop ecosystem.<p>As of 2017, Amazon has clusters between 1 and 100 nodes, Yahoo has over 100 000 CPUs running Hadoop, AOL has clusters with 50 machines, and Facebook has a 320-machine (2 560 cores) and 1.3PB of raw storage.<h2 id=why-not-use-rdbms>Why not use RDBMS?</h2><p>Relational database management systems simply cannot scale horizontally, and vertical scaling will require very expensive servers. Similar to RDBMS, Hadoop has a notion of jobs (analogous to transactions), but without ACID or concurrency control. Hadoop supports any form of data (unstructured or semi-structured) in read-only mode, and failures are common but there’s a simple yet efficient fault tolerance.<p>So what problems does Hadoop solve? It solves the way we should think about problems, and distributing them, which is key to do anything related with BigData nowadays. We start working with clusters of nodes, and coordinating the jobs between them. Hadoop’s API makes this really easy.<p>Hadoop also takes very seriously the loss of data with replication, and if a node falls, they are moved to a different node.<h2 id=major-components>Major components</h2><p>The previously-mentioned HDFS runs on commodity machine, which are cost-friendly. It is very fault-tolerant and efficient enough to process huge amounts of data, because it splits large files into smaller chunks (or blocks) that can be more easily handled. Multiple nodes can work on multiple chunks at the same time.<p>NameNode stores the metadata of the various datablocks (map of blocks) along with their location. It is the brain and the master in Hadoop’s master-slave architecture, also known as the namespace, and makes use of the DataNode.<p>A secondary NameNode is a replica that can be used if the first NameNode dies, so that Hadoop doesn’t shutdown and can restart.<p>DataNode stores the blocks of data, and are the slaves in the architecture. This data is split into one or more files. Their only job is to manage this access to the data. They are often distributed among racks to avoid data lose.<p>JobTracker creates and schedules jobs from the clients for either map or reduce operations.<p>TaskTracker runs MapReduce tasks assigned to the current data node.<p>When clients need data, they first interact with the NameNode and replies with the location of the data in the correct DataNode. Client proceeds with interaction with the DataNode.<h2 id=mapreduce>MapReduce</h2><p>MapReduce, as the name implies, is split into two steps: the map and the reduce. The map stage is the «divide and conquer» strategy, while the reduce part is about combining and reducing the results.<p>The mapper has to process the input data (normally a file or directory), commonly line-by-line, and produce one or more outputs. The reducer uses all the results from the mapper as its input to produce a new output file itself.<p><img src=https://lonami.dev/blog/ribw/introduction-to-hadoop-and-its-mapreduce/bitmap.png><p>When reading the data, some may be junk that we can choose to ignore. If it is valid data, however, we label it with a particular type that can be useful for the upcoming process. Hadoop is responsible for splitting the data accross the many nodes available to execute this process in parallel.<p>There is another part to MapReduce, known as the Shuffle-and-Sort. In this part, types or categories from one node get moved to a different node. This happens with all nodes, so that every node can work on a complete category. These categories are known as «keys», and allows Hadoop to scale linearly.<h2 id=references>References</h2><ul><li><a href=https://youtu.be/oT7kczq5A-0>YouTube – Hadoop Tutorial For Beginners | What Is Hadoop? | Hadoop Tutorial | Hadoop Training | Simplilearn</a><li><a href=https://youtu.be/bcjSe0xCHbE>YouTube – Learn MapReduce with Playing Cards</a><li><a href=https://youtu.be/j8ehT1_G5AY?list=PLi4tp-TF_qjM_ed4lIzn03w7OnEh0D8Xi>YouTube – Video Post #2: Hadoop para torpes (I)-¿Qué es y para qué sirve?</a><li><a href=https://youtu.be/NQ8mjVPCDvk?list=PLi4tp-TF_qjM_ed4lIzn03w7OnEh0D8Xi>Video Post #3: Hadoop para torpes (II)-¿Cómo funciona? HDFS y MapReduce</a><li><a href=https://hadoop.apache.org/old/releases.html>Apache Hadoop Releases</a><li><a href=https://youtu.be/20qWx2KYqYg?list=PLi4tp-TF_qjM_ed4lIzn03w7OnEh0D8Xi>Video Post #4: Hadoop para torpes (III y fin)- Ecosistema y distribuciones</a><li><a href=http://www.hadoopbook.com/>Chapter 2 – Hadoop: The Definitive Guide, Fourth Edition</a> (<a href=http://grut-computing.com/HadoopBook.pdf>pdf,</a><a href=http://www.hadoopbook.com/code.html>code</a>)</ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Introduction to NoSQL | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Introduction to NoSQL</h1><div class=time><p>2020-02-25T01:00:15+00:00<p>last updated 2020-03-18T09:38:23+00:00</div><p>This post will primarly focus on the talk held in the <a href=https://youtu.be/qI_g07C_Q5I>GOTO 2012 conference: Introduction to NoSQL by Martin Fowler</a>. It can be seen as an informal, summarized transcript of the talk<hr><p>The relational database model is affected by the <em><a href=https://en.wikipedia.org/wiki/Object-relational_impedance_mismatch>impedance mismatch problem</a></em>. This occurs because we have to match our high-level design with the separate columns and rows used by relational databases.<p>Taking the in-memory objects and putting them into a relational database (which were dominant at the time) simply didn’t work out. Why? Relational databases were more than just databases, they served as a an integration mechanism across applications, up to the 2000s. For 20 years!<p>With the rise of the Internet and the sheer amount of traffic, databases needed to scale. Unfortunately, relational databases only scale well vertically (by upgrading a <em>single</em> node). This is <em>very</em> expensive, and not something many could afford.<p>The problem are those pesky <code>JOIN</code>‘s, and its friends <code>GROUP BY</code>. Because our program and reality model don’t match the tables used by SQL, we have to rely on them to query the data. It is because the model doesn’t map directly.<p>Furthermore, graphs don’t map very well at all to relational models.<p>We needed a way to scale horizontally (by increasing the <em>amount</em> of nodes), something relational databases were not designed to do.<blockquote><p><em>We need to do something different, relational across nodes is an unnatural act</em></blockquote><p>This inspired the NoSQL movement.<blockquote><p><em>#nosql was only meant to be a hashtag to advertise it, but unfortunately it’s how it is called now</em></blockquote><p>It is not possible to define NoSQL, but we can identify some of its characteristics:<ul><li><p>Non-relational<li><p><strong>Cluster-friendly</strong> (this was the original spark)<li><p>Open-source (until now, generally)<li><p>21st century web culture<li><p>Schema-less (easier integration or conjugation of several models, structure aggregation) These databases use different data models to those used by the relational model. However, it is possible to identify 4 broad chunks (some may say 3, or even 2!):<li><p><strong>Key-value store</strong>. With a certain key, you obtain the value corresponding to it. It knows nothing else, nor does it care. We say the data is opaque.<li><p><strong>Document-based</strong>. It stores an entire mass of documents with complex structure, normally through the use of JSON (XML has been left behind). Then, you can ask for certain fields, structures, or portions. We say the data is transparent.<li><p><strong>Column-family</strong>. There is a «row key», and within it we store multiple «column families» (columns that fit together, our aggregate). We access by row-key and column-family name. All of these kind of serve to store documents without any <em>explicit</em> schema. Just shove in anything! This gives a lot of flexibility and ease of migration, except… that’s not really true. There’s an <em>implicit</em> schema when querying.</ul><p>For example, a query where we may do <code>anOrder['price'] * anOrder['quantity']</code> is assuming that <code>anOrder</code> has both a <code>price</code> and a <code>quantity</code>, and that both of these can be multiplied together. «Schema-less» is a fuzzy term.<p>However, it is the lack of a <em>fixed</em> schema that gives flexibility.<p>One could argue that the line between key-value and document-based is very fuzzy, and they would be right! Key-value databases often let you include additional metadata that behaves like an index, and in document-based, documents often have an identifier anyway.<p>The common notion between these three types is what matters. They save an entire structure as an <em>unit</em>. We can refer to these as «Aggregate Oriented Databases». Aggregate, because we group things when designing or modeling our systems, as opposed to relational databases that scatter the information across many tables.<p>There exists a notable outlier, though, and that’s:<ul><li><strong>Graph</strong> databases. They use a node-and-arc graph structure. They are great for moving on relationships across things. Ironically, relational databases are not very good at jumping across relationships! It is possibly to perform very interesting queries in graph databases which would be really hard and costly on relational models. Unlike the aggregated databases, graphs break things into even smaller units. NoSQL is not <em>the</em> solution. It depends on how you’ll work with your data. Do you need an aggregate database? Will you have a lot of relationships? Or would the relational model be good fit for you?</ul><p>NoSQL, however, is a good fit for large-scale projects (data will <em>always</em> grow) and faster development (the impedance mismatch is drastically reduced).<p>Regardless of our choice, it is important to remember that NoSQL is a young technology, which is still evolving really fast (SQL has been stable for <em>decades</em>). But the <em>polyglot persistence</em> is what matters. One must know the alternatives, and be able to choose.<hr><p>Relational databases have the well-known ACID properties: Atomicity, Consistency, Isolation and Durability.<p>NoSQL (except graph-based!) are about being BASE instead: Basically Available, Soft state, Eventual consistency.<p>SQL needs transactions because we don’t want to perform a read while we’re only half-way done with a write! The readers and writers are the problem, and ensuring consistency results in a performance hit, even if the risk is low (two writers are extremely rare but it still must be handled).<p>NoSQL on the other hand doesn’t need ACID because the aggregate <em>is</em> the transaction boundary. Even before NoSQL itself existed! Any update is atomic by nature. When updating many documents it <em>is</em> a problem, but this is very rare.<p>We have to distinguish between logical and replication consistency. During an update and if a conflict occurs, it must be resolved to preserve the logical consistency. Replication consistency on the other hand is preserveed when distributing the data across many machines, for example during sharding or copies.<p>Replication buys us more processing power and resillence (at the cost of more storage) in case some of the nodes die. But what happens if what dies is the communication across the nodes? We could drop the requests and preserve the consistency, or accept the risk to continue and instead preserve the availability.<p>The choice on whether trading consistency for availability is acceptable or not depends on the domain rules. It is the domain’s choice, the business people will choose. If you’re Amazon, you always want to be able to sell, but if you’re a bank, you probably don’t want your clients to have negative numbers in their account!<p>Regardless of what we do, in a distributed system, the CAP theorem always applies: Consistecy, Availability, Partitioning-tolerancy (error tolerancy). It is <strong>impossible</strong> to guarantee all 3 at 100%. Most of the times, it does work, but it is mathematically impossible to guarantee at 100%.<p>A database has to choose what to give up at some point. When designing a distributed system, this must be considered. Normally, the choice is made between consistency or response time.<h2 id=further-reading>Further reading</h2><ul><li><a href=https://www.martinfowler.com/articles/nosql-intro-original.pdf>The future is: <del>NoSQL Databases</del> Polyglot Persistence</a><li><a href=https://www.thoughtworks.com/insights/blog/nosql-databases-overview>NoSQL Databases: An Overview</a></ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Introduction to NoSQL | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Introduction to NoSQL</h1><div class=time><p>2020-02-25T01:00:15+00:00<p>last updated 2020-03-18T09:38:23+00:00</div><p>This post will primarly focus on the talk held in the <a href=https://youtu.be/qI_g07C_Q5I>GOTO 2012 conference: Introduction to NoSQL by Martin Fowler</a>. It can be seen as an informal, summarized transcript of the talk<hr><p>The relational database model is affected by the <em><a href=https://en.wikipedia.org/wiki/Object-relational_impedance_mismatch>impedance mismatch problem</a></em>. This occurs because we have to match our high-level design with the separate columns and rows used by relational databases.<p>Taking the in-memory objects and putting them into a relational database (which were dominant at the time) simply didn’t work out. Why? Relational databases were more than just databases, they served as a an integration mechanism across applications, up to the 2000s. For 20 years!<p>With the rise of the Internet and the sheer amount of traffic, databases needed to scale. Unfortunately, relational databases only scale well vertically (by upgrading a <em>single</em> node). This is <em>very</em> expensive, and not something many could afford.<p>The problem are those pesky <code>JOIN</code>‘s, and its friends <code>GROUP BY</code>. Because our program and reality model don’t match the tables used by SQL, we have to rely on them to query the data. It is because the model doesn’t map directly.<p>Furthermore, graphs don’t map very well at all to relational models.<p>We needed a way to scale horizontally (by increasing the <em>amount</em> of nodes), something relational databases were not designed to do.<blockquote><p><em>We need to do something different, relational across nodes is an unnatural act</em></blockquote><p>This inspired the NoSQL movement.<blockquote><p><em>#nosql was only meant to be a hashtag to advertise it, but unfortunately it’s how it is called now</em></blockquote><p>It is not possible to define NoSQL, but we can identify some of its characteristics:<ul><li><p>Non-relational<li><p><strong>Cluster-friendly</strong> (this was the original spark)<li><p>Open-source (until now, generally)<li><p>21st century web culture<li><p>Schema-less (easier integration or conjugation of several models, structure aggregation) These databases use different data models to those used by the relational model. However, it is possible to identify 4 broad chunks (some may say 3, or even 2!):<li><p><strong>Key-value store</strong>. With a certain key, you obtain the value corresponding to it. It knows nothing else, nor does it care. We say the data is opaque.<li><p><strong>Document-based</strong>. It stores an entire mass of documents with complex structure, normally through the use of JSON (XML has been left behind). Then, you can ask for certain fields, structures, or portions. We say the data is transparent.<li><p><strong>Column-family</strong>. There is a «row key», and within it we store multiple «column families» (columns that fit together, our aggregate). We access by row-key and column-family name. All of these kind of serve to store documents without any <em>explicit</em> schema. Just shove in anything! This gives a lot of flexibility and ease of migration, except… that’s not really true. There’s an <em>implicit</em> schema when querying.</ul><p>For example, a query where we may do <code>anOrder['price'] * anOrder['quantity']</code> is assuming that <code>anOrder</code> has both a <code>price</code> and a <code>quantity</code>, and that both of these can be multiplied together. «Schema-less» is a fuzzy term.<p>However, it is the lack of a <em>fixed</em> schema that gives flexibility.<p>One could argue that the line between key-value and document-based is very fuzzy, and they would be right! Key-value databases often let you include additional metadata that behaves like an index, and in document-based, documents often have an identifier anyway.<p>The common notion between these three types is what matters. They save an entire structure as an <em>unit</em>. We can refer to these as «Aggregate Oriented Databases». Aggregate, because we group things when designing or modeling our systems, as opposed to relational databases that scatter the information across many tables.<p>There exists a notable outlier, though, and that’s:<ul><li><strong>Graph</strong> databases. They use a node-and-arc graph structure. They are great for moving on relationships across things. Ironically, relational databases are not very good at jumping across relationships! It is possibly to perform very interesting queries in graph databases which would be really hard and costly on relational models. Unlike the aggregated databases, graphs break things into even smaller units. NoSQL is not <em>the</em> solution. It depends on how you’ll work with your data. Do you need an aggregate database? Will you have a lot of relationships? Or would the relational model be good fit for you?</ul><p>NoSQL, however, is a good fit for large-scale projects (data will <em>always</em> grow) and faster development (the impedance mismatch is drastically reduced).<p>Regardless of our choice, it is important to remember that NoSQL is a young technology, which is still evolving really fast (SQL has been stable for <em>decades</em>). But the <em>polyglot persistence</em> is what matters. One must know the alternatives, and be able to choose.<hr><p>Relational databases have the well-known ACID properties: Atomicity, Consistency, Isolation and Durability.<p>NoSQL (except graph-based!) are about being BASE instead: Basically Available, Soft state, Eventual consistency.<p>SQL needs transactions because we don’t want to perform a read while we’re only half-way done with a write! The readers and writers are the problem, and ensuring consistency results in a performance hit, even if the risk is low (two writers are extremely rare but it still must be handled).<p>NoSQL on the other hand doesn’t need ACID because the aggregate <em>is</em> the transaction boundary. Even before NoSQL itself existed! Any update is atomic by nature. When updating many documents it <em>is</em> a problem, but this is very rare.<p>We have to distinguish between logical and replication consistency. During an update and if a conflict occurs, it must be resolved to preserve the logical consistency. Replication consistency on the other hand is preserveed when distributing the data across many machines, for example during sharding or copies.<p>Replication buys us more processing power and resillence (at the cost of more storage) in case some of the nodes die. But what happens if what dies is the communication across the nodes? We could drop the requests and preserve the consistency, or accept the risk to continue and instead preserve the availability.<p>The choice on whether trading consistency for availability is acceptable or not depends on the domain rules. It is the domain’s choice, the business people will choose. If you’re Amazon, you always want to be able to sell, but if you’re a bank, you probably don’t want your clients to have negative numbers in their account!<p>Regardless of what we do, in a distributed system, the CAP theorem always applies: Consistecy, Availability, Partitioning-tolerancy (error tolerancy). It is <strong>impossible</strong> to guarantee all 3 at 100%. Most of the times, it does work, but it is mathematically impossible to guarantee at 100%.<p>A database has to choose what to give up at some point. When designing a distributed system, this must be considered. Normally, the choice is made between consistency or response time.<h2 id=further-reading>Further reading</h2><ul><li><a href=https://www.martinfowler.com/articles/nosql-intro-original.pdf>The future is: <del>NoSQL Databases</del> Polyglot Persistence</a><li><a href=https://www.thoughtworks.com/insights/blog/nosql-databases-overview>NoSQL Databases: An Overview</a></ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> MongoDB: an Introduction | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>MongoDB: an Introduction</h1><div class=time><p>2020-03-05T01:00:06+00:00<p>last updated 2020-04-08T17:38:22+00:00</div><p>This is the first post in the MongoDB series, where we will introduce the MongoDB database system and take a look at its features and installation methods.<p>Other posts in this series:<ul><li><a href=/blog/ribw/mongodb-an-introduction/>MongoDB: an Introduction</a> (this post)<li><a href=/blog/ribw/mongodb-basic-operations-and-architecture/>MongoDB: Basic Operations and Architecture</a><li><a href=/blog/ribw/developing-a-python-application-for-mongodb/>Developing a Python application for MongoDB</a></ul><p>This post is co-authored wih Classmate.<hr><p><img src=https://lonami.dev/blog/ribw/mongodb-an-introduction/mongodb.png alt="NoSQL database – MongoDB – First delivery"><h2 id=purpose-of-technology>Purpose of technology</h2><p>MongoDB is a <strong>general purpose, document-based, distributed database</strong> built for modern application developers and for the cloud era, with the scalability and flexibility that you want with the querying and indexing that you need. It being a document database means it stores data in JSON-like documents.<p>The Mongo team believes this is the most natural way to think about data, which is (they claim) much more expressive and powerful than the traditional row/column model, since programmers think in objects.<h2 id=how-it-works>How it works</h2><p>MongoDB’s architecture can be summarized as follows:<ul><li>Document data model.<li>Distributed systems design.<li>Unified experience with freedom to run it anywhere.</ul><p>For a more in-depth explanation, MongoDB offers a <a href=https://www.mongodb.com/collateral/mongodb-architecture-guide>download to the MongoDB Architecture Guide</a> with roughly ten pages worth of text.<p><img src=https://lonami.dev/blog/ribw/mongodb-an-introduction/knGHenfTGA4kzJb1PHmS9EQvtZl2QlhbIPN15M38m8fZfZf7ODwYfhf0Tltr.png> _ Overview of MongoDB’s architecture_<p>Regarding usage, MongoDB comes with a really nice introduction along with JavaScript, Python, Java, C++ or C# code at our choice, which describes the steps necessary to make it work. Below we will describe a common workflow.<p>First, we must <strong>connect</strong> to a running MongoDB instance. Once the connection succeeds, we can access individual «collections», which we can think of as <em>tables</em> where collections of data is stored.<p>For instance, we could <strong>insert</strong> an arbitrary JSON document into the <code>restaurants</code> collection to store information about a restaurant.<p>At any other point in time, we can <strong>query</strong> these collections. The queries range from trivial, empty ones (which would retrieve all the documents and fields) to more rich and complex queries (for instance, using AND and OR operators, checking if data exists, and then looking for a value in a list).<p>MongoDB also supports the creation of <strong>indices</strong>, similar to those in other database systems. It allows for the creation of indices on any field or subfields.<p>In Mongo, the <strong>aggregation pipeline</strong> allows us to filter and analyze data based on a given set of criteria. For example, we could pull all the documents in the <code>restaurants</code> collection that have a <code>category</code> of <code>Bakery</code> using the <code>$match</code> operator. Then, we can group them by their star rating using the <code>$group</code> operator. Using the accumulator operator, <code>$sum</code>, we can see how many bakeries in our collection have each star rating.<h2 id=features>Features</h2><p>The features can be seen all over the place in their site, because it’s something they make a lot of emphasis on:<ul><li><p><strong>Easy development</strong>, thanks to the document data model, something they claim to be «the best way to work with data».<li><p>Data is stored in flexible JSON-like documents.<li><p>This model directly maps to the objects in the application’s code.<li><p>Ad hoc queries, indexing, and real time aggregation provide powerful ways to access and analyze the data.<li><p><strong>Powerful query language</strong>, with a rich and expressive query language that allows filtering and sorting by any field, no matter how nested it may be within a document. The queries are themselves JSON, and thus easily composable.<li><p><strong>Support for aggregations</strong> and other modern use-cases such as geo-based search, graph search, and text search.<li><p><strong>A distributed systems design</strong>, which allows developers to intelligently put data where they want it. High availability, horizontal scaling, and geographic distribution are built in and easy to use.<li><p><strong>A unified experience</strong> with the freedom to run anywhere, which allows developers to future-proof their work and eliminate vendor lock-in.</ul><h2 id=corner-in-cap-theorem>Corner in CAP theorem</h2><p>MongoDB’s position in the CAP theorem (Consistency, Availability, Partition Tolerance) depends on the database and driver configurations, and the type of disaster.<ul><li>With <strong>no partitions</strong>, the main focus is <strong>CA</strong>.<li>If there are **partitions **but the system is <strong>strongly connected</strong>, the main focus is <strong>AP</strong>: non-synchronized writes from the old primary are ignored.<li>If there are <strong>partitions</strong> but the system is <strong>not strongly connected</strong>, the main focus is <strong>CP</strong>: only read access is provided to avoid inconsistencies. The general consensus seems to be that Mongo is <strong>CP</strong>.</ul><h2 id=download>Download</h2><p>We will be using the apt-based installation.<p>The Community version can be downloaded by anyone through <a href=https://www.mongodb.com/download-center/community>MongoDB Download Center</a>, where one can choose the version, Operating System and Package.MongoDB also seems to be <a href=https://packages.ubuntu.com/eoan/mongodb>available in Ubuntu’s PPAs</a>.<h2 id=installation>Installation</h2><p>We will be using an Ubuntu-based system, with apt available. To install MongoDB, we open a terminal and run the following command:<pre><code>apt install mongodb +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> MongoDB: an Introduction | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>MongoDB: an Introduction</h1><div class=time><p>2020-03-05T01:00:06+00:00<p>last updated 2020-04-08T17:38:22+00:00</div><p>This is the first post in the MongoDB series, where we will introduce the MongoDB database system and take a look at its features and installation methods.<p>Other posts in this series:<ul><li><a href=/blog/ribw/mongodb-an-introduction/>MongoDB: an Introduction</a> (this post)<li><a href=/blog/ribw/mongodb-basic-operations-and-architecture/>MongoDB: Basic Operations and Architecture</a><li><a href=/blog/ribw/developing-a-python-application-for-mongodb/>Developing a Python application for MongoDB</a></ul><p>This post is co-authored wih Classmate.<hr><p><img src=https://lonami.dev/blog/ribw/mongodb-an-introduction/mongodb.png alt="NoSQL database – MongoDB – First delivery"><h2 id=purpose-of-technology>Purpose of technology</h2><p>MongoDB is a <strong>general purpose, document-based, distributed database</strong> built for modern application developers and for the cloud era, with the scalability and flexibility that you want with the querying and indexing that you need. It being a document database means it stores data in JSON-like documents.<p>The Mongo team believes this is the most natural way to think about data, which is (they claim) much more expressive and powerful than the traditional row/column model, since programmers think in objects.<h2 id=how-it-works>How it works</h2><p>MongoDB’s architecture can be summarized as follows:<ul><li>Document data model.<li>Distributed systems design.<li>Unified experience with freedom to run it anywhere.</ul><p>For a more in-depth explanation, MongoDB offers a <a href=https://www.mongodb.com/collateral/mongodb-architecture-guide>download to the MongoDB Architecture Guide</a> with roughly ten pages worth of text.<p><img src=https://lonami.dev/blog/ribw/mongodb-an-introduction/knGHenfTGA4kzJb1PHmS9EQvtZl2QlhbIPN15M38m8fZfZf7ODwYfhf0Tltr.png> _ Overview of MongoDB’s architecture_<p>Regarding usage, MongoDB comes with a really nice introduction along with JavaScript, Python, Java, C++ or C# code at our choice, which describes the steps necessary to make it work. Below we will describe a common workflow.<p>First, we must <strong>connect</strong> to a running MongoDB instance. Once the connection succeeds, we can access individual «collections», which we can think of as <em>tables</em> where collections of data is stored.<p>For instance, we could <strong>insert</strong> an arbitrary JSON document into the <code>restaurants</code> collection to store information about a restaurant.<p>At any other point in time, we can <strong>query</strong> these collections. The queries range from trivial, empty ones (which would retrieve all the documents and fields) to more rich and complex queries (for instance, using AND and OR operators, checking if data exists, and then looking for a value in a list).<p>MongoDB also supports the creation of <strong>indices</strong>, similar to those in other database systems. It allows for the creation of indices on any field or subfields.<p>In Mongo, the <strong>aggregation pipeline</strong> allows us to filter and analyze data based on a given set of criteria. For example, we could pull all the documents in the <code>restaurants</code> collection that have a <code>category</code> of <code>Bakery</code> using the <code>$match</code> operator. Then, we can group them by their star rating using the <code>$group</code> operator. Using the accumulator operator, <code>$sum</code>, we can see how many bakeries in our collection have each star rating.<h2 id=features>Features</h2><p>The features can be seen all over the place in their site, because it’s something they make a lot of emphasis on:<ul><li><p><strong>Easy development</strong>, thanks to the document data model, something they claim to be «the best way to work with data».<li><p>Data is stored in flexible JSON-like documents.<li><p>This model directly maps to the objects in the application’s code.<li><p>Ad hoc queries, indexing, and real time aggregation provide powerful ways to access and analyze the data.<li><p><strong>Powerful query language</strong>, with a rich and expressive query language that allows filtering and sorting by any field, no matter how nested it may be within a document. The queries are themselves JSON, and thus easily composable.<li><p><strong>Support for aggregations</strong> and other modern use-cases such as geo-based search, graph search, and text search.<li><p><strong>A distributed systems design</strong>, which allows developers to intelligently put data where they want it. High availability, horizontal scaling, and geographic distribution are built in and easy to use.<li><p><strong>A unified experience</strong> with the freedom to run anywhere, which allows developers to future-proof their work and eliminate vendor lock-in.</ul><h2 id=corner-in-cap-theorem>Corner in CAP theorem</h2><p>MongoDB’s position in the CAP theorem (Consistency, Availability, Partition Tolerance) depends on the database and driver configurations, and the type of disaster.<ul><li>With <strong>no partitions</strong>, the main focus is <strong>CA</strong>.<li>If there are **partitions **but the system is <strong>strongly connected</strong>, the main focus is <strong>AP</strong>: non-synchronized writes from the old primary are ignored.<li>If there are <strong>partitions</strong> but the system is <strong>not strongly connected</strong>, the main focus is <strong>CP</strong>: only read access is provided to avoid inconsistencies. The general consensus seems to be that Mongo is <strong>CP</strong>.</ul><h2 id=download>Download</h2><p>We will be using the apt-based installation.<p>The Community version can be downloaded by anyone through <a href=https://www.mongodb.com/download-center/community>MongoDB Download Center</a>, where one can choose the version, Operating System and Package.MongoDB also seems to be <a href=https://packages.ubuntu.com/eoan/mongodb>available in Ubuntu’s PPAs</a>.<h2 id=installation>Installation</h2><p>We will be using an Ubuntu-based system, with apt available. To install MongoDB, we open a terminal and run the following command:<pre><code>apt install mongodb </code></pre><p>After confirming that we do indeed want to install the package, we should be able to run the following command to verify that the installation was successful:<pre><code>mongod --version </code></pre><p>The output should be similar to the following:<pre><code>db version v4.0.16 git version: 2a5433168a53044cb6b4fa8083e4cfd7ba142221
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> MongoDB: Basic Operations and Architecture | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>MongoDB: Basic Operations and Architecture</h1><div class=time><p>2020-03-05T04:00:08+00:00<p>last updated 2020-04-08T17:36:25+00:00</div><p>This is the second post in the MongoDB series, where we will take a look at the <a href=https://stackify.com/what-are-crud-operations/>CRUD operations</a> they support, the data model and architecture used.<p>Other posts in this series:<ul><li><a href=/blog/ribw/mongodb-an-introduction/>MongoDB: an Introduction</a><li><a href=/blog/ribw/mongodb-basic-operations-and-architecture/>MongoDB: Basic Operations and Architecture</a> (this post)<li><a href=/blog/ribw/developing-a-python-application-for-mongodb/>Developing a Python application for MongoDB</a></ul><p>This post is co-authored wih Classmate, and in it we will take an explorative approach using the <code>mongo</code> command line shell to execute commands against the database. It even has TAB auto-completion, which is awesome!<hr><p>Before creating any documents, we first need to create somewhere for the documents to be in. And before we create anything, the database has to be running, so let’s do that first. If we don’t have a service installed, we can run the <code>mongod</code> command ourselves in some local folder to make things easier:<pre><code>$ mkdir -p mongo-database +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> MongoDB: Basic Operations and Architecture | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>MongoDB: Basic Operations and Architecture</h1><div class=time><p>2020-03-05T04:00:08+00:00<p>last updated 2020-04-08T17:36:25+00:00</div><p>This is the second post in the MongoDB series, where we will take a look at the <a href=https://stackify.com/what-are-crud-operations/>CRUD operations</a> they support, the data model and architecture used.<p>Other posts in this series:<ul><li><a href=/blog/ribw/mongodb-an-introduction/>MongoDB: an Introduction</a><li><a href=/blog/ribw/mongodb-basic-operations-and-architecture/>MongoDB: Basic Operations and Architecture</a> (this post)<li><a href=/blog/ribw/developing-a-python-application-for-mongodb/>Developing a Python application for MongoDB</a></ul><p>This post is co-authored wih Classmate, and in it we will take an explorative approach using the <code>mongo</code> command line shell to execute commands against the database. It even has TAB auto-completion, which is awesome!<hr><p>Before creating any documents, we first need to create somewhere for the documents to be in. And before we create anything, the database has to be running, so let’s do that first. If we don’t have a service installed, we can run the <code>mongod</code> command ourselves in some local folder to make things easier:<pre><code>$ mkdir -p mongo-database $ mongod --dbpath mongo-database </code></pre><p>Just like that, we will have Mongo running. Now, let’s connect to it using the <code>mongo</code> command in another terminal (don’t close the terminal where the server is running, we need it!). By default, it connects to localhost, which is just what we need.<pre><code>$ mongo </code></pre><h2 id=create>Create</h2><h3 id=create-a-database>Create a database</h3><p>Let’s list the databases:<pre><code>> show databases
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Cassandra: Basic Operations and Architecture | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Cassandra: Basic Operations and Architecture</h1><div class=time><p>2020-03-05T02:00:36+00:00<p>last updated 2020-03-24T17:57:05+00:00</div><p>This is the second post in the NoSQL Databases series, with a brief description on the basic operations (such as insertion, retrieval, indexing…), and complete execution along with the data model / architecture.<p>Other posts in this series:<ul><li><a href=/blog/ribw/nosql-databases-an-introduction/>Cassandra: an Introduction</a><li><a href=/blog/ribw/nosql-databases-basic-operations-and-architecture/>Cassandra: Basic Operations and Architecture</a> (this post)</ul><hr><p>Cassandra uses it own Query Language for managing the databases, it is known as **CQL **(<strong>Cassandra Query Language</strong>). Cassandra stores data in <strong><em>tables</em></strong>, as in relational databases, and these tables are grouped in <strong><em>keyspaces</em></strong>. A keyspace defines a number of options that applies to all the tables it contains. The most used option is the **replication strategy. **It is recommended to have only one keyspace by application.<p>It is important to mention that <strong>tables and keyspaces</strong> are** case insensitive**, so myTable is equivalent to mytable, but it is possible to <strong>force case sensitivity</strong> using <strong>double-quotes</strong>.<p>To begin with the basic operations it is necessary to deploy Cassandra:<ol><li>Open a terminal in the root of the Apache Cassandra folder downloaded in the previous post.<li>Run the command:</ol><pre><code>$ bin/cassandra +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Cassandra: Basic Operations and Architecture | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Cassandra: Basic Operations and Architecture</h1><div class=time><p>2020-03-05T02:00:36+00:00<p>last updated 2020-03-24T17:57:05+00:00</div><p>This is the second post in the NoSQL Databases series, with a brief description on the basic operations (such as insertion, retrieval, indexing…), and complete execution along with the data model / architecture.<p>Other posts in this series:<ul><li><a href=/blog/ribw/nosql-databases-an-introduction/>Cassandra: an Introduction</a><li><a href=/blog/ribw/nosql-databases-basic-operations-and-architecture/>Cassandra: Basic Operations and Architecture</a> (this post)</ul><hr><p>Cassandra uses it own Query Language for managing the databases, it is known as **CQL **(<strong>Cassandra Query Language</strong>). Cassandra stores data in <strong><em>tables</em></strong>, as in relational databases, and these tables are grouped in <strong><em>keyspaces</em></strong>. A keyspace defines a number of options that applies to all the tables it contains. The most used option is the **replication strategy. **It is recommended to have only one keyspace by application.<p>It is important to mention that <strong>tables and keyspaces</strong> are** case insensitive**, so myTable is equivalent to mytable, but it is possible to <strong>force case sensitivity</strong> using <strong>double-quotes</strong>.<p>To begin with the basic operations it is necessary to deploy Cassandra:<ol><li>Open a terminal in the root of the Apache Cassandra folder downloaded in the previous post.<li>Run the command:</ol><pre><code>$ bin/cassandra </code></pre><p>Once Cassandra is deployed, it is time to open a** CQL Shell**, in <strong>other terminal</strong>, with the command:<pre><code>$ bin/cqlsh </code></pre><p>It is possible to check if Cassandra is deployed if the SQL Shell prints the next message:<p><img src=https://lonami.dev/blog/ribw/nosql-databases-basic-operations-and-architecture/uwqQgQte-cuYb_pePFOuY58re23kngrDKNgL1qz4yOfnBDZkqMIH3fFuCrye.png> <em>CQL Shell</em><h2 id=create-insert>Create/Insert</h2><h3 id=ddl-data-definition-language>DDL (Data Definition Language)</h3><h4 id=create-keyspace>Create keyspace</h4><p>A keyspace is created using a **CREATE KEYSPACE **statement:<pre><code>$ **CREATE** KEYSPACE [ **IF** **NOT** **EXISTS** ] keyspace_name **WITH** options; </code></pre><p>The supported “<strong>options</strong>” are:<ul><li>“<strong>replication</strong>”: this is **mandatory **and defines the <strong>replication strategy</strong> and the <strong>replication factor</strong> (the number of nodes that will have a copy of the data). Within this option there is a property called “<strong>class</strong>” in which the <strong>replication strategy</strong> is specified (“SimpleStrategy” or “NetworkTopologyStrategy”)<li>“<strong>durable_writes</strong>”: this is <strong>not mandatory</strong> and it is possible to use the <strong>commit logs for updates</strong>. Attempting to create an already existing keyspace will return an error unless the **IF NOT EXISTS **directive is used.</ul><p>The example associated to this statement is create a keyspace with name “test_keyspace” with “SimpleStrategy” as “class” of replication and a “replication_factor” of 3.<pre><code>**CREATE** KEYSPACE test_keyspace
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Privado: NoSQL evaluation | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Privado: NoSQL evaluation</h1><div class=time><p>2020-03-16T00:00:15+00:00<p>last updated 2020-03-27T11:22:45+00:00</div><p>I have decided to evaluate Classmate‘s post and Classmate‘s post, because they review databases I have not seen or used before, and I think it would be interesting to see new ones.<p>The evaluation is based on the requirements defined by Trabajos en grupo sobre Bases de Datos NoSQL:<blockquote><p><strong>1ª entrada:</strong> Descripción de la finalidad de la tecnología y cómo funciona o trabaja la BD NoSQL, sus características, la arista que ocupa en el Teorema CAP, de dónde se descarga, y cómo se instala.</blockquote><p>-- Teacher<h2 id=classmate-s-evaluation>Classmate’s evaluation</h2><p><strong>Grading: A.</strong><p>The post I have evaluated is BB.DD. NoSQL: Voldemort 1ª Fase.<p>The post doesn’t start very well, because the first sentence has (emphasis mine):<blockquote><p>En él repasaremos en qué consiste <strong>MongoDB</strong>, sus características, y cómo se instala, entre otros.</blockquote><p>-- Classmate<p>…yet the post is about Voldemort!<p>The post does detail how it works, its architecture, corner in the CAP theorem, download and installation.<p>I have graded the post with A because I think it meets all the requirements, even if they slipped a bit in the beginning.<h2 id=classmate-s-evaluation-1>Classmate’s evaluation</h2><p><strong>Grading: A.</strong><p>The post I have evaluted is Raven.<p>They have done a good job describing the project’s goals, corner in the CAP theorem, download, and provide an extensive installation section.<p>They don’t seem to use some of WordPress features, such as lists, but otherwise the post is good and deserves an A grading.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Privado: NoSQL evaluation | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Privado: NoSQL evaluation</h1><div class=time><p>2020-03-16T00:00:15+00:00<p>last updated 2020-03-27T11:22:45+00:00</div><p>I have decided to evaluate Classmate‘s post and Classmate‘s post, because they review databases I have not seen or used before, and I think it would be interesting to see new ones.<p>The evaluation is based on the requirements defined by Trabajos en grupo sobre Bases de Datos NoSQL:<blockquote><p><strong>1ª entrada:</strong> Descripción de la finalidad de la tecnología y cómo funciona o trabaja la BD NoSQL, sus características, la arista que ocupa en el Teorema CAP, de dónde se descarga, y cómo se instala.</blockquote><p>-- Teacher<h2 id=classmate-s-evaluation>Classmate’s evaluation</h2><p><strong>Grading: A.</strong><p>The post I have evaluated is BB.DD. NoSQL: Voldemort 1ª Fase.<p>The post doesn’t start very well, because the first sentence has (emphasis mine):<blockquote><p>En él repasaremos en qué consiste <strong>MongoDB</strong>, sus características, y cómo se instala, entre otros.</blockquote><p>-- Classmate<p>…yet the post is about Voldemort!<p>The post does detail how it works, its architecture, corner in the CAP theorem, download and installation.<p>I have graded the post with A because I think it meets all the requirements, even if they slipped a bit in the beginning.<h2 id=classmate-s-evaluation-1>Classmate’s evaluation</h2><p><strong>Grading: A.</strong><p>The post I have evaluted is Raven.<p>They have done a good job describing the project’s goals, corner in the CAP theorem, download, and provide an extensive installation section.<p>They don’t seem to use some of WordPress features, such as lists, but otherwise the post is good and deserves an A grading.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Privado: PC-Crawler evaluation 2 | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Privado: PC-Crawler evaluation 2</h1><div class=time><p>2020-03-16T01:00:47+00:00<p>last updated 2020-03-28T10:29:49+00:00</div><p>As the student <code>a(i)</code> where <code>i = 9</code>, I have been assigned to evaluate students <code>a(i - 1)</code> and <code>a(i - 2)</code>, these being:<ul><li>a08: Classmate (username)<li>a07: Classmate (username)</ul><p>The evaluation is done according to the criteria described in Segunda entrega del PC-Crawler.<h2 id=classmate-s-evaluation>Classmate’s evaluation</h2><p><strong>Grading: A.</strong><p>This is the evaluation of Crawler – Thesauro.<p>It’s a well-written post, properly using WordPress code blocks, and they explain the process of improving the code and what it does. Because there are no noticeable issues with the post, they get the highest grading.<h2 id=classmate-s-evaluation-1>Classmate’s evaluation</h2><p><strong>Grading: B.</strong><p>This is the evaluation of Actividad 2-Crawler.<p>They start with an introduction on what they will do.<p>Next, they show the code they have written, also describing what it does, although they don’t explain <em>why</em> they chose the data structures they used.<p>The style of the code leaves a lot to be desired, and they should have embedded the code in the post instead of taking screenshots. People that rely on screen readers will not be able to see the code.<p>I have graded them B and not A for this last reason.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Privado: PC-Crawler evaluation 2 | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Privado: PC-Crawler evaluation 2</h1><div class=time><p>2020-03-16T01:00:47+00:00<p>last updated 2020-03-28T10:29:49+00:00</div><p>As the student <code>a(i)</code> where <code>i = 9</code>, I have been assigned to evaluate students <code>a(i - 1)</code> and <code>a(i - 2)</code>, these being:<ul><li>a08: Classmate (username)<li>a07: Classmate (username)</ul><p>The evaluation is done according to the criteria described in Segunda entrega del PC-Crawler.<h2 id=classmate-s-evaluation>Classmate’s evaluation</h2><p><strong>Grading: A.</strong><p>This is the evaluation of Crawler – Thesauro.<p>It’s a well-written post, properly using WordPress code blocks, and they explain the process of improving the code and what it does. Because there are no noticeable issues with the post, they get the highest grading.<h2 id=classmate-s-evaluation-1>Classmate’s evaluation</h2><p><strong>Grading: B.</strong><p>This is the evaluation of Actividad 2-Crawler.<p>They start with an introduction on what they will do.<p>Next, they show the code they have written, also describing what it does, although they don’t explain <em>why</em> they chose the data structures they used.<p>The style of the code leaves a lot to be desired, and they should have embedded the code in the post instead of taking screenshots. People that rely on screen readers will not be able to see the code.<p>I have graded them B and not A for this last reason.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Privado: PC-Crawler evaluation | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Privado: PC-Crawler evaluation</h1><div class=time><p>2020-03-04T00:00:23+00:00<p>last updated 2020-03-18T09:39:27+00:00</div><p>As the student <code>a(i)</code> where <code>i = 9</code>, I have been assigned to evaluate students <code>a(i + 3)</code> and <code>a(i + 4)</code>, these being:<ul><li>a12: Classmate (username)<li>a13: Classmate (username)</ul><h2 id=classmate-s-evaluation>Classmate’s evaluation</h2><p><strong>Grading: B.</strong><p>I think they mix up a bit their considerations with program usage and how it works, not justifying why the considerations are the ones they chose, or what the alternatives would be.<p>The implementation notes are quite well-written. Even someone without knowledge of Java’s syntax can read the notes and more or less make sense of what’s going on, with the relevant code excerpts on each section.<p>Implementation-wise, some methods could definitely use some improvement:<ul><li><code>esExtensionTextual</code> is overly complicated. It could use a <code>for</code> loop and Java’s <code>String.endsWith</code>.<li><code>calcularFrecuencia</code> has quite some duplication (e.g. <code>this.getFicherosYDirectorios().remove(0)</code>) and could definitely be cleaned up.</ul><p>However, all the desired functionality is implemented.<p>Style-wise, some of the newlines and avoiding braces on <code>if</code> and <code>while</code> could be changed to improve the readability.<p>The post is written in Spanish, but uses some words that don’t translate well («remover» could better be said as «eliminar» or «quitar»).<h2 id=classmate-s-evaluation-1>Classmate’s evaluation</h2><p><strong>Grading: B.</strong><p>Their post starts with an explanation on what a crawler is, common uses for them, and what type of crawler they will be developing. This is a very good start. Regarding the post style, it seems they are not properly using some of WordPress features, such as lists, and instead rely on paragraphs with special characters prefixing each list item.<p>The post also contains some details on how to install the requirements to run the program, which can be very useful for someone not used to working with Java.<p>They do not explain their implementation and the filename of the download has a typo.<p>Implementation-wise, the code seems to be well-organized, into several packages and files, although the naming is a bit inconsistent. They even designed a GUI, which is quite impressive.<p>Some of the methods are documented, although the code inside them is not very commented, including missing rationale for the data structures chosen. There also seem to be several other unused main functions, which I’m unsure why they were kept.<p>However, all the desired functionality is implemented.<p>Similar to Classmate, the code style could be improved and settled on some standard, as well as making use of Java features such as <code>for</code> loops over iterators instead of manual loops.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Privado: PC-Crawler evaluation | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Privado: PC-Crawler evaluation</h1><div class=time><p>2020-03-04T00:00:23+00:00<p>last updated 2020-03-18T09:39:27+00:00</div><p>As the student <code>a(i)</code> where <code>i = 9</code>, I have been assigned to evaluate students <code>a(i + 3)</code> and <code>a(i + 4)</code>, these being:<ul><li>a12: Classmate (username)<li>a13: Classmate (username)</ul><h2 id=classmate-s-evaluation>Classmate’s evaluation</h2><p><strong>Grading: B.</strong><p>I think they mix up a bit their considerations with program usage and how it works, not justifying why the considerations are the ones they chose, or what the alternatives would be.<p>The implementation notes are quite well-written. Even someone without knowledge of Java’s syntax can read the notes and more or less make sense of what’s going on, with the relevant code excerpts on each section.<p>Implementation-wise, some methods could definitely use some improvement:<ul><li><code>esExtensionTextual</code> is overly complicated. It could use a <code>for</code> loop and Java’s <code>String.endsWith</code>.<li><code>calcularFrecuencia</code> has quite some duplication (e.g. <code>this.getFicherosYDirectorios().remove(0)</code>) and could definitely be cleaned up.</ul><p>However, all the desired functionality is implemented.<p>Style-wise, some of the newlines and avoiding braces on <code>if</code> and <code>while</code> could be changed to improve the readability.<p>The post is written in Spanish, but uses some words that don’t translate well («remover» could better be said as «eliminar» or «quitar»).<h2 id=classmate-s-evaluation-1>Classmate’s evaluation</h2><p><strong>Grading: B.</strong><p>Their post starts with an explanation on what a crawler is, common uses for them, and what type of crawler they will be developing. This is a very good start. Regarding the post style, it seems they are not properly using some of WordPress features, such as lists, and instead rely on paragraphs with special characters prefixing each list item.<p>The post also contains some details on how to install the requirements to run the program, which can be very useful for someone not used to working with Java.<p>They do not explain their implementation and the filename of the download has a typo.<p>Implementation-wise, the code seems to be well-organized, into several packages and files, although the naming is a bit inconsistent. They even designed a GUI, which is quite impressive.<p>Some of the methods are documented, although the code inside them is not very commented, including missing rationale for the data structures chosen. There also seem to be several other unused main functions, which I’m unsure why they were kept.<p>However, all the desired functionality is implemented.<p>Similar to Classmate, the code style could be improved and settled on some standard, as well as making use of Java features such as <code>for</code> loops over iterators instead of manual loops.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Upgrading our Baby Crawler | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Upgrading our Baby Crawler</h1><div class=time><p>2020-03-11T00:00:07+00:00<p>last updated 2020-03-18T09:49:33+00:00</div><p>In our <a href=/blog/ribw/build-your-own-pc/>last post on this series</a>, we presented the code for our Personal Crawler. However, we didn’t quite explain what a crawler even is! We will use this moment to go a bit more in-depth, and make some upgrades to it.<h2 id=what-is-a-crawler>What is a Crawler?</h2><p>A crawler is a program whose job is to analyze documents and extract data from them. For example, search engines like <a href=http://duckduckgo.com/>DuckDuckGo</a>, <a href=https://bing.com/>Bing</a> or <a href=http://google.com/>Google</a> all have crawlers to analyze websites and build a database around them. They are some kind of «trackers», because they keep track of everything they find.<p>Their basic behaviour can be described as follows: given a starting list of URLs, follow them all and identify hyperlinks inside the documents. Add these to the list of links to follow, and repeat <em>ad infinitum</em>.<ul><li>This lets us create an index to quickly search across them all.<li>We can also identify broken links.<li>We can gather any other type of information that we found. Our crawler will work offline, within our own computer, scanning the text documents it finds on the root we tell it to scan.</ul><h2 id=design-decissions>Design Decissions</h2><ul><li>We will use Java. Its runtime is quite ubiquitous, so it should be able to run in virtually anywhere. The language is typed, which helps catch errors early on.<li>Our solution is iterative. While recursion can be seen as more elegants by some, iterative solutions are often more performant with less need for optimization.</ul><h2 id=requirements>Requirements</h2><p>If you don’t have Java installed yet, you can <a href=https://java.com/en/download/>Download Free Java Software</a> from Oracle’s site. To compile the code, the <a href=https://www.oracle.com/java/technologies/javase-jdk8-downloads.html>Java Development Kit</a> is also necessary.<p>We don’t depend on any other external libraries, for easier deployment and compilation.<h2 id=implementation>Implementation</h2><p>Because the code was getting pretty large, it has been split into several files, and we have also upgraded it to use a Graphical User Interface instead! We decided to use Swing, based on the Java tutorial <a href=https://docs.oracle.com/javase/tutorial/uiswing/>Creating a GUI With JFC/Swing</a>.<h3 id=app>App</h3><p>This file is the entry point of our application. Its job is to initialize the components, lay them out in the main panel, and connect the event handlers.<p>Most widgets are pretty standard, and are defined as class variables. However, some variables are notable. The <code>[DefaultTableModel](https://docs.oracle.com/javase/8/docs/api/javax/swing/table/DefaultTableModel.html)</code> is used because it allows to <a href=https://stackoverflow.com/a/22550106>dynamically add rows</a>, and we also have a <code>[SwingWorker](https://docs.oracle.com/javase/8/docs/api/javax/swing/SwingWorker.html)</code> subclass responsible for performing the word analysis (which is quite CPU intensive and should not be ran in the UI thread!).<p>There’s a few utility methods to ease some common operations, such as <code>updateStatus</code> which changes the status label in the main window, informing the user of the latest changes.<h3 id=thesaurus>Thesaurus</h3><p>A thesaurus is a collection of words or terms used to represent concepts. In literature this is commonly known as a dictionary.<p>On the subject of this project, we are using a thesaurus based on how relevant is a word for the meaning of a sentence, filtering out those that barely give us any information.<p>This file contains a simple thesaurus implementation, which can trivially be used as a normal or inverted thesaurus. However, we only treat it as inverted, and its job is loading itself and determining if words are valid or should otherwise be ignored.<h3 id=utils>Utils</h3><p>Several utility functions used across the codebase.<h3 id=wordmap>WordMap</h3><p>This file is the important one, and its implementation hasn’t changed much since our last post. Instances of a word map contain… wait for it… a map of words! It stores the mapping <code>word → count</code> in memory, and offers methods to query the count of a word or iterate over the word count entries.<p>It can be loaded from cache or told to analyze a root path. Once an instance is created, additional files could be analyzed one by one if desired.<h2 id=download>Download</h2><p>The code was getting a bit too large to embed it within the blog post itself, so instead you can download it as a<code>.zip</code> file.<p><em>download removed</em></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Upgrading our Baby Crawler | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Upgrading our Baby Crawler</h1><div class=time><p>2020-03-11T00:00:07+00:00<p>last updated 2020-03-18T09:49:33+00:00</div><p>In our <a href=/blog/ribw/build-your-own-pc/>last post on this series</a>, we presented the code for our Personal Crawler. However, we didn’t quite explain what a crawler even is! We will use this moment to go a bit more in-depth, and make some upgrades to it.<h2 id=what-is-a-crawler>What is a Crawler?</h2><p>A crawler is a program whose job is to analyze documents and extract data from them. For example, search engines like <a href=http://duckduckgo.com/>DuckDuckGo</a>, <a href=https://bing.com/>Bing</a> or <a href=http://google.com/>Google</a> all have crawlers to analyze websites and build a database around them. They are some kind of «trackers», because they keep track of everything they find.<p>Their basic behaviour can be described as follows: given a starting list of URLs, follow them all and identify hyperlinks inside the documents. Add these to the list of links to follow, and repeat <em>ad infinitum</em>.<ul><li>This lets us create an index to quickly search across them all.<li>We can also identify broken links.<li>We can gather any other type of information that we found. Our crawler will work offline, within our own computer, scanning the text documents it finds on the root we tell it to scan.</ul><h2 id=design-decissions>Design Decissions</h2><ul><li>We will use Java. Its runtime is quite ubiquitous, so it should be able to run in virtually anywhere. The language is typed, which helps catch errors early on.<li>Our solution is iterative. While recursion can be seen as more elegants by some, iterative solutions are often more performant with less need for optimization.</ul><h2 id=requirements>Requirements</h2><p>If you don’t have Java installed yet, you can <a href=https://java.com/en/download/>Download Free Java Software</a> from Oracle’s site. To compile the code, the <a href=https://www.oracle.com/java/technologies/javase-jdk8-downloads.html>Java Development Kit</a> is also necessary.<p>We don’t depend on any other external libraries, for easier deployment and compilation.<h2 id=implementation>Implementation</h2><p>Because the code was getting pretty large, it has been split into several files, and we have also upgraded it to use a Graphical User Interface instead! We decided to use Swing, based on the Java tutorial <a href=https://docs.oracle.com/javase/tutorial/uiswing/>Creating a GUI With JFC/Swing</a>.<h3 id=app>App</h3><p>This file is the entry point of our application. Its job is to initialize the components, lay them out in the main panel, and connect the event handlers.<p>Most widgets are pretty standard, and are defined as class variables. However, some variables are notable. The <code>[DefaultTableModel](https://docs.oracle.com/javase/8/docs/api/javax/swing/table/DefaultTableModel.html)</code> is used because it allows to <a href=https://stackoverflow.com/a/22550106>dynamically add rows</a>, and we also have a <code>[SwingWorker](https://docs.oracle.com/javase/8/docs/api/javax/swing/SwingWorker.html)</code> subclass responsible for performing the word analysis (which is quite CPU intensive and should not be ran in the UI thread!).<p>There’s a few utility methods to ease some common operations, such as <code>updateStatus</code> which changes the status label in the main window, informing the user of the latest changes.<h3 id=thesaurus>Thesaurus</h3><p>A thesaurus is a collection of words or terms used to represent concepts. In literature this is commonly known as a dictionary.<p>On the subject of this project, we are using a thesaurus based on how relevant is a word for the meaning of a sentence, filtering out those that barely give us any information.<p>This file contains a simple thesaurus implementation, which can trivially be used as a normal or inverted thesaurus. However, we only treat it as inverted, and its job is loading itself and determining if words are valid or should otherwise be ignored.<h3 id=utils>Utils</h3><p>Several utility functions used across the codebase.<h3 id=wordmap>WordMap</h3><p>This file is the important one, and its implementation hasn’t changed much since our last post. Instances of a word map contain… wait for it… a map of words! It stores the mapping <code>word → count</code> in memory, and offers methods to query the count of a word or iterate over the word count entries.<p>It can be loaded from cache or told to analyze a root path. Once an instance is created, additional files could be analyzed one by one if desired.<h2 id=download>Download</h2><p>The code was getting a bit too large to embed it within the blog post itself, so instead you can download it as a<code>.zip</code> file.<p><em>download removed</em></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> What is ElasticSearch and why should you care? | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>What is ElasticSearch and why should you care?</h1><div class=time><p>2020-03-18T02:00:00+00:00<p>last updated 2020-03-27T11:04:45+00:00</div><p>ElasticSearch is a giant search index with powerful analytics capabilities. It’s like a database and search engine on steroids, really easy and fast to get up and running. One can think of it as your own Google, a search engine with analytics.<p>ElasticSearch is rich, stable, performs well, is well maintained, and able to scale to petabytes of any kind of data, whether it’s structured, semi-structured or not at all. It’s cost-effective and can be used to make business decisions.<p>Or, described in 10 seconds:<blockquote><p>Schema-free, REST & JSON based distributed document store Open source: Apache License 2.0 Zero configuration</blockquote><p>-- Alex Reelsen<h2 id=basic-capabilities>Basic capabilities</h2><p>ElasticSearch lets you ask questions about your data, not just make queries. You may think SQL can do this too, but what’s important is making a pipeline of facets, and feed the results from query to query.<p>Instead of changing your data, you can be flexible with your questions with no need to re-index it every time the questions change.<p>ElasticSearch is not just to search for full-text data, either. It can search for structured data and return more than just the results. It also yields additional data, such as ranking, highlights, and allows for pagination.<p>It doesn’t take a lot of configuration to get running, either, which can be a good boost on productivity.<h2 id=how-does-it-work>How does it work?</h2><p>ElasticSearch depends on Java, and can work in a distributed cluster if you execute multiple instances. Data will be replicated and sharded as needed. The current version at the time of writing is 7.6.1, and it’s being developed fast!<p>It also has support for plugins, with an ever-growing ecosystem and integration on many programming languages. Tools around it are being built around it, too, like Kibana which helps you visualize your data.<p>The way you use it is through a JSON API, served over HTTP/S.<h2 id=how-can-i-use-it>How can I use it?</h2><p><a href=https://www.elastic.co/downloads/>You can try ElasticSearch out for free on Elastic Cloud</a>, however, it can also be <a href=https://www.elastic.co/downloads/elasticsearch>downloaded and ran offline</a>, which is what we’ll do. Download the file corresponding to your operating system, unzip it, and execute the binary. Running it is as simple as that!<p>Now you can make queries to it over HTTP, with for example <code>curl</code>:<pre><code>curl -X PUT localhost:9200/orders/order/1 -d ' +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> What is ElasticSearch and why should you care? | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>What is ElasticSearch and why should you care?</h1><div class=time><p>2020-03-18T02:00:00+00:00<p>last updated 2020-03-27T11:04:45+00:00</div><p>ElasticSearch is a giant search index with powerful analytics capabilities. It’s like a database and search engine on steroids, really easy and fast to get up and running. One can think of it as your own Google, a search engine with analytics.<p>ElasticSearch is rich, stable, performs well, is well maintained, and able to scale to petabytes of any kind of data, whether it’s structured, semi-structured or not at all. It’s cost-effective and can be used to make business decisions.<p>Or, described in 10 seconds:<blockquote><p>Schema-free, REST & JSON based distributed document store Open source: Apache License 2.0 Zero configuration</blockquote><p>-- Alex Reelsen<h2 id=basic-capabilities>Basic capabilities</h2><p>ElasticSearch lets you ask questions about your data, not just make queries. You may think SQL can do this too, but what’s important is making a pipeline of facets, and feed the results from query to query.<p>Instead of changing your data, you can be flexible with your questions with no need to re-index it every time the questions change.<p>ElasticSearch is not just to search for full-text data, either. It can search for structured data and return more than just the results. It also yields additional data, such as ranking, highlights, and allows for pagination.<p>It doesn’t take a lot of configuration to get running, either, which can be a good boost on productivity.<h2 id=how-does-it-work>How does it work?</h2><p>ElasticSearch depends on Java, and can work in a distributed cluster if you execute multiple instances. Data will be replicated and sharded as needed. The current version at the time of writing is 7.6.1, and it’s being developed fast!<p>It also has support for plugins, with an ever-growing ecosystem and integration on many programming languages. Tools around it are being built around it, too, like Kibana which helps you visualize your data.<p>The way you use it is through a JSON API, served over HTTP/S.<h2 id=how-can-i-use-it>How can I use it?</h2><p><a href=https://www.elastic.co/downloads/>You can try ElasticSearch out for free on Elastic Cloud</a>, however, it can also be <a href=https://www.elastic.co/downloads/elasticsearch>downloaded and ran offline</a>, which is what we’ll do. Download the file corresponding to your operating system, unzip it, and execute the binary. Running it is as simple as that!<p>Now you can make queries to it over HTTP, with for example <code>curl</code>:<pre><code>curl -X PUT localhost:9200/orders/order/1 -d ' { "created_at": "2013/09/05 15:45:10", "items": [
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Tips for Outpost | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Tips for Outpost</h1><div class=time><p>2020-05-10<p>last updated 2020-05-22</div><p><a href=https://store.steampowered.com/app/1127110/Outpost/>Outpost</a> is a fun little game by Open Mid Interactive that has popped in recently in my recommended section of Steam, and I decided to give it a try.<p>It's a fun tower-defense game with progression, different graphics and random world generation which makes it quite fun for a few hours. In this post I want to talk about some tips I found useful to get past night 50.<h2 id=build-pattern>Build Pattern</h2><p>At first, you may be inclined to design a checkerboard pattern like the following, where "C" is the Crystal shrine, "S" is a stone launcher and "B" is a booster:<p><img src=https://lonami.dev/blog/tips-outpost/outpost-bad-pattern.svg alt="Bad Outpost build pattern"><p>Indeed, this pattern will apply <strong>4</strong> boosts to every turret, but unfortunately, the other 4 slots of the booster are wasted! This is because boosters are able to power 8 different towers, and you really want to maximize that. Here's a better design:<p><img src=https://lonami.dev/blog/tips-outpost/outpost-good-pattern.svg alt="Good Outpost build pattern"><p>The shrine's tower does get boosted, but it's still not really worth it to boost it. This pattern works good, and it's really easy to tile: just repeat the same 3x3 pattern.<p>Nonetheless, we can do better. What if we applied multiple boosters to the same tower while still applying all 8 boosts?<p><img src=https://lonami.dev/blog/tips-outpost/outpost-best-pattern.svg alt="Best Outpost build pattern"><p>That's what peak performance looks like. You can actually apply multiple boosters to the same tower, and it works great.<p>Now, is it really worth it building anywhere except around the shrine? Not really. You never know where a boss will come from, so all sides need a lot of defense if you want to stand a chance.<p>The addition of traps in 1.6 is amazing. You want to build these outside your strong "core", mostly to slow the enemies down so your turrets have more time to finish them off. Don't waste boosters on the traps, and build them at a reasonable distance from the center (the sixth tile is a good spot):<p><img src=https://lonami.dev/blog/tips-outpost/outpost-trap-pattern.svg alt="Trap Outpost build pattern"><p>If you gather enough materials, you can build more trap and cannon layers outside, roughly at enough distance to slow them for enough duration until they reach the next layer of traps, and so on. Probably a single gap of "cannon, booster, cannon" is enough between trap layers, just not in the center where you need a lot of fire power.<h2 id=talents>Talents</h2><p>Talents are the way progression works in the game. Generally, after a run, you will have enough experience to upgrade nearly all talents of roughly the same tier. However, some are worth upgrading more than others (which provide basically no value).<p>The best ones to upgrade are:<ul><li>Starting supplies. Amazing to get good tools early.<li>Shrine shield. Very useful to hold against tough bosses.<li>Better buildings (cannon, boosters, bed and traps). They're a must to deal the most damage.<li>Better pickaxe. Stone is limited, so better make good use of it.<li>Better chests. They provide an insane amount of resources early.<li>Winter slow. Turrets will have more time to deal damage, it's perfect.<li>More time. Useful if you're running out, although generally you enter nights early after having a good core anyway.<li>More rocks. Similar to a better pickaxe, more stone is always better.</ul><p>Some decent ones:<ul><li>In-shrine turret. It's okay to get past the first night without building but not much beyond that.<li>Better axe and greaves. Great to save some energy and really nice quality of life to move around.<li>Tree growth. Normally there's enough trees for this not to be an issue but it can save some time gathering wood.<li>Wisps. They're half-decent since they can provide materials once you max out or max out expensive gear.</ul><p>Some okay ones:<ul><li>Extra XP while playing. Generally not needed due to the way XP scales per night, but can be a good boost.<li>Runestones. Not as reliable as chests but some can grant more energy per day.</ul><p>Some crap ones:<ul><li>Boosts for other seasons. I mean, winter is already the best, no use there.<li>Bow. The bow is very useless at the moment, it's not worth your experience.<li>More energy per bush. Not really worth hunting for bushes since you will have enough energy to do well.</ul><h2 id=turrets>Turrets</h2><p>Always build the highest tier, there's no point in anything lower than that. You will need to deal a lot of damage in a small area, which means space is a premium.<h2 id=boosters>Boosters</h2><p>If you're very early in the game, I recommend alternating both the flag and torch in a checkerboard pattern where the boosters should go in the pattern above. This way your towers will get extra speed and extra range, which works great.<p>When you're in mid-game (stone launchers, gears and campfires), I do not recommend using campfires. The issue is their range boost is way too long, and the turrets will miss quite a few shots. It's better to put all your power into fire speed for increased DPS, at least near the center. If you manage to build too far out and some of the turrets hardly ever shoot, you may put campfires there.<p>In end-game, of course alternate both of the highest tier upgrades. They are really good, and provide the best benefit / cost ratio.<h2 id=gathering-materials>Gathering Materials</h2><p>It is <strong>very</strong> important to use all your energy every day! Otherwise it will go to waste, and you will need a lot of materials.<p>As of 1.6, you can mine two things at once if they're close enough! I don't know if this is intended or a bug, but it sure is great.<p>Once you're in mid-game, your stone-based fort should stand pretty well against the nights on its own. After playing for a while you will notice, if your base can defend a boss, then it will have no issue carrying you through the nights until the next boss. You can (and should!) spend the nights gathering materials, but only when you're confident that the night won't run out.<p>Before the boss hits (every fifth night), come back to your base and use all of your materials. This is the next fort upgrade that will carry it the five next nights.<p>You may also speed up time during night, but make sure you use all your energy before hand. And also take care, in the current version of the game speeding up time only speeds up monster movement, not the fire rate or projectile speed of your turrets! This means they will miss more shots and can be pretty dangerous. If you're speeding up time, consider speeding it up for a little bit, then go back to normal until things are more calm, and repeat.<p>If you're in the end-game, try to rush for chests. They provide a huge amount of materials which is really helpful to upgrade all your tools early so you can make sure to get the most out of every rock left in the map.<p>In the end-game, after all stone has been collected, you don't really need to use all of your energy anymore. Just enough to have enough wood to build with the remaining stone. This will also be nice with the bow upgrades, which admitedly can get quite powerful, but it's best to have a strong fort first.<h2 id=season>Season</h2><p>In my opinion, winter is just the best of the seasons. You don't <em>really</em> need that much energy (it gets tiresome), or extra tree drops, or luck. Slower movement means your turrets will be able to shoot enemies for longer, dealing more damage over time, giving them more chance to take enemies out before they reach the shrine.<p>Feel free to re-roll the map a few times (play and exit, or even restart the game) until you get winter if you want to go for The Play.<h2 id=gear>Gear</h2><p>In my opinion, you really should rush for the best pickaxe you can afford. Stone is a limited resource that doesn't regrow like trees, so once you run out, it's over. Better to make the best use out of it with a good pickaxe!<p>You may also upgrade your greaves, we all known faster movement is a <em>really</em> nice quality of life improvement.<p>Of course, you will eventually upgrade your axe to chop wood (otherwise it's wasted energy, really), but it's not as much of a priority as the pickaxe.<p>Now, the bow is completely useless. Don't bother with it. Your energy is better spent gathering materials to build permanent turrets that deal constant damage while you're away, and the damage adds up with every extra turret you build.<p>With regards to items you carry (like sword, or helmet), look for these (from best to worst):<ul><li>Less minion life.<li>Chance to not consume energy.<li>+1 turret damage.<li>Extra energy.<li>+1 drop from trees or stones.<li>+1 free wood or stone per day.</ul><p>Less minion life, nothing to say. You will need it near end-game.<p>The chance to not consume energy is better the more energy you have. With a 25% chance not to consume energy, you can think of it as 1 extra energy for every 4 energy you have on average.<p>Turret damage is a tough one, it's <em>amazing</em> mid-game (it basically doubles your damage) but falls short once you unlock the cannon where you may prefer other items. Definitely recommended if you're getting started. You may even try to roll it on low tiers by dying on the second night, because it's that good.<p>Extra energy is really good, because it means you can get more materials before it gets too rough. Make sure you have built at least two beds in the first night! This extra energy will pay of for the many nights to come.<p>The problem with free wood or stone per day is that you have, often, five times as much energy per day. By this I mean you can get easily 5 stone every day, which means 5 extra stone, whereas the other would provide just 1 per night. On a good run, you will get around 50 free stone or 250 extra stone. It's a clear winner.<p>In end-game, more quality of life are revealing chests so that you can rush them early, if you like to hunt for them try to make better use of the slot.<h2 id=closing-words>Closing words</h2><p>I hope you enjoy the game as much as I do! Movement is sometimes janky and there's the occassional lag spikes, but despite this it should provide at least a few good hours of gameplay. Beware however a good run can take up to an hour!</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Tips for Outpost | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Tips for Outpost</h1><div class=time><p>2020-05-10<p>last updated 2020-05-22</div><p><a href=https://store.steampowered.com/app/1127110/Outpost/>Outpost</a> is a fun little game by Open Mid Interactive that has popped in recently in my recommended section of Steam, and I decided to give it a try.<p>It's a fun tower-defense game with progression, different graphics and random world generation which makes it quite fun for a few hours. In this post I want to talk about some tips I found useful to get past night 50.<h2 id=build-pattern>Build Pattern</h2><p>At first, you may be inclined to design a checkerboard pattern like the following, where "C" is the Crystal shrine, "S" is a stone launcher and "B" is a booster:<p><img src=https://lonami.dev/blog/tips-outpost/outpost-bad-pattern.svg alt="Bad Outpost build pattern"><p>Indeed, this pattern will apply <strong>4</strong> boosts to every turret, but unfortunately, the other 4 slots of the booster are wasted! This is because boosters are able to power 8 different towers, and you really want to maximize that. Here's a better design:<p><img src=https://lonami.dev/blog/tips-outpost/outpost-good-pattern.svg alt="Good Outpost build pattern"><p>The shrine's tower does get boosted, but it's still not really worth it to boost it. This pattern works good, and it's really easy to tile: just repeat the same 3x3 pattern.<p>Nonetheless, we can do better. What if we applied multiple boosters to the same tower while still applying all 8 boosts?<p><img src=https://lonami.dev/blog/tips-outpost/outpost-best-pattern.svg alt="Best Outpost build pattern"><p>That's what peak performance looks like. You can actually apply multiple boosters to the same tower, and it works great.<p>Now, is it really worth it building anywhere except around the shrine? Not really. You never know where a boss will come from, so all sides need a lot of defense if you want to stand a chance.<p>The addition of traps in 1.6 is amazing. You want to build these outside your strong "core", mostly to slow the enemies down so your turrets have more time to finish them off. Don't waste boosters on the traps, and build them at a reasonable distance from the center (the sixth tile is a good spot):<p><img src=https://lonami.dev/blog/tips-outpost/outpost-trap-pattern.svg alt="Trap Outpost build pattern"><p>If you gather enough materials, you can build more trap and cannon layers outside, roughly at enough distance to slow them for enough duration until they reach the next layer of traps, and so on. Probably a single gap of "cannon, booster, cannon" is enough between trap layers, just not in the center where you need a lot of fire power.<h2 id=talents>Talents</h2><p>Talents are the way progression works in the game. Generally, after a run, you will have enough experience to upgrade nearly all talents of roughly the same tier. However, some are worth upgrading more than others (which provide basically no value).<p>The best ones to upgrade are:<ul><li>Starting supplies. Amazing to get good tools early.<li>Shrine shield. Very useful to hold against tough bosses.<li>Better buildings (cannon, boosters, bed and traps). They're a must to deal the most damage.<li>Better pickaxe. Stone is limited, so better make good use of it.<li>Better chests. They provide an insane amount of resources early.<li>Winter slow. Turrets will have more time to deal damage, it's perfect.<li>More time. Useful if you're running out, although generally you enter nights early after having a good core anyway.<li>More rocks. Similar to a better pickaxe, more stone is always better.</ul><p>Some decent ones:<ul><li>In-shrine turret. It's okay to get past the first night without building but not much beyond that.<li>Better axe and greaves. Great to save some energy and really nice quality of life to move around.<li>Tree growth. Normally there's enough trees for this not to be an issue but it can save some time gathering wood.<li>Wisps. They're half-decent since they can provide materials once you max out or max out expensive gear.</ul><p>Some okay ones:<ul><li>Extra XP while playing. Generally not needed due to the way XP scales per night, but can be a good boost.<li>Runestones. Not as reliable as chests but some can grant more energy per day.</ul><p>Some crap ones:<ul><li>Boosts for other seasons. I mean, winter is already the best, no use there.<li>Bow. The bow is very useless at the moment, it's not worth your experience.<li>More energy per bush. Not really worth hunting for bushes since you will have enough energy to do well.</ul><h2 id=turrets>Turrets</h2><p>Always build the highest tier, there's no point in anything lower than that. You will need to deal a lot of damage in a small area, which means space is a premium.<h2 id=boosters>Boosters</h2><p>If you're very early in the game, I recommend alternating both the flag and torch in a checkerboard pattern where the boosters should go in the pattern above. This way your towers will get extra speed and extra range, which works great.<p>When you're in mid-game (stone launchers, gears and campfires), I do not recommend using campfires. The issue is their range boost is way too long, and the turrets will miss quite a few shots. It's better to put all your power into fire speed for increased DPS, at least near the center. If you manage to build too far out and some of the turrets hardly ever shoot, you may put campfires there.<p>In end-game, of course alternate both of the highest tier upgrades. They are really good, and provide the best benefit / cost ratio.<h2 id=gathering-materials>Gathering Materials</h2><p>It is <strong>very</strong> important to use all your energy every day! Otherwise it will go to waste, and you will need a lot of materials.<p>As of 1.6, you can mine two things at once if they're close enough! I don't know if this is intended or a bug, but it sure is great.<p>Once you're in mid-game, your stone-based fort should stand pretty well against the nights on its own. After playing for a while you will notice, if your base can defend a boss, then it will have no issue carrying you through the nights until the next boss. You can (and should!) spend the nights gathering materials, but only when you're confident that the night won't run out.<p>Before the boss hits (every fifth night), come back to your base and use all of your materials. This is the next fort upgrade that will carry it the five next nights.<p>You may also speed up time during night, but make sure you use all your energy before hand. And also take care, in the current version of the game speeding up time only speeds up monster movement, not the fire rate or projectile speed of your turrets! This means they will miss more shots and can be pretty dangerous. If you're speeding up time, consider speeding it up for a little bit, then go back to normal until things are more calm, and repeat.<p>If you're in the end-game, try to rush for chests. They provide a huge amount of materials which is really helpful to upgrade all your tools early so you can make sure to get the most out of every rock left in the map.<p>In the end-game, after all stone has been collected, you don't really need to use all of your energy anymore. Just enough to have enough wood to build with the remaining stone. This will also be nice with the bow upgrades, which admitedly can get quite powerful, but it's best to have a strong fort first.<h2 id=season>Season</h2><p>In my opinion, winter is just the best of the seasons. You don't <em>really</em> need that much energy (it gets tiresome), or extra tree drops, or luck. Slower movement means your turrets will be able to shoot enemies for longer, dealing more damage over time, giving them more chance to take enemies out before they reach the shrine.<p>Feel free to re-roll the map a few times (play and exit, or even restart the game) until you get winter if you want to go for The Play.<h2 id=gear>Gear</h2><p>In my opinion, you really should rush for the best pickaxe you can afford. Stone is a limited resource that doesn't regrow like trees, so once you run out, it's over. Better to make the best use out of it with a good pickaxe!<p>You may also upgrade your greaves, we all known faster movement is a <em>really</em> nice quality of life improvement.<p>Of course, you will eventually upgrade your axe to chop wood (otherwise it's wasted energy, really), but it's not as much of a priority as the pickaxe.<p>Now, the bow is completely useless. Don't bother with it. Your energy is better spent gathering materials to build permanent turrets that deal constant damage while you're away, and the damage adds up with every extra turret you build.<p>With regards to items you carry (like sword, or helmet), look for these (from best to worst):<ul><li>Less minion life.<li>Chance to not consume energy.<li>+1 turret damage.<li>Extra energy.<li>+1 drop from trees or stones.<li>+1 free wood or stone per day.</ul><p>Less minion life, nothing to say. You will need it near end-game.<p>The chance to not consume energy is better the more energy you have. With a 25% chance not to consume energy, you can think of it as 1 extra energy for every 4 energy you have on average.<p>Turret damage is a tough one, it's <em>amazing</em> mid-game (it basically doubles your damage) but falls short once you unlock the cannon where you may prefer other items. Definitely recommended if you're getting started. You may even try to roll it on low tiers by dying on the second night, because it's that good.<p>Extra energy is really good, because it means you can get more materials before it gets too rough. Make sure you have built at least two beds in the first night! This extra energy will pay of for the many nights to come.<p>The problem with free wood or stone per day is that you have, often, five times as much energy per day. By this I mean you can get easily 5 stone every day, which means 5 extra stone, whereas the other would provide just 1 per night. On a good run, you will get around 50 free stone or 250 extra stone. It's a clear winner.<p>In end-game, more quality of life are revealing chests so that you can rush them early, if you like to hunt for them try to make better use of the slot.<h2 id=closing-words>Closing words</h2><p>I hope you enjoy the game as much as I do! Movement is sometimes janky and there's the occassional lag spikes, but despite this it should provide at least a few good hours of gameplay. Beware however a good run can take up to an hour!</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Data Mining, Warehousing and Information Retrieval | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Data Mining, Warehousing and Information Retrieval</h1><div class=time><p>2020-07-03</div><p>During university, there were a few subjects where I had to write blog posts for (either as evaluable tasks or just for fun). I thought it was really fun and I wanted to preserve that work here, with the hopes it's interesting to someone.<p>The posts series were auto-generated from the original HTML files and manually anonymized later.<ul><li><a href=/blog/mdad>Data Mining and Data Warehousing</a><li><a href=/blog/ribw>Information Retrieval and Web Search</a></ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Data Mining, Warehousing and Information Retrieval | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Data Mining, Warehousing and Information Retrieval</h1><div class=time><p>2020-07-03</div><p>During university, there were a few subjects where I had to write blog posts for (either as evaluable tasks or just for fun). I thought it was really fun and I wanted to preserve that work here, with the hopes it's interesting to someone.<p>The posts series were auto-generated from the original HTML files and manually anonymized later.<ul><li><a href=/blog/mdad>Data Mining and Data Warehousing</a><li><a href=/blog/ribw>Information Retrieval and Web Search</a></ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,4 +1,4 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Writing our own Cheat Engine: Introduction | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Writing our own Cheat Engine: Introduction</h1><div class=time><p>2021-02-07</div><p>This is part 1 on the <em>Writing our own Cheat Engine</em> series.<p><a href=https://cheatengine.org/>Cheat Engine</a> is a tool designed to modify single player games and contains other useful tools within itself that enable its users to debug games or other applications. It comes with a memory scanner, (dis)assembler, inspection tools and a handful other things. In this series, we will be writing our own tiny Cheat Engine capable of solving all steps of the tutorial, and diving into how it all works underneath.<p>Needless to say, we're doing this for private and educational purposes only. One has to make sure to not violate the EULA or ToS of the specific application we're attaching to. This series, much like cheatengine.org, does not condone the illegal use of the code shared.<p>Cheat Engine is a tool for Windows, so we will be developing for Windows as well. However, you can also <a href=https://stackoverflow.com/q/12977179/4759433>read memory from Linux-like systems</a>. <a href=https://github.com/scanmem/scanmem>GameConqueror</a> is a popular alternative to Cheat Engine on Linux systems, so if you feel adventurous, you could definitely follow along too! The techniques shown in this series apply regardless of how we read memory from a process. You will learn a fair bit about doing FFI in Rust too.<p>We will be developing the application in Rust, because it enables us to interface with the Windows API easily, is memory safe (as long as we're careful with <code>unsafe</code>!), and is speedy (we will need this for later steps in the Cheat Engine tutorial). You could use any language of your choice though. For example, <a href=https://lonami.dev/blog/ctypes-and-windows/>Python also makes it relatively easy to use the Windows API</a>.<p><a href=https://github.com/cheat-engine/cheat-engine/>Cheat Engine's source code</a> is mostly written in Pascal and C. And it's <em>a lot</em> of code, with a very flat project structure, and files ranging in the thousand lines of code each. It's daunting<sup class=footnote-reference><a href=#1>1</a></sup>. It's a mature project, with a lot of knowledge encoded in the code base, and a lot of features like distributed scanning or an entire disassembler. Unfortunately, there's not a lot of comments. For these reasons, I'll do some guesswork when possible as to how it's working underneath, rather than actually digging into what Cheat Engine is actually doing.<p>With that out of the way, let's get started!<h2 id=welcome-to-the-cheat-engine-tutorial>Welcome to the Cheat Engine Tutorial</h2><details open><summary>Cheat Engine Tutorial: Step 1</summary> <blockquote><p>This tutorial will teach you the basics of cheating in video games. It will also show you foundational aspects of using Cheat Engine (or CE for short). Follow the steps below to get started.<ol><li>Open Cheat Engine if it currently isn't running.<li>Click on the "Open Process" icon (it's the top-left icon with the computer on it, below "File".).<li>With the Process List window now open, look for this tutorial's process in the list. It will look something like > "00001F98-Tutorial-x86_64.exe" or "0000047C-Tutorial-i386.exe". (The first 8 numbers/letters will probably be different.)<li>Once you've found the process, click on it to select it, then click the "Open" button. (Don't worry about all the > other buttons right now. You can learn about them later if you're interested.)</ol><p>Congratulations! If you did everything correctly, the process window should be gone with Cheat Engine now attached to the > tutorial (you will see the process name towards the top-center of CE).<p>Click the "Next" button below to continue, or fill in the password and click the "OK" button to proceed to that step.)<p>If you're having problems, simply head over to forum.cheatengine.org, then click on "Tutorials" to view beginner-friendly > guides!</blockquote></details><h2 id=enumerating-processes>Enumerating processes</h2><p>Our first step is attaching to the process we want to work with. But we need a way to find that process in the first place! Having to open the task manager, look for the process we care about, noting down the process ID (PID), and slapping it in the source code is not satisfying at all. Instead, let's enumerate all the processes from within the program, and let the user select one by typing its name.<p>From a quick <a href=https://ddg.gg/winapi%20enumerate%20all%20processes>DuckDuckGo search</a>, we find an official tutorial for <a href=https://docs.microsoft.com/en-us/windows/win32/psapi/enumerating-all-processes>Enumerating All Processes</a>, which leads to the <a href=https://docs.microsoft.com/en-us/windows/win32/api/psapi/nf-psapi-enumprocesses><code>EnumProcesses</code></a> call. Cool! Let's slap in the <a href=https://crates.io/crates/winapi><code>winapi</code></a> crate on <code>Cargo.toml</code>, because I don't want to write all the definitions by myself:<pre><code class=language-toml data-lang=toml>[dependencies] +<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Writing our own Cheat Engine: Introduction | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Writing our own Cheat Engine: Introduction</h1><div class=time><p>2021-02-07</div><p>This is part 1 on the <em>Writing our own Cheat Engine</em> series.<p><a href=https://cheatengine.org/>Cheat Engine</a> is a tool designed to modify single player games and contains other useful tools within itself that enable its users to debug games or other applications. It comes with a memory scanner, (dis)assembler, inspection tools and a handful other things. In this series, we will be writing our own tiny Cheat Engine capable of solving all steps of the tutorial, and diving into how it all works underneath.<p>Needless to say, we're doing this for private and educational purposes only. One has to make sure to not violate the EULA or ToS of the specific application we're attaching to. This series, much like cheatengine.org, does not condone the illegal use of the code shared.<p>Cheat Engine is a tool for Windows, so we will be developing for Windows as well. However, you can also <a href=https://stackoverflow.com/q/12977179/4759433>read memory from Linux-like systems</a>. <a href=https://github.com/scanmem/scanmem>GameConqueror</a> is a popular alternative to Cheat Engine on Linux systems, so if you feel adventurous, you could definitely follow along too! The techniques shown in this series apply regardless of how we read memory from a process. You will learn a fair bit about doing FFI in Rust too.<p>We will be developing the application in Rust, because it enables us to interface with the Windows API easily, is memory safe (as long as we're careful with <code>unsafe</code>!), and is speedy (we will need this for later steps in the Cheat Engine tutorial). You could use any language of your choice though. For example, <a href=https://lonami.dev/blog/ctypes-and-windows/>Python also makes it relatively easy to use the Windows API</a>.<p><a href=https://github.com/cheat-engine/cheat-engine/>Cheat Engine's source code</a> is mostly written in Pascal and C. And it's <em>a lot</em> of code, with a very flat project structure, and files ranging in the thousand lines of code each. It's daunting<sup class=footnote-reference><a href=#1>1</a></sup>. It's a mature project, with a lot of knowledge encoded in the code base, and a lot of features like distributed scanning or an entire disassembler. Unfortunately, there's not a lot of comments. For these reasons, I'll do some guesswork when possible as to how it's working underneath, rather than actually digging into what Cheat Engine is actually doing.<p>With that out of the way, let's get started!<h2 id=welcome-to-the-cheat-engine-tutorial>Welcome to the Cheat Engine Tutorial</h2><details open><summary>Cheat Engine Tutorial: Step 1</summary> <blockquote><p>This tutorial will teach you the basics of cheating in video games. It will also show you foundational aspects of using Cheat Engine (or CE for short). Follow the steps below to get started.<ol><li>Open Cheat Engine if it currently isn't running.<li>Click on the "Open Process" icon (it's the top-left icon with the computer on it, below "File".).<li>With the Process List window now open, look for this tutorial's process in the list. It will look something like > "00001F98-Tutorial-x86_64.exe" or "0000047C-Tutorial-i386.exe". (The first 8 numbers/letters will probably be different.)<li>Once you've found the process, click on it to select it, then click the "Open" button. (Don't worry about all the > other buttons right now. You can learn about them later if you're interested.)</ol><p>Congratulations! If you did everything correctly, the process window should be gone with Cheat Engine now attached to the > tutorial (you will see the process name towards the top-center of CE).<p>Click the "Next" button below to continue, or fill in the password and click the "OK" button to proceed to that step.)<p>If you're having problems, simply head over to forum.cheatengine.org, then click on "Tutorials" to view beginner-friendly > guides!</blockquote></details><h2 id=enumerating-processes>Enumerating processes</h2><p>Our first step is attaching to the process we want to work with. But we need a way to find that process in the first place! Having to open the task manager, look for the process we care about, noting down the process ID (PID), and slapping it in the source code is not satisfying at all. Instead, let's enumerate all the processes from within the program, and let the user select one by typing its name.<p>From a quick <a href=https://ddg.gg/winapi%20enumerate%20all%20processes>DuckDuckGo search</a>, we find an official tutorial for <a href=https://docs.microsoft.com/en-us/windows/win32/psapi/enumerating-all-processes>Enumerating All Processes</a>, which leads to the <a href=https://docs.microsoft.com/en-us/windows/win32/api/psapi/nf-psapi-enumprocesses><code>EnumProcesses</code></a> call. Cool! Let's slap in the <a href=https://crates.io/crates/winapi><code>winapi</code></a> crate on <code>Cargo.toml</code>, because I don't want to write all the definitions by myself:<pre><code class=language-toml data-lang=toml>[dependencies] winapi = { version = "0.3.9", features = ["psapi"] } </code></pre><p>Because <a href=https://docs.microsoft.com/en-us/windows/win32/api/psapi/nf-psapi-enumprocesses><code>EnumProcesses</code></a> is in <code>Psapi.h</code> (you can see this in the online page of its documentation), we know we'll need the <code>psapi</code> crate feature. Another option is to search for it in the <a href=https://docs.rs/winapi/><code>winapi</code> documentation</a> and noting down the parent module where its stored.<p>The documentation for the method has the following remark:<blockquote><p>It is a good idea to use a large array, because it is hard to predict how many processes there will be at the time you call <strong>EnumProcesses</strong>.</blockquote><p><em>Sidenote: reading the documentation for the methods we'll use from the Windows API is extremely important. There's a lot of gotchas involved, so we need to make sure we're extra careful.</em><p>1024 is a pretty big number, so let's go with that:<pre><code class=language-rust data-lang=rust>use std::io; use std::mem;
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> WorldEdit Commands | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>WorldEdit Commands</h1><div class=time><p>2018-07-11</div><p><a href=https://dev.bukkit.org/projects/worldedit>WorldEdit</a> is an extremely powerful tool for modifying entire worlds within <a href=https://minecraft.net>Minecraft</a>, which can be used as either a mod for your single-player worlds or as a plugin for your <a href=https://getbukkit.org/>Bukkit</a> servers.<p>This command guide was written for Minecraft 1.12.1, version <a href=https://dev.bukkit.org/projects/worldedit/files/2460562>6.1.7.3</a>, but should work for newer versions too. All WorldEdit commands can be used with a double slash (<code>//</code>) so they don't conlict with built-in commands. This means you can get a list of all commands with <code>//help</code>. Let's explore different categories!<h2 id=movement>Movement</h2><p>In order to edit a world properly you need to learn how to move in said world properly. There are several straightforward commands that let you move:<ul><li><code>//ascend</code> goes up one floor.<li><code>//descend</code> goes down one floor.<li><code>//thru</code> let's you pass through walls.<li><code>//jumpto</code> to go wherever you are looking.</ul><h2 id=information>Information</h2><p>Knowing your world properly is as important as knowing how to move within it, and will also let you change the information in said world if you need to.<ul><li><code>//biomelist</code> shows all known biomes.<li><code>//biomeinfo</code> shows the current biome.<li><code>//setbiome</code> lets you change the biome.</ul><h2 id=blocks>Blocks</h2><p>You can act over all blocks in a radius around you with quite a few commands. Some won't actually act over the entire range you specify, so 100 is often a good number.<h3 id=filling>Filling</h3><p>You can fill pools with <code>//fill water 100</code> or caves with <code>//fillr water 100</code>, both of which act below your feet.<h3 id=fixing>Fixing</h3><p>If the water or lava is buggy use <code>//fixwater 100</code> or <code>//fixlava 100</code> respectively.<p>Some creeper removed the snow or the grass? Fear not, you can use <code>//snow 10</code> or <code>//grass 10</code>.<h3 id=emptying>Emptying</h3><p>You can empty a pool completely with <code>//drain 100</code>, remove the snow with <code>//thaw 10</code>, and remove fire with <code>//ex 10</code>.<h3 id=removing>Removing</h3><p>You can remove blocks above and below you in some area with the <code>//removeabove N</code> and <code>//removebelow N</code>. You probably want to set a limit though, or you could fall off the world with <code>//removebelow 1 10</code> for radius and depth. You can also remove near blocks with <code>//removenear block 10</code>.<h3 id=shapes>Shapes</h3><p>Making a cylinder (or circle) can be done with through <code>//cyl stone 10</code>, a third argument for the height. The radius can be comma-separated to make a ellipses instead, such as <code>//cyl stone 5,10</code>.<p>Spheres are done with <code>//sphere stone 5</code>. This will build one right at your center, so you can raise it to be on your feet with <code>//sphere stone 5 yes</code>. Similar to cylinders, you can comma separate the radius <code>x,y,z</code>.<p>Pyramids can be done with <code>//pyramic stone 5</code>.<p>All these commands can be prefixed with "h" to make them hollow. For instance, <code>//hsphere stone 10</code>.<h2 id=regions>Regions</h2><h3 id=basics>Basics</h3><p>Operating over an entire region is really important, and the first thing you need to work comfortably with them is a tool to make selections. The default wooden-axe tool can be obtained with <code>//wand</code>, but you must be near the blocks to select. You can use a different tool, like a golden axe, to use as your "far wand" (wand usable over distance). Once you have one in your hand type <code>//farwand</code> to use it as your "far wand". You can select the two corners of your region with left and right click. If you have selected the wrong tool, use <code>//none</code> to clear it.<p>If there are no blocks but you want to use your current position as a corner, use <code>//pos1</code> or 2.<p>If you made a region too small, you can enlarge it with <code>//expand 10 up</code>, or <code>//expand vert</code> for the entire vertical range, etc., or make it smaller with <code>//contract 10 up</code> etc., or <code>//inset</code> it to contract in both directions. You can use short-names for the cardinal directions (NSEW).<p>Finally, if you want to move your selection, you can <code>//shift 1 north</code> it to wherever you need.<h3 id=information-1>Information</h3><p>You can get the <code>//size</code> of the selection or even <code>//count torch</code> in some area. If you want to count all blocks, get their distribution <code>//distr</code>.<h3 id=filling-1>Filling</h3><p>With a region selected, you can <code>//set</code> it to be any block! For instance, you can use <code>//set air</code> to clear it entirely. You can use more than one block evenly by separting them with a comma <code>//set stone,dirt</code>, or with a custom chance <code>//set 20%stone,80%dirt</code>.<p>You can use <code>//replace from to</code> instead if you don't want to override all blocks in your selection.<p>You can make an hollow set with <code>//faces</code>, and if you just want the walls, use <code>//walls</code>.<h3 id=cleaning>Cleaning</h3><p>If someone destroyed your wonderful snow landscape, fear not, you can use <code>//overlay snow</code> over it (although for this you actually have <code>//snow N</code> and its opposite <code>//thaw</code>).<p>If you set some rough area, you can always <code>//smooth</code> it, even more than one time with <code>//smooth 3</code>. You can get your dirt and stone back with <code>//naturalize</code> and put some plants with <code>//flora</code> or <code>//forest</code>, both of which support a density or even the type for the trees. If you already have the dirt use <code>//green</code> instead. If you want some pumpkins, with <code>//pumpkins</code>.<h3 id=moving>Moving</h3><p>You can repeat an entire selection many times by stacking them with <code>//stack N DIR</code>. This is extremely useful to make things like corridors or elevators. For instance, you can make a small section of the corridor, select it entirely, and then repeat it 10 times with <code>//stack 10 north</code>. Or you can make the elevator and then <code>//stack 10 up</code>. If you need to also copy the air use <code>//stackair</code>.<p>Finally, if you don't need to repeat it and simply move it just a bit towards the right direction, you can use <code>//move N</code>. The default direction is "me" (towards where you are facing) but you can set one with <code>//move 1 up</code> for example.<h3 id=selecting>Selecting</h3><p>You can not only select cuboids. You can also select different shapes, or even just points:<ul><li><code>//sel cuboid</code> is the default.<li><code>//sel extend</code> expands the default.<li><code>//sel poly</code> first point with left click and right click to add new points.<li><code>//sel ellipsoid</code> first point to select the center and right click to select the different radius.<li><code>//sel sphere</code> first point to select the center and one more right click for the radius.<li><code>//sel cyl</code> for cylinders, first click being the center.<li><code>//sel convex</code> for convex shapes. This one is extremely useful for <code>//curve</code>.</ul><h2 id=brushes>Brushes</h2><p>Brushes are a way to paint in 3D without first bothering about making a selection, and there are spherical and cylinder brushes with e.g. <code>//brush sphere stone 2</code>, or the shorter form <code>//br s stone</code>. For cylinder, one must use <code>cyl</code> instead <code>sphere</code>.<p>There also exists a brush to smooth the terrain which can be enabled on the current item with <code>//br smooth</code>, which can be used with right-click like any other brush.<h2 id=clipboard>Clipboard</h2><p>Finally, you can copy and cut things around like you would do with normal text with <code>//copy</code> and <code>//cut</code>. The copy is issued from wherever you issue the command, so when you use <code>//paste</code>, remember that if you were 4 blocks apart when copying, it will be 4 blocks apart when pasting.<p>The contents of the clipboard can be flipped to wherever you are looking via <code>//flip</code>, and can be rotated via the <code>//rotate 90</code> command (in degrees).<p>To remove the copy use <code>//clearclipboard</code>.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> WorldEdit Commands | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog class=selected>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>WorldEdit Commands</h1><div class=time><p>2018-07-11</div><p><a href=https://dev.bukkit.org/projects/worldedit>WorldEdit</a> is an extremely powerful tool for modifying entire worlds within <a href=https://minecraft.net>Minecraft</a>, which can be used as either a mod for your single-player worlds or as a plugin for your <a href=https://getbukkit.org/>Bukkit</a> servers.<p>This command guide was written for Minecraft 1.12.1, version <a href=https://dev.bukkit.org/projects/worldedit/files/2460562>6.1.7.3</a>, but should work for newer versions too. All WorldEdit commands can be used with a double slash (<code>//</code>) so they don't conlict with built-in commands. This means you can get a list of all commands with <code>//help</code>. Let's explore different categories!<h2 id=movement>Movement</h2><p>In order to edit a world properly you need to learn how to move in said world properly. There are several straightforward commands that let you move:<ul><li><code>//ascend</code> goes up one floor.<li><code>//descend</code> goes down one floor.<li><code>//thru</code> let's you pass through walls.<li><code>//jumpto</code> to go wherever you are looking.</ul><h2 id=information>Information</h2><p>Knowing your world properly is as important as knowing how to move within it, and will also let you change the information in said world if you need to.<ul><li><code>//biomelist</code> shows all known biomes.<li><code>//biomeinfo</code> shows the current biome.<li><code>//setbiome</code> lets you change the biome.</ul><h2 id=blocks>Blocks</h2><p>You can act over all blocks in a radius around you with quite a few commands. Some won't actually act over the entire range you specify, so 100 is often a good number.<h3 id=filling>Filling</h3><p>You can fill pools with <code>//fill water 100</code> or caves with <code>//fillr water 100</code>, both of which act below your feet.<h3 id=fixing>Fixing</h3><p>If the water or lava is buggy use <code>//fixwater 100</code> or <code>//fixlava 100</code> respectively.<p>Some creeper removed the snow or the grass? Fear not, you can use <code>//snow 10</code> or <code>//grass 10</code>.<h3 id=emptying>Emptying</h3><p>You can empty a pool completely with <code>//drain 100</code>, remove the snow with <code>//thaw 10</code>, and remove fire with <code>//ex 10</code>.<h3 id=removing>Removing</h3><p>You can remove blocks above and below you in some area with the <code>//removeabove N</code> and <code>//removebelow N</code>. You probably want to set a limit though, or you could fall off the world with <code>//removebelow 1 10</code> for radius and depth. You can also remove near blocks with <code>//removenear block 10</code>.<h3 id=shapes>Shapes</h3><p>Making a cylinder (or circle) can be done with through <code>//cyl stone 10</code>, a third argument for the height. The radius can be comma-separated to make a ellipses instead, such as <code>//cyl stone 5,10</code>.<p>Spheres are done with <code>//sphere stone 5</code>. This will build one right at your center, so you can raise it to be on your feet with <code>//sphere stone 5 yes</code>. Similar to cylinders, you can comma separate the radius <code>x,y,z</code>.<p>Pyramids can be done with <code>//pyramic stone 5</code>.<p>All these commands can be prefixed with "h" to make them hollow. For instance, <code>//hsphere stone 10</code>.<h2 id=regions>Regions</h2><h3 id=basics>Basics</h3><p>Operating over an entire region is really important, and the first thing you need to work comfortably with them is a tool to make selections. The default wooden-axe tool can be obtained with <code>//wand</code>, but you must be near the blocks to select. You can use a different tool, like a golden axe, to use as your "far wand" (wand usable over distance). Once you have one in your hand type <code>//farwand</code> to use it as your "far wand". You can select the two corners of your region with left and right click. If you have selected the wrong tool, use <code>//none</code> to clear it.<p>If there are no blocks but you want to use your current position as a corner, use <code>//pos1</code> or 2.<p>If you made a region too small, you can enlarge it with <code>//expand 10 up</code>, or <code>//expand vert</code> for the entire vertical range, etc., or make it smaller with <code>//contract 10 up</code> etc., or <code>//inset</code> it to contract in both directions. You can use short-names for the cardinal directions (NSEW).<p>Finally, if you want to move your selection, you can <code>//shift 1 north</code> it to wherever you need.<h3 id=information-1>Information</h3><p>You can get the <code>//size</code> of the selection or even <code>//count torch</code> in some area. If you want to count all blocks, get their distribution <code>//distr</code>.<h3 id=filling-1>Filling</h3><p>With a region selected, you can <code>//set</code> it to be any block! For instance, you can use <code>//set air</code> to clear it entirely. You can use more than one block evenly by separting them with a comma <code>//set stone,dirt</code>, or with a custom chance <code>//set 20%stone,80%dirt</code>.<p>You can use <code>//replace from to</code> instead if you don't want to override all blocks in your selection.<p>You can make an hollow set with <code>//faces</code>, and if you just want the walls, use <code>//walls</code>.<h3 id=cleaning>Cleaning</h3><p>If someone destroyed your wonderful snow landscape, fear not, you can use <code>//overlay snow</code> over it (although for this you actually have <code>//snow N</code> and its opposite <code>//thaw</code>).<p>If you set some rough area, you can always <code>//smooth</code> it, even more than one time with <code>//smooth 3</code>. You can get your dirt and stone back with <code>//naturalize</code> and put some plants with <code>//flora</code> or <code>//forest</code>, both of which support a density or even the type for the trees. If you already have the dirt use <code>//green</code> instead. If you want some pumpkins, with <code>//pumpkins</code>.<h3 id=moving>Moving</h3><p>You can repeat an entire selection many times by stacking them with <code>//stack N DIR</code>. This is extremely useful to make things like corridors or elevators. For instance, you can make a small section of the corridor, select it entirely, and then repeat it 10 times with <code>//stack 10 north</code>. Or you can make the elevator and then <code>//stack 10 up</code>. If you need to also copy the air use <code>//stackair</code>.<p>Finally, if you don't need to repeat it and simply move it just a bit towards the right direction, you can use <code>//move N</code>. The default direction is "me" (towards where you are facing) but you can set one with <code>//move 1 up</code> for example.<h3 id=selecting>Selecting</h3><p>You can not only select cuboids. You can also select different shapes, or even just points:<ul><li><code>//sel cuboid</code> is the default.<li><code>//sel extend</code> expands the default.<li><code>//sel poly</code> first point with left click and right click to add new points.<li><code>//sel ellipsoid</code> first point to select the center and right click to select the different radius.<li><code>//sel sphere</code> first point to select the center and one more right click for the radius.<li><code>//sel cyl</code> for cylinders, first click being the center.<li><code>//sel convex</code> for convex shapes. This one is extremely useful for <code>//curve</code>.</ul><h2 id=brushes>Brushes</h2><p>Brushes are a way to paint in 3D without first bothering about making a selection, and there are spherical and cylinder brushes with e.g. <code>//brush sphere stone 2</code>, or the shorter form <code>//br s stone</code>. For cylinder, one must use <code>cyl</code> instead <code>sphere</code>.<p>There also exists a brush to smooth the terrain which can be enabled on the current item with <code>//br smooth</code>, which can be used with right-click like any other brush.<h2 id=clipboard>Clipboard</h2><p>Finally, you can copy and cut things around like you would do with normal text with <code>//copy</code> and <code>//cut</code>. The copy is issued from wherever you issue the command, so when you use <code>//paste</code>, remember that if you were 4 blocks apart when copying, it will be 4 blocks apart when pasting.<p>The contents of the clipboard can be flipped to wherever you are looking via <code>//flip</code>, and can be rotated via the <code>//rotate 90</code> command (in degrees).<p>To remove the copy use <code>//clearclipboard</code>.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Apuntes de bachillerato de Filosofía | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog>blog</a><li><a href=/golb class=selected>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Apuntes de bachillerato de Filosofía</h1><div class=time><p>2016-06-21</div><p>Hay asignaturas que merecen la pena, y una de ellas es la filosofía. De verdad. Aprendes un montón de cosas y abres tu mente, comparas muchos puntos de vista y te das cuenta de grandes cosas. Por eso, quiero compartir mis apuntes con todo aquel interesado.<p>Personalmente, mi autor favorito es Friedrich Nietzsche (se pronuncia <em>/niche/</em>). Está en el tercer trimestre, el más interesante, aunque si prefieres algo de contexto, te recomiendo leerlo todo. Puedes leerlo como si fuera un libro cualquiera:<ul><li>Descargar en .pdf <ul><li><a href=https://lonami.dev/golb/filosofia/filo_trimestre1.pdf>Filosofía - Primer trimestre</a><li><a href=https://lonami.dev/golb/filosofia/filo_trimestre2.pdf>Filosofía - Segundo trimestre</a><li><a href=https://lonami.dev/golb/filosofia/filo_trimestre3.pdf>Filosofía - Tercer trimestre</a></ul><li>Descargar en .odt (lo puedes editar) <ul><li><a href=https://lonami.dev/golb/filosofia/filo_trimestre1.odt>Filosofía - Primer trimestre</a><li><a href=https://lonami.dev/golb/filosofia/filo_trimestre2.odt>Filosofía - Segundo trimestre</a><li><a href=https://lonami.dev/golb/filosofia/filo_trimestre3.odt>Filosofía - Tercer trimestre</a></ul></ul><p>Nota: Hay algunas palabras un tanto soez. ¡Añaden emoción y no son para tanto, a lo sumo dos o tres! :)</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Apuntes de bachillerato de Filosofía | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog>blog</a><li><a href=/golb class=selected>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Apuntes de bachillerato de Filosofía</h1><div class=time><p>2016-06-21</div><p>Hay asignaturas que merecen la pena, y una de ellas es la filosofía. De verdad. Aprendes un montón de cosas y abres tu mente, comparas muchos puntos de vista y te das cuenta de grandes cosas. Por eso, quiero compartir mis apuntes con todo aquel interesado.<p>Personalmente, mi autor favorito es Friedrich Nietzsche (se pronuncia <em>/niche/</em>). Está en el tercer trimestre, el más interesante, aunque si prefieres algo de contexto, te recomiendo leerlo todo. Puedes leerlo como si fuera un libro cualquiera:<ul><li>Descargar en .pdf <ul><li><a href=https://lonami.dev/golb/filosofia/filo_trimestre1.pdf>Filosofía - Primer trimestre</a><li><a href=https://lonami.dev/golb/filosofia/filo_trimestre2.pdf>Filosofía - Segundo trimestre</a><li><a href=https://lonami.dev/golb/filosofia/filo_trimestre3.pdf>Filosofía - Tercer trimestre</a></ul><li>Descargar en .odt (lo puedes editar) <ul><li><a href=https://lonami.dev/golb/filosofia/filo_trimestre1.odt>Filosofía - Primer trimestre</a><li><a href=https://lonami.dev/golb/filosofia/filo_trimestre2.odt>Filosofía - Segundo trimestre</a><li><a href=https://lonami.dev/golb/filosofia/filo_trimestre3.odt>Filosofía - Tercer trimestre</a></ul></ul><p>Nota: Hay algunas palabras un tanto soez. ¡Añaden emoción y no son para tanto, a lo sumo dos o tres! :)</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Lonami's Golb </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog>blog</a><li><a href=/golb class=selected>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>My Golb</h1><p>Welcome to my golb!<p>It's like my blog, but with things that are a bit more… personal? Random? Spanish? Yeah!<hr><ul><li><a href=https://lonami.dev/golb/making-a-difference/>Making a Difference</a><li><a href=https://lonami.dev/golb/sentences/>Sentences</a><li><a href=https://lonami.dev/golb/filosofia/>Apuntes de bachillerato de Filosofía</a><li><a href=https://lonami.dev/golb/reflexion-ia/>Reflexión sobre la Inteligencia artificial</a><li><a href=https://lonami.dev/golb/inteligencia-artificial/>Inteligencia artificial</a></ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Lonami's Golb </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog>blog</a><li><a href=/golb class=selected>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>My Golb</h1><p>Welcome to my golb!<p>It's like my blog, but with things that are a bit more… personal? Random? Spanish? Yeah!<hr><ul><li><a href=https://lonami.dev/golb/making-a-difference/>Making a Difference</a><li><a href=https://lonami.dev/golb/sentences/>Sentences</a><li><a href=https://lonami.dev/golb/filosofia/>Apuntes de bachillerato de Filosofía</a><li><a href=https://lonami.dev/golb/reflexion-ia/>Reflexión sobre la Inteligencia artificial</a><li><a href=https://lonami.dev/golb/inteligencia-artificial/>Inteligencia artificial</a></ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Inteligencia artificial | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog>blog</a><li><a href=/golb class=selected>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Inteligencia artificial</h1><div class=time><p>2016-02-24<p>last updated 2016-03-05</div><h2 id=indice>Índice</h2><ul><li><a href=https://lonami.dev/golb/inteligencia-artificial/#qu%C3%A9_es>Qué es</a><li><a href=https://lonami.dev/golb/inteligencia-artificial/#en_qu%C3%A9_consiste>En qué consiste</a><li><a href=https://lonami.dev/golb/inteligencia-artificial/#l%C3%ADmites>Límites</a><li><a href=https://lonami.dev/golb/inteligencia-artificial/#tipos_de_inteligencia_artificial>Tipos de inteligencia artificial</a><li><a href=https://lonami.dev/golb/inteligencia-artificial/#aplicaciones_pr%C3%A1cticas>Aplicaciones prácticas</a><li><a href=https://lonami.dev/golb/inteligencia-artificial/#implicaciones_%C3%A9ticas>Implicaciones éticas</a><li><a href=https://lonami.dev/golb/inteligencia-artificial/#ejemplos>Ejemplos</a><li><a href=https://lonami.dev/golb/inteligencia-artificial/#conceptos>Conceptos</a><li><a href=https://lonami.dev/golb/inteligencia-artificial/#fuentes>Fuentes</a></ul><h2 id=que-es>Qué es</h2><p>La inteligencia artificial es una rama apasionante que tiene su origen en la <strong>informática</strong> y se basa en el concepto de conseguir <em>emular</em><sup class=footnote-reference><a href=#1>1</a></sup> al cerebro humano, mediante el desarrollo un programa que sea capaz de <strong>aprender y mejorar por sí sólo</strong> (normalmente bajo algún tipo de supervisión)<p>Fue un concepto acuñado por <em>John McCarthy</em> en un congreso de informática de 1956, y desde entonces este campo ha crecido de manera exponencial con unas buenas previsiones de futuro.<p><img src=https://lonami.dev/golb/inteligencia-artificial/human_progress_edge.svg alt="Progreso humano en la inteligencia artificial"><p><em>Progreso humano en la inteligencia artificial. <a href=http://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1>Fuente</a></em><h2 id=en-que-consiste>En qué consiste</h2><p>La inteligencia artificial no consiste en escribir unas pautas fijas y finitas al igual que hacen la gran mayoría de programas, en los cuales introduces unos datos y producen siempre la misma salida, una salida predecible, programada e invariable, que además, tiene sus límites, ya que si introduces datos para los que la aplicación no está programada, esta aplicación no será capaz de manejarlos. No los entenderá y no producirá ningún resultado.<p>La inteligencia artificial consiste en dar un paso <strong>más allá</strong>. Una inteligencia artificial <em>entrenada</em> es capaz de manejar datos para los cuales no ha sido programada de manera explícita<sup class=footnote-reference><a href=#2>2</a></sup><h2 id=limites>Límites</h2><p>Actualmente, la inteligencia artificial se ve limitada por la velocidad y capacidad de los dispositivos (ordenadores, teléfonos inteligentes).<p>A día de hoy, ya hemos conseguido emular el cerebro de un gusano de un milímetro de longitud, que consiste de un total de trescientas dos neuronas. El <strong>cerebro humano</strong> contiene unas <strong>cien mil millones de neuronas</strong>.<p><img src=https://lonami.dev/golb/inteligencia-artificial/exponential_grow.gif alt="Progreso en la velocidad de los dispositivos"><p><em>Crecimiento en la velocidad de procesado de los dispositivos. <a href=http://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1>Fuente</a></em><p>Comparado con las neuronas de un cerebro humano (cuya velocidad<sup class=footnote-reference><a href=#3>3</a></sup> máxima oscilan entre los 200Hz), los procesadores de hoy en día (mucho más lentos que los que tendremos dentro de algunos años) ya tienen una velocidad superior a los 2Ghz, es decir, <strong>10 millones de veces</strong> más rápidos que las neuronas. Y la comunicación interna del cerebro, que oscila entre los 120m/s, queda infinitamente distante de la velocidad de los ordenadores que se comunican de manera óptica a la <strong>velocidad de la luz</strong>.<p>Además de todo esto, la capacidad de los dispositivos puede ser <strong>ampliada</strong>, a diferencia del cerebro que tiene un tamaño ya determinado. Y, por último, un procesador puede estar <strong>trabajando sin parar</strong> nunca, sin cansarse.<h2 id=tipos-de-inteligencia-artificial>Tipos de inteligencia artificial</h2><h3 id=segun-el-tipo-de-aprendizaje>Según el tipo de aprendizaje</h3><ul><li><strong>Aprendizaje supervisado</strong>: se le presenta una entrada de datos y produce una salida de los datos procesados, y un "tutor" es el que determina si la salida es correcta o no.<li><strong>Aprendizaje sin supervisar</strong>: se le presenta una entrada de datos sin presentarle ningún otro tipo de información, para que encuentre la estructura de los datos por sí sóla.<li><strong>Aprendizaje por refuerzo</strong>: un ordenador interactua con un entorno variable en el que debe llevar a cabo una tarea concreta, sin que un tutor le indique cómo explícitamente.</ul><h3 id=segun-la-forma-de-llevarlo-a-cabo-principales-metodos>Según la forma de llevarlo a cabo (principales métodos)</h3><ul><li><p><strong>Aprendizaje por árbol de decisiones</strong>. Este aprendizaje usa un árbol de decisiones, que almacena observaciones y conclusiones.</p> <p><img src=https://lonami.dev/golb/inteligencia-artificial/decision_tree.svg alt="Árbol de decisiones"><li><p><strong>Aprendizaje por asociación de reglas</strong>. Utilizado para descubrir relaciones en grandes bases de datos<sup class=footnote-reference><a href=#4>4</a></sup>.<li><p><strong>Red neuronal artificial (RNA)</strong>. Inspirado en redes neuronales biológicas**. Los cálculos se estructuran en un grupo de neuronas artificiales interconectadas.<li><p><strong>Programación lógica inductiva (PLI)</strong>. Se aproxima de manera hipotética, dado un transfondo lógico y una entrada, a una solución que no se le había presentado antes.<li><p><strong>Máquinas de soporte vectorial (MSV)</strong>. Se usan para clasificar y problemas que necesitan de regresión<sup class=footnote-reference><a href=#5>5</a></sup>. Dado una serie de ejemplos, una entrada será clasificada de una forma u otra.<li><p><em><strong>Clustering</strong></em>. Este tipo de análisis consiste en asignar observaciones a ciertas subcategorías (denominadas <em>clústeres</em>), para que aquellas que están en el mismo <em>clúster</em> sean similares**. Este tipo de aprendizaje es una técnica común en análisis estadístico.<li><p><strong>Redes bayesianas</strong>. Es un modelo probabilístico que organiza variables al azar según unas determinadas condiciones mediante un gráfico**. Un ejemplo de red bayesiana es el siguiente:</p> <p><img src=https://lonami.dev/golb/inteligencia-artificial/bayesian_network.svg alt="Red bayesiana"><li><p><strong>Algoritmos genéticos</strong>. Imita el proceso evolutivo de la selección natural, y usa métodos como la mutación para generar nuevos "genotipos" que, con algo de suerte, serán mejores en encontrar la solución correcta.</ul><h2 id=aplicaciones-practicas>Aplicaciones prácticas</h2><p>La inteligencia artificial ya se encuentra desde hace algún tiempo entre nosotros, como por ejemplo el archiconocido <strong>buscador Google</strong>, que filtra los resultados más relevantes mediante una inteligencia artificial. Otros ejemplos son el reconocimiento de caracteres a partir de una foto, o incluso reconocimiento del habla con <strong>asistentes virtuales como Cortana o Siri</strong>, en los videojuegos, en bolsa, en los <strong>hospitales</strong>, industria pesada, transportes, juguetes, música, aviación, robótica, filtros anti-spam... y un largo etcétera.<h2 id=implicaciones-eticas>Implicaciones éticas</h2><p>Una vez tengamos la tecnología necesaria para recrear un cerebro humano, si enseñáramos a esta inteligencia artificial al igual que un humano, ¿llegaría a tener <strong>sentimientos</strong>? ¿Sería consciente de su existencia? ¿Podría sentir felicidad, alegría, tristeza, enfado? ¿Tendría <strong>creatividad</strong>? ¿Derecho a propiedad? Si la respuesta es que sí, y es la respuesta más lógica, significa que, en realidad, los sentimientos no son nada más que una manera de entender las cosas. No tienen valor por sí mismos. Seríamos capaces de recrearlos, y tendrían el mismo valor que un sentimiento humano, aunque esa inteligencia viviera dentro de un ordenador. Y acabar con esta inteligencia sería acabar con esta vida, <strong>una vida</strong> casi, por no decir enteramente, <strong>humana</strong>. Además, todo esto implicaría que todo comportamiento humano es predecible. Por último, si esta inteligencia es realmente como un humano, al utilizarla, ¿la estaríamos esclavizando al obligarla a trabajar para nosotros? ¿En qué momento dejaremos de llamarlos "ordenadores" y comenzaremos a tratarles como "humanos"? ¿Será la humanidad capaz de adaptarse al cambio?<h2 id=ejemplos>Ejemplos</h2><p>En el siguiente algorítmo genético podemos ver como una figura aprende a saltar, obedeciendo a las leyes físicas (ver en <a href=https://youtu.be/Gl3EjiVlz_4>YouTube</a>):</p><iframe width=420 height=315 src=https://www.youtube.com/embed/Gl3EjiVlz_4 frameborder=0 allowfullscreen></iframe><p>Por el contrario, en el siguiente ejemplo, un algorítmo genético aprende a "luchar" contra otra figura: (ver en <a href=https://youtu.be/u2t77mQmJiY>YouTube</a>):</p><iframe width=560 height=315 src=https://www.youtube.com/embed/u2t77mQmJiY frameborder=0 allowfullscreen></iframe><p>Estos cuatro increíbles ejemplos siguientes muestran un proceso evolutivo similar al sufrido por cualquier tipo de ser (ver en <a href=https://youtu.be/GOFws_hhZs8>YouTube</a>):</p><iframe width=560 height=315 src=https://www.youtube.com/embed/GOFws_hhZs8 frameborder=0 allowfullscreen></iframe><iframe width=560 height=315 src=https://www.youtube.com/embed/31dsH2Fs1IQ frameborder=0 allowfullscreen></iframe><iframe width=560 height=315 src=https://www.youtube.com/embed/IVcvvqxtNwE frameborder=0 allowfullscreen></iframe><iframe width=560 height=315 src=https://www.youtube.com/embed/KrTbJUJsDSw frameborder=0 allowfullscreen></iframe><h2 id=conceptos>Conceptos</h2><div class=footnote-definition id=1><sup class=footnote-definition-label>1</sup><p><strong>Emular</strong>. Tratar de imitar un modelo, aproximarse a este. Copiar su comportamiento o incluso mejorarlo.</div><div class=footnote-definition id=2><sup class=footnote-definition-label>2</sup><p><strong>Explícito</strong>. Suceso que ocurre de manera previamente avisada de una forma directa, anticipado <em>sin rodeos</em>.</div><div class=footnote-definition id=3><sup class=footnote-definition-label>3</sup><p><strong>Velocidad (en hercios)</strong>. Número de cálculos realizados por segundo. Un procesador con una velocidad de 100Hz es capaz de realizar 100 cálculos por segundo.</div><div class=footnote-definition id=4><sup class=footnote-definition-label>4</sup><p><strong>Base de datos</strong>. Lugar en el que se almacena de manera estructurada una información, como por ejemplo un censo que indique el nombre de las personas, sus apellidos, edad, etcétera.</div><div class=footnote-definition id=5><sup class=footnote-definition-label>5</sup><p><strong>Regresión</strong>. Las pruebas de regresión consisten en someter a un programa a una serie de pruebas para descubrir fallos en este cometidos accidentalmente con anterioridad en versiones anteriores.</div><h2 id=fuentes>Fuentes</h2><ul><li><a href=http://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1>Evolución de la inteligencia artificial - Wait but why</a><li><a href=https://en.wikipedia.org/wiki/Machine_learning><em>Machine learning</em> - Wikipedia</a></ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Inteligencia artificial | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog>blog</a><li><a href=/golb class=selected>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Inteligencia artificial</h1><div class=time><p>2016-02-24<p>last updated 2016-03-05</div><h2 id=indice>Índice</h2><ul><li><a href=https://lonami.dev/golb/inteligencia-artificial/#qu%C3%A9_es>Qué es</a><li><a href=https://lonami.dev/golb/inteligencia-artificial/#en_qu%C3%A9_consiste>En qué consiste</a><li><a href=https://lonami.dev/golb/inteligencia-artificial/#l%C3%ADmites>Límites</a><li><a href=https://lonami.dev/golb/inteligencia-artificial/#tipos_de_inteligencia_artificial>Tipos de inteligencia artificial</a><li><a href=https://lonami.dev/golb/inteligencia-artificial/#aplicaciones_pr%C3%A1cticas>Aplicaciones prácticas</a><li><a href=https://lonami.dev/golb/inteligencia-artificial/#implicaciones_%C3%A9ticas>Implicaciones éticas</a><li><a href=https://lonami.dev/golb/inteligencia-artificial/#ejemplos>Ejemplos</a><li><a href=https://lonami.dev/golb/inteligencia-artificial/#conceptos>Conceptos</a><li><a href=https://lonami.dev/golb/inteligencia-artificial/#fuentes>Fuentes</a></ul><h2 id=que-es>Qué es</h2><p>La inteligencia artificial es una rama apasionante que tiene su origen en la <strong>informática</strong> y se basa en el concepto de conseguir <em>emular</em><sup class=footnote-reference><a href=#1>1</a></sup> al cerebro humano, mediante el desarrollo un programa que sea capaz de <strong>aprender y mejorar por sí sólo</strong> (normalmente bajo algún tipo de supervisión)<p>Fue un concepto acuñado por <em>John McCarthy</em> en un congreso de informática de 1956, y desde entonces este campo ha crecido de manera exponencial con unas buenas previsiones de futuro.<p><img src=https://lonami.dev/golb/inteligencia-artificial/human_progress_edge.svg alt="Progreso humano en la inteligencia artificial"><p><em>Progreso humano en la inteligencia artificial. <a href=http://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1>Fuente</a></em><h2 id=en-que-consiste>En qué consiste</h2><p>La inteligencia artificial no consiste en escribir unas pautas fijas y finitas al igual que hacen la gran mayoría de programas, en los cuales introduces unos datos y producen siempre la misma salida, una salida predecible, programada e invariable, que además, tiene sus límites, ya que si introduces datos para los que la aplicación no está programada, esta aplicación no será capaz de manejarlos. No los entenderá y no producirá ningún resultado.<p>La inteligencia artificial consiste en dar un paso <strong>más allá</strong>. Una inteligencia artificial <em>entrenada</em> es capaz de manejar datos para los cuales no ha sido programada de manera explícita<sup class=footnote-reference><a href=#2>2</a></sup><h2 id=limites>Límites</h2><p>Actualmente, la inteligencia artificial se ve limitada por la velocidad y capacidad de los dispositivos (ordenadores, teléfonos inteligentes).<p>A día de hoy, ya hemos conseguido emular el cerebro de un gusano de un milímetro de longitud, que consiste de un total de trescientas dos neuronas. El <strong>cerebro humano</strong> contiene unas <strong>cien mil millones de neuronas</strong>.<p><img src=https://lonami.dev/golb/inteligencia-artificial/exponential_grow.gif alt="Progreso en la velocidad de los dispositivos"><p><em>Crecimiento en la velocidad de procesado de los dispositivos. <a href=http://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1>Fuente</a></em><p>Comparado con las neuronas de un cerebro humano (cuya velocidad<sup class=footnote-reference><a href=#3>3</a></sup> máxima oscilan entre los 200Hz), los procesadores de hoy en día (mucho más lentos que los que tendremos dentro de algunos años) ya tienen una velocidad superior a los 2Ghz, es decir, <strong>10 millones de veces</strong> más rápidos que las neuronas. Y la comunicación interna del cerebro, que oscila entre los 120m/s, queda infinitamente distante de la velocidad de los ordenadores que se comunican de manera óptica a la <strong>velocidad de la luz</strong>.<p>Además de todo esto, la capacidad de los dispositivos puede ser <strong>ampliada</strong>, a diferencia del cerebro que tiene un tamaño ya determinado. Y, por último, un procesador puede estar <strong>trabajando sin parar</strong> nunca, sin cansarse.<h2 id=tipos-de-inteligencia-artificial>Tipos de inteligencia artificial</h2><h3 id=segun-el-tipo-de-aprendizaje>Según el tipo de aprendizaje</h3><ul><li><strong>Aprendizaje supervisado</strong>: se le presenta una entrada de datos y produce una salida de los datos procesados, y un "tutor" es el que determina si la salida es correcta o no.<li><strong>Aprendizaje sin supervisar</strong>: se le presenta una entrada de datos sin presentarle ningún otro tipo de información, para que encuentre la estructura de los datos por sí sóla.<li><strong>Aprendizaje por refuerzo</strong>: un ordenador interactua con un entorno variable en el que debe llevar a cabo una tarea concreta, sin que un tutor le indique cómo explícitamente.</ul><h3 id=segun-la-forma-de-llevarlo-a-cabo-principales-metodos>Según la forma de llevarlo a cabo (principales métodos)</h3><ul><li><p><strong>Aprendizaje por árbol de decisiones</strong>. Este aprendizaje usa un árbol de decisiones, que almacena observaciones y conclusiones.</p> <p><img src=https://lonami.dev/golb/inteligencia-artificial/decision_tree.svg alt="Árbol de decisiones"><li><p><strong>Aprendizaje por asociación de reglas</strong>. Utilizado para descubrir relaciones en grandes bases de datos<sup class=footnote-reference><a href=#4>4</a></sup>.<li><p><strong>Red neuronal artificial (RNA)</strong>. Inspirado en redes neuronales biológicas**. Los cálculos se estructuran en un grupo de neuronas artificiales interconectadas.<li><p><strong>Programación lógica inductiva (PLI)</strong>. Se aproxima de manera hipotética, dado un transfondo lógico y una entrada, a una solución que no se le había presentado antes.<li><p><strong>Máquinas de soporte vectorial (MSV)</strong>. Se usan para clasificar y problemas que necesitan de regresión<sup class=footnote-reference><a href=#5>5</a></sup>. Dado una serie de ejemplos, una entrada será clasificada de una forma u otra.<li><p><em><strong>Clustering</strong></em>. Este tipo de análisis consiste en asignar observaciones a ciertas subcategorías (denominadas <em>clústeres</em>), para que aquellas que están en el mismo <em>clúster</em> sean similares**. Este tipo de aprendizaje es una técnica común en análisis estadístico.<li><p><strong>Redes bayesianas</strong>. Es un modelo probabilístico que organiza variables al azar según unas determinadas condiciones mediante un gráfico**. Un ejemplo de red bayesiana es el siguiente:</p> <p><img src=https://lonami.dev/golb/inteligencia-artificial/bayesian_network.svg alt="Red bayesiana"><li><p><strong>Algoritmos genéticos</strong>. Imita el proceso evolutivo de la selección natural, y usa métodos como la mutación para generar nuevos "genotipos" que, con algo de suerte, serán mejores en encontrar la solución correcta.</ul><h2 id=aplicaciones-practicas>Aplicaciones prácticas</h2><p>La inteligencia artificial ya se encuentra desde hace algún tiempo entre nosotros, como por ejemplo el archiconocido <strong>buscador Google</strong>, que filtra los resultados más relevantes mediante una inteligencia artificial. Otros ejemplos son el reconocimiento de caracteres a partir de una foto, o incluso reconocimiento del habla con <strong>asistentes virtuales como Cortana o Siri</strong>, en los videojuegos, en bolsa, en los <strong>hospitales</strong>, industria pesada, transportes, juguetes, música, aviación, robótica, filtros anti-spam... y un largo etcétera.<h2 id=implicaciones-eticas>Implicaciones éticas</h2><p>Una vez tengamos la tecnología necesaria para recrear un cerebro humano, si enseñáramos a esta inteligencia artificial al igual que un humano, ¿llegaría a tener <strong>sentimientos</strong>? ¿Sería consciente de su existencia? ¿Podría sentir felicidad, alegría, tristeza, enfado? ¿Tendría <strong>creatividad</strong>? ¿Derecho a propiedad? Si la respuesta es que sí, y es la respuesta más lógica, significa que, en realidad, los sentimientos no son nada más que una manera de entender las cosas. No tienen valor por sí mismos. Seríamos capaces de recrearlos, y tendrían el mismo valor que un sentimiento humano, aunque esa inteligencia viviera dentro de un ordenador. Y acabar con esta inteligencia sería acabar con esta vida, <strong>una vida</strong> casi, por no decir enteramente, <strong>humana</strong>. Además, todo esto implicaría que todo comportamiento humano es predecible. Por último, si esta inteligencia es realmente como un humano, al utilizarla, ¿la estaríamos esclavizando al obligarla a trabajar para nosotros? ¿En qué momento dejaremos de llamarlos "ordenadores" y comenzaremos a tratarles como "humanos"? ¿Será la humanidad capaz de adaptarse al cambio?<h2 id=ejemplos>Ejemplos</h2><p>En el siguiente algorítmo genético podemos ver como una figura aprende a saltar, obedeciendo a las leyes físicas (ver en <a href=https://youtu.be/Gl3EjiVlz_4>YouTube</a>):</p><iframe width=420 height=315 src=https://www.youtube.com/embed/Gl3EjiVlz_4 frameborder=0 allowfullscreen></iframe><p>Por el contrario, en el siguiente ejemplo, un algorítmo genético aprende a "luchar" contra otra figura: (ver en <a href=https://youtu.be/u2t77mQmJiY>YouTube</a>):</p><iframe width=560 height=315 src=https://www.youtube.com/embed/u2t77mQmJiY frameborder=0 allowfullscreen></iframe><p>Estos cuatro increíbles ejemplos siguientes muestran un proceso evolutivo similar al sufrido por cualquier tipo de ser (ver en <a href=https://youtu.be/GOFws_hhZs8>YouTube</a>):</p><iframe width=560 height=315 src=https://www.youtube.com/embed/GOFws_hhZs8 frameborder=0 allowfullscreen></iframe><iframe width=560 height=315 src=https://www.youtube.com/embed/31dsH2Fs1IQ frameborder=0 allowfullscreen></iframe><iframe width=560 height=315 src=https://www.youtube.com/embed/IVcvvqxtNwE frameborder=0 allowfullscreen></iframe><iframe width=560 height=315 src=https://www.youtube.com/embed/KrTbJUJsDSw frameborder=0 allowfullscreen></iframe><h2 id=conceptos>Conceptos</h2><div class=footnote-definition id=1><sup class=footnote-definition-label>1</sup><p><strong>Emular</strong>. Tratar de imitar un modelo, aproximarse a este. Copiar su comportamiento o incluso mejorarlo.</div><div class=footnote-definition id=2><sup class=footnote-definition-label>2</sup><p><strong>Explícito</strong>. Suceso que ocurre de manera previamente avisada de una forma directa, anticipado <em>sin rodeos</em>.</div><div class=footnote-definition id=3><sup class=footnote-definition-label>3</sup><p><strong>Velocidad (en hercios)</strong>. Número de cálculos realizados por segundo. Un procesador con una velocidad de 100Hz es capaz de realizar 100 cálculos por segundo.</div><div class=footnote-definition id=4><sup class=footnote-definition-label>4</sup><p><strong>Base de datos</strong>. Lugar en el que se almacena de manera estructurada una información, como por ejemplo un censo que indique el nombre de las personas, sus apellidos, edad, etcétera.</div><div class=footnote-definition id=5><sup class=footnote-definition-label>5</sup><p><strong>Regresión</strong>. Las pruebas de regresión consisten en someter a un programa a una serie de pruebas para descubrir fallos en este cometidos accidentalmente con anterioridad en versiones anteriores.</div><h2 id=fuentes>Fuentes</h2><ul><li><a href=http://waitbutwhy.com/2015/01/artificial-intelligence-revolution-1>Evolución de la inteligencia artificial - Wait but why</a><li><a href=https://en.wikipedia.org/wiki/Machine_learning><em>Machine learning</em> - Wikipedia</a></ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Making a Difference | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog>blog</a><li><a href=/golb class=selected>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Making a Difference</h1><div class=time><p>2020-08-24</div><p>When I've thought about what "making a difference" means, I've always seen it as having to do something at very large scales. Something that changes everyone's lives. But I've realized that it doesn't need the case.<p>I'm thinking about certain people. I'm thinking about middle-school.<p>I'm thinking about my math teacher, who I remember saying that if he made a student fail with a grade very close to passing, then he would be a bad teacher because he could just "let them pass". But if he just passed that one student, he would fail as a teacher, because it's his job to get people to actually <em>learn</em> his subject. He didn't want to be mean, he was just trying to have everybody properly learn the subject. That made a difference on me, but I never got the chance to thank him.<p>I'm thinking about my English teacher, who has had to put up with a lot of stupidity from the rest of students, making the class not enjoyable. But I thought she was nice, and she thought I was nice. I remember of a day when she was grading my assignement and debating what grade I should get. I thought to myself, she should just grade whatever she considered fair. But she went something along the lines of "what the heck, you deserve it", and graded in my favour. I think of her as an honest person who also just wants to make other people learn, despite the other students not liking her much. I never got a chance to thank her.<p>I'm thinking about my philosophy teacher, who was a nice chap. He tried to make the lectures fun and had some really interesting ways of thinking. He was nice to talk to overall, but I never got to thank him for what he taught us.<p>I'm thinking about one of my lecturers at university who has let me express my feelings to her and helped me make the last push I needed to finish my university degree (I was really dreading some subjects and considering dropping out, but those days are finally over).<p>I'm thinking about all the people who has been in a long-distance relationship with me. None of the three I've had have worked out in the long-term so far, and I'm in a need of a break from those. But they were really invaluable to help me grow and learn a lot about how things actually work. I'm okay with the first two people now, maybe the third one can be my friend once more in the future as well. I'm sure I've told them how important they have been to me and my life.<p>I'm thinking about all the people who I've met online and have had overall a healthy relation, sharing interesting things between each other, playtime, thoughts, and other various lessons.<p>What I'm trying to get across is that you may be more impactful than you think you really are. And even if people don't say it, some are extremely thankful of your interactions with them. You can see this post as a sort of a "call for action" to be more thankful to the people that have affected you in important ways. If people take things for granted because they Just Work, the person who made those things should be proud of this achievement.<p>Thanks to all of them, to everyone who has shared good moments with me, and to all the people who enjoy the things I make.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Making a Difference | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog>blog</a><li><a href=/golb class=selected>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Making a Difference</h1><div class=time><p>2020-08-24</div><p>When I've thought about what "making a difference" means, I've always seen it as having to do something at very large scales. Something that changes everyone's lives. But I've realized that it doesn't need the case.<p>I'm thinking about certain people. I'm thinking about middle-school.<p>I'm thinking about my math teacher, who I remember saying that if he made a student fail with a grade very close to passing, then he would be a bad teacher because he could just "let them pass". But if he just passed that one student, he would fail as a teacher, because it's his job to get people to actually <em>learn</em> his subject. He didn't want to be mean, he was just trying to have everybody properly learn the subject. That made a difference on me, but I never got the chance to thank him.<p>I'm thinking about my English teacher, who has had to put up with a lot of stupidity from the rest of students, making the class not enjoyable. But I thought she was nice, and she thought I was nice. I remember of a day when she was grading my assignement and debating what grade I should get. I thought to myself, she should just grade whatever she considered fair. But she went something along the lines of "what the heck, you deserve it", and graded in my favour. I think of her as an honest person who also just wants to make other people learn, despite the other students not liking her much. I never got a chance to thank her.<p>I'm thinking about my philosophy teacher, who was a nice chap. He tried to make the lectures fun and had some really interesting ways of thinking. He was nice to talk to overall, but I never got to thank him for what he taught us.<p>I'm thinking about one of my lecturers at university who has let me express my feelings to her and helped me make the last push I needed to finish my university degree (I was really dreading some subjects and considering dropping out, but those days are finally over).<p>I'm thinking about all the people who has been in a long-distance relationship with me. None of the three I've had have worked out in the long-term so far, and I'm in a need of a break from those. But they were really invaluable to help me grow and learn a lot about how things actually work. I'm okay with the first two people now, maybe the third one can be my friend once more in the future as well. I'm sure I've told them how important they have been to me and my life.<p>I'm thinking about all the people who I've met online and have had overall a healthy relation, sharing interesting things between each other, playtime, thoughts, and other various lessons.<p>What I'm trying to get across is that you may be more impactful than you think you really are. And even if people don't say it, some are extremely thankful of your interactions with them. You can see this post as a sort of a "call for action" to be more thankful to the people that have affected you in important ways. If people take things for granted because they Just Work, the person who made those things should be proud of this achievement.<p>Thanks to all of them, to everyone who has shared good moments with me, and to all the people who enjoy the things I make.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Reflexión sobre la Inteligencia artificial | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog>blog</a><li><a href=/golb class=selected>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Reflexión sobre la Inteligencia artificial</h1><div class=time><p>2016-06-13</div><blockquote><p>Nota: esta reflexión ha sido sacada de una conversación en Telegram, aunque ha sido lo más adaptada posible a formato de blog.</blockquote><h2 id=conversacion-del-12-03-16>Conversación del 12.03.16</h2><p>Pienso que para conseguir una verdadera inteligencia artificial debemos abstraernos mucho. Es decir, siempre hay una pequeña parte de <em>Pero es que el ser humano, los sentimientos, tal, cual</em>... Igual simplemente, absolutamente todo esté programado. Cuando actúas de manera que no sabes por qué por ejemplo, seguramente sea una serie de estímulos adecuados que producen esa respuesta porque se ha formado ese camino de neuronas en tu mente. Por ejemplo, el arco reflejo, que es un arco innato.<ul><li><em>Sensación de quemar → retirar</em><li><em><a href=https://es.wikipedia.org/wiki/Condicionamiento_cl%C3%A1sico>Un estímulo X → una respuesta Y</a></em></ul><p>Es una ida y vuelta instantanea entre sensación y respuesta. El cerebro simplemente es capaz de trabajar con combinaciones más complejas, como por ejemplo la suma, este número con este otro → sale otro número, y se le añade a otro... Hay una especie de <a href=https://es.wikipedia.org/wiki/Recursi%C3%B3n>recursión</a> también, aunque en realidad es que es el mismo estímulo el que <a href=https://es.wikipedia.org/wiki/M%C3%A9todo_%28inform%C3%A1tica%29>"llama"</a> a un determinado camino.<p>Pero lo verdaderamente impresionante es la consciencia, igual no tenemos consciencia de verdad, igual es como lo sentimos. Siempre nos han hablado de la consciencia pero nadie ha sabido probarla, por lo que sólo tenemos una idea, un concepto. Es aún más impresionante es el hecho de recordar y <a href=https://es.wikipedia.org/wiki/Memoria_de_trabajo>trabajar</a> con la información.<p>Cuando hablamos de consciencia, estamos simplemente tratando información sobre esa misma información, ¿cómo cojones sentimos lo que pensamos? Yo sé que estoy pensando porque hemos definido <em>pensar</em> como este proceso. ¿Pero cómo coño entiendo yo eso? Es decir, ¿cómo me doy cuenta? por qué lo situo en mi cabeza? Probablemente, aunque el cerebro esté trabajando con todo eso, la sensación sea externa a nosotros, es decir, ocurre en mi cabeza. ¿Pero de verdad lo siento en mi cabeza?<p>Me estoy rayando.<p>El verdadero problema está en saber cómo sabemos que estamos pensando. El cerebro se compone de neuronas y conexiones, esa es la base a parte de donde se encuentra todo y tal la base es esa, y las sensaciones táctiles son igual de complejas, las procesa mi cerebro pero las siento en mi mano. ¿Será cosa de costumbre? Yo siento algo y lo situo ahí. Sin embargo sentimos ahí y no por encima o por debajo de, vamos a poner, los dedos de la mano, lo siento justo ahí. ¿Qué coño es realmente la <a href=https://es.wikipedia.org/wiki/Memoria_a_corto_plazo>memoria a corto plazo</a>? (porque lo de la <a href=https://es.wikipedia.org/wiki/Memoria_a_largo_plazo>memoria a largo plazo</a> se traslada ahí cuando la necesitamos para trabajar con ella, por eso es <em>MCP</em> o memoria de trabajo según ciertas teorías).<p>En fin.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Reflexión sobre la Inteligencia artificial | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog>blog</a><li><a href=/golb class=selected>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Reflexión sobre la Inteligencia artificial</h1><div class=time><p>2016-06-13</div><blockquote><p>Nota: esta reflexión ha sido sacada de una conversación en Telegram, aunque ha sido lo más adaptada posible a formato de blog.</blockquote><h2 id=conversacion-del-12-03-16>Conversación del 12.03.16</h2><p>Pienso que para conseguir una verdadera inteligencia artificial debemos abstraernos mucho. Es decir, siempre hay una pequeña parte de <em>Pero es que el ser humano, los sentimientos, tal, cual</em>... Igual simplemente, absolutamente todo esté programado. Cuando actúas de manera que no sabes por qué por ejemplo, seguramente sea una serie de estímulos adecuados que producen esa respuesta porque se ha formado ese camino de neuronas en tu mente. Por ejemplo, el arco reflejo, que es un arco innato.<ul><li><em>Sensación de quemar → retirar</em><li><em><a href=https://es.wikipedia.org/wiki/Condicionamiento_cl%C3%A1sico>Un estímulo X → una respuesta Y</a></em></ul><p>Es una ida y vuelta instantanea entre sensación y respuesta. El cerebro simplemente es capaz de trabajar con combinaciones más complejas, como por ejemplo la suma, este número con este otro → sale otro número, y se le añade a otro... Hay una especie de <a href=https://es.wikipedia.org/wiki/Recursi%C3%B3n>recursión</a> también, aunque en realidad es que es el mismo estímulo el que <a href=https://es.wikipedia.org/wiki/M%C3%A9todo_%28inform%C3%A1tica%29>"llama"</a> a un determinado camino.<p>Pero lo verdaderamente impresionante es la consciencia, igual no tenemos consciencia de verdad, igual es como lo sentimos. Siempre nos han hablado de la consciencia pero nadie ha sabido probarla, por lo que sólo tenemos una idea, un concepto. Es aún más impresionante es el hecho de recordar y <a href=https://es.wikipedia.org/wiki/Memoria_de_trabajo>trabajar</a> con la información.<p>Cuando hablamos de consciencia, estamos simplemente tratando información sobre esa misma información, ¿cómo cojones sentimos lo que pensamos? Yo sé que estoy pensando porque hemos definido <em>pensar</em> como este proceso. ¿Pero cómo coño entiendo yo eso? Es decir, ¿cómo me doy cuenta? por qué lo situo en mi cabeza? Probablemente, aunque el cerebro esté trabajando con todo eso, la sensación sea externa a nosotros, es decir, ocurre en mi cabeza. ¿Pero de verdad lo siento en mi cabeza?<p>Me estoy rayando.<p>El verdadero problema está en saber cómo sabemos que estamos pensando. El cerebro se compone de neuronas y conexiones, esa es la base a parte de donde se encuentra todo y tal la base es esa, y las sensaciones táctiles son igual de complejas, las procesa mi cerebro pero las siento en mi mano. ¿Será cosa de costumbre? Yo siento algo y lo situo ahí. Sin embargo sentimos ahí y no por encima o por debajo de, vamos a poner, los dedos de la mano, lo siento justo ahí. ¿Qué coño es realmente la <a href=https://es.wikipedia.org/wiki/Memoria_a_corto_plazo>memoria a corto plazo</a>? (porque lo de la <a href=https://es.wikipedia.org/wiki/Memoria_a_largo_plazo>memoria a largo plazo</a> se traslada ahí cuando la necesitamos para trabajar con ella, por eso es <em>MCP</em> o memoria de trabajo según ciertas teorías).<p>En fin.</main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -1,1 +1,1 @@
-<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Sentences | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul><li><a href=/>lonami's site</a><li><a href=/blog>blog</a><li><a href=/golb class=selected>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1 class=title>Sentences</h1><div class=time><p>2018-01-31</div><blockquote><p>Don't know English? <a href=https://lonami.dev/golb/sentences/spanish.html>Read the Spanish version instead</a>.</blockquote><p>Just a few sentences that I've been gathering among the years and I think are worthy of being kept somewhere.<ul><li>So far, you've survived 100% of your worst days. You're doing great<li>Money is not a concern, perfection is<li>The only limit to our realization of tomorrow will be our doubts today<li>Not all cultures deserve respect.<li>It's not the same knowing that you're not free that not being free.<li>Being alive is not a biolgical matter, rather mental.<li>Every mountain starts off as a grain of sand.<li>Time goes against desire.<li>The only way to respect is fear.<li>For those who have nothing I have so much.<li>One isn't what they think, it's what they do.<li>When your goal seems impossible, don't change the goal. Find new ways to get to it.<li>Be the change you wish to see in the world.<li>Tell me something, I'll forget. Show me something, I'll remember. But make me part of it and I'll understand it.<li>If you want something to happen, go and do it. Or wait sitting like a stupid until it happens that it never happens.<li>I'd rather be happy than be right.<li>The engines don’t move the ship at all. The ship stays where it is and the engines move the universe around it.<li>If I never try getting there, I'll never get there.</ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!+<!DOCTYPE html><html lang=en><head><meta charset=utf-8><meta name=description content="Official Lonami's website"><meta name=viewport content="width=device-width, initial-scale=1.0, user-scalable=yes"><title> Sentences | Lonami's Blog </title><link rel=stylesheet href=/style.css><body><article><nav class=sections><ul class=left><li><a href=/>lonami's site</a><li><a href=/blog>blog</a><li><a href=/golb class=selected>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1 class=title>Sentences</h1><div class=time><p>2018-01-31</div><blockquote><p>Don't know English? <a href=https://lonami.dev/golb/sentences/spanish.html>Read the Spanish version instead</a>.</blockquote><p>Just a few sentences that I've been gathering among the years and I think are worthy of being kept somewhere.<ul><li>So far, you've survived 100% of your worst days. You're doing great<li>Money is not a concern, perfection is<li>The only limit to our realization of tomorrow will be our doubts today<li>Not all cultures deserve respect.<li>It's not the same knowing that you're not free that not being free.<li>Being alive is not a biolgical matter, rather mental.<li>Every mountain starts off as a grain of sand.<li>Time goes against desire.<li>The only way to respect is fear.<li>For those who have nothing I have so much.<li>One isn't what they think, it's what they do.<li>When your goal seems impossible, don't change the goal. Find new ways to get to it.<li>Be the change you wish to see in the world.<li>Tell me something, I'll forget. Show me something, I'll remember. But make me part of it and I'll understand it.<li>If you want something to happen, go and do it. Or wait sitting like a stupid until it happens that it never happens.<li>I'd rather be happy than be right.<li>The engines don’t move the ship at all. The ship stays where it is and the engines move the universe around it.<li>If I never try getting there, I'll never get there.</ul></main><footer><div><p>Share your thoughts, or simply come hang with me <a href=https://t.me/LonamiWebs><img src=/img/telegram.svg alt=Telegram></a> <a href=mailto:totufals@hotmail.com><img src=/img/mail.svg alt=Mail></a></div></footer></article><p class=abyss>Glaze into the abyss… Oh hi there!
@@ -0,0 +1,8 @@
+<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24"> + <rect fill="#e77717" width="24" height="24" /> + <g fill="#fff"> + <path d="M4,4 q16,0 16,16 h-3 q0,-13 -13,-13" /> + <path d="M4,10 q10,0 10,10 h-3 q0,-7 -7,-7" /> + <circle cx="6" cy="18" r="2" /> + </g> +</svg>
@@ -7,7 +7,7 @@
.golb:hover { transform: scaleY(1); } -</style><body><article><nav class=sections><ul><li><a href=/ class=selected>lonami's site</a><li><a href=/blog>blog</a><li><a href=/golb>golb</a><li><a href=/blog/atom.xml>rss</a></ul></nav><main><h1>Lonami's Site</h1><p>Welcome to my personal website! This page has had several redesigns over time, but now I've decided to let it be as minimalist as possible (proud to be under 32KB!).<h2 id=about>About me</h2><p>Spanish male <span id=age><noscript>born in 1998</noscript></span>. I have been programming <span id=programming><noscript>since 2012</noscript></span> and it is my passion.<p>I enjoy nature, taking pictures, playing video-games, drawing vector graphics, or just chatting online.<p>I can speak perfect Spanish, read and write perfect English and Python, and have programmed in C#, Java, JavaScript, Rust, some C and C++, and designed pages like this with plain HTML and CSS.<p>On the Internet, I'm often known as <i>Lonami</i>, although my real name is simply my nick name, put the other way round.<h2 id=projects>Project highlights</h2><ul><li><a href=https://github.com/LonamiWebs/Telethon/>Telethon</a>: Python implementation of the Telegram's API.<li><a href=klooni>1010! Klooni</a>: libGDX simple puzzle game based on the original <i>1010!</i>.<li><a href=https://github.com/LonamiWebs/Stringlate/>Stringlate</a>: Android application that makes it easy to translate other FOSS apps.</ul><p>These are only my <i>Top 3</i> projects, the ones I consider to be the most successful. If you're curious about what else I've done, feel free to check out my <a href=https://github.com/LonamiWebs/>GitHub</a>.<h2 id=more-links>More links</h2><dl><dt><a href=https://t.me/LonamiWebs><img src=img/telegram.svg> My Telegram</a><dd>Come meet me at my group in Telegram and talk about anything!<dt><a href=/blog><img src=img/blog.svg alt=blog> My blog</a><dd>Sometimes I blog about things, whether it's games, techy stuff, or random life stuff.<dt><a href=/golb><img src=img/blog.svg class=golb alt=golb> My golb</a><dd>What? You don't know what a golb is? It's like a blog, but less conventional.<dt><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github> My GitHub</a><dd>By far what I'm most proud of. I love releasing my projects as open source. There is no reason not to!<dt><a href=/utils><img src=img/utils.svg alt=utilities> Several Utilities</a><dd>Random things I've put online because I keep forgetting about them.<dt><a href=/stopwatch.html><img src=stopwatch.svg width=24 height=24 alt=stopwatch> stopwatch</a><dd>An extremely simple JavaScript-based stopwatch.<dt><a href=donate><img src=img/bitcoin.svg alt=donate> Donate</a><dd>Some people like what I do and want to compensate me for it, but I'm fine with compliments if you can't afford a donation!<dt><a href=humans.txt><img src=img/humans.svg alt=humans.txt> humans.txt</a><dd><a href=http://humanstxt.org/>We are humans, not robots.</a></dl><h2 id=contact>Contact</h2><p>If you use Telegram you can join <a href=https://t.me/LonamiWebs>@LonamiWebs</a> and just chat about any topics you like politely.<p>If you prefer, you can also send me a private email to <a href=mailto:totufals@hotmail.com>totufals[at]hotmail[dot]com</a> and I will try to reply as soon as I can. Please don't use the email if you need help with a specific project, this is better discussed in the group where everyone can benefit from it.</p><script> +</style><body><article><nav class=sections><ul class=left><li><a href=/ class=selected>lonami's site</a><li><a href=/blog>blog</a><li><a href=/golb>golb</a></ul><div class=right><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github></a><a href=/blog/atom.xml><img src=/img/rss.svg alt=rss></a></div></nav><main><h1>Lonami's Site</h1><p>Welcome to my personal website! This page has had several redesigns over time, but now I've decided to let it be as minimalist as possible (proud to be under 32KB!).<h2 id=about>About me</h2><p>Spanish male <span id=age><noscript>born in 1998</noscript></span>. I have been programming <span id=programming><noscript>since 2012</noscript></span> and it is my passion.<p>I enjoy nature, taking pictures, playing video-games, drawing vector graphics, or just chatting online.<p>I can speak perfect Spanish, read and write perfect English and Python, and have programmed in C#, Java, JavaScript, Rust, some C and C++, and designed pages like this with plain HTML and CSS.<p>On the Internet, I'm often known as <i>Lonami</i>, although my real name is simply my nick name, put the other way round.<h2 id=projects>Project highlights</h2><ul><li><a href=https://github.com/LonamiWebs/Telethon/>Telethon</a>: Python implementation of the Telegram's API.<li><a href=klooni>1010! Klooni</a>: libGDX simple puzzle game based on the original <i>1010!</i>.<li><a href=https://github.com/LonamiWebs/Stringlate/>Stringlate</a>: Android application that makes it easy to translate other FOSS apps.</ul><p>These are only my <i>Top 3</i> projects, the ones I consider to be the most successful. If you're curious about what else I've done, feel free to check out my <a href=https://github.com/LonamiWebs/>GitHub</a>.<h2 id=more-links>More links</h2><dl><dt><a href=https://t.me/LonamiWebs><img src=img/telegram.svg> My Telegram</a><dd>Come meet me at my group in Telegram and talk about anything!<dt><a href=/blog><img src=img/blog.svg alt=blog> My blog</a><dd>Sometimes I blog about things, whether it's games, techy stuff, or random life stuff.<dt><a href=/golb><img src=img/blog.svg class=golb alt=golb> My golb</a><dd>What? You don't know what a golb is? It's like a blog, but less conventional.<dt><a href=https://github.com/LonamiWebs><img src=img/github.svg alt=github> My GitHub</a><dd>By far what I'm most proud of. I love releasing my projects as open source. There is no reason not to!<dt><a href=/utils><img src=img/utils.svg alt=utilities> Several Utilities</a><dd>Random things I've put online because I keep forgetting about them.<dt><a href=/stopwatch.html><img src=stopwatch.svg width=24 height=24 alt=stopwatch> stopwatch</a><dd>An extremely simple JavaScript-based stopwatch.<dt><a href=donate><img src=img/bitcoin.svg alt=donate> Donate</a><dd>Some people like what I do and want to compensate me for it, but I'm fine with compliments if you can't afford a donation!<dt><a href=humans.txt><img src=img/humans.svg alt=humans.txt> humans.txt</a><dd><a href=http://humanstxt.org/>We are humans, not robots.</a></dl><h2 id=contact>Contact</h2><p>If you use Telegram you can join <a href=https://t.me/LonamiWebs>@LonamiWebs</a> and just chat about any topics you like politely.<p>If you prefer, you can also send me a private email to <a href=mailto:totufals@hotmail.com>totufals[at]hotmail[dot]com</a> and I will try to reply as soon as I can. Please don't use the email if you need help with a specific project, this is better discussed in the group where everyone can benefit from it.</p><script> now = (new Date()).getFullYear(); document.getElementById("age").innerHTML = "aged " + (now - 1999); document.getElementById("programming").innerHTML = "for " + (now - 2012) + " years";
@@ -23,7 +23,8 @@
/* navigation */ nav.sections { padding-top: 20px; - display: block; + display: flex; + justify-content: space-between; } nav.sections ul {@@ -40,15 +41,27 @@ margin: 0 -2px;
font-size: 1.3em; } -nav.sections a { +nav.sections .left a { color: #787878; border-bottom: solid 2px #A8A8A8; padding: 0 16px; } -nav.sections a.selected, nav.sections a:hover { +nav.sections .left a.selected, nav.sections .left a:hover { color: #000000; border-bottom: solid 2px #444444; +} + +nav.sections img { + margin: 0 4px; +} + +.left { + justify-self: flex-start; +} + +.right { + justify-self: flex-end; } /* footer */