Wednesday, October 8, 2014

Binary Search Trees - Search

Note - The code examples used in this post are located at my GitHub account - Binary Search Trees.

Imagine you were hired to develop an algorithm for finding a name in a phonebook. 

The simplest way to achieve this would be to start from the beginning of the phonebook and systematically flip through every page until we have found the name. Whilst this will certainly provide the right answer every time, is it the most efficient? 

Think what would happen if we had to find someone with the surname of “Jerry”, we’d start from A and have to flip through the phonebook to J to find it. If our phonebook contained millions of names, this could take quite a while. 

The structure of the data in a phonebook is special because it has the benefit of being ordered alphabetically. We can take advantage of this to make our search run faster. 

Now imagine that we begin our search for “Jerry” in the middle of the phonebook at the letter M. We know that “Jerry” must be in the left half of our phonebook since J comes before M, so we find the midpoint in the left half of the phonebook (A and M) and start the search process again. Each time we don’t find the name we continue the search on the left or right side, depending on where we are in the phonebook and the name that we want to find. If the name we want comes before the letter we are currently searching, then we search the left side and vice versa. 

With these steps we will reach two eventual outcomes -
  1. We will find the name in the phonebook.
  2. We will run out of phonebook space to search.
Because we discard half of the remaining phonebook each time we split left or right, if the name isn’t in the phonebook we will keep on eliminating parts of the phonebook to search until there is nothing. When we get to this point, it means that the name isn’t in the phonebook.

Starting in the middle, doing a comparison and continuing our search in either the left or right subspace if we do not find the item, is what we call a binary search

Sunday, September 28, 2014

Binary Search Trees

Note - The code examples used in this post are located at my GitHub account - Binary Search Trees.

A binary search tree or BST is a node based data structure that contains the following properties in each node -
  • A key or value - The item contained within the node that can be used in comparisons.
  • A pointer to the left subtree - A reference to the children in its left subtree.
  • A pointer to the right subtree - A reference to the children in its right subtree.



Figure 2 - A Binary Search Tree Node.

Each node in the tree comprises of the following properties - 
  1. All nodes in its left subtree contain key/values that are smaller than the current key/value that is stored in the node. 
  2. All nodes in its right subtree contain key/values that are larger than the current key/value that is stored in the node.
  3. If the current node is a right child of its parent, the key/value in the current node will be larger than that of its parent.
  4. If the current node is a left child of its parent, the key/value in the current node will be smaller than that of its parent. 
A tree comprised of a single node is still a binary search tree. A larger tree can be formed by joining a bunch of individual nodes together so long as they still abide by the properties that are defined above. For example, the illustration below shows a BST built with 7 individual nodes (numbered 1 through to 7). It is a BST because it still maintains the left/right child and parent relationship properties.



Figure 3 - A Binary Search Tree composed of many individual nodes.

It is called a binary search tree because we can determine if a node exists in the tree by using the steps from a binary search. We will go into more detail about how this works in a followup post.

A BST is seen as a recursive data structure because it uses the same node structure repeatedly to build the tree in addition to all nodes containing the same properties. We can leverage these relationships to allow us to perform some pretty nifty operations on these trees with a small amount of code. 

Stacks and Queues - An Introduction

Previously we've discussed about two types of containers for storage of data, arrays and linked structures. To expand a little further, we'll introduce two abstract data types that can be built from linked structures or arrays, stacks and queues.

An abstract data type (ADT) provides a definition rather than the concrete implementation. It defines what the data structure or data type can and cannot do (operations it can perform, and constraints of the structure). ADT's are generally implemented using existing data structures or data types. 

Stacks and queues are examples of ADT's. Conceptually each provide the description of the operations it can perform, and the limitations as to how they are supposed to operate. 

Stacks - 

The stack data structure functions as literally as the name suggests. Imagine at a dockyard, a crane stacking a number of shipping containers one on top of another. 


Figure 1 - Stacks of shipping containers 

Figure 1 shows a bunch of shipping containers all stacked on top of another. If we treat each column of containers as a single stack, we begin to notice that we're unable to remove the containers towards the bottom without removing the top most containers first. This is exactly how the ADT stack works.

A stack functions as a "Last in First Out" (LIFO) data structure, meaning the last item to be placed onto the stack is the first item taken out. Returning to the shipping container example, it means that we cannot remove the bottom most container until we've removed all the others above it. 

The process of adding and removing items from a stack is called push and pop.
  • Push (x, s) - Is the equivalent of adding an item onto the stack. This method definition means "to push item x onto the stack s".

  • Pop (s) - Is the equivalent of removing the top most item from the stack. This method definition means "to pop the top most item from stack s".

Push and pop are the two most basic operations that are performed on any stack. There is a third operation top (also known as peek), that is sometimes defined. 
  • Top/Peek (s) - Is the equivalent of asking "which item is at the top of the stack". It returns the requested item (if any) without removing it from the stack.


Queues - 

The queue data structure is just as literal as the definition for a stack. Imagine a single filed line of people at a bank like the illustration below.



Figure 2 - A queue of people in a line

Figure 2 illustrates exactly what a queue data structure represents. The person at the front of the line is the first one out of the line, all new people must follow the rules and line up at the end, no queue jumping or asking for favours. 

A queue functions as a "First in First Out" (FIFO) data structure. The two main operations to add and remove items from a queue are called enqueue and dequeue.

  • Enqueue (x, q) - Is the equivalent of "insert item x at the end of the queue q" or telling someone to line up at the end of the line. 

  • Dequeue(q) - Is the equivalent of "remove and return the first item from the queue q", or having the bank teller say "next person please!".

These are the two most basic operations on a queue, however like a stack, there is also a third operation that can also be implemented, front (or peek).
  • Front/Peek(q) - Is the equivalent of looking to see who's at the front of the queue, without actually removing them from the line. 

Summary - 

Stacks and queues can be implemented with both arrays and linked structures. To determine which structure is best to use is dependant on the situation.

If the we know the size of the stack or queue in advance, then we can use an array, otherwise it would be better to use a linked list since it is more dynamic when adding additional items. 

Stacks are useful to use when retrieval order of items does not really matter like in batch jobs. It is also useful for reversing a particular order of items since we can take advantage of the "Last in First Out" property of a stack.

Queues are useful when we want to control the waiting times for particular items. A queue allows us to minimise the maximum waiting time.  It is also useful if ordering of items is important.

Thursday, August 7, 2014

The Lightbulb Moment

So here we are, the decision to pursue a startup.

I've been wanting to do this for a while now, always with the same lingering questions running through my mind - 

  • "Do I know enough to even do it?"
  • "What if someone has already taken my idea?"
  • "Will this idea even work?"
  • "Have I done enough analysis to test the feasibility of this idea?"
  • "Will this make money?"
  • "Where do I begin?"

For me, that last point has always been the biggest stumbling block, how does one even begin to build something like a startup?

Motivations -

Progressing along in your professional career, you slowly start to learn more about yourself and the driving forces that get you out of bed every day. 

Being in the technology sector, we're constantly involved in discussions about the most successful startups, Facebook, Twitter, Instagram, WhatsApp, just to name a few. It's ingrained in our minds in such a way that makes us believe that we could be sitting on the next billion dollar idea. 

I'd have to admit that the idea of being the "next big thing" is certainly one of the motivating reasons to go on this journey, but surprisingly it isn't my primary reason to do so. 

I remember watching a YC Startup School presentation from Chase Adam of Watsi (video here) that I found to be really inspiring. Watsi by the way is a non-profit that utilises crowd sourcing to fund medical treatments for those in need by using a 100% donation model. 

The inspiring part of it is that Watsi leverages the power of technology to change the game in terms of how fundraising is achieved. It's not about the fame or fortune, its about making a positive difference in the world. 

Watching the video made me want to pursue a similar goal, to do something you truly enjoy and make a difference in peoples lives.

What Now? -

I started bouncing some ideas around friends, colleagues and even my university professors and I received some really constructive feedback. Being from Melbourne, the tech scene is still not as big as Silicon Valley but in saying so it still presents an excellent opportunity to build something local, testing it then expanding it out. 

Even with the feedback, I was still stuck on what to do next. Having an idea is one thing but turning it into a product or service is a completely different thing altogether. I got a little bogged down in the details of what to do and from that I became a little deflated, it almost seemed as though I wasn't progressing with this, still stuck on the start line.

There's probably a conventional way of trying to build a startup, having an idea, a team, a business plan etc, but at some point it is a matter of just going ahead and doing it. I do can all the planning the world but it isn't going to get me to where I want to be if I don't start building it. 

What's Next? -

This is most probably the wrong way to go about it, but at the moment the aim is to build a prototype over the next two months and go from there. Currently its a one man show but I do hope that this can change over time. 

This will definitely be hard but whats life without the challenges. I'm sure there will be more failures than successes but you can neither succeed nor fail at what you do if you don't give it a shot. 

I am open to any pieces of advice, everything from here on out will be a learning process.

So here we go...

Thursday, June 26, 2014

Deleting from a Doubly Linked List

Note - The code examples used in this post are located at my GitHub account - Doubly Linked Lists

In the past couple of posts we've covered the process of searching and inserting items into a doubly linked list. This post will cover the last of the three basic operations on doubly linked lists, deletion.

Since there is no way to directly access items in a doubly linked list like we can with an array, deleting an item involves the following steps. 
  1. Search through the list to find the node we want to remove (if it exists). 
  2. Connect the predecessor of the found node to the next of found node. 
Step 2 is important as if we do not complete this step then we would have two separate lists since we would have removed the connecting node.

Inserting into a Doubly Linked List

Note - The code examples used in this post are located at my GitHub account - Doubly Linked Lists.

Previously we covered the algorithm for recursively searching through a doubly linked list. It provided a detailed insight into how to traverse through a linked structure and to find and return items within it. 

Searching is an important functionality of a doubly linked list, however without any content to search for, it doesn't really do much. In this post, we'll cover the insertion algorithm so that we can start putting items into our doubly linked list and search for them. 



Thursday, June 12, 2014

Doubly Linked Lists (Searching)

Note - The code examples used in this post are located at my GitHub account - Doubly Linked Lists

A doubly linked list is another example of a linked structure. As with single linked lists, it also comprises of a group of nodes connected together in a chain to form the list. 

The nodes in a doubly linked list contain the following properties -
  1. The data field(s) - each node can contain any number of data fields.
  2. Pointer to next - each node contains a pointer to the successor node in the list.
  3. Pointer to previous - each node contains a pointer to its predecessor node in the list. 
The doubly linked list gets its name from the number of pointers that each node has. With the two pointers, each node knows about both its successor and predecessor.




Figure 1 - Doubly Linked List Structure.