Monday, March 30, 2009

Prim-Jarnik's algorithm for extracting Minimum Spanning Tree (MST)

This algorithm comes under the greedy method, which means that the objects are chosen to join a growing collection by iteratively picking an object that minimizes some cost function.

Now, taking account of the above mentioned greedy method, the Prim-Jarnik's algorithm is similar to the Dijkstra's shortest-path algorithm to grow the Minimum Spanning Tree (MST) from a single cluster starting with a randomly chosen single root vertex.

The Prim-Jarnik's algorithm can be broken down as follows:
Step 1: Define the starting "cloud" of vertices [C] by selecting some random vertex [v] initially
Step 2: Iteratively choose a min-weight edge [e=(v,u)], thereby connecting a vertex [v] in the cloud [C] to a vertex [u] outside of [C]. After choosing [e] the vertex [u] is brought into the cloud [C] continuously until a spanning tree emerges.


Pseudo-code of the Prim-Jarnik's algorithm
// Prim-Jarnik(G) [Time complexity = O(m * log * n)]
// Input: A weighted connected graph [G] with n number of vertices & m number of edges
// Output: A minimum spanning tree [T] for [G]

// Pick a root vertex [v] of [G]
D[v] <- 0

// All vertices connected to the root vertex, hence it necessary to define the cost function.
// The cost function "D[u] = +infinity" when vertex [u] doesn't connect to vertex [v].
for each vertex u != v do
D[u] <- +infinity

// Initialise the MST
T <- null

// Initialise a priority queue Q with an entry ((u,null),D[u]) for each vertex [u],
// where (u,null) is the element & D[u] in the key
while Q is not empty do
(u,e) <- Q.removeMin()
Add vertex u and edge e to T

// Perform the relaxation procedure on the edge of the vertex [z] adjacent to [u]
For each vertex z adjacent to u such that z is in Q do
if W((u,z)) <>
D[z] <- W((u,z))
Change to (z,(u,z)) the element of vertex z in Q
Change to D[z] the key of vertex z in Q
return the tree T








Kruskal's algorithm for extracting Minimum Spanning Tree (MST)

This algorithm comes under the greedy method, which means that the objects are chosen to join a growing collection by iteratively picking an object that minimizes some cost function.

Now, taking account of the above mentioned greedy method, the Kruskal's algorithm with a graph considers the edges in order of their weights to grow the Minimum Spanning Tree (MST) in clusters.

The Kruskal's algorithm can be broken down as follows:
Step 1: In a given undirected graph [G] consider each vertex is in its own cluster all by itself i.e. if a [G] has four vertices then you have four cluster initially
Step 2: Look for each edges in turn, ordered by increasing weight i.e. arrange all the edges in ascending order
Step 3: Start with the smallest weighing edge. If this edge connects two different clusters then it is considered as the min-weight "bridge" edge [e] and added to the set of edges of the MST. In this initial stages, due to addition of [e] to the MST (i.e. the growth of the MST), the two clusters then are merged into a single cluster. However, if a given [e] connects to two vertices that are already in the identified cluster then this [e] is discarded.
Step 4: Now iteratively the algorithm adds enough [e] to a set of edges of spanning tree, it terminates and outputs this tree as the MST.


Pseudo-code of the Kruskal's algorithm
// Kruskal(G) [Time complexity = O(m * log * n)]
// Input: A simple connected weight graph [G] with [n] vertices and [m] edges
// Output: A minimum spanning tree [T] for [G]
for each vertex [v] in [G] do
Define an elementary cluster C(v) <- {v}

// Initialise a priority queue Q to contain all edges in [G], using the weights as keys
T <- null
// The above [T] will ultimately contain the edges of the MST

while [T] has fewer then n-1 edges do
(u,v) <- Q.removeMin()
// Let C(v) be the cluster containing v, and let C(u) be the cluster containing u.
if C(v) != C(u) the
Add edges (v,u) to T
Merge C(v) & C(u)
// The above union of C(v) & C(u) yields a single cluster
return tree T

Example below shows the Kruskal's algorithm that extracts the MST from a spanning tree. This spanning tree depicts a subset of Finnish intercity roadways route:

--------------------------------------------
Stage 1.
--------------------------------------------
Main cluster (C) contains the following vertices: null
Elementary Cluster (C1) contains the following vertices: null

Priority Queue (Q)
------------------
----
110
----
137
---
141
---
164
---
177
---
294
---
381
---
409
---
534
---
549
---
------------------

Set of min-weight "bridge" edges e = {null}
Set of discarded edges d = {null}

--------------------------------------------
Stage 2.
--------------------------------------------
Main cluster (C) contains the following vertices:null
Elementary Cluster (C1) contains the following vertices: Pori+Tampere

Priority Queue (Q)
------------------
----
137
---
141
---
164
---
177
---
294
---
381
---
409
---
534
---
549
---
------------------

Set of min-weight "bridge" edges e = {110}
Set of discarded edges d = {null}

--------------------------------------------
Stage 3.
--------------------------------------------
Main cluster (C) contains the following vertices:null
Elementary Cluster (C1) contains the following vertices: Pori+Tampere
Elementary Cluster (C2) contains the following vertices: Helsinki+Kouvola

Priority Queue (Q)
------------------
---
141
---
164
---
177
---
294
---
381
---
409
---
534
---
549
---
------------------

Set of min-weight "bridge" edges e = {110,137,}
Set of discarded edges d = {null}

--------------------------------------------
Stage 4.
--------------------------------------------
Main cluster (C) contains the following vertices:null
Elementary Cluster (C1) contains the following vertices: Pori+Tampere+Turku
Elementary Cluster (C2) contains the following vertices: Helsinki+Kouvola

Priority Queue (Q)
------------------
---
164
---
177
---
294
---
381
---
409
---
534
---
549
---
------------------

Set of min-weight "bridge" edges e = {110,137,141}
Set of discarded edges d = {null}

--------------------------------------------
Stage 5.
--------------------------------------------
Main cluster (C) contains the following vertices:null
Elementary Cluster (C1) contains the following vertices: Pori+Tampere+Turku+Helsinki+Kouvola

Priority Queue (Q)
------------------
---
177
---
294
---
381
---
409
---
534
---
549
---
------------------

Set of min-weight "bridge" edges e = {110,137,141,164,}
Set of discarded edges d = {null}

--------------------------------------------
Stage 6.
--------------------------------------------
Main cluster (C) contains the following vertices:null
Elementary Cluster (C1) contains the following vertices: Pori+Tampere+Turku+Helsinki+Kouvola

Priority Queue (Q)
------------------
---
294
---
381
---
409
---
534
---
549
---
------------------

Set of min-weight "bridge" edges e = {110,137,141,164,}
Set of discarded edges d = {177,}

--------------------------------------------
Stage 7.
--------------------------------------------
Main cluster (C) contains the following vertices:null
Elementary Cluster (C1) contains the following vertices: Pori+Tampere+Turku+Helsinki+Kouvola+Kuopio

Priority Queue (Q)
------------------
---
381
---
409
---
534
---
549
---
------------------

Set of min-weight "bridge" edges e = {110,137,141,164,294}
Set of discarded edges d = {177,}

--------------------------------------------
Stage 8.
--------------------------------------------
Main cluster (C) contains the following vertices:null
Elementary Cluster (C1) contains the following vertices: Pori+Tampere+Turku+Helsinki+Kouvola+Kuopio

Priority Queue (Q)
------------------
---
409
---
534
---
549
---
------------------

Set of min-weight "bridge" edges e = {110,137,141,164,294,}
Set of discarded edges d = {177,381}

--------------------------------------------
Stage 9.
--------------------------------------------
Main cluster (C) contains the following vertices:null
Elementary Cluster (C1) contains the following vertices: Pori+Tampere+Turku+Helsinki+Kouvola+Kuopio

Priority Queue (Q)
------------------
---
534
---
549
---
------------------

Set of min-weight "bridge" edges e = {110,137,141,164,294}
Set of discarded edges d = {177,381,409}

--------------------------------------------
Stage 10.
--------------------------------------------
Main cluster (C) contains the following vertices:null
Elementary Cluster (C1) contains the following vertices: Pori+Tampere+Turku+Helsinki+Kouvola+Kuopio+Oulu

Priority Queue (Q)
------------------
---
549
---
------------------

Set of min-weight "bridge" edges e = {110,137,141,164,294,534,}
Set of discarded edges d = {177,381,409}

--------------------------------------------
Stage 11.
--------------------------------------------
Main cluster (C) contains the following vertices = C1
Elementary Cluster (C1) contains the following vertices: Pori+Tampere+Turku+Helsinki+Kouvola+Kuopio+Oulu+Ivalo

Priority Queue (Q)
------------------
null
------------------

Set of min-weight "bridge" edges e = {110,137,141,164,294,534,549}
Set of discarded edges d = {177,381,409}







Spanning Tree & Minimum Spanning Tree (MST)

Spanning Tree - A tree that contains every vertex of a connected graph G is referred to as a spanning tree. The below figure depicts the Finnish intercity roadways route as a sapnning tree.



Minimum Spanning Tree (MST) - the problem of computing a spanning tree with the smallest total weight is known as the Minimum Spanning Tree problem. The below figure depicts the MST that houses a collection of smallest weighted routes between the Finnish cities. Routes not taken into the MST are: 177,381,409.

Friday, March 27, 2009

Application of a Binary Tree

The binary tree are applicable in the below mentioned fields:
- Used in the compilers of high-level programming languages for intermediate representation
- Used as a searching technique
- Used in databases for storing data
- To solve arithmetic expressions

Exercise-Binary Tree Construction

Given data items: 17, 6, 3, 9 ,12, 15, 1, 18, 22
The resulting binary tree is:

What is a Binary Tree?

A binary tree embodies a finite set of data items that is either empty or partitioned into three disjoint subsets. The first part contains a single data item referred to as the root of the binary tree, other two data items are left and right subtrees. These data items is referred to as nodes of the binary tree.

From this figure we can portray the following:
A - Root node of the binary tree
A - Parent node of the nodes B & C
B & C - Children nodes of A and these nodes are in a level one more than the root node A
C - Right child node of A
B - Left child node of A
D & E - These node are without a child, so they are referred to as leafs or external nodes
B & C - These nodes have a child each, so they are referred to as internal nodes

Depth of this binary tree - The longest path from the root node A to any leaf node, for example D node. Therefore, the depth of this binary tree is two.
Indegree of a node - The number of edges merging into a node. For example, indegree of the node B is one i.e. one edge merges.
Outdegree of a node - The number of edges coming out from a node. For example, outdegree of the node A is two i.e. two edges comes out of this root node.

Concerning the binary trees, do refer to the following blog entries:
- Exercise-Binary Tree Construction
- Application of a Binary Tree

What is a Complete Binary Tree?

A strictly binary tree of depth 'd' in which all the leaves are at level 'd' i.e. there is non empty left or right subtrees and leaves are populated at the same level.
In a complete binary tree all the internal nodes have equal degree, which means that one node at the root level, two nodes at level 2, four nodes at level 3, eight nodes at level 4, etc, which the below figure represents:

Thursday, March 26, 2009

Hashing, Hash Data Structure and Hash Table

Hashing is the process of mapping large amount of data item to a smaller table with the help of a hashing function. The essence of hashing is to facilitate the next level searching method when compared with the linear or binary search. The advantage of this searching method is its efficiency to hand vast amount of data items in a given collection (i.e. collection size).

Due to this hashing process, the result i
s a Hash data structure that can store or retrieve data items in an average time disregard to the collection size.

Hash Table is the result of storing the hash data structure in a smaller table which incorporates the hash function within itself. The Hash Function primarily is responsible to map between the original data item and the smaller table itself. Here the mapping takes place with the help of an output integer in a consistent range produced when a given data item (any data type) is provided for storage and this output integer range determines the location in the smaller table for the data item. In terms of implementation, the hash table is constructed with the help of an array and the indices of this array are associated to the output integer range.

Hash Table Example :
Here, we construct a hash table for storing and retrieving data related to the citizens of a county and the social-security number of citizens are used as the indices of the array implementation (i.e. key). Let's assume that the table size is 12, therefore the hash function would be Value modulus of 12.

Hence, the Hash Function would equate to:
(sum of numeric values of the characters in the data item) %12
Note! % is the modulus operator

Let us consider the following social-security numbers and produce a hashcode:
120388113D => 1+2+0+3+8+8+1+1+3+13=40
Hence, (40)%12 => Hashcode=4

310181312E => 3+1+0+1+8+1+3+1+2+14=34
Hence, (34)%12 => Hashcode=10

041176438A => 0+4+1+1+7+6+4+3+8+10=44
Hence, (44)%12 => Hashcode=8

Therefore, the Hashtable content would be as follows:
-----------------------------------------------------
0:empty
1:empty
2:empty
3:empty
4:occupied Name:Drew Smith SSN:120388113D
5:empty
6:empty
7:empty
8:occupied Name:Andy Conn SSN:041176438A
9:empty
10:occupied Name:Igor Barton SSN:310181312E
11:empty
-----------------------------------------------------

Tuesday, March 24, 2009

Find shortest path using Dijkstra's Algorithm

The Dijkstra's algorithm entails the following procedure:
- The breath-first approach is used to traversal the network, however the difference here is instead of a queue data structure to store the traversed vertex this algorithm uses a priority queue. The priority queue helps to remove the items in order of values rather than in order of being added to the queue.
- Have an account of lowest total path weight (i.e. weightsum), from the start vertex to the target vertex
- Have an account of immediate predecessor in a given path for each vertex
- Have an account of the items in the priority queue

Exercise- Find the shortest path from Pori to Kuopio


1
----------------------------------------------------
Vertex - Weightsum - Predecessor
----------------------------------------------------
Helsinki - nul - null
Ivalo - null - null
Kouvola - null - null
Kuopio - 409 - Pori
Oulu - null - null
Pori - 0 - Pori
Tampere - 110 - Pori
Turku - 141 - Pori
----------------------------------------------------
Visited: Pori
Priority Queue: (Tampere,110), (Turku,141), (Kuopio,409)

2
----------------------------------------------------
Vertex - Weightsum - Predecessor
----------------------------------------------------
Helsinki - 287 - Tampere
Ivalo - null - null
Kouvola - null - null
Kuopio - 404 - Tampere
Oulu - null - null
Pori - 0 - Pori
Tampere - 110 - Pori
Turku - 141 - Pori
----------------------------------------------------
Visited: Pori,(Tampere,110)
Priority Queue: (Turku,141),(Helsinki,287),(Kuopio,404),(Kuopio,409)

3
----------------------------------------------------
Vertex - Weightsum - Predecessor
----------------------------------------------------
Helsinki - 287 - Tampere
Ivalo - null - null
Kouvola - null - null
Kuopio - 404 - Tampere
Oulu - null - null
Pori - 0 - Pori
Tampere - 110 - Pori
Turku - 141 - Pori
----------------------------------------------------
Visited: Pori,(Tampere,110),(Turku,141)
Priority Queue: (Helsinki,287),(Kuopio,404),(Kuopio,409)

4
----------------------------------------------------
Vertex - Weightsum - Predecessor
----------------------------------------------------
Helsinki - 287 - Tampere
Ivalo - null - null
Kouvola - 364 - Helsinki
Kuopio - 404 - Tampere
Oulu - null - null
Pori - 0 - Pori
Tampere - 110 - Pori
Turku - 141 - Pori
----------------------------------------------------
Visited: Pori,(Tampere,110),(Turku,141),(Helsinki,287)
Priority Queue: (Kouvola,364),(Kuopio,404),(Kuopio,409)

5
----------------------------------------------------
Vertex - Weightsum - Predecessor
----------------------------------------------------
Helsinki - 287 - Tampere
Ivalo - null - null
Kouvola - 364 - Helsinki
Kuopio - 404 - Tampere
Oulu - 898 - Kouvola
Pori - 0 - Pori
Tampere - 110 - Pori
Turku - 141 - Pori
----------------------------------------------------
Visited: Pori,(Tampere,110),(Turku,141),(Helsinki,287),(Kouvola,364)
Priority Queue: (Kuopio,404),(Kuopio,409),(Oulu,898)

6
----------------------------------------------------
Vertex - Weightsum - Predecessor
----------------------------------------------------
Helsinki - 287 - Tampere
Ivalo - null - null
Kouvola - 364 - Helsinki
Kuopio - 404 - Tampere
Oulu - 898 - Kouvola
Pori - 0 - Pori
Tampere - 110 - Pori
Turku - 141 - Pori
----------------------------------------------------
Visited: Pori,(Tampere,110),(Turku,141),(Helsinki,287),(Kouvola,364),(Kuopio,404)
Priority Queue: (Kuopio,409),(Oulu,898)

Therefore the shortest path from Pori to Kuopio is Pori,(Tampere,110),(Kuopio,404)

Friday, March 20, 2009

What is a Network in the context of Graph?

Network refers to a graph in which the edges are bound with weights. Here weights can be numbers assigned to a the edges. Hence, a graph of this nature is referred to as weighted graph or a network.

Thursday, March 19, 2009

Graph Traversal Methods

There are two possible traversal methods for a graph:
i. Breadth-First Traversal
ii. Depth-First Traversal

i. Breadth-First Traversal: Traverse all the vertices starting from a given 'start vertex'. Hence, such a traversal follows the 'neighbour-first' principle by considering the following:
- No vertex is visited more than once
- Only those vertices that can be reached are visited

Importantly here, we make use of the Queue data structure. In effect, we house the vertices in the queue that are to be visited soon in the order which they are added to this queue i.e. first-in first-out.

Visiting a vertex involves returning the data item in the vertex and adding its neighbour vertices to the queue. However, we don't carryout any action under following conditions:
- if the neighbour is already present in the queue or if the neighbour has been already visited

As an example of the breadth-first traversal, we traverse the major cities of Finland with by the start vertex as 'Pori . Note! the neighbour(s) of a given vertex are added to the queue in alphabetical order.



1
Visited: Pori
Queue: Tampere, Turku

2
Visited: Pori, Tampere
Queue: Turku, Helsinki, Kuopio

3
Visited: Pori, Tampere, Turku
Queue: Helsinki, Kuopio

4
Visited: Pori, Tampere, Turku, Helsinki
Queue: Kuopio, Kouvola

5
Visited: Pori, Tampere, Turku, Helsinki, Kuopio
Queue: Kouvola

6
Visited: Pori, Tampere, Turku, Helsinki, Kuopio, Kouvola
Queue: Oulu

7
Visited: Pori, Tampere, Turku, Helsinki, Kuopio, Kouvola, Oulu
Queue: empty

Therefore, the route yielded by the Breadth-First traversal:
Pori, Tampere, Turku, Helsinki, Kuopio, Kouvola, Oulu


ii. Depth-First Traversal: Traverse all the vertices starting from a given 'start vertex'.
Hence, such a traversal follows the 'neighbour-first' principle by considering the following:
- No vertex is visited more than once
- Only those vertices that can be reached are visited

Importantly here, we make use of the Stack data structure. In effect, we stack the vertices so that vertices to be visited soon are in order in which they are popped from the stack i.e. last-in, first-out.

Visiting a vertex involves returning the data item in the vertex and adding its neighbour vertices to the stack. However,
we don't carryout any action if the following occurs:
- neighbour is already present in the stack or if the neighbour has already been visited


As an example of the depth-first traversal, we traverse the major cities of Finland with by the start vertex as 'Pori'. Note! the neighbour(s) of a given vertex are pushed to the stack in alphabetical order.

1
Visited: Pori
Stack:
-------
Turku
-------
Tampere
-------

2
Visited: Pori, Turku
Stack:
-------
Helsinki
-------
Tampere
-------

3
Visited: Pori, Turku, Helsinki
Stack:
-------
Kuopio
-------
Kouvola
-------
Tampere
-------

4
Visited: Pori, Turku, Helsinki, Kuopio
Stack:
-------
Kouvola
-------
Tampere
-------

5
Visited: Pori, Turku, Helsinki, Kuopio, Kouvola
Stack:
-------
Oulu
-------
Tampere
-------

6
Visited: Pori, Turku, Helsinki, Kuopio, Kouvola, Oulu
Stack:
-------
Tampere
-------

7
Visited: Pori, Turku, Helsinki, Kuopio, Kouvola, Oulu, Tampere
Stack:
-------
Empty
-------

Therefore, the route yielded by the Depth-First traversal:
Pori, Turku, Helsinki, Kuopio, Kouvola, Oulu, Tampere

Types of Graphs

There two types of graphs:
i. Undirected Graphs
ii. Directed Graphs

Undirected Graph: A graph that entail edges with ordered pair of vertices, however it does not have direction define. Example of such a graph is the 'Family tree of the Greek gods'

Directed Graph: A graph that entail edges with ordered pair of vertices and has direction indicated with an arrow. Example of such a graph is the 'A finite-state machine of a light switch'

Tuesday, March 17, 2009

What is a Graph data structure?

Graph is an Abstract Data Type (ADT) which is of a non-linear structure entailing sets of nodes and connectors. Here, the connectors helps to establish relationship between nodes. However, from the graph ADT literature point-of-view the nodes are referred to as Vertices and connectors are referred to as Edges.
Mathematically, a graph G entails vertices V, which are connected by edges E. Hence, a graph is defined as an ordered pair G=(V,E), where V is a finite set and E is a set consisting of two element subsets of V.

Wednesday, March 11, 2009

What is Bucket Sort?

Bucket Sort (alternatively know as bin sort) is an algorithm that sorts numbers and comes under the category of distribution sort.
This sorting algorithm doesn't compare the numbers but distributes them, it works as follows:
1. Partition a given array into a number of buckets
2. One-by-one the numbers in the array are placed in the designated bucket
3. Now, only sort the bucket that bare numbers by recursively applying this Bucket Sorting algorithm
4. Now, visiting the buckets in order, the numbers should be sorted

What is Radix Sort?

Radix Sort is an algorithm that sorts a list of numbers and comes under the category of distribution sort.
This sorting algorithm doesn't compare the numbers but distributes them, it works as follows:
1. Sorting takes place by distributing the list of number into a bucket by passing through the individual digits of a given number one-by-one beginning with the least significant part. Here, the number of buckets are a total of ten, which bare key values starting from 0 to 9.
2. After each pass, the numbers are collected from the buckets, keeping the numbers in order
3. Now, recursively redistribute the numbers as in the above step '1' but with a following reconsideration: take into account next most significant part of the number, which is then followed by above step '2'.

For an illustration concerning this sorting algorithm check the Radix Sorting exercise

Tuesday, March 10, 2009

What is Heap Sort?

Heap Sort is a sorting algorithm that sorts data items into ascending or descending order, which comes under the category of comparison-based sorting and a family member of the selection sort. As the name suggests, this sorting algorithm use the concept of a heap structure that is based on the binary tree, which works as follows:
1. Using an array and the given data items, build a heap i.e. either a min heap or max heap as per requested sorting order
2. From the heap, remove the root node's data items and insert it to an separate array
3. Wait until the heap orders is restored, which will produces a new root node. Now, recursively repeat the above procedure '2' until there is no data items left in the heap.

Click the link -> Heap Sort for the pseudocode

Monday, March 9, 2009

What is Quick Sort?

Quick Sort is a sorting algorithm that sorts data items into ascending or descending order, which comes under the category of comparison-based sorting.
This algorithm works as follows:
1. Reorder by spiting the sequence into left and right halves with a pivot data item in between. Here, pivot data item is identified by comparison i.e. a data item must be greater than or equal to every data item in the left half & less than or equal to every data item in the right half.
2. Now, recursively sort the two half's separately.

What is Merge Sort?

Merge sort is a sorting algorithm that sorts data items into ascending or descending order, which comes under the category of comparison-based sorting.
Here we apply the divide-and-conquer strategy to sort a given sequence of data items, which can be described as follows:
1. Recursively split the sequence into two halves (i.e. subsequences) until the subsequence contains only a single data item (i.e. singleton subsequence)
2. Now, recursively merge these subsequences back together preserving their required order (i.e. ascending or descending order)

Friday, March 6, 2009

What is Selection Sort?

Selection Sort is a sorting algorithm that sorts data items into ascending or descending order, which comes under the category of in-place comparison sort.
Sorting takes place by stepping through all the data items one-by-one while looking for either largest or smallest data item and making only one swap after finding either largest or smallest data item. Hence, this sorting algorithm is referred to as the selection sort because on each pass this algorithm selects either largest or smallest of the remaining unsorted data items and places them in the right order.

click the link -> Selection Sort for the pseudocode

What is Bubble Sort?

Bubble sort is a sorting algorithm that sorts data items. These data items are presumed to be arranged in a sequential order (i.e. like an array). Sorting takes place by stepping through all the data items one-by-one in pairs and comparing adjacent data items and swapping each pair that is out of order. Idea of swapping operation is to gradually moves either larger or smaller data items to the right or left. This swapping operation is repeated on the data items (i.e. array) until no swaps are required, indicating that the data items is in ascending or descending order.

click the link -> Bubble Sort for the pseudocode.

Thursday, March 5, 2009

What is a non-linear datastructure?

A non-linear data structure is a data structure in which a data item is connected to several other data items. So that a given data item has the possibility to reach one-or-more data items.

Pros
  • Uses memory efficiently that the free contiguous memory in not an requirement for allocating data items
  • The length of the data items is not necessary to be known prior to allocation
Cons
  • Overhead of the link to the next data item
Examples of non-linear data-structures are Graphs and Trees. However Linked List and Arrays are linear data structures.

What is a linear datastructure?

Linear datastructure is one in which the data items are placed in the memory contiguously i.e. one after the other.
Pros
  • If we search the first data item, then it is very easy to find the next data items
Cons
  • Size of the array must be know prior to allocation
  • Requires contiguous memory, however if a free memory space is disjoint then this free memory space is not utilized for memory allocation
Example of linear datastructure is an array.

Wednesday, March 4, 2009

What is a sorting algorithm?

A sorting algorithm is an algorithm that sorts the data into ascending or descending order.

Examples of such sorting algorithms are:

Tuesday, March 3, 2009

What is a Search Tree?

Search Tree is based on the Binary Search Tree (BST) in which each node has a left and right child. All the childern to the left of the node with key values have smaller than the root node, all the childern to the right of the node have values greater than the root node.

For example, a sequence of numbers like 34, 54, 66, 12, 43, 28, 83 in a Search Tree is represented as follows:
Now, in order to search a given value i.e. 43 in this constructed tree. Start at the root node to search 43, this is done by making comparsion with the node value. When the value compared with the search value results to be greater, then move to the right node or vice versa. In this case, we move right because 43 is greater than 34.

From, now on just repeat the comparison procedure until you find a matching value or end the comparing procedure at a leaf node.

Continuing our search, we now compare 54 with 43, we notice 43 is less than 54, hence we move left and compare the leaf nodes value i.e. 43 with the search value i.e. 43, therefore 43 equals to 43... match found!

Advantages of such a search tree: Insertion, Removal, Search are performed efficiently. In particularly, when considering the searching iteration we reduce the number of elements to search by one-half.


Monday, March 2, 2009

What is Priority Queue?

A data structure for maintaining items or elements in a queue of priorities, which is usually implemented using an array. A queue of this nature helps to insert every item with an assigned priority and these inserted items could be deleted if required.

Sunday, March 1, 2009

What is a Linked List?

Linked List consists of a sequence of nodes. These nodes are made-up of the following:
  • A Data Field for housing the data item
  • One or Two Reference/s for pointing at other node/s i.e. pointing to the next/previous node/s.
In this data structure, the nodes are allowed to be inserted and removed at any point in the list in constant time, however random access in not possible.

Related posts:

Thursday, February 19, 2009

Binary tree traversal: Preorder, Inorder, and Postorder

In order to illustrate few of the binary tree traversals, let us consider the below binary tree:












Preorder traversal: To traverse a binary tree in Preorder, following operations are carried-out (i) Visit the root, (ii) Traverse the left subtree, and (iii) Traverse the right subtree.
Therefore, the Preorder traversal of the above tree will outputs:
7, 1, 0, 3, 2, 5, 4, 6, 9, 8, 10

Inorder traversal: To traverse a binary tree in Inorder, following operations are carried-out (i) Traverse the left most subtree starting at the left external node, (ii) Visit the root, and (iii) Traverse the right subtree starting at the left external node.
Therefore, the Inorder traversal of the above tree will outputs:
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10

Postorder traversal: To traverse a binary tree in Postorder, following operations are carried-out (i) Traverse all the left external nodes starting with the left most subtree which is then followed by bubble-up all the internal nodes, (ii) Traverse the right subtree starting at the left external node which is then followed by bubble-up all the internal nodes, and (iii) Visit the root.
Therefore, the Postorder traversal of the above tree will outputs:
0, 2, 4, 6, 5, 3, 1, 8, 10, 9, 7

Wednesday, February 18, 2009

String Searching Algorithm - KMP Algorithm

Knuth-Morris-Partt string searching algorithm:
- This algorithm searches a word->w within the main string text->s
- Such a search must employ the following observation:
i) A given match determines where the next match could begin
ii) The above intelligence is obtained for the search word itself that contains sufficient information for making such a discussion (i.e. determine where the next match begins)

Advantage of this string searching algorithm: Helps to avoid the re-examination of already matched characters

Step1:

returns: No match found!
Search algorithms observation: We notice no string->'s'(i.e. word position->i[0]) occurs between position w[0] & w[8] in the main string text except at w[5]. Hence, the next character matching operation begins at w[5].

Step2:

returns: Match found!
Search algorithms observation: We notice all strings of the word occurs between position w[5] & w[3] in the main string text. The next character matching operation begins at w[4].

Step3:

returns: No match found!
Search algorithms observation: We notice no string->'s'(i.e. word position->i[0]) occurs between position w[4] & w[9] in the main string text. The character matching operation ends here.

Tuesday, February 17, 2009

Exercise-Radix Sorting

a) Given elements for radix sorting = {92,66,74,81,76,52,93,16,12,20}

Sorted sequence = {12,16,20,52,66,74,76,81,92,93}


b) Given elements for radix sorting = {24,16,63,42,47,30,29,32,27,55,52}

Sorted sequence = {16,24,27,29,30,32,42,47,52,55,63}

Exercise-Manipulating an element in an array (Replace/Insert/Remove)

Given array sequence: S=(6, 5, 4, 3, 2, 1)

a) S.replaceElementAtRank(0,7)
(7, 5, 4, 3, 2, 1)

b) S.insertElementAtRank(1,7); S.replaceElementAtRank(3,8)
(6, 7, 5, 4, 3, 2, 1) => (6, 7, 5, 8, 3, 2, 1)

c) S.removeElementAtRank(2); S.insertElementAtRank(4,S.elementAtRank(2))
(6, 5, 3, 2, 1) => (6, 5, 3, 2, 3, 1)

d) S.insertElementAtRank(1,S.elementAtRank(2)); S.removeElementAtRank(S.elementAtRank(4))
(6, 4, 5, 4, 3, 2, 1) => (6, 4, 5, 3, 2, 1)

Monday, February 16, 2009

Exercise-Reverse Polish Notation (RPN)

a) 2+3+4
=> 34+
=> 234++

b) (2+3)*4
=> 23+
=> 423+*

c) 2*(5+2*3)
=> 23*
=> 23*5+
=> 223*5+*

d) (3+4)*(20-(3*4+2))
=> 34*
=> 34*2+
=> 2034*2+-
=> 34+2034*2+-*

e) 5*(3+4)-(2*(2+2*(1+2)))
=> 12+
=> 212+*
=> 2212+*+
=> 22212+*+*
=> 34+22212+*+*-
=> 534+*22212+*+*-

Sorting Algorithm - HeapSort & Heapify

HeapSort(A)
---------------------------------
01 BuildMaxHeap(A)
02 For i <- A.Length to 2 do
03 Exchange A[1] <-> A[i]
04 A.HeapSize <- A.HeapSize-1
05 PercolateDown(A,1)
06 end for
---------------------------------

BuildMaxHeap(A) (i.e. Max-Heapify)
---------------------------------
01 A.HeapSize <- A.Length
02 For i <- (A.Length/2) to 1 do
03 PercolateDown(A,i)
04 end for
---------------------------------

Thursday, February 12, 2009

Algorithm that returns the middle element of a double linked list

FindTheMiddleElementofDoubleLinkedList(L)
01 i <- head
02 j <- tail
03 while i != j do
04 i <- i.next
05 j <- j.previous
06 return i

Time complexity - Omega of data structures

a) 2-4 tree - O(logn) [Omega logarithmic n]

Time complexity - Omega of algorithms

a) if statement - O(n) [Omega times n]
b) for loop - O(n) [Omega times n]
c) for loop with a for loop - O(n*n) [Omega times n squared]
d) for loop within a if loop, this if loop is within a for loop - O(n + n*n) [n plus n squared omega]
e) while loop - O(n) [Omega times n]
f) MergeSort - O(nlogn) [Omega n time logarithmic n]
g) HeapSort - O(nlogn) [Omega n time logarithmic n]
h) KMP algorithm - O(m+n) [m plus n omega]

Wednesday, February 11, 2009

Checklist before selecting the right data structure for a given problem

(i) Analysis the problem in order to determine resource constraints
(ii) Determine the basic operations that need to be supported, such as Insert/Remove/Search
(iii) Now, select the data structure that has minimum cost/benefits to support the required operations

Algorithm to print-all-pairs in the tables (X, Y)

PrintAllPairs(X[0, 1, 2, ..., n-1], Y[0, 1, 2, ..., m-1])
01 for i<-0 to n-1 do
02 for j<-0 to m-1 do
03 Print(X[i], Y[j])
04 end for
05 end for

Algorithm to Sum-up n number of items in the table

Sum-up n number of items in the table and return the computed Sum

SumUp(X[0, 1, 2, ..., n-1])
01 Sum<-0
02 for i<-0 to n-1 do
03 Sum <- Sum + X[i]
04 end for
05 return Sum

Time complexity of algorithms - In particular Big Omega of algorithms

Big Omega or so called asymptotic upper bound of algorithms such as Sequential statements, If statements and Looping construct

(i) Sequential statements:
Big Omega of this algorithm equals to maximum time complexities of all individual statements (i.e. sum of maximum time complexities of the individual statements)
Therefore, O[g(n)] = O{f1(n)+f2(n)+ ... +fk(n)}
Conclusion: Overall complexity of this algorithm equates to O[g(n)]

(ii) If statements:
Big Omega of this algorithm equals to maximum time complexities of either one of the alternative statements i.e. then branch -> f(n)/else branch -> g(n)
Therefore, O[h(n)] = O{f(n), g(n)}
Conclusion: Overall complexity of this algorithm equates to O[h(n)]

(iii) Looping construct:
Big Omega of this algorithm equals to aggeration of the maximum time complexities of the following: (a) execution time (i.e. elementary operations execution in a given iteration) -> f(n) and (b) the number of iterations -> g(n)
Therefore, O{f(n) * g(n)}
Conclusion: Overall complexity of this algorithm equates to O{f(n) * g(n)}

Tuesday, February 10, 2009

Operation time complexity for a LINKED LIST

The time complexity of handling the operations in a data structure like a LINKED LIST are as follows:

i. Most of the operations are O(n) [i.e. omega times n]
ii. Remove/Insert operations that handles element between elements require O(1) [i.e. omega times one] this is due to references/pointers on both the ends of the list, hence the elements don't require to be shifted. However, due to the references/pointers this data structure requires additional memory!

Operation time complexity for an ARRAY

The time complexity of handling the operations in a data structure like an ARRAY are as follows:

i. Almost all the operations are O(1) [i.e. omega times one]
ii. Remove/Insert operations that handles element between elements require O(n) [i.e. omega times n] because the elements need to be shifted.
Note! Hence, an array as a data structure is used where remove/insert operations are not required.

Monday, February 9, 2009

Selection Sorting An Array

Selection sorting an array A[0, 1, 2, ..., n-1]

01 SelectionSort(A[0, 1, 2, ..., n-1])
02 for (j <- 0) to (n-2) do
03 m <- j
04 for (i <- j+1) to (n-1) do
05 if A[m] > A[i] then
06 m <- i
07 end if
08 end for
09 exchange A[m] <-> A[j]
10 end for

Bubble Sorting An Array

Bubble sorting an array A[0, 1, 2, ..., n-1]

01 BubbleSort(A[0, 1, 2, ..., n-1])
02 i <- 0 03 repeat 04 exch <- False 05 for (j <- 0) to (n-2-i) do 06 if A[j] > A[j+1] then
07 exchange A[j] <-> A[j+1]
08 exch <- True
09 end if
10 end for
11 i <- i+1
12 Until exch = False

Click here for explanation on what is bubble sort?

Computing The Median of An Array

Pseudocode for computing the median of an array A[0, 1, 2, ..., n-1]

01 SelectMedian(A[0, 1, 2, ..., n-1])
02 BubbleSort(A) or SelectionSort(A)
03 if n is odd then
04 r <- A[(n-1)/2]
05 else
06 r <- (A[n/2] + A[(n/2)-1)])/2
07 end if
08 return r