Lecture 5 - Data Structures
- we cannot always malloc more memory for an array if the memory after the current array position is already taken
- one solution is to move array to new location, however that adds operation time of O(n)
- This problem introduces the challenge in computer science of balancing the form of the memory that are allocated for programs against the worst-case running times
- i.e. we can make improvements for one which usually has a detrimental impact on the other
- This balancing act comes is manipulated in the data structures used in a program
- Data structures are complex ways to organise data in memory, allowing information to be stored in the different layouts
- Tools needed to build data structures are
- structs: to create custome data types
- . to access properties in a structure
- * to go to an address in memory pointed to by a pointer
- - > to access propertied in a structure pointed to by a pointer
- this is a new operator, and it is what linked list uses to go to a new address and look inside of structure.
Linked Lists
- with a linked list we can store a list of values which can be easily grown by storing subsequent values in different parts of memory
- To track all values in a linked list, we need to allocate enough memory for both the value of each element we want to store and the address of the next element
- the dual values are called a node
- a node contains both an item and a pointer called next
Example structure for a node
typedef struct node
{
int number;
struct node *next;
}
node;
- The tradeoff of this flexibility is we need to allocate twice as much memory for each element in order to spend less time adding values
- We can no longer use binary search either
- by convention, an additional element to the null indicating the end of list, is an element which is a pointer to the first item in the list
Stacks and Queues
- A queue is one type of data structure
- has the form of first in first out
- an item can be enqueued, that is the item can be put in queue
- an item can be dequeued, that is the item leaves the queue
- A stack is another type
- has the form of last in and last out
- an item can be pushed on stack, that is put on top of stack
- the top item can be popped off stack, that is taken from the stack
- there is a function in C called realloc which can reallocate memory for an array which has run out of contiguous memory space
- this function takes two arguments, the array to be copied and the size of the new array
- realloc handles all the following stages for you
- creates a new array of requested size
- copies values from array to be copied to new array
- creates a pointer for new array
- once copying is done, realloc frees the memory of first array
- the pointer for previous array is then changed, pointing now to the new array
- useful as amount of data in array changes, realloc can be used to grow or shrink the stack or queue
Trees
- Binary search trees are another data structure
- They store data more efficiently than stacks or queues by allowing faster searching and retrieval
- searching tree can be implemented by using recursion
- a tree can offer dynamism because it grows and shrinks easily
Dictionaries
- another data structure which have a key and value, much like a word and definition in a language dictionary
Hashing a Hash Tables
- hashing is the idea of taking a value or part of it and using that that hash as a shortcut to finding it later
- for example, in a data structure containing fruit, apples could be hashed by using the letter a, putting all fruit with the letter a in that ‘bucket’, and then find the apple data by using the shortcut of looking in the ‘a’ bucket.
- This example demonstrates the concept of bucketising
- A hash function is an algorithm which reduces a larger value into something small and predictable
- generally involves representing a new array item with a number which represents where item should be placed
- A hash table is a combination of arrays and linked lists.
- it is an array of pointers to nodes
Tries
- tries are a form of data structure searchable in real time
- one downside is they use a lot of memory