miniDB Project 2022-2023 P20176, P20235 #218

irosolonaki · 2023-02-20T21:31:47Z

Ονοματεπώνυμα Ομάδας και ΑΜ:
Σολωνάκη Ηρώ - Π20176
Ραφαηλίδη Σοφία Ιωάννα - Π20235

Έχουν υλοποιηθεί τα Issues 1,2,3a.

Issue 1:

a) Οι αλλαγές έχουν γίνει κυρίως στα αρχεία table.py και misc.py, όπου αναλύεται το condition για να γίνει έλεγχος αν υπάρχουν οι operators “not” ή “between”.
Στην περίπτωση του not, αφαιρείται το keyword “not” από την συνθήκη ελέγχου (condition) στην οποία μετά οι operators αντιστρέφονται κατάλληλα στην negated μορφή τους στο misc.py και επιστρέφονται στο table.py. Αντίστοιχα στην περίπτωση που έχουμε between operator τότε η συνθήκη γίνεται split μία φορά στο between και μία φορά στο and, αφού η σύνταξη είναι “salary between 67000 and 80000” το οποίο μεταφράζεται σε “salary >= 67000 and salary <= 80000”. Υπάρχει και η δυνατότητα της ερώτησης “salary not between 67000 and 80000” που μεταφράζεται σε “salary < 67000 or salary >80000”. Οι operators υποστηρίζονται εκτός από το select, στο delete και στο update.

Παραδείγματα Εκτέλεσης:

select * from instructor where not name = "gold"

select * from instructor where not salary > 67000

select * from instructor where not salary <= 72000

select * from instructor where salary between 62000 and 80000

select * from instructor where salary not between 62000 and 80000

delete from instructor where not salary > 72000

update table instructor set dept_name=finance where not salary < 87000

b) Η λογική είναι παρόμοια με το 1a, όπου κάνουμε split στους operators “and” ή “or” το condition και δημιουργούμε 2 λίστες που περιέχουν όλα τα conditions και τα rows που αντιστοιχούν σε αυτές. Ύστερα, όσον αφορά το “and”, αναλύουμε μία μία τις σύνθετες συνθήκες λαμβάνοντας τις κατάλληλες γραμμές για κάθε μία και ύστερα τις μετατρέπουμε σε set για να βρούμε το intersection τους, δηλαδή τα κοινά τους στοιχεία. Τέλος για το “or” ακολουθούμε την ίδια λογική με την διάσπαση των σύνθετων συνθηκών με την διαφορά ότι κάνουμε αφαίρεση των διπλοτύπων ελέγχοντας αν υπάρχουν κοινά στοιχεία σε κάθε υπολίστα της λίστας με τα αποτελέσματα κάθε συνθήκης ελέγχου (τα οποία είναι λίστα με τους δείκτες για τις αντίστοιχες εγγραφές), κάνοντας τέλος προσθήκη των σωστών γραμμών στο τελικό αποτέλεσμα.

Παραδείγματα Εκτέλεσης:

select * from course where title = "game design" and dept_name = "comp.sci"

select * from course where title = "game design" or dept_name = "biology"

Issue 2:

Αρχικά για το unique constraint τροποποιήθηκε το αρχείο mdb.py για να λαμβάνει στο λεξικό με τα keywords στο create index και την συγκεκριμένη στήλη με στόχο να περιέχει unique constraint. Επίσης προστίθεται και στο λεξικό ένα key για τα columns_unique. Αυτό το attribute προστέθηκε και στο Table object ως columns_unique, και όπως και για το primary key index πλέον υπάρχει και unique index το οποίο είναι λίστα και περιέχει τα ονόματα των στηλών με unique constraint.

Η σύνταξη είναι:
create table games(id int primary key, title str unique, studio str)

a) Η βασική υλοποίηση έχει γίνει στο database.py αρχείο στο οποίο έχουμε δημιουργήσει 2 νέες στήλες στον πίνακα meta_indexes για το unique column και το index type. Χρειάστηκε να δημιουργήσουμε νέα βάση, για αυτό και τα παραδείγματα είναι μέσα στη βάση δεδομένων test. Στην μέθοδο select, ελέγχουμε εάν ο πίνακας περιέχει ευρετήριο, σε ποια στήλη, αν είναι unique ή primary key, και αναλόγως με το index type κάνουμε την αντίστοιχη αναζήτηση. Η μέθοδος create_index άλλαξε έτσι ώστε να μην δημιουργείται btree index μόνο στο primary key αλλά και στις στήλες που περιέχουν το unique constraint.

Παραδείγματα Εκτέλεσης:

create index btree_gamestitle on games(title) using btree

Δεν γίνεται είσοδος διπλότυπων σε στήλη με unique constraint:

b) Ομοίως με το 2a, ελέγχεται εάν ο πίνακας περιέχει ευρετήριο, και στην περίπτωση του hash index χρειάστηκε να κάνουμε όλη την υλοποίηση ενός hashtable αντικειμένου που να χρησιμοποιεί extendible hashing. Συγκεκριμένα χρησιμοποιήθηκε MSB variant. Επίσης, δημιουργήθηκε καινούρια μέθοδος στο database.py για την δυνατότητα construct_index_hash και στο table.py για να γίνεται η επιλογή με hash-index. Σε περίπτωση ερώτησης διαστήματος κάνουμε απλή γραμμική αναζήτηση.

Παραδείγματα Εκτέλεσης:

create index hash_gamesid on games(id) using hash

Δημιουργία των indexes:

Issue 3:

a) Περιέχει 5 κανόνες ισοδυναμίας με βάση την θεωρία των ΣΔΒΔ, το οποίο λειτουργεί αλλάζοντας την σειρά που εκτελούνται τα sub-queries κάθε αρχικού query, μέσα από το λεξικό που χρησιμοποιείται για το query plan στο αρχείο mdb.py. Δημιουργούμε υπολίστες που περιέχουν temporary queries που αναμένουν να υλοποιηθούν και τα στέλνουμε στους κανόνες, λαμβάνοντας το ισοδύναμο τροποποιημένο.

Παραδείγματα Εκτέλεσης:

select * from instructor inner join department on instructor.dept_name=department.dept_name where dept_name!=biology

b) Δεν υλοποιήθηκε. Σκοπός ήταν ο υπολογισμός του κόστους μέσω του δέντρου πλάνου εκτέλεσης.

The table.py and misc.py files were mainly changed, so that they recognise both operators and change the condition accordingly. NOT Operator: The main idea is that we take the condition, check whether the 'not' keyword exists, if yes we remove it, and set a parameter named 'notcheck' to true, so that in the misc.py file the operator is reversed to its negated operator accordingly. Finally we reconstruct the condition and send it back to the table.py file. BETWEEN Operator: We check two keywords, 'between' and 'and' in the condition, and we split it 2 times in each keyword. We get the 2 values of our wanted range, and check whether each entry in the table is within that range. We return the correct records to the 'values' list and then add them to the rows.

Fixed some issues in the between and not operators, added 'and' and 'or' functionality. AND: The conditions are put in a list and split accordingly to create an intersection of the rows that correspond with the conditions. OR: Similar functionality, but the rows are created by checking each index in the list of rows that again correspond with at least one condition (meaning at least one is true) and adding those to the final rows.

…nality. UNIQUE constraint: Changed the dictionary of the query plan to specify the column in which the index will be created, then changed the Table object to have the unique attribute. Btree Indexing functionality: 2 columns were added to the meta-indexes table for the specific column and index type created. The table.py, database.py and mdb.py files were mainly changed. *Issue with the B+Tree nodes was found.

*not fully commented yet, needs code clean up The Hash-Index functionality is based on a HashTable object, created in the hash.py file which uses extendible hashing using the MSB variant and the splitting buckets method. Also added the ability to create more than one indexes on a table (one column with btree index and one with hash index for example) and the correct index is fetched from the meta_indexes table and used to check for the appropriate select method (select_with_btree or select_with_hash). Otherwise only one index type at a time was supported.

Optimizer object that changes the order of the sub-queries inside the items of the query plan

irosolonaki added 9 commits January 16, 2023 17:11

Code clean up and comments

3065336

Added all operators in update_table and fixed bug in delete_from

c16b17a

Comment unnecessary prints

ff4ef4f

Added Unique in headers

7f7ce88

Approach to RA Expressions

0556c1d

Optimizer object that changes the order of the sub-queries inside the items of the query plan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

miniDB Project 2022-2023 P20176, P20235 #218

miniDB Project 2022-2023 P20176, P20235 #218

irosolonaki commented Feb 20, 2023

miniDB Project 2022-2023 P20176, P20235 #218

Are you sure you want to change the base?

miniDB Project 2022-2023 P20176, P20235 #218

Conversation

irosolonaki commented Feb 20, 2023

Issue 1:

Παραδείγματα Εκτέλεσης:

Παραδείγματα Εκτέλεσης:

Issue 2:

Παραδείγματα Εκτέλεσης:

Παραδείγματα Εκτέλεσης:

Issue 3:

Παραδείγματα Εκτέλεσης: