W4111 -- Introduction to DatabasesHomework 3Spring 2019, Sections 03, V03, H03IntroductionThis is the specification of for homework 3 for W4111 - Introduction to Databases, section 03,H03, V03 for spring semester 2019. This document is always the current version of thespecification. Developers are responsible for continuously reviewing the document for changes.Document ControlRolesAuthor UNI RoleFerguson, Donald, F. dff9 InstructorApprover UNI RoleFerguson, Donald, F. dff9 InstructorReviewer UNI RoleDalchand, Samantha sd2995 Assistant InstructorDhillon, Kirit ksd2142 Assistant InstructorGandikota, Chandana cg3111 Assistant InstructorGorrela, Meghna mg3740 Assistant InstructorHuang, Rose rh2805 Assistant InstructorHudson, Alysha alh2202 Assistant InstructorKarasev, Mikhail mak2257 Assistant InstructorPeterson, Ara alp2210 Assistant InstructorSaosun, Tahsina ts2931 Assistant InstructorSwaroop, Vatsala vs2671 Assistant InstructorTan, Xinyue xt2215 Assistant InstructorChange LogChange No. Date DocumentVersionChangesChange ProcessStudents should post clarification requests on this Piazza thread. The current version of thisdocument and the change log will note changes/clarifications. There will not be any other sourcedocumenting changes or clarifications.OverviewThis project has two parts:1. Implement indexes on top of CSV data in CSVDataTable implementations.2. Query optimization:1. Add a JOIN function to the CSVDataTable and implement query optimizations.2. Use Access Paths based on index selection to optimize find_by_template() andjoin().Allowed Frameworks/Libraries● You MAY only use libraries that are part of the core Python environment, e.g. csv, json,etc.● You MUST NOT use Pandas.Indexes and File1. The CSVDataTable manages dictionaries (also known as maps, name value pairs). Anindividual dictionary represents a row. The CSVDataTable as a whole represents all ofthe rows in a CSV file.2. A template is also a dictionary. A row matches a template if for each key in thedictionary, the row has a key that identifies a piece of data with exactly the same valueas in the template.3. Your CSVDataTable must implement the following operations:1. insert(row)2. find_by_template(template, field_list, index_allowed)1. This must return a CSVDataTable.2. The table contains dictionaries that match the template and contain therequested fields (dictionary keys and values).3. If index_allowed is True, the find may use an index if one is supports thetemplate.3. delete(template) deletes all rows matching a template.4. add_index(name, kind, column_list):1. name is a caller defined name.2. kind is one of “PRIMARY”, “UNIQUE”, “INDEX”.1. “UNIQUE” means that at most one row may exist in the table for aset of column values specified by the column_list.2. “PRIMARY” has the same behavior as unique, but there can beonly one “PRIMARY” index.3. “INDEX” allows duplicate values.3. column_list is the set of column names that comprise the index definition.5. import(rows): rows is a list of dictionaries. This operation inserts the rows into theCSVDataTable.6. save(): This function saves the CSVDataTable data (rows) and index informationto a single file.7. load(): This function loads the rows and index information from a single file.4. Data file behavior:1. Index information/state must persist between a save() and load(). You may notrebuild indexes on data load.2. load() loads the entire data and indexes.3. save() saves all of the data and index information.5. Your implementation should perform input validation on methods.6. Indexes only need to support equality comparisons.Join and Query OptimizationYour CSVDataTable implementation must support the following operation:join(other_table, on_columns, where_template, field_list):● Other table is a reference to a CSVDataTable.● on_columns is a list of column names common to both tables. The join() functionimplements an equi-join using these columns.● where_template is a dictionary. The only comparison operator is “==” and the resultingrows must match the template completely. They keys in the dictionary are of the form:○ table_name.column_name○ This specified the name of the table and column to which the template elementapplies.● field_list is of columns of the same form as the template column definition. The queryonly returns the requested columns.The join() function returns a CSVDataTable, which supports all CSVDataTable operations.Your implementation of MUST implement at least three optimizations that are analogous to SQLoptimizations covered in lectures.Your implementation should test for obvious error conditions and demonstrate improvementsfrom your optimizations.Submission FormatYou will submit your homework as a zip file. The file will have the following directory structure:● /src: Your implementation code.● /tests: Your test code.● /test_output: The console output from running the tests.● /CSVFile: Contains any CSV files you use in testing● /DB: Contains the holding tables/indexes.You must name the zip file following guidelines previously posted on Piazza.本团队核心人员组成主要包括硅谷工程师、BAT一线工程师,精通德英语!我们主要业务范围是代做编程大作业、课程设计等等。我们的方向领域:window编程 数值算法 AI人工智能 金融统计 计量分析 大数据 网络编程 WEB编程 通讯编程 游戏编程多媒体linux 外挂编程 程序API图像处理 嵌入式/单片机 数据库编程 控制台 进程与线程 网络安全 汇编语言 硬件编程 软件设计 工程标准规等。其中代写编程、代写程序、代写留学生程序作业语言或工具包括但不限于以下范围:C/C++/C#代写Java代写IT代写Python代写辅导编程作业Matlab代写Haskell代写Processing代写Linux环境搭建Rust代写Data Structure Assginment 数据结构代写MIPS代写Machine Learning 作业 代写Oracle/SQL/PostgreSQL/Pig 数据库代写/代做/辅导Web开发、网站开发、网站作业ASP.NET网站开发Finance Insurace Statistics统计、回归、迭代Prolog代写Computer Computational method代做因为专业,所以值得信赖。如有需要,请加QQ:99515681 或邮箱:99515681@qq.com 微信:codehelp
网友评论