Day 1 Practice Problems

You are not logged in.

Please Log In for full access to the web site.
Note that this link will take you to an external site (https://petrock.mit.edu) to authenticate, and then you will be redirected back to this page.

Practice the concepts you learned in Day 1

1) Data Science Pipeline

What practices are done in Structure Extraction? (Select all that apply)

2) Tables

According to the principles of 'good' table design, why is high redundancy (repeating information) problematic?

If you have a many-to-many relationship, what is a good general approach?

Which relational algebra operation is used to filter out rows based on a specific predicate (condition)?

        # Perform a projection on groups to retrieve only the list of group names. 
        # Submit your answer as a python iterable
        groups = [
            {"gid": 101, "group_name": "Hikers"},
            {"gid": 102, "group_name": "Coders"}
        ]
        

3) SQL and Pandas

In SQL, what is the result of using a LEFT JOIN on Table A and Table B if a row in Table A has no matching record in Table B?

When performing an aggregate query in SQL, what is the functional difference between COUNT(*) and COUNT(column_name)?

Which of the following describes a 'Analytics' workload as opposed to an 'Transaction' workload? (Select all that apply)

How does a B-Tree index improve the performance of a query looking for a specific value?

In Pandas, what is the specific purpose of the iloc indexer compared to the loc indexer?