Skip to content

Bumblebee Sela G#3

Open
SelaCrow wants to merge 11 commits into
Ada-C23:mainfrom
SelaCrow:main
Open

Bumblebee Sela G#3
SelaCrow wants to merge 11 commits into
Ada-C23:mainfrom
SelaCrow:main

Conversation

@SelaCrow
Copy link
Copy Markdown

No description provided.

Copy link
Copy Markdown

@anselrognlie anselrognlie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, but please review my comments, and let me know if you have any questions.

Comment thread tests/test_wave_01.py
Comment on lines +175 to +179
assert updated_data["watched"] == [{
"title": MOVIE_TITLE_1,
"genre": GENRE_1,
"rating": RATING_1
}]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Nice use of the fact that Python compares dictionaries by their contents to simplify these checks, rather than writing out the checks key by key.

Comment thread tests/test_wave_01.py
assert len(updated_data["watched"]) == 1

raise Exception("Test needs to be completed.")
assert updated_data["watchlist"] == []
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking that the watchlist is empty isn't necessary here, since we already checked that the length was 0 earlier.

Comment thread tests/test_wave_01.py
# ****** Add assertions here to test that the correct movie was added to "watched" **********
# *******************************************************************************************
assert updated_data["watchlist"] == [FANTASY_1]
assert updated_data["watched"] == [FANTASY_2, movie_to_watch]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The project description doesn't require that the watched movie ends up at any particular location in the watched list. As a result, rather than checking the results of the watched list against a literal list (which implies an ordering), we might use in checks. We could do the same for the watchlist check, but since there should be only a single movie left there, we actually do know exactly where in the list it is!

Comment thread tests/test_wave_03.py
Comment on lines +65 to +67
assert INTRIGUE_3 in friends_unique_movies
assert HORROR_1 in friends_unique_movies
assert FANTASY_4 in friends_unique_movies
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 The get_friends_unique_watched method doesn't specify the order of the movies (and doing so could restrict the ways in which we might perform uniquing on the movie records). Using in to check only for each movie's presence is a great idea. And since we also know the number of expected records, these checks completely describe the contents of the result.

Comment thread tests/test_wave_05.py
}
recommendations = get_new_rec_by_genre(sonyas_data)

assert len(recommendations) == 0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the specification says this should return a list, it would be more appropriate to check not only that the length of the result is 0, but that the result is itself a list. This is most succinctly checked as

    assert recommendations == []

Note that while in application code, it's considered idiomatic to check for an empty list as not some_list_variable, we should be aware that this is a pretty loose check. It really checks that whatever some_list_variable refers to is falsy. In a test, we like our checks to be more specific than checks in regular code. If we're more strict in our tests than we need to be, then application code can use more idiomatic code without concern, but can also be more precise where desired.

Comment thread viewing_party/party.py
# get list of friends unique movie
# add to empty genre rec only if genre of the movies is equal to user favorite
# return list
user_genre = get_most_watched_genre(user_data)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 The description for this function states that we should "consider the user's most frequently watched genre." This sounds a lot like what get_most_watched_genre calculates! Reusing get_most_watched_genre gets us the required most watched genre information while keeping our code DRY.

Comment thread viewing_party/party.py
# add to empty genre rec only if genre of the movies is equal to user favorite
# return list
user_genre = get_most_watched_genre(user_data)
friends_unique = get_friends_unique_watched(user_data)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 The description for this function states that the results should only include movies that a friend has watched, but the user has not. This sounds a lot like what get_friends_unique_watched calculates! Reusing get_friends_unique_watched gets us the unique movies watched by friends, while keeping our code DRY.

All we have left to do is filter the list down to the movies in the same genre as the user's most watched genre.

Comment thread viewing_party/party.py
Comment on lines +136 to +139
genre_reqs = []
for data in friends_unique:
if data["genre"] == user_genre:
genre_reqs.append(data)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, we're filtering a list of movies based on a relatively simple criterion: get the movies that have a matching genre. This would be a great fit for a list comprehension. This could be expressed as

    genre_reqs = [movie for movie in friends_unique if movie["genre"] == user_genre]

This may look a little odd, but they pop up all over the place in Python code for cases like this, so it's worth getting used to reading and writing them.

Comment thread viewing_party/party.py
# put movies in list only if in user favorites
# AND NOT in friend's watched
favorites_recs = []
user_unique = get_unique_watched(user_data)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍The description for this function states that the results should only include movies that no friends have watched. This sounds a lot like what get_unique_watched calculates! However, this problem differs slightly in that we're starting from the user's favorites list, not their watched list, whereas get_unique_watched starts from the watched list.

If we make a bit of an assumption that a user cannot have made a movie a favorite that they have not yet already watched, then we could approach this by getting the unique user-watched movies (no friends will have watched these) and then filter them to the movies in the user's favorites list. It turns out this is a safe assumption to make for this project, allowing us to reuse our get_unique_watched.

However, if we assumed that there were no direct connection between the user's watched list and their favorites (maybe a user decides to use their "favorites" data to track some other collection of movies or to just register approval of the concept of a movie), we might not choose to reuse get_unique_watched directly.

In this case, it would be possible to make another helper function, say filter_movies_against_friends_watched(movie_list), that performs the same logic that we have in get_unique_watched, just that it starts with the list of movies passed in rather than assuming it should start with user_data["watched"].

Then for get_unique_watched we could pass in user_data["watched"], while for get_rec_from_favorites we could pass in user_data["favorites"]!

Comment thread viewing_party/party.py
favorites_recs = []
user_unique = get_unique_watched(user_data)
for data in user_unique:
if data in user_data["favorites"]:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on how large we're expecting the favorites list to become, it could be worth extracting the titles into a collection with more favorable lookup time complexity rather than directly checking that each movie is in the favorites list (linear time).

Copy link
Copy Markdown

@anselrognlie anselrognlie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update!

Comment thread viewing_party/party.py
Comment on lines +64 to +70
for movie in user_data["watched"]:
genre = movie["genre"]

if genre in genre_counts: # can maybe do this instead: genre_counts[genre] = genre_counts.get(genre, 0) + 1
genre_counts[genre] += 1
else:
genre_counts[genre] = 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Nice use of a frequency map to count up the appearances of each genre in a single pass through the list of movies. This has better time complexity than trying to count the occurrences of each genre while looking for the most frequent genre, which results in more nested iterations. Instead, we can trade some space to store the results of counting them all once.

Comment thread viewing_party/party.py
for movie in user_data["watched"]:
genre = movie["genre"]

if genre in genre_counts: # can maybe do this instead: genre_counts[genre] = genre_counts.get(genre, 0) + 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I like the .get() approach since it hides the conditional behind the method call, making this code simpler to read. The .get() approach isn't any more or less efficient (internally, it basically has to do the same thing we're doing here), so we'd really be choosing this for the improved readability.

Comment thread viewing_party/party.py
Comment on lines +72 to +74
if genre_counts[genre] > top_count:
top_count = genre_counts[genre]
top_genre = genre
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer to pull this out of the loop that's building the frequency map into its own loop. Doing so would allow for greater decomposability making it easier to refactor the major steps. When everything is mixed into a single loop, it can be difficult to meaningfully untangle them.

But we could instead write

    for movie in user_data["watched"]:
        genre = movie["genre"]

        if genre in genre_counts:
            genre_counts[genre] += 1
        else:
            genre_counts[genre] = 1

    for genre, count in genre_counts.items():
        if count > top_count:
            top_count = count
            top_genre = genre

which has exactly same complexity, but could be further refactored to

    genre_counts = count_genres(user_data["watched"])
    top_genre = get_max_genre(genre_counts)

assuming we split those two loops into the functions listed above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants