Try It Yourself
You are almost done with the course. Nice job.
Fortunately, we have a couple more interesting problems for you before you go.
As always, run the setup code below before working on the questions.
自己尝试一下
您即将完成课程。 不错的工作。
幸运的是,在您出发之前,我们还为您准备了一些更有趣的问题。
与往常一样,在解决问题之前运行下面的设置代码。
from learntools.core import binder; binder.bind(globals())
from learntools.python.ex6 import *
print('Setup complete.')
Setup complete.
Exercises
练习
0.
Let's start with a string lightning round to warm up. What are the lengths of the strings below?
For each of the five strings below, predict what len()
would return when passed that string. Use the variable length
to record your answer, then run the cell to check whether you were right.
0。
让我们从快速开始热身。 下面的字符串的长度是多少?
对于下面五个字符串中的每一个,预测传递该字符串时len()
将返回什么。 使用变量length
记录您的答案,然后运行单元格来检查您是否正确。
a = ""
length = 0
q0.a.check()
Correct:
The empty string has length zero. Note that the empty string is also the only string that Python considers as False when converting to boolean.
b = "it's ok"
length = 7
q0.b.check()
Correct:
Keep in mind Python includes spaces (and punctuation) when counting string length.
c = 'it\'s ok'
# length = ____
length = 7
q0.c.check()
Correct:
Even though we use different syntax to create it, the string c
is identical to b
. In particular, note that the backslash is not part of the string, so it doesn't contribute to its length.
d = """hey"""
# length = ____
length = 3
q0.d.check()
Correct:
The fact that this string was created using triple-quote syntax doesn't make any difference in terms of its content or length. This string is exactly the same as 'hey'
.
e = '\n'
# length = ____
length = 1
q0.e.check()
Correct:
The newline character is just a single character! (Even though we represent it to Python using a combination of two characters.)
1.
There is a saying that "Data scientists spend 80% of their time cleaning data, and 20% of their time complaining about cleaning data." Let's see if you can write a function to help clean US zip code data. Given a string, it should return whether or not that string represents a valid zip code. For our purposes, a valid zip code is any string consisting of exactly 5 digits.
HINT: str
has a method that will be useful here. Use help(str)
to review a list of string methods.
1.
有句话说数据科学家80%的时间都花在清理数据上,20%的时间在抱怨清理数据
。 让我们看看您是否可以编写一个函数来帮助清理美国邮政编码数据。 给定一个字符串,它应该返回该字符串是否代表有效的邮政编码。 就我们的目的而言,有效的邮政编码是由 5 位数字组成的任何字符串。
提示:str
有一个方法在这里很有用。 使用help(str)
查看字符串方法列表。
def is_valid_zip(zip_code):
"""
Returns whether the input string is a valid (5 digit) zip code
返回输入字符串是否是有效的(5 位)邮政编码
"""
# pass
return zip_code.isdigit() and len(zip_code)==5
# Check your answer
q1.check()
Correct
# q1.hint()
q1.solution()
Solution:
def is_valid_zip(zip_code):
return len(zip_code) == 5 and zip_code.isdigit()
2.
A researcher has gathered thousands of news articles. But she wants to focus her attention on articles including a specific word. Complete the function below to help her filter her list of articles.
Your function should meet the following criteria:
- Do not include documents where the keyword string shows up only as a part of a larger word. For example, if she were looking for the keyword “closed”, you would not include the string “enclosed.”
- She does not want you to distinguish upper case from lower case letters. So the phrase “Closed the case.” would be included when the keyword is “closed”
- Do not let periods or commas affect what is matched. “It is closed.” would be included when the keyword is “closed”. But you can assume there are no other types of punctuation.
2.
一位研究人员收集了数千篇新闻文章。 但她想将注意力集中在包含特定单词的文章上。 完成以下功能以帮助她过滤文章列表。
您的函数应满足以下条件:
- 不要包含关键字字符串仅作为较大单词的一部分出现的文档。 例如,如果她正在查找关键字“close”,则您不会包含字符串“enclosed”。
- 她不希望你区分大小写字母。 所以这句话“Closed the case.”。 当关键字“closed”时将被包含。
- 不要让句号或逗号影响匹配的内容。 “It is closed.” 当关键字为“closed”时将被包含。 但您可以假设没有其他类型的标点符号。
def word_search(doc_list, keyword):
"""
Takes a list of documents (each document is a string) and a keyword.
Returns list of the index values into the original list for all documents
containing the keyword.
参数为一个文档列表(每个文档都是一个字符串)和一个关键字。
将包含关键字的所有文档的索引值,通过列表方式返回到原始列表中。
Example:
doc_list = ["The Learn Python Challenge Casino.", "They bought a car", "Casinoville"]
>>> word_search(doc_list, 'casino')
>>> [0]
"""
# pass
return [doc_list.index(i) for i in doc_list if keyword in i.lower().replace('.', ' ').replace(',',' ').split()]
# Check your answer
q2.check()
Correct
# q2.hint()
q2.solution()
Solution:
def word_search(doc_list, keyword):
# list to hold the indices of matching documents
indices = []
# Iterate through the indices (i) and elements (doc) of documents
for i, doc in enumerate(doc_list):
# Split the string doc into a list of words (according to whitespace)
tokens = doc.split()
# Make a transformed list where we 'normalize' each word to facilitate matching.
# Periods and commas are removed from the end of each word, and it's set to all lowercase.
normalized = [token.rstrip('.,').lower() for token in tokens]
# Is there a match? If so, update the list of matching indices.
if keyword.lower() in normalized:
indices.append(i)
return indices
3.
Now the researcher wants to supply multiple keywords to search for. Complete the function below to help her.
(You're encouraged to use the word_search
function you just wrote when implementing this function. Reusing code in this way makes your programs more robust and readable - and it saves typing!)
3.
现在研究人员想要提供多个关键字进行搜索。 完成下面的函数来帮助她。
(我们鼓励您在实现此函数时使用刚刚编写的word_search
函数。以这种方式重用代码可以使您的程序更加健壮和可读 - 并且可以节省打字!)
def multi_word_search(doc_list, keywords):
"""
Takes list of documents (each document is a string) and a list of keywords.
Returns a dictionary where each key is a keyword, and the value is a list of indices
(from doc_list) of the documents containing that keyword
参数为一个文档列表(每个文档都是一个字符串)和一个关键字列表。
返回一个字典,其中每个键是一个关键字,值是包含该关键字的一个索引列表(来自 doc_list)
>>> doc_list = ["The Learn Python Challenge Casino.", "They bought a car and a casino", "Casinoville"]
>>> keywords = ['casino', 'they']
>>> multi_word_search(doc_list, keywords)
{'casino': [0, 1], 'they': [1]}
"""
# pass
return {i:word_search(doc_list,i) for i in keywords}
# Check your answer
q3.check()
Correct
q3.solution()
Solution:
def multi_word_search(documents, keywords):
keyword_to_indices = {}
for keyword in keywords:
keyword_to_indices[keyword] = word_search(documents, keyword)
return keyword_to_indices
Keep Going
You've learned a lot. But even the best programmers rely heavily on "libraries" of code from other programmers. You'll learn about that in the last lesson.
继续
你学到了很多东西。 但即使是最好的程序员也严重依赖其他程序员的代码“库”。 您将在最后一课中了解这一点。