This notebook is an exercise in the Pandas course. You can reference the tutorial at this link.

Introduction

介绍

Run the following cell to load your data and some utility functions.

运行以下单元格来加载数据和一些实用函数。

import pandas as pd

reviews = pd.read_csv("../input/wine-reviews/winemag-data-130k-v2.csv", index_col=0)

from learntools.core import binder; binder.bind(globals())
from learntools.pandas.renaming_and_combining import *
print("Setup complete.")

Setup complete.

Exercises

练习

View the first several lines of your data by running the cell below:

通过运行下面的单元格查看数据的前几行：

reviews.head()

	country	description	designation	points	price	province	region_1	region_2	taster_name	taster_twitter_handle	title	variety	winery
0	Italy	Aromas include tropical fruit, broom, brimston...	Vulkà Bianco	87	NaN	Sicily & Sardinia	Etna	NaN	Kerin O’Keefe	@kerinokeefe	Nicosia 2013 Vulkà Bianco (Etna)	White Blend	Nicosia
1	Portugal	This is ripe and fruity, a wine that is smooth...	Avidagos	87	15.0	Douro	NaN	NaN	Roger Voss	@vossroger	Quinta dos Avidagos 2011 Avidagos Red (Douro)	Portuguese Red	Quinta dos Avidagos
2	US	Tart and snappy, the flavors of lime flesh and...	NaN	87	14.0	Oregon	Willamette Valley	Willamette Valley	Paul Gregutt	@paulgwine	Rainstorm 2013 Pinot Gris (Willamette Valley)	Pinot Gris	Rainstorm
3	US	Pineapple rind, lemon pith and orange blossom ...	Reserve Late Harvest	87	13.0	Michigan	Lake Michigan Shore	NaN	Alexander Peartree	NaN	St. Julian 2013 Reserve Late Harvest Riesling ...	Riesling	St. Julian
4	US	Much like the regular bottling from 2012, this...	Vintner's Reserve Wild Child Block	87	65.0	Oregon	Willamette Valley	Willamette Valley	Paul Gregutt	@paulgwine	Sweet Cheeks 2012 Vintner's Reserve Wild Child...	Pinot Noir	Sweet Cheeks

1.

region_1 and region_2 are pretty uninformative names for locale columns in the dataset. Create a copy of reviews with these columns renamed to region and locale, respectively.

region_1和region_2是数据集中的不能明确表明信息的名称。创建reviews的副本，并将这些列分别重命名为region和locale。

# Your code here
#renamed = ____

renamed = reviews.rename(columns={'region_1': 'region', 'region_2': 'locale'})

# Check your answer
q1.check()

Correct

#q1.hint()
q1.solution()

Solution:

renamed = reviews.rename(columns=dict(region_1='region', region_2='locale'))

2.

Set the index name in the dataset to wines.

将数据集中的索引名称设置为wines。

#reindexed = ____

reindexed = reviews.rename_axis("wines", axis="rows")
# Check your answer
q2.check()

Correct

#q2.hint()
q2.solution()

Solution:

reindexed = reviews.rename_axis('wines', axis='rows')

3.

The Things on Reddit dataset includes product links from a selection of top-ranked forums ("subreddits") on reddit.com. Run the cell below to load a dataframe of products mentioned on the /r/gaming subreddit and another dataframe for products mentioned on the r//movies subreddit.

Reddit 上的事物数据集包含来自 reddit.com 上精选的顶级论坛（“subreddits”）的产品链接。运行下面的单元格来加载 /r/gaming subreddit 上提到的产品的DataFrame以及 r//movies subreddit 上提到的产品的另一个DataFrame。

gaming_products = pd.read_csv("../input/things-on-reddit/top-things/top-things/reddits/g/gaming.csv")
gaming_products['subreddit'] = "r/gaming"
movie_products = pd.read_csv("../input/things-on-reddit/top-things/top-things/reddits/m/movies.csv")
movie_products['subreddit'] = "r/movies"

Create a DataFrame of products mentioned on either subreddit.

创建任一 Reddit 子版块中提到的产品的DataFrame。

#combined_products = ____

combined_products = pd.concat([gaming_products, movie_products])
# Check your answer
q3.check()

Correct

#q3.hint()
q3.solution()

Solution:

combined_products = pd.concat([gaming_products, movie_products])

4.

The Powerlifting Database dataset on Kaggle includes one CSV table for powerlifting meets and a separate one for powerlifting competitors. Run the cell below to load these datasets into dataframes:

Kaggle 上的举重数据库数据集包括一张用于举重比赛的 CSV 表和一张单独的举重参赛者表格。运行下面的单元格将这些数据集加载到DataFrame中：

powerlifting_meets = pd.read_csv("../input/powerlifting-database/meets.csv")
powerlifting_competitors = pd.read_csv("../input/powerlifting-database/openpowerlifting.csv")

Both tables include references to a MeetID, a unique key for each meet (competition) included in the database. Using this, generate a dataset combining the two tables into one.

两个表都包含对MeetID的引用，这是数据库中包含的每次会议（比赛）的唯一键。使用它，生成一个将两个表合并为一个的数据集。

#powerlifting_combined = ____
powerlifting_combined = powerlifting_meets.join(powerlifting_competitors, on="MeetID", lsuffix="_Meet", rsuffix="_Competitor")
powerlifting_combined = powerlifting_meets.set_index("MeetID").join(powerlifting_competitors.set_index("MeetID"))
# Check your answer
q4.check()

Correct

# q4.hint()
q4.solution()

Hint: Use pd.Dataframe.join().

Solution:

powerlifting_combined = powerlifting_meets.set_index("MeetID").join(powerlifting_competitors.set_index("MeetID"))

Congratulations!

恭喜！

You've finished the Pandas micro-course. Many data scientists feel efficiency with Pandas is the most useful and practical skill they have, because it allows you to progress quickly in any project you have.

您已经完成了 Pandas 微课程。许多数据科学家认为 Pandas 的效率是他们拥有的最有用、最实用的技能，因为它可以让你在任何项目中快速取得进展。

If you'd like to apply your new skills to examining geospatial data, you're encouraged to check out our Geospatial Analysis micro-course.

如果您想应用新技能来检查地理空间数据，我们鼓励您查看我们的地理空间分析 微课程。

You can also take advantage of your Pandas skills by entering a Kaggle Competition or by answering a question you find interesting using Kaggle Datasets.

您还可以通过参加 Kaggle 竞赛 或使用 Kaggle 数据集。来提升您的技能。

06.exercise-renaming-and-combining【练习：重命名及组合】

M	T	W	T	F	S	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29

06.exercise-renaming-and-combining【练习：重命名及组合】

Introduction

介绍

Exercises

练习

1.

2.

3.

4.

Congratulations!

恭喜！

Leave a Reply Cancel reply