Skip to content

Latest commit

 

History

History
193 lines (150 loc) · 5.68 KB

File metadata and controls

193 lines (150 loc) · 5.68 KB
comments difficulty edit_url tags
true
Medium
Database

中文文档

Description

Table: cities

+-------------+---------+
| Column Name | Type    | 
+-------------+---------+
| state       | varchar |
| city        | varchar |
+-------------+---------+
(state, city) is the combination of columns with unique values for this table.
Each row of this table contains the state name and the city name within that state.

Write a solution to find all the cities in each state and analyze them based on the following requirements:

  • Combine all cities into a comma-separated string for each state.
  • Only include states that have at least 3 cities.
  • Only include states where at least one city starts with the same letter as the state name.

Return the result table ordered by the count of matching-letter cities in descending order and then by state name in ascending order.

The result format is in the following example.

 

Example:

Input:

cities table:

+--------------+---------------+
| state        | city          |
+--------------+---------------+
| New York     | New York City |
| New York     | Newark        |
| New York     | Buffalo       |
| New York     | Rochester     |
| California   | San Francisco |
| California   | Sacramento    |
| California   | San Diego     |
| California   | Los Angeles   |
| Texas        | Tyler         |
| Texas        | Temple        |
| Texas        | Taylor        |
| Texas        | Dallas        |
| Pennsylvania | Philadelphia  |
| Pennsylvania | Pittsburgh    |
| Pennsylvania | Pottstown     |
+--------------+---------------+

Output:

+-------------+-------------------------------------------+-----------------------+
| state       | cities                                    | matching_letter_count |
+-------------+-------------------------------------------+-----------------------+
| Pennsylvania| Philadelphia, Pittsburgh, Pottstown       | 3                     |
| Texas       | Dallas, Taylor, Temple, Tyler             | 3                     |
| New York    | Buffalo, Newark, New York City, Rochester | 2                     |
+-------------+-------------------------------------------+-----------------------+

Explanation:

  • Pennsylvania:
    <ul>
    	<li>Has 3 cities (meets minimum requirement)</li>
    	<li>All 3 cities start with &#39;P&#39; (same as state)</li>
    	<li>matching_letter_count = 3</li>
    </ul>
    </li>
    <li><strong>Texas</strong>:
    <ul>
    	<li>Has 4 cities (meets minimum requirement)</li>
    	<li>3 cities (Taylor, Temple, Tyler) start with &#39;T&#39; (same as state)</li>
    	<li>matching_letter_count = 3</li>
    </ul>
    </li>
    <li><strong>New York</strong>:
    <ul>
    	<li>Has 4 cities (meets minimum requirement)</li>
    	<li>2 cities (Newark, New York City) start with &#39;N&#39; (same as state)</li>
    	<li>matching_letter_count = 2</li>
    </ul>
    </li>
    <li><strong>California</strong> is not included in the output because:
    <ul>
    	<li>Although it has 4 cities (meets minimum requirement)</li>
    	<li>No cities start with &#39;C&#39; (doesn&#39;t meet the matching letter requirement)</li>
    </ul>
    </li>
    

Note:

  • Results are ordered by matching_letter_count in descending order
  • When matching_letter_count is the same (Texas and New York both have 2), they are ordered by state name alphabetically
  • Cities in each row are ordered alphabetically

Solutions

Solution 1: Group Aggregation + Filtering

We can group the cities table by the state field, then apply filtering on each group to retain only the groups that meet the specified conditions.

MySQL

# Write your MySQL query statement below
SELECT
    state,
    GROUP_CONCAT(city ORDER BY city SEPARATOR ', ') AS cities,
    COUNT(
        CASE
            WHEN LEFT(city, 1) = LEFT(state, 1) THEN 1
        END
    ) AS matching_letter_count
FROM cities
GROUP BY 1
HAVING COUNT(city) >= 3 AND matching_letter_count > 0
ORDER BY 3 DESC, 1;

Pandas

import pandas as pd


def state_city_analysis(cities: pd.DataFrame) -> pd.DataFrame:
    cities["matching_letter"] = cities["city"].str[0] == cities["state"].str[0]

    result = (
        cities.groupby("state")
        .agg(
            cities=("city", lambda x: ", ".join(sorted(x))),
            matching_letter_count=("matching_letter", "sum"),
            city_count=("city", "count"),
        )
        .reset_index()
    )

    result = result[(result["city_count"] >= 3) & (result["matching_letter_count"] > 0)]

    result = result.sort_values(
        by=["matching_letter_count", "state"], ascending=[False, True]
    )

    result = result.drop(columns=["city_count"])

    return result